Saturday, April 13, 2019

Putting the 'Fun' In Functional Indexes

I gave a thirty minute talk at Drupalcon this week on the features in MySQL 8.0 that would be of interest to developers and for such a short talk (and I cut slides to allow for audience questions) I could only cover the highlights. One attendee gently chastised me over no including their favorite new MySQL 8.0 feature -- functional indexes.


What is a Functional Index?


The manual says MySQL 8.0.13 and higher supports functional key parts that index expression values rather than column or column prefix values. Use of functional key parts enables indexing of values not stored directly in the table.   

There are some cool examples in the documentation on setting up some functional indexes, as can seen below.

CREATE TABLE t1 (
  col1 INT, col2 INT, 
  INDEX func_index ((ABS(col1)))
);
CREATE INDEX idx1 ON t1 ((col1 + col2));
CREATE INDEX idx2 ON t1 ((col1 + col2), (col1 - col2), col1);
ALTER TABLE t1 ADD INDEX ((col1 * 40) DESC);

But there are no example queries or examples with query plans provided.  So let us add some data.

select * from t1 where (col1 + col2) ;
+------+------+
| col1 | col2 |
+------+------+
|   10 |   10 |
|   20 |   11 |
|   30 |   12 |
|   40 |   15 |
|   50 |   18 |
+------+------+
5 rows in set (0.0008 sec)

And then lets look at a query plan.

explain select * from t1 where (col1 + col2) > 40\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: NULL
         type: range
possible_keys: idx1
          key: idx1
      key_len: 9
          ref: NULL
         rows: 3
     filtered: 100
        Extra: Using where
1 row in set, 1 warning (0.0006 sec)
Note (code 1003): /* select#1 */ 
select `so`.`t1`.`col1` AS `col1`,
`so`.`t1`.`col2` AS `col2` 
from `so`.`t1` 
where ((`so`.`t1`.`col1` + `so`.`t1`.`col2`) > 40)

explain select * from t1 where (col1 * 40) > 90\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: NULL
         type: range
possible_keys: functional_index
          key: functional_index
      key_len: 9
          ref: NULL
         rows: 5
     filtered: 100
        Extra: Using where
1 row in set, 1 warning (0.0006 sec)
Note (code 1003): /* select#1 */ 
select `so`.`t1`.`col1` AS `col1`,`so`.`t1`.`col2` AS `col2` 
 from `so`.`t1` where ((`so`.`t1`.`col1` * 40) > 90)

It is interesting to note that in the above case we are told the key's name is 'functional_index' (this is the one created by the ALTER TABLE command and not explicitly given a name).  


Implementation


Functional indexes are implemented as hidden virtual generated columns which means you will have to follow the rules for virtual generated columns And while this functional index takes up no space because virtual generated columns are only computed at select time the index itself does take up space.

Be sure to read the manual for all the restrictions and limitations.


Do They Work With JSON Data?


Well, yes, functional indexes do work with JSON data.  Once again the manual provides an interesting example.

 CREATE TABLE employees (
data JSON, INDEX idx ((CAST(data->>"$.name" AS CHAR(30)) COLLATE utf8mb4_bin)) ); INSERT INTO employees VALUES ('{ "name": "james", "salary": 9000 }'), ('{ "name": "James", "salary": 10000 }'), ('{ "name": "Mary", "salary": 12000 }'), ('{ "name": "Peter", "salary": 8000 }');
EXPLAIN SELECT * FROM employees WHERE data->>'$.name' = 'James';
*************************** 1. row *************************** id: 1 select_type: SIMPLE table: employees partitions: NULL type: ref possible_keys: idx key: idx key_len: 123 ref: const rows: 1 filtered: 100 Extra: NULL 1 row in set, 1 warning (0.0008 sec) Note (code 1003): /* select#1 */
select `so`.`employees`.`data` AS `data`
from `so`.`employees`
where ((cast(json_unquote(json_extract(`so`.`employees`.`data`,_utf8mb4'$.name'))
as char(30) charset utf8mb4) collate utf8mb4_bin) = 'James')

And we can see that it does use the index idx that was defined.



States


One of the examples you will see from other databases for functional indexes include
forcing lower case names. So I created a table with US state names.

create table states (id int, name char(30), primary key (id));

I did find that the syntax was a little fussy on creating the index as it needed an extra set of
parenthesis more than I originally thought it would

create index state_name_lower on states ((lower(name)));



explain select name from states where name = 'texas'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: states
   partitions: NULL
         type: ref
possible_keys: state_name
          key: state_name
      key_len: 121
          ref: const
         rows: 1
     filtered: 100
        Extra: Using where; Using index
1 row in set, 1 warning (0.0008 sec)
Note (code 1003): /* select#1 */
select `so`.`states`.`name` AS `name`
from `so`.`states` where (`so`.`states`.`name` = 'texas')

So please give functional indexes a try and please let me know if you find an interesting or unusual way to use them.


Wednesday, April 3, 2019

MySQL Workbench Spatial Viewer or How to See your GEOMETRY Data

The past couple of blog entries have been on Geographic Information Systems and Geometric Data.  Visualizing that data with MySQL Workbench makes it easier for me to see what the results really mean.
Example drawn by MySQL Workbench 8.0.15
Workbench 8.0.15 will draw the polygon with the Spatial View Option

So how do you get there?

Start Workbench, create a new SQL Tab in your favorite scratch schema, and crate the table below. 

CREATE TABLE `test` (
  `id` INT NOT NULL AUTO_INCREMENT,
  `geom` GEOMETRY NULL,
  PRIMARY KEY (`id`));

Next add some data.

INSERT INTO `test`
  (`geom`)
VALUES
  (st_geomfromtext
  ('polygon((0 0,0 3,3 0, 2 2,0 0),(1 1,1 2,2 1,2 2, 1 1))')
   );

The run the query.

select geom from test;

However the result will default to the Result Grid. Look off to the right hand side of the results window to see a series of stacked icons and the default is the Result Grid.  And that 'BLOB' is the result of the query.  But that result is not exactly visually stunning.

Output from the query in the results grid view
The 'BLOB' is the result of the query.

Click on the Form Editor icon. It is directly below the Result Grid Icon

Select the Form Editor Icon
And you should see the image from the top of this blog entry.

Bonus!


Now scroll down below the Form Editor icon and select Spatial View.

Spatial View of the Query




Tuesday, April 2, 2019

MySQL 8.0 GIS -- Inserting Data & Fun Functions

The last blog entry was very popular and there were lots of requests for some introductory information on the spatial data types. 

Well Known Text Or Binary


I am going to use the GEOMETRY data type over POINT, LINESTRING, or POLYGON as it can store any of those three while the other three can only contain data matching their name (so POINT can holds only point data, etc.). The values are stored in an internal geometry format but it takes wither WKT or WKB formatted data.

Those are Well-Known Text (WKT) or Well-Known Binary (WKB) formats repectively. I am hoping most of your are better with text than binary so the following examples demonstrate how to insert geometry values into a table by converting WKT values to internal geometry format.

So let us start with a simple table.

mysql> create table geom (type text, g geometry);
Query OK, 0 rows affected (0.04 sec)

We can use the ST_GeomFromText function to take some strings and convert into the internal format.

mysql> insert into geom values 
       ('point', st_geomfromtext('point(1 1)'));
Query OK, 1 row affected (0.01 sec)

mysql> insert into geom values 
       ('linestring', st_geomfromtext('linestring(0 0,1 1, 2 2)'));
Query OK, 1 row affected (0.01 sec)

There are type specific functions for POINT, LINESTRING, and POLYGON that we can also take advantage of for this work.

mysql> SET @g = 
     'POLYGON((0 0,10 0,10 10,0 10,0 0),(5 5,7 5,7 7,5 7, 5 5))';
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO geom 
       VALUES ('polygon',ST_PolygonFromText(@g));
Query OK, 1 row affected (0.00 sec)

If you do a SELECT * FROM geom; you will get the g column output in binary.  Thankfully we can use ST_AsText() to provide us with something more readable.


mysql> select type, st_astext(g) from geom;
+------------+----------------------------------------------------------+
| type       | st_astext(g)                                             |
+------------+----------------------------------------------------------+
| point      | POINT(1 1)                                               |
| linestring | LINESTRING(0 0,1 1,2 2)                                  |
| polygon    | POLYGON((0 0,10 0,10 10,0 10,0 0),(5 5,7 5,7 7,5 7,5 5)) |
+------------+----------------------------------------------------------+
3 rows in set (0.00 sec)

mysql>


Put the 'fun' in Function

There are functions that can be use with each of these type to provide information., For instance there are the x & Y coordinates for a point.

mysql> select type, st_x(g), st_y(g) from geom where type='point';
+-------+---------+---------+
| type  | st_x(g) | st_y(g) |
+-------+---------+---------+
| point |       1 |       1 |
+-------+---------+---------+
1 row in set (0.00 sec)

We can even use our linestring data to get the minimal bounding rectangle for the given coordinates (basically if we had to put an envelope around the points the results are the coordinates points to enclose those x,y coordinates)

mysql> select st_astext(st_envelope(g)) 
       from geom where type='linestring';
+--------------------------------+
| st_astext(st_envelope(g))      |
+--------------------------------+
| POLYGON((0 0,2 0,2 2,0 2,0 0)) |
+--------------------------------+
1 row in set (0.00 sec)


And we can get the area of a polygon too.

mysql> select type, st_area((g)) from geom where type='polygon';
+---------+--------------+
| type    | st_area((g)) |
+---------+--------------+
| polygon |           96 |
+---------+--------------+
1 row in set (0.00 sec)

And find the mathematical center of that polygon.

mysql> select type, 
       st_astext(st_centroid(g)) 
       from geom where type='polygon';
+---------+--------------------------------------------+
| type    | st_astext(st_centroid(g))                  |
+---------+--------------------------------------------+
| polygon | POINT(4.958333333333333 4.958333333333333) |
+---------+--------------------------------------------+
1 row in set (0.00 sec)

Plus we can get the linestring data if we wanted to draw out polygon.

mysql> select type, 
       st_astext(st_exteriorring(g)) 
       from geom where type='polygon';
+---------+-------------------------------------+
| type    | st_astext(st_exteriorring(g))       |
+---------+-------------------------------------+
| polygon | LINESTRING(0 0,10 0,10 10,0 10,0 0) |
+---------+-------------------------------------+
1 row in set (0.00 sec)



Wednesday, March 27, 2019

MySQL 8.0 Geographic Information Systems (GIS) - How to get the distance between two cities


MySQL before version 5.7 had less than stellar Geographic Information Systems (GIS) support.  In version 5.7 the Boost.Geometry two dimensional or 2D libraries were added.  And with 8.0 came the three dimensional or 3D libraries. But how do you use these features?

I would like to state up front that this is all new to me and this is an attempt for me to document what it takes to go from zero knowledge on GIS to something hopefully better.  What I want to do as an exercise is to get the distance between two places from their longitude and latitude, say two cities near where I live.  So what do we have to do to accomplish that?

It is actually easy with the functions provided if we have the longitude and latitude in an SRID 4326 format.

SELECT ST_Distance(
 (SELECT loc FROM cities WHERE name = 'Trondheim'),
 (SELECT loc FROM cities WHERE name = 'San Francisco')
) AS distance_in_meters;

Okay, but what is an SRID 4326 format, where do I get them, and how do I get data into that format? So let us dig deeper.

I have to thank Norvald Ryeng who is a senior software manager at Oracle for that query roughly transposed by me from his MySQL GIS 8.0 Overview.
Harrison's Clocks were the first successful one to be used by the Royal Navy for navigation and can be seen at the Royal Observatory, Greenwich, United Kingdom. Very interesting story of a man on the cutting edge of his technology for the time who had to overcome so many technical and political problems.

Geometry Class: Background


Geometric data is stored in a geometry data type. Each geometry has a type, a SRID, coordinates, and some other attributes that will be ignored for now. And we will mainly deal with the POINT class.  Think of a point as someplace on a map. We will skip CURVE, LineString, Surface, Polygons, GeometryCollection, MultiPoint, MultiCurve, MultiLineString, MultiSurface, and MultiPolygon.

Columns with a spatial data type have a SRID attribute to indicate the spatial reference system, or SRS, used to store data in that column.  There are SRSs for projecting a globe onto a flat surface (think Mercator Projection maps of the world), non projected representing longitude-latitude on an ellipsoid. In MySQL a SRID value of 0 represents an infinite flat Cartesian plane with no units assigned to the axes and is the default.  Now a group named the European Petroleum Survey Group has their own system with a value of 4326, which you will see later.

Coordinates are represented as double-precision (8-byte) numbers.  These number pairs are either planar (flat) or geodetic (think Earth's surface) and the same pair values from one system can be wildly different than the other.

In the query above we used a column named loc as the point where the cities were located.  You can see the details in Norvald's excellent slides or take my word that these queries were used in a column defined as loc POINT SRID 4326 NOT NULL for now.

INSERT INTO cities (name, loc) VALUES (
 'Trondheim',
 ST_GeomFromText('POINT(64.43048320193547 10.394972698312927)', 4326)
);
INSERT INTO cities (name, loc) VALUES (
 'San Francisco',
 ST_GeomFromText('POINT(37.272615666666667 -122.4425455194445)', 4326)
);

Now Wikipedia says Trondheim is at 63 degrees 25"47'N and 10 degrees 23"36'E which looks pretty close to the numbers above.  And the Nidaros Cathedral is listed as 63.4305 degrees North latitude and 10.3951 degrees East longitude. So saying POINT('latitude' 'longitude') seems to be what I am looking for to designate a location.

To peel back another layer of the onion, lets look at what ST_GeomFromText() does.  The states thusly that ST_GeomFromText 'Constructs a geometry value of any type using its WKT representation and SRID.'

Okay, that cleared up nothing. So what the heck is a WKT representation? The MySQL Manual has a section on Functions That Create Geometry Values from WKT Values.  The Well-Known Text (WKT) representation of geometry values is designed for exchanging geometry data in ASCII form. The OpenGIS specification provides a Backus-Naur grammar that specifies the formal production rules for writing WKT values.  So 'POINT(15 20)' is a WKT.

This is not getting easier, is it?  And we still do not know how to get the distance between two sets of longitude and latitude.

Getting Some Data


I was able to find a very nice list of US Zip Codes (postal codes) at https://gist.github.com/erichurst/7882666/ and used the MySQL Workbench data import agent to create a table. Just three columns ZIP for the Zip Code, LAT for latitude, and LNG for longitude. I stored the ZIP filed as TEXT (leading zeros that are significant disappear if the column in an INT) and the other two fields as REALs.  And there are 33,144 rows of data.

Lets look at the distance between the lovely cities of Justin and Trophy Club, both in the state of Texas. Justin is zip code 76247 and Trophy Club is 76262

 mysql> select * from data where ZIP in (76247, 76262);
+-------+-----------+------------+
| ZIP   | LAT       | LNG        | +-------+-----------+------------+
| 76247 | 33.099993 | -97.340499 |
| 76262 | 33.009335 |  -97.22672 | +-------+-----------+------------+
2 rows in set (0.03 sec)
 
And lets try a query.


mysql> select 
 st_distance(
  st_geomfromtext('point(33.099993 -97.34099)',4326),                     
  st_geomfromtext('point(33.009335 -97.22672)',4326)
  )  as distance;

+--------------------+


| distance           |
+--------------------+
| 14662.554615102901 |
+--------------------+
1 row in set (0.00 sec)



So there is roughly 15,000 meters between the cities of Justin and Trophy Club. And please note this is a flat model not one that takes in the curvature of the earth.  But we can get that easily enough by using ST_Distance_Sphere.

mysql> select st_distance_sphere(
    st_geomfromtext('point(33.099993 -97.34099)',4326),
    st_geomfromtext('point(33.009335 -97.22672)',4326)) 

   as sphere,
    st_distance(

     st_geomfromtext('point(33.099993 -97.34099)',4326),
     st_geomfromtext('point(33.009335 -97.22672)',4326)) 

    as flat;
+--------------------+--------------------+
| sphere             | flat               |
+--------------------+--------------------+
| 14664.184945803418 | 14662.554615102901 |
+--------------------+--------------------+
1 row in set (0.00 sec)







So even in the fairly flat area of North Texas, there is some difference between a spherical and flat world.



Sunday, March 10, 2019

MySQL and PHP Basics Part I

I have had some requests to write some blogs on the basics of using PHP and MySQL together.  This will not be a series for the experienced as it will start at a level where I will go into a lot of details but expect very few prerequisites from the reader.  If this is not you, please move on. If it is you and you read something you do not understand, please contact me to show me where I assumed too much.

PHP and MySQL are both in their mid twenties and both vital in the worlds of developers.  With the big improvements in PHP 7 and MySQL 8, I have found a lot of developers flocking to both but stymied by the examples they see as their are many details not explained. So let's get to the explaining!

1. Use the latest software

If you are not using PHP 7.2 or 7.3 (or maybe 7.1) then you are missing out in features and performance.  The PHP 5.x series is deprecated, no longer support, and is quickly disappearing.  

MySQL 8.0 is likewise a big advancement but many sites are using earlier versions.  If you are not on 5.6 with plans to upgrade to 5.7 then you are about to be left behind.  If you are running an earlier version then you are using antique code. Also get your MySQL from MySQL as your Linux distro may not be current, especially for the smaller ones.  The APT and DEB repos can be  found here and there are Docket containers available too. 

In many cases it is a fight to keep your software core tools up to date, or fairly up to to date.  The time and heartache in fighting problems resolved in a later version of software or creating a work around for a feature not in your older version will eventually bite you hard and should be viewed as a CRM (Career Limiting Move).  BTW hiring managers look for folks with current skills not skill for a decade old version of the skills.

2. Do not pick one connector over another (yet!)

PHP is a very rich environment for developers and it has three viable options for connecting to MySQL databases.  Please note that the older mysql connector is deprecated, no longer support, and is to be avoided.  It was replaced by the mysqli or mysqlnd (native driver) and is officially supported by Oracle's MySQL Engineers.  Next is the PDO (public data objects) connector that is designed to be database agnostic but there is no overall maintainer who watches out for the code but Oracle MySQL Engineers do try to fix MySQL related issues if they do not impinge on other PDO code. And the newest, using the new MySQL X DevAPI and X protocol is the X DevAPI connector which supports both SQL and NoSQL interfaces.

The good news for developers is that you can install all three, side by side, with no problem.  For those staring out the ability to transpose from connector can be a little confusing as they work just a bit differently from each other but the ability to use two or more is a good skill to have.  Such much like being able to drive a car with an automatic or manual transmission, it does give you more professional skills.

Next time we will install PHP, MySQL, the three connectors, and some other cool stuff so you can start using PHP to access your MySQL servers.

Friday, March 8, 2019

SCaLE 17x

The Southern California Linux Expo has a MySQL Track this year and if you are around Pasadena, you should be here.  Besides the amazing expo hall, this is the biggest open source show on the left coast.  The MySQL track features talks on the MySQL Document Store, Kuberneties, query optimization, analytics, consistency at Facebook, and MySQL Security. 

Thursday, February 28, 2019

Giving back to the Community -- ChickTech Austin

I am doing something this weekend that is way out of my comfort zone and I want to encourage you to do something similar.  On Saturday I will be teaching a pop up workshop for Chick Tech Austin titled An Introduction to Databases with MySQL.  

Why is this out of my comfort zone?  I regularly speak to computer professional on databases and programming and have done so for years.  But this Saturday the class is made up of young women between the ages of fourteen and eighteen.For us oldsters, the class is made up of people half the age of MySQL, Python, and Java.  Their parents are probably younger than Structured Query Language! But these young folks are going to be inheriting our code bases based on technologies used today and integrated with future innovations. 

Another point of reference is that the Hudson continuous integration tool came out fourteen years ago and it seems like the CI tools world has really blossomed since.  And that is time life span of some folks in the class.

Teaching up and coming developers how to use Structured Query Language and other long lived technologies is vital.  These cogs of our infrastructures have lasted as they provide utility and fulfill needed function.  SQL may have many oddities but it has lasted so very long because it is so very useful. But fewer developers each year seem to get any formal training in SQL, sets, relational calculus, symbolic logic, or the basics of relational databases.  Yet the cries of help on sites like Quora and Stackoverflow for relatively simple SQL and database questions seem to grow exponentially.

What I am asking (or pleading) you to do is to pass on your knowledge of basics to others.  You should be able to speak on something basic you do on a regular basis to a novice for at least five minutes.  Find a local user group, Meetup, or other organization and offer them a short presentation.  Can't find one, start one! Feynman said that teaching really sharpens your own skill sets. 

Real novices need very clear examples.  Saying, "Oh it is like a Generator in PHP" is not going to help them frame of reference wise.  Take something you do every day (use the vi editor, explain a query, use css) and write a very simple guide and/or cheat sheet.  Thee is an audience for you information.

Or contact local groups and ask what subjects they would want as a presentation.  A simple 'How I set up PHP with Apache' covering the steps you took and what you discovered on the way may be old hat to you but there are folks stumbling around who would welcome your guidance.  Organizers of meetups and user groups are dying for presentations and even a five to ten minutes talk is welcome.  Ask your local groups what they would like to see and if you do not know the subject, plunge in and build your own skills.  Nobody is expecting perfection and often talking about how to dig yourself out of the potholes you smacked into is the real value of your presentation.

I wish there was a formalized way so that anyone with X years of experience could share their knowledge.  But there is not.  So I am asking you to get up off your backside, find a group near you, and offer your expertise.  Think of this as meta-documentation for the future generations who will have to suffer with your code and systems long after you have logged off for the last time.

And if you want to talk about an introduction to databases, I will send you my materials and help you prep for the presentation.