Monday, July 8, 2019

MySQL Workbench and OpenStreetMap Data Visualization

I have a presentation next month on MySQL and GIS.  MySQL 8.0 has benefited greatly from the three dimensional libraries from Boost.Geometry. There are many facets to the Geographic Data world that it is damn near impossible not to lurch down one rabbit hole into another in an unending spiral of acronyms, standards, projections, and functions.  But thankfully I have MySQL Workbench to aid me.

Texas


I wanted some test data to use for some GIS exercises and was very happy to find many useful sets curated by the OpenStreetMaps folks. Shapefiles are used to hold the various data points of an item of interest.  I had assumed that the data would have some sort of longitude/latitude pairs but was wondering what I would need to do to work with that data and what ever came bundled with it.  I download the Texas data and then loaded it into the database.

You will need a copy of the ogr2org utility This wonderful program reads the raw shapefile and converts it into SQL. You may want to get the source but hopefully a version may be available for you Linux distribution.  There is a copy of the ogr2ogr program that comes with MySQL Workbench but I have not been able to get it to convert the data and load that data into a MySQL instance, with either Windows, Fedora, or Ubuntu.

$ogr2ogr -overwrite -progress -f "MySQL" mysql:texas,user=dave,password=****** gis_osm_natural_a_free_1.shp
0...10...20...30...40...50...60...70...80...90...100 - done.
$ ogr2ogr -overwrite -progress -f "MySQL" mysql:texas,user=dave,password=***** gis_osm_natural_free_1.shp
0...10...20...30...40...50...60...70...80...90...100 - done.


The Data


The ogr2org utility reads the shape files and puts all the attributes into a a table. 

mysql> DESC gis_osm_natural_free_1;
+---------+--------------+------+-----+---------+----------------+
| Field   | Type         | Null | Key | Default | Extra          |
+---------+--------------+------+-----+---------+----------------+
| OGR_FID | int(11)      | NO   | PRI | NULL    | auto_increment |
| SHAPE   | geometry     | NO   | MUL | NULL    |                |
| osm_id  | varchar(10)  | YES  |     | NULL    |                |
| code    | decimal(4,0) | YES  |     | NULL    |                |
| fclass  | varchar(28)  | YES  |     | NULL    |                |
| name    | varchar(100) | YES  |     | NULL    |                |
+---------+--------------+------+-----+---------+----------------+
6 rows in set (0.01 sec)

And a sample line from that table.

mysql> select OGR_FID,ST_ASTEXT(SHAPE) as Shape, osm_id, code, fclass, name FROM gis_osm_natural_free_1 limit 1;
+---------+-------------------------------+----------+------+--------+--------------------+
| OGR_FID | Shape                         | osm_id   | code | fclass | name               |
+---------+-------------------------------+----------+------+--------+--------------------+
|       1 | POINT(29.3060929 -94.7679897) | 80029566 | 4141 | beach  | Stewart Beach Park |
+---------+-------------------------------+----------+------+--------+--------------------+
1 row in set (0.00 sec)

The SHAPE column has the desired longitude and latitude. So now we have this data but I had zero clue to where Stewart Beach Park was located.

MySQL Workbench to the Rescue


Workbench makes it easy to 'see' where the data was.  You can see in the below example that Stewart Beach Park is the first row returned.

MySQL Workbench has an amazing number of features including the ability to help display GIS data.

If you right click on the 'BLOB' graphic under the shape column you will see an option to Show Point In Browser.  

The location of Stewart Beach Park on the Galveston Sea Wall

So Stewart Beach Park is right on the Galveston Sea Wall and around the corner from the cruise ship terminals. 

Mid-Atlantic Developer's Conference


I will be speaking on MySQL & GIS at the Mid-Atlantic Developer's Conference and you can still attend but tickets are going quickly. 








Monday, July 1, 2019

Plot your Location MySQL Workbench and OpenStreetMap

MySQL has added a lot of functionality for Geographical Information System (GIS) data in the last two releases. This has given us better ways to save and explore GIS data. MySQL Workbench is an amazing tool that many do not realize has vastly upped its game for supporting GIS information.  But did you know you can use MySQL Workbench with OpenStreetMap to plot locations?

1. Get your Longitude and Latitude


These, believe it or not, is the hard part.  Go to  https://www.latlong.net/ and type in the name of your location.  It will report back your longitude and your latitude.  For the example I will be using the information for the city of Justin, Texas.


Using www.longlat.net to find the location of Justin, Texas

2. Create Table and Add Data 

Now create a table using MySQL Workbench in your favorite test schema to store your data.

CREATE TABLE city (id int unsigned auto_increment primary key, 
             name char(25), 
  loc point);

And then add the information and as before the example is for my home town.

INSERT INTO city (name,loc) VALUES 
('Justin', ST_GeomFromText('point(33.084843 -97.296127)',4326));


3. Query With Workbench


Now query your data from workbench.  If you only have the one record SELECT * FROM city; will suffice or use SELECT loc FROM city WHERE name='<yourcitynamehere>';

Running the query to get the longitude and latitude of Jsutin, Texas using MySQL Workbench. Note the 'BLOB' under the loc column

4. Open In a Browser



If you right click on the 'BLOB' icon under the 'loc' column, one of the choice is show point in browser. Pick that choice and you will get redirected to OpenStreetMap.org

The OpenStreetMap.org map of Justin, Texas


Thursday, June 27, 2019

The '$' As The JSON Document

Recently on Stackoverflow was a question about the handling of  a JSON document stored in a MySQL Database. The data looked like this:

[{"name":"cdennett","address":"123 street","Postcode":"ABCDE"}]

The data above is valid JSON. That data is in an array because it is surrounded by []s while objects are surrounded by {}s. And the author was trying to use the following JSON_TABLE function

SELECT people.* 
FROM t1, 
     JSON_TABLE(json_col, '$.people[*]' 
COLUMNS (
                name VARCHAR(40)  PATH '$.name',
                address VARCHAR(100) PATH '$.address')
     ) people;

Some you who have been using the JSON data type are probably smiling at the code above.It is a simple problem caused by confusion about the path of the JSON document. The problem is the way the data is referenced. Or to put it another way the path to the document is wrong. And, sadly, you probably only run into this after getting confused and having to learn the proper way to look at the path of a JSON document.

'$' is Your Document

The '$' character refers to the entire document. Is we use JSON_EXTRACT or the arrow operator it is easy to retrieve the entire document. 

 select json_extract(json_col,'$') from t1;
+----------------------------------------------------------------------+
| json_extract(json_col,'$')                                           |
+----------------------------------------------------------------------+
| [{"name": "cdennett", "address": "123 street", "Postcode": "ABCDE"}] |
+----------------------------------------------------------------------+

Or we could use the shortcut method  select json_col->"$" from t1; to get the same results.

Where this gets confusing is if you use a function like JSON_KEYS like such:

select JSON_KEYS(json_col) from t1x;
+---------------------+
| JSON_KEYS(json_col) |
+---------------------+
| NULL                |
+---------------------+

Why did we get a NULL and not the same output as from the JSON_EXTRACT?  Well, JSON_KEYS will return null if a) any argument is NULL, the document in question is not an object, or a path.  And an array is not an object.

But other functions are not bothered by the fact that the document is not an object.

select json_pretty(json_col) from t1;
+------------------------------------------------------------------------------------------+
| json_pretty(json_col)                                                                    |
+------------------------------------------------------------------------------------------+
| [
  {
    "name": "cdennett",
    "address": "123 street",
    "Postcode": "ABCDE"
  }
] |
+------------------------------------------------------------------------------------------+

So if '$' is [{"name":"cdennett","address":"123 street","Postcode":"ABCDE"}] how do we 'peel off one onion layer' to get to the {"name":"cdennett","address":"123 street","Postcode":"ABCDE"}?

The contents of the first array is denoted as $[0].

'$' is [{"name":"cdennett","address":"123 street","Postcode":"ABCDE"}]
and
'$[0] is {"name":"cdennett","address":"123 street","Postcode":"ABCDE"}
and
'$[0].name is "cdbennet"

So if we refer back to the code snippet from Stackoverflow, it becomes evident that the path was certainly not '$.people[*]' but '$[*]' or '$[0]'.

Conclusions

So we end up with two conclusions. First is that to remember that '$' refers to the entire document and walking down the document of the structure means walking down a path that starts at '$'. And second, you might want to consider not burring things in a top level array.
























Monday, May 20, 2019

Structuring Your Unstructured JSON data

The world seems awash in unstructured, NoSQL data, mainly of the JSON variety.  While this has a great many benefits as far as data mutability and not being locked into a rigid structure there are some things missing that are common in the structured world of SQL databases.

What if there was a way to take this unstructured NoSQL JSON data and cast it, temporarily, into a structured table?  Then you could use all the processing functions and features found in a relation database on you data.  There is a way and it is the JSON_TABLE function.

JSON_TABLE


You can find the documentation for JSON_TABLE here  but there are some examples below that may make learning this valuable function easier than the simple RTFM.

I will be using the world_x dataset for the next example

If we run a simple SELECT JSON_PRETTY(doc) FROM countryinfo LIMIT 1;  the server will return something similar to the following:

{
  "GNP": 828,
  "_id": "ABW",
  "Name": "Aruba",
  "IndepYear": null,
  "geography": {
    "Region": "Caribbean",
    "Continent": "North America",
    "SurfaceArea": 193
  },
  "government": {
    "HeadOfState": "Beatrix",
    "GovernmentForm": "Nonmetropolitan Territory of The Netherlands"
  },
  "demographics": {
    "Population": 103000,
    "LifeExpectancy": 78.4000015258789
  }
}


We can use JSON_TABLE to extract the Name, the Head of State, and the Governmental Form easily with the following query. If you are not used to the MySQL JSON Data type, the "$" references the entire document in the doc column (and doc is out JSON data type column in the table).  And notice that the $.government.HeadOfState and $.government.GovernmentForm are the full path to the keys in the document.

select jt.* FROM countryinfo,
json_table(doc, "$" COLUMNS (
          name varchar(20) PATH "$.Name",
          hofstate varchar(20) PATH '$.government.HeadOfState',
          gform varchar(50) PATH '$.government.GovernmentForm')) as jt
limit 1;

The syntax is JSON_TABLE(expr, path COLUMNS (column_list) [AS] alias) where expr is either a column column from a table or a JSON document passed to the function ('{"Name": "Dave"}' as an example).  Then the desired columns are specified where we name the new column, give it a relational type, and then specify the path of the JSON values we want to cast.

And the results are in the form of a relational table.

+-------+----------+----------------------------------------------+
| name  | hofstate | gform                                        |
+-------+----------+----------------------------------------------+
| Aruba | Beatrix  | Nonmetropolitan Territory of The Netherlands |
+-------+----------+----------------------------------------------+

This is JSON_TABLE in its most basic form.  The only thing I would like to emphasize is that the keys of the JSON data are case sensitive and it is import to check your spelling!

Data Problems

There is also a nice feature to JSON_TABLE where you assign a default value if that key/value pair is missing or yet another value if there is something can not be cast. If we use a non-existent key/value pair here named 'xyz' for an example, we can insert the value '888' for any JSON document missing values.

select jt.* FROM countryinfo,
json_table(doc, "$" COLUMNS (
          name varchar(20) PATH "$.Name",
          hofstate varchar(20) PATH '$.government.HeadOfState',
          xyz int(4) PATH '$.xyz' DEFAULT '999' ON ERROR DEFAULT '888' ON EMPTY,
          gform varchar(50) PATH '$.government.GovernmentForm')) as jt
limit 1;

And how the result looks:

+-------+----------+-----+----------------------------------------------+
| name  | hofstate | xyz | gform                                        |
+-------+----------+-----+----------------------------------------------+
| Aruba | Beatrix  | 888 | Nonmetropolitan Territory of The Netherlands |
+-------+----------+-----+----------------------------------------------+


NULL Handling


Now be careful with Null values. If you change the new line to  xyz int(4) PATH '$.IndepYear' DEFAULT '999' ON ERROR DEFAULT '888' ON EMPTY, we can easily see that the NULL value for Aruba's year of independence will return the default '999' value.  And if you change the path to '$.Name' to try and force the string value into the integer it will trake the ON ERROR path.

And you can assign missing values to NULL 

Nested Path Data


Iterating nested arrays can be painful but JSON_TABLE makes it very simple.  So creating some dummy data, we can start work on digging through the nested information.

select * from a;
+----+-----------------------+
| id | x                     |
+----+-----------------------+
|  1 | {"a": 1, "b": [1, 2]} |
|  2 | {"a": 2, "b": [3, 4]} |
|  3 | {"a": 3, "b": [5, 6]} |

+----+-----------------------+

The query features the NESTED PATH argument

select d.* FROM a, 
JSON_TABLE(x, "$" columns 
        (mya varchar(50) PATH "$.a", 
NESTED PATH "$.b[*]" 
                columns (myb int path '$'))
) as d;

The output.

+-----+-----+
| mya | myb |
+-----+-----+
| 1   |   1 |
| 1   |   2 |
| 2   |   3 |
| 2   |   4 |
| 3   |   5 |
| 3   |   6 |
+-----+-----+
6 rows in set (0.0013 sec)

Not bad but lets add another level.

select * from b;
+----+-----------------------------------------------------+
| id | x                                                   |
+----+-----------------------------------------------------+
|  1 | {"a": 2, "b": [{"c": 101, "d": [44, 55, 66]}]}      |
|  2 | {"a": 1, "b": [{"c": 100, "d": [11, 22, 33]}]}      |
|  3 | {"a": 3, "b": [{"c": 102, "d": [77, 88, 99, 101]}]} |
+----+-----------------------------------------------------+
3 rows in set (0.0009 sec)

So lets embed another level

select d.* FROM b,  
       JSON_TABLE(x, "$" columns          
        (mya varchar(50) PATH "$.a",  
          NESTED PATH "$.b[*]"          
           columns (myc int path '$.c', 
           NESTED PATH '$.d[*]'  
           columns (dpath int path '$'))) 
) as d order by myc;
+-----+-----+-------+
| mya | myc | dpath |
+-----+-----+-------+
| 1   | 100 |    22 |
| 1   | 100 |    33 |
| 1   | 100 |    11 |
| 2   | 101 |    44 |
| 2   | 101 |    55 |
| 2   | 101 |    66 |
| 3   | 102 |    77 |
| 3   | 102 |    88 |
| 3   | 102 |    99 |
| 3   | 102 |   101 |
+-----+-----+-------+
10 rows in set (0.0006 sec)

And we can get ordinal numbers too.

select d.* FROM b,  
         JSON_TABLE(x, "$" columns          
        (mya varchar(50) PATH "$.a",  
           NESTED PATH "$.b[*]"          
           columns (myc int path '$.c', 
           nested path '$.d[*]'  
           columns (dcount for ordinality
           dpath int path '$'))) ) as d 
order by dpath;
+-----+-----+--------+-------+
| mya | myc | dcount | dpath |
+-----+-----+--------+-------+
| 1   | 100 |      1 |    11 |
| 1   | 100 |      2 |    22 |
| 1   | 100 |      3 |    33 |
| 2   | 101 |      1 |    44 |
| 2   | 101 |      2 |    55 |
| 2   | 101 |      3 |    66 |
| 3   | 102 |      1 |    77 |
| 3   | 102 |      2 |    88 |
| 3   | 102 |      3 |    99 |
| 3   | 102 |      4 |   101 |
+-----+-----+--------+-------+
10 rows in set (0.0009 sec)

And not that we have the data structured, we can start using the WHERE clause such as  where myc > 100 and dpath < 100.














Monday, April 29, 2019

MySQL 8.0 GIS Units of Measure - Meter, foot, Clarke's yard, or Indian Foot

The ST_DISTANCE function has been upgraded in MySQL 8.0.16 to allow you to specify the unit of measure between to locations.  Previously you had to convert from meters to what you desired but now you can use the INFORMATION_SCHEMA.ST_UNITS_OF_MEASURE table to help you get many of the more popular measurements (foot, yard, statue mile, nautical mile, fathom) and some ones that are new to me (chain, link, various feet).   However some measures are omitted (furlong,smoot) that may have some relevance in your life.

select * from information_schema.ST_UNITS_OF_MEASURE;
Fetching table and column names from `mysql` for auto-completion... Press ^C to stop.
+--------------------------------------+-----------+---------------------+-------------+
| UNIT_NAME                            | UNIT_TYPE | CONVERSION_FACTOR   | DESCRIPTION |
+--------------------------------------+-----------+---------------------+-------------+
| metre                                | LINEAR    |                   1 |             |
| millimetre                           | LINEAR    |               0.001 |             |
| centimetre                           | LINEAR    |                0.01 |             |
| German legal metre                   | LINEAR    |        1.0000135965 |             |
| foot                                 | LINEAR    |              0.3048 |             |
| US survey foot                       | LINEAR    | 0.30480060960121924 |             |
| Clarke's yard                        | LINEAR    |        0.9143917962 |             |
| Clarke's foot                        | LINEAR    |        0.3047972654 |             |
| British link (Sears 1922 truncated)  | LINEAR    |          0.20116756 |             |
| nautical mile                        | LINEAR    |                1852 |             |
| fathom                               | LINEAR    |              1.8288 |             |
| US survey chain                      | LINEAR    |   20.11684023368047 |             |
| US survey link                       | LINEAR    |  0.2011684023368047 |             |
| US survey mile                       | LINEAR    |  1609.3472186944375 |             |
| Indian yard                          | LINEAR    |  0.9143985307444408 |             |
| kilometre                            | LINEAR    |                1000 |             |
| Clarke's chain                       | LINEAR    |       20.1166195164 |             |
| Clarke's link                        | LINEAR    |      0.201166195164 |             |
| British yard (Benoit 1895 A)         | LINEAR    |           0.9143992 |             |
| British yard (Sears 1922)            | LINEAR    |  0.9143984146160288 |             |
| British foot (Sears 1922)            | LINEAR    |  0.3047994715386762 |             |
| Gold Coast foot                      | LINEAR    |  0.3047997101815088 |             |
| British chain (Sears 1922)           | LINEAR    |  20.116765121552632 |             |
| yard                                 | LINEAR    |              0.9144 |             |
| British link (Sears 1922)            | LINEAR    |  0.2011676512155263 |             |
| British foot (Benoit 1895 A)         | LINEAR    |  0.3047997333333333 |             |
| Indian foot (1962)                   | LINEAR    |           0.3047996 |             |
| British chain (Benoit 1895 A)        | LINEAR    |          20.1167824 |             |
| chain                                | LINEAR    |             20.1168 |             |
| British link (Benoit 1895 A)         | LINEAR    |         0.201167824 |             |
| British yard (Benoit 1895 B)         | LINEAR    |  0.9143992042898124 |             |
| British foot (Benoit 1895 B)         | LINEAR    | 0.30479973476327077 |             |
| British chain (Benoit 1895 B)        | LINEAR    |  20.116782494375872 |             |
| British link (Benoit 1895 B)         | LINEAR    |  0.2011678249437587 |             |
| British foot (1865)                  | LINEAR    | 0.30480083333333335 |             |
| Indian foot                          | LINEAR    | 0.30479951024814694 |             |
| Indian foot (1937)                   | LINEAR    |          0.30479841 |             |
| Indian foot (1975)                   | LINEAR    |           0.3047995 |             |
| British foot (1936)                  | LINEAR    |        0.3048007491 |             |
| Indian yard (1937)                   | LINEAR    |          0.91439523 |             |
| Indian yard (1962)                   | LINEAR    |           0.9143988 |             |
| Indian yard (1975)                   | LINEAR    |           0.9143985 |             |
| Statute mile                         | LINEAR    |            1609.344 |             |
| link                                 | LINEAR    |            0.201168 |             |
| British yard (Sears 1922 truncated)  | LINEAR    |            0.914398 |             |
| British foot (Sears 1922 truncated)  | LINEAR    | 0.30479933333333337 |             |
| British chain (Sears 1922 truncated) | LINEAR    |           20.116756 |             |
+--------------------------------------+-----------+---------------------+-------------+
47 rows in set (0.0019 sec)



Friday, April 26, 2019

Bye Bye to mysql_upgrade, change to skip_grant_tables, and One Year of MySQL 8.0 GA

The MySQL 8.0.16 Release Notes are very interesting and sadly not read enough. One thing that may have escaped attention is that you no longer have to run mysql_upgrade after updating the binaries.

Let me repeat: you no longer have to run mysql_upgrade after updating the binaries. 

From the release notes:
Previously, after installation of a new version of MySQL, the MySQL server automatically upgrades the data dictionary tables at the next startup, after which the DBA is expected to invoke mysql_upgrade manually to upgrade the system tables in the mysql schema, as well as objects in other schemas such as the sys schema and user schemas.

The server, starting 8.0.16, does the work previously done by mysql_upgrade for itself.  And mysql_upgrade itself is deprecated.

I have had to help too many folks who either forgot to run mysql_upgrade after an upgrade or did not know they could not run it properly do to a lack of permissions from their unprivileged user account. 

One Year of 8.0

And speaking of MySQL 8.0, it has been out for OVER one year now.  Woot!

Skip_grant_tables

Another change to note concerns the much abused skip_grant_tables option.

Previously, if the grant tables were corrupted, the MySQL server wrote a message to the error log but continued as if the --skip-grant-tables option had been specified. This resulted in the server operating in an unexpected state unless --skip-grant-tables had in fact been specified. Now, the server stops after writing a message to the error log unless started with --skip-grant-tables.








Thursday, April 25, 2019

MySQL 8.0.16 Check Constraints

Before MySQL 8.0.16 you could put constraint checks into your Data Definition Language (DDL) when creating tables but the server ignored them.  There was much gnashing of teeth as taunts of "It is not a real database" from other databases taunted the MySQL Community. 
 
But with 8.0.16 this has all changed. You can now have your data constraints checked by the server. Below is an example table with two constraints.

mysql>CREATE TABLE parts 
            (id int, cost decimal(5,2) not null check (cost > 0),
             price decimal(5,2) not null check (price > 1.0)
          );
Query OK, 0 rows affected (0.09 sec)

mysql> insert into parts (id,cost,price) values (1,1.10,2.25);
Query OK, 1 row affected (0.03 sec)

mysql> insert into parts (id,cost,price) values (1,1.10,0.25);
ERROR 3819 (HY000): Check constraint 'parts_chk_2' is violated.


mysql> insert into parts (id,cost,price) values (2,-1.1,4.25);
ERROR 3819 (HY000): Check constraint 'parts_chk_1' is violated.


The first data input above does passes the constraints since both the cost and price columns are greater than the minimum. But not so with the next inserts, So the constraints work!

Finding Out What The Error Message Means

Now seeing Check constraint 'parts_chk_1' is violated. tells me the data is out of value. But how do I find out what parts_chk_1 is?

The fist thing to check is the table itself.


mysql> show create table parts;
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                                                                                |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| parts | CREATE TABLE `parts` (
  `id` int(11) DEFAULT NULL,
  `cost` decimal(5,2) NOT NULL,
  `price` decimal(5,2) NOT NULL,
  CONSTRAINT `parts_chk_1` CHECK ((`cost` > 0)),
  CONSTRAINT `parts_chk_2` CHECK ((`price` > 1.0))

) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |
+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)


And you can also see the information in the IS.

mysql> select * from INFORMATION_SCHEMA.CHECK_CONSTRAINTS;
+--------------------+-------------------+-----------------+-----------------+
| CONSTRAINT_CATALOG | CONSTRAINT_SCHEMA | CONSTRAINT_NAME | CHECK_CLAUSE    |
+--------------------+-------------------+-----------------+-----------------+
| def                | davetest          | xxx_chk_1       | (`age` > 18)    |
| def                | davetest          | parts_chk_1     | (`cost` > 0)    |
| def                | davetest          | parts_chk_2     | (`price` > 1.0) |
| def                | davetest          | client_chk_1    | (`age` > 17)    |
+--------------------+-------------------+-----------------+-----------------+
4 rows in set (0.00 sec)