Wednesday, March 27, 2019

MySQL 8.0 Geographic Information Systems (GIS) - How to get the distance between two cities


MySQL before version 5.7 had less than stellar Geographic Information Systems (GIS) support.  In version 5.7 the Boost.Geometry two dimensional or 2D libraries were added.  And with 8.0 came the three dimensional or 3D libraries. But how do you use these features?

I would like to state up front that this is all new to me and this is an attempt for me to document what it takes to go from zero knowledge on GIS to something hopefully better.  What I want to do as an exercise is to get the distance between two places from their longitude and latitude, say two cities near where I live.  So what do we have to do to accomplish that?

It is actually easy with the functions provided if we have the longitude and latitude in an SRID 4326 format.

SELECT ST_Distance(
 (SELECT loc FROM cities WHERE name = 'Trondheim'),
 (SELECT loc FROM cities WHERE name = 'San Francisco')
) AS distance_in_meters;

Okay, but what is an SRID 4326 format, where do I get them, and how do I get data into that format? So let us dig deeper.

I have to thank Norvald Ryeng who is a senior software manager at Oracle for that query roughly transposed by me from his MySQL GIS 8.0 Overview.
Harrison's Clocks were the first successful one to be used by the Royal Navy for navigation and can be seen at the Royal Observatory, Greenwich, United Kingdom. Very interesting story of a man on the cutting edge of his technology for the time who had to overcome so many technical and political problems.

Geometry Class: Background


Geometric data is stored in a geometry data type. Each geometry has a type, a SRID, coordinates, and some other attributes that will be ignored for now. And we will mainly deal with the POINT class.  Think of a point as someplace on a map. We will skip CURVE, LineString, Surface, Polygons, GeometryCollection, MultiPoint, MultiCurve, MultiLineString, MultiSurface, and MultiPolygon.

Columns with a spatial data type have a SRID attribute to indicate the spatial reference system, or SRS, used to store data in that column.  There are SRSs for projecting a globe onto a flat surface (think Mercator Projection maps of the world), non projected representing longitude-latitude on an ellipsoid. In MySQL a SRID value of 0 represents an infinite flat Cartesian plane with no units assigned to the axes and is the default.  Now a group named the European Petroleum Survey Group has their own system with a value of 4326, which you will see later.

Coordinates are represented as double-precision (8-byte) numbers.  These number pairs are either planar (flat) or geodetic (think Earth's surface) and the same pair values from one system can be wildly different than the other.

In the query above we used a column named loc as the point where the cities were located.  You can see the details in Norvald's excellent slides or take my word that these queries were used in a column defined as loc POINT SRID 4326 NOT NULL for now.

INSERT INTO cities (name, loc) VALUES (
 'Trondheim',
 ST_GeomFromText('POINT(64.43048320193547 10.394972698312927)', 4326)
);
INSERT INTO cities (name, loc) VALUES (
 'San Francisco',
 ST_GeomFromText('POINT(37.272615666666667 -122.4425455194445)', 4326)
);

Now Wikipedia says Trondheim is at 63 degrees 25"47'N and 10 degrees 23"36'E which looks pretty close to the numbers above.  And the Nidaros Cathedral is listed as 63.4305 degrees North latitude and 10.3951 degrees East longitude. So saying POINT('latitude' 'longitude') seems to be what I am looking for to designate a location.

To peel back another layer of the onion, lets look at what ST_GeomFromText() does.  The states thusly that ST_GeomFromText 'Constructs a geometry value of any type using its WKT representation and SRID.'

Okay, that cleared up nothing. So what the heck is a WKT representation? The MySQL Manual has a section on Functions That Create Geometry Values from WKT Values.  The Well-Known Text (WKT) representation of geometry values is designed for exchanging geometry data in ASCII form. The OpenGIS specification provides a Backus-Naur grammar that specifies the formal production rules for writing WKT values.  So 'POINT(15 20)' is a WKT.

This is not getting easier, is it?  And we still do not know how to get the distance between two sets of longitude and latitude.

Getting Some Data


I was able to find a very nice list of US Zip Codes (postal codes) at https://gist.github.com/erichurst/7882666/ and used the MySQL Workbench data import agent to create a table. Just three columns ZIP for the Zip Code, LAT for latitude, and LNG for longitude. I stored the ZIP filed as TEXT (leading zeros that are significant disappear if the column in an INT) and the other two fields as REALs.  And there are 33,144 rows of data.

Lets look at the distance between the lovely cities of Justin and Trophy Club, both in the state of Texas. Justin is zip code 76247 and Trophy Club is 76262

 mysql> select * from data where ZIP in (76247, 76262);
+-------+-----------+------------+
| ZIP   | LAT       | LNG        | +-------+-----------+------------+
| 76247 | 33.099993 | -97.340499 |
| 76262 | 33.009335 |  -97.22672 | +-------+-----------+------------+
2 rows in set (0.03 sec)
 
And lets try a query.


mysql> select 
 st_distance(
  st_geomfromtext('point(33.099993 -97.34099)',4326),                     
  st_geomfromtext('point(33.009335 -97.22672)',4326)
  )  as distance;

+--------------------+


| distance           |
+--------------------+
| 14662.554615102901 |
+--------------------+
1 row in set (0.00 sec)



So there is roughly 15,000 meters between the cities of Justin and Trophy Club. And please note this is a flat model not one that takes in the curvature of the earth.  But we can get that easily enough by using ST_Distance_Sphere.

mysql> select st_distance_sphere(
    st_geomfromtext('point(33.099993 -97.34099)',4326),
    st_geomfromtext('point(33.009335 -97.22672)',4326)) 

   as sphere,
    st_distance(

     st_geomfromtext('point(33.099993 -97.34099)',4326),
     st_geomfromtext('point(33.009335 -97.22672)',4326)) 

    as flat;
+--------------------+--------------------+
| sphere             | flat               |
+--------------------+--------------------+
| 14664.184945803418 | 14662.554615102901 |
+--------------------+--------------------+
1 row in set (0.00 sec)







So even in the fairly flat area of North Texas, there is some difference between a spherical and flat world.



Sunday, March 10, 2019

MySQL and PHP Basics Part I

I have had some requests to write some blogs on the basics of using PHP and MySQL together.  This will not be a series for the experienced as it will start at a level where I will go into a lot of details but expect very few prerequisites from the reader.  If this is not you, please move on. If it is you and you read something you do not understand, please contact me to show me where I assumed too much.

PHP and MySQL are both in their mid twenties and both vital in the worlds of developers.  With the big improvements in PHP 7 and MySQL 8, I have found a lot of developers flocking to both but stymied by the examples they see as their are many details not explained. So let's get to the explaining!

1. Use the latest software

If you are not using PHP 7.2 or 7.3 (or maybe 7.1) then you are missing out in features and performance.  The PHP 5.x series is deprecated, no longer support, and is quickly disappearing.  

MySQL 8.0 is likewise a big advancement but many sites are using earlier versions.  If you are not on 5.6 with plans to upgrade to 5.7 then you are about to be left behind.  If you are running an earlier version then you are using antique code. Also get your MySQL from MySQL as your Linux distro may not be current, especially for the smaller ones.  The APT and DEB repos can be  found here and there are Docket containers available too. 

In many cases it is a fight to keep your software core tools up to date, or fairly up to to date.  The time and heartache in fighting problems resolved in a later version of software or creating a work around for a feature not in your older version will eventually bite you hard and should be viewed as a CRM (Career Limiting Move).  BTW hiring managers look for folks with current skills not skill for a decade old version of the skills.

2. Do not pick one connector over another (yet!)

PHP is a very rich environment for developers and it has three viable options for connecting to MySQL databases.  Please note that the older mysql connector is deprecated, no longer support, and is to be avoided.  It was replaced by the mysqli or mysqlnd (native driver) and is officially supported by Oracle's MySQL Engineers.  Next is the PDO (public data objects) connector that is designed to be database agnostic but there is no overall maintainer who watches out for the code but Oracle MySQL Engineers do try to fix MySQL related issues if they do not impinge on other PDO code. And the newest, using the new MySQL X DevAPI and X protocol is the X DevAPI connector which supports both SQL and NoSQL interfaces.

The good news for developers is that you can install all three, side by side, with no problem.  For those staring out the ability to transpose from connector can be a little confusing as they work just a bit differently from each other but the ability to use two or more is a good skill to have.  Such much like being able to drive a car with an automatic or manual transmission, it does give you more professional skills.

Next time we will install PHP, MySQL, the three connectors, and some other cool stuff so you can start using PHP to access your MySQL servers.

Friday, March 8, 2019

SCaLE 17x

The Southern California Linux Expo has a MySQL Track this year and if you are around Pasadena, you should be here.  Besides the amazing expo hall, this is the biggest open source show on the left coast.  The MySQL track features talks on the MySQL Document Store, Kuberneties, query optimization, analytics, consistency at Facebook, and MySQL Security.