JSON in Flat File to MySQL Database
So how do you load that JSON data file into MySQL. Recently I had this question presented to me and I thought I would share a handy script I use to do such work. For this example I will use the US Zip (postal) codes from
JSONAR. Download and unzip the file. The data file is named
zips.json and it can not be bread directly into MySQL using the SOURCE command. It needs to have the information wrapped in a more palatable fashion.
head zips.json
{ "city" : "AGAWAM", "loc" : [ -72.622739, 42.070206 ], "pop" : 15338, "state" : "MA", "_id" : "01001" }
{ "city" : "CUSHMAN", "loc" : [ -72.51564999999999, 42.377017 ], "pop" : 36963, "state" : "MA", "_id" : "01002" }
{ "city" : "BARRE", "loc" : [ -72.10835400000001, 42.409698 ], "pop" : 4546, "state" : "MA", "_id" : "01005" }
{ "city" : "BELCHERTOWN", "loc" : [ -72.41095300000001, 42.275103 ], "pop" : 10579, "state" : "MA", "_id" : "01007" }
{ "city" : "BLANDFORD", "loc" : [ -72.936114, 42.182949 ], "pop" : 1240, "state" : "MA", "_id" : "01008" }
{ "city" : "BRIMFIELD", "loc" : [ -72.188455, 42.116543 ], "pop" : 3706, "state" : "MA", "_id" : "01010" }
{ "city" : "CHESTER", "loc" : [ -72.988761, 42.279421 ], "pop" : 1688, "state" : "MA", "_id" : "01011" }
{ "city" : "CHESTERFIELD", "loc" : [ -72.833309, 42.38167 ], "pop" : 177, "state" : "MA", "_id" : "01012" }
{ "city" : "CHICOPEE", "loc" : [ -72.607962, 42.162046 ], "pop" : 23396, "state" : "MA", "_id" : "01013" }
{ "city" : "CHICOPEE", "loc" : [ -72.576142, 42.176443 ], "pop" : 31495, "state" : "MA", "_id" : "01020" }
Follow the Document Store Example
The MySQL Document Store is designed for storing JSON data and this example will follow its practices by having a two column table -- a JSON column, and another column for a primary key (remember InnoDB wants so badly to have a primary key on each table that it will create one for you but it is better practice to make it yourself; besides we want to search on the zipcode which is labeled as
_id in the data. So we use a stored generated column that uses
JSON_UNQUOTE(JSON_EXTRACT(doc,"$_id")) and saves that info in a column named
zip.
So a simple table is created and it looks like this:
mysql> desc zipcode\g
+-------------+-------------+------+-----+---------+-------------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-------------+------+-----+---------+-------------------+
| doc | json | YES | | NULL | |
| zip | char(5) | NO | PRI | NULL | STORED GENERATED |
+-------------+-------------+------+-----+---------+-------------------+
2 rows in set (0.00 sec)
Handy Script
So now we have the data, we have the table, and now we need to convert the data into something MySQL can use to laod the data.
Bash is one of those shells with so many rich built-in tools that is hard to remember them all. But it does have a hand read line feature that can be used for the task.
#!/bin/bash
file="/home/dstokes/Downloads/zips.json"
while IFS= read line
do
echo "INSERT INTO zipcode (doc) VALUES ('$line');"
done <"$file"
Run the script and output the data to a file named foo,
./loader.sh > foo.
The output shows how the data is wrapped:
$head foo
INSERT INTO zipcode (doc) VALUES ('{ "city" : "AGAWAM", "loc" : [ -72.622739, 42.070206 ], "pop" : 15338, "state" : "MA", "_id" : "01001" }');
INSERT INTO zipcode (doc) VALUES ('{ "city" : "CUSHMAN", "loc" : [ -72.51564999999999, 42.377017 ], "pop" : 36963, "state" : "MA", "_id" : "01002" }');
INSERT INTO zipcode (doc) VALUES ('{ "city" : "BARRE", "loc" : [ -72.10835400000001, 42.409698 ], "pop" : 4546, "state" : "MA", "_id" : "01005" }');
INSERT INTO zipcode (doc) VALUES ('{ "city" : "BELCHERTOWN", "loc" : [ -72.41095300000001, 42.275103 ], "pop" : 10579, "state" : "MA", "_id" : "01007" }');
INSERT INTO zipcode (doc) VALUES ('{ "city" : "BLANDFORD", "loc" : [ -72.936114, 42.182949 ], "pop" : 1240, "state" : "MA", "_id" : "01008" }');
INSERT INTO zipcode (doc) VALUES ('{ "city" : "BRIMFIELD", "loc" : [ -72.188455, 42.116543 ], "pop" : 3706, "state" : "MA", "_id" : "01010" }');
INSERT INTO zipcode (doc) VALUES ('{ "city" : "CHESTER", "loc" : [ -72.988761, 42.279421 ], "pop" : 1688, "state" : "MA", "_id" : "01011" }');
INSERT INTO zipcode (doc) VALUES ('{ "city" : "CHESTERFIELD", "loc" : [ -72.833309, 42.38167 ], "pop" : 177, "state" : "MA", "_id" : "01012" }');
INSERT INTO zipcode (doc) VALUES ('{ "city" : "CHICOPEE", "loc" : [ -72.607962, 42.162046 ], "pop" : 23396, "state" : "MA", "_id" : "01013" }');
INSERT INTO zipcode (doc) VALUES ('{ "city" : "CHICOPEE", "loc" : [ -72.576142, 42.176443 ], "pop" : 31495, "state" : "MA", "_id" : "01020" }');
So now the data can be loaded with mysql -u itisme test < foo.