Friday, January 10, 2020

Indexing the MySQL Document Store

I am writing a tutorial on the MySQL Document Store for the sold out (sorry) pre-FOSDEM MySQL days.  For those who do not write such exercise they are often a difficult task as you have a limited time to convey information, need to provide vivid examples, and create exercises that give a good idea of what the topic is all about.  And my personal preference is to write once and use the tutorial at other events (please let me know if you have such an event).

Indexing records is a well know performance step when creating databases, SQL or NoSQL.  And back in June of 2017 I wrote a blog post on using createIndex() to index documents in the MySQL Document Store. And as part of creating the tutorial I referred to that blog post as a reference and was quite surprised that it was not working.

What happened? Well back in 8.0.11 the function was revised and it is no longer a series of chained calls but a function that receives 2 parameters, details can be found at: https://dev.mysql.com/doc/x-devapi-userguide/en/collection-indexing.html

So what follows is an update to the old blog post with the new version of the function.

Indexing and the MySQL Document Store

The MySQL Document Store allows developers who do not know Structured Query Language (SQL) to use MySQL as a high efficient NoSQL document store. It has several great features but databases, NoSQL and SQL, have a problem searching through data efficiently. To help searching, you can add an index on certain fields to go directly to certain records. Traditional databases, like MySQL, allow you to add indexes and NoSQL databases, for example MongoDB, lets you add indexes. The MySQL Document Store also allows indexing.

So lets take a quick look at some simple data and then create an index.

db.b.find()
{
    "_id": "00005e163bc70000000000000001",
    "nbr": 1
}
{
    "_id": "00005e163bc70000000000000002",
    "nbr": 3
}
{
    "_id": "00005e163bc70000000000000003",
    "nbr": 5
}
{
    "_id": "00005e163bc70000000000000004",
    "nbr": 7
}
{
    "_id": "00005e163bc70000000000000005",
    "nbr": 99
}
{
    "_id": "00005e163bc70000000000000006",
    "nbr": 2
}
6 documents in set (0.0037 sec)

To index the nbr field with the 8.0.11 syntax we need to specify the name of the index and then the parameters for the index. In the example below we name the index nbr_idx and provide a JSON object of {fields:[{"field": "$.nbr", "type":"INT", required:true}]} with the desired settings, which is called the index definition.


db.b.createIndex("nbr_idx", {fields:[{"field": "$.nbr", "type":"INT", required:true}]});

The index name is up to you but please keep it useful and relevant.

What you Specify to Create and Index

Rule # 1 -- All the values in an key/value to be indexed MUST be of the same type! So no '1,2,3,Ralph,3.4' plese

The JSON document used for defining an index, such as {fields: [{field: '$.username', type: 'TEXT'}]}, can contain the following:

fields: an array of at least one IndexField object, each of which describes a JSON document field to be included in the index.

A single IndexField description consists of the following fields:


  • field: a string with the full document path to the document member or field to be indexed
  • type: a string with one of the supported column types to map the field to. For numeric types, the optional UNSIGNED keyword can follow. For the TEXT type you can define the length to consider for indexing (you do not need to index all that TEXT column, just enough to narrow down your search).
  • required: an optional boolean, set to true if the field is required to exist in the document. Defaults to false for all types except GEOJSON, which defaults to true.
  • options: an optional integer, used as special option flags to use when decoding GEOJSON data.
  • srid: an optional integer, srid value to use when decoding GEOJSON data.
  • array: (for MySQL 8.0.17 and later) an optional boolean, set to true if the field contains arrays. The default value is false. 
  • type: an optional string which defines the type of index. One of INDEX or SPATIAL. The default is INDEX and can be omitted.

Quick Review


Old syntax (MySQL 8.0.10 and earlier):

db.foo.createIndex("nbr_idx").field("nbr","INTEGER", false).execute()

New syntax (MySQL 8.0.11 and later):

db.b.createIndex("nbr_idx", {fields:[{"field": "$.nbr", "type":"INT", required:true}]});