A post on the MySQL forums caught my eye today. The distilled question was why was an index not being used to search JSON data. The poster (in blue) had done most of the steps needed almost correctly but just missed the crucial steps.
Suppose we have this table with an index on a virtual generated column as JSON attribute index pattern
```
CREATE TABLE applications (
id BINARY(16) PRIMARY KEY,
data JSON NOT NULL,
) ENGINE=InnoDB;
CREATE INDEX idx ON applications ((CAST(data->>"$.name" AS CHAR(10))));
One of the frustrations for novices with databases is the combined richness of Structured Query Language or SQL and the many options available with MySQL which can cause problems. In this case the index was created from a generated data but is it really a generated column?
For queries with `=` operator to compare the text field of data, the index is hit and works as expected
```
mysql> EXPLAIN SELECT * FROM applications FORCE INDEX (idx) WHERE CAST(data->>'$.name' AS CHAR(10)) = 'app-1';
+----+----+----+-------+------+---------------+------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra
+----+-----+-------+---+------+---------------+------+---------+-------+------+----------+-------+
| 1 | SIMPLE | applications | NULL | ref | idx | idx | 13 | const | 1 | 100.00 | NULL |
+----+----+----+-------+------+---------------+------+---------+-------+------+----------+-------+1 row in set, 1 warning (0.00 sec)
So the index works as expected. But there is trouble on the horizon. And by the way, I do not like to use FORCE INDEX.
But for queries with `LIKE` operator, even in case search without wildcard, the indexes is not hit
```
mysql> EXPLAIN SELECT * FROM applications FORCE INDEX (idx) WHERE CAST(data->>'$.name' AS CHAR(10)) LIKE 'app-1';
+----+--+-----+---+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+----+----+-------+------+----------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | applications | NULL | ALL | NULL | NULL | NULL | NULL | 6 | 100.00 | Using where |
+----+---+------+-+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
mysql> EXPLAIN SELECT * FROM applications FORCE INDEX (idx) WHERE CAST(data->>'$.name' AS CHAR(10)) LIKE 'app-%';
+----+---+------+--+------+----------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+----+-----+---+------+-------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | applications | NULL | ALL | NULL | NULL | NULL | NULL | 6 | 100.00 | Using where |
+----+------------+------+----+------+------+------+------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
So why would LIKE cause a problem? And is there a work around? Some of you may be wondering if this is a case for using Full Text Search.
Digging Deeper
First I wanted to 'trust but verify' the examples. I have learned the hard way that sometimes that the example code can be wrong or omit something critical.
data JSON not null);
create index ap1_idx on ap1 ((cast(data->>"$.name" AS CHAR(10))));
I created some test data and ran the query wondering if I would get the same result. The equal operator worked fine but not the like with a wildcard.
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: ap1
partitions: NULL
type: ref
possible_keys: ap1_idx
key: ap1_idx
key_len: 43
ref: const
rows: 1
filtered: 100
Extra: NULL
A query with a an equal sign worked but not with the wildcard. Well dang! It does not work. Just as the original poster described.
The thing nagging at the back of my mind was the index being set up as CREATE INDEX idx ON applications cast(json_unquote(json_extract(`data`,_utf8mb4\'$.name\')) as char(10) charset utf8mb4). Sometimes you can get a 'code smell' where you can not point out exactly what is wrong but you know somethings is not right.
But is there a better way?
Query OK, 0 rows affected (0.0479 sec)
SQL > explain select * from app where extracted_name like 'app-%' \G
id: 1
select_type: SIMPLE
table: app
partitions: NULL
type: range
possible_keys: i
key: i
key_len: 81
ref: NULL
rows: 3
filtered: 100
Extra: Using where
1 row in set, 1 warning (0.0010 sec)
This provides the ability to search with wildcards. I will dig into the troublesome index but wanted to share this work around.