MySQL has come a long way, and keeps on bringing in new improvements and new collections of JSON functions. Here are a few techniques to store data in JSON columns.
MySQL Vs MariaDB JSON Implementation :
There is a key difference on how JSON is implemented in MySQL and in MariaDB.
MySQL has implemented JSON using RFC 7159 standard that enables native support and efficient way to store and access data in JSON documents.
MariaDB decided not to implement this enhancement as they claim it as not a part of the SQL standard. Instead, to support replication from MySQL, they only defined an alias for JSON, which is actually a LONGTEXT column. MariaDB claims that there is no significant performance difference between the two, but there is no benchmark recently to support that claim.
To conclude, MySQL has rich features and options when using JSON and it has added more enhancements on the recent version 8.0. , which are explained in this blog.
Overview of new JSON Functions in MySQL 8.0 :
Operator “->>” has been added, this is the equivalent to JSON_UNQUOTE(JSON_EXTRACT()).
New aggregation functions
JSON_ARRAYAGG()
JSON_OBJECTAGG()
New JSON utility function
JSON_STORAGE_SIZE()
JSON_STORAGE_FREE()
JSON_TABLE() – One of the most impressive function added in MySQL 8.0
JSON_OVERLAPS() – This functions compares two JSON documents.
JSON_SCHEMA_VALID() – This function validates a JSON schema
JSON_SCHEMA_VALIDATION_REPORT() – Validates a JSON document and a JSON schema.
JSON_VALUE() – this function extracts a scalar value from JSON document.
MySQL Architecture :
JSON Operator “->” :
-> operator serves as an alias for the JSON_EXTRACT() function.
When used with two arguments ( a column identifier on the left and a JSON path on the right) , it evaluates against the JSON document (the column value).
mysql> select id,browser->'$.name' from user_events where JSON_EXTRACT(properties,'$.amount')>50;
+----+-------------------+
| id | browser->'$.name' |
+----+-------------------+
| 4 | "Firefox" |
| 5 | "Firefox" |
| 6 | "Chrome" |
+----+-------------------+
3 rows in set (0.00 sec)
This is an improved, unquoting extraction operator.
The -> operator simply extracts a value, and in addition unquotes the extracted result.
In other words, given a JSON column value column and a path expression path, the following three expressions return the same value – Check please
The ->> operator can be used wherever JSON_UNQUOTE(JSON_EXTRACT()) is allowed.
Example :
mysql> select id,browser->>'$.name' from user_events where JSON_EXTRACT(properties,'$.amount')>50 order by id;
+----+--------------------+
| id | browser->>'$.name' |
+----+--------------------+
| 4 | Firefox |
| 5 | Firefox |
| 6 | Chrome |
+----+--------------------+
3 rows in set (0.00 sec)
2. JSON_ARRAYAGG() :
In MySQL 8.0, a new aggregate function is introduced called JSON_ARRAYAGG() and JSON_OBJECTAGG().
The JSON_ARRYAGG() function reads a column and shows the output as a single JSON array with aggregated results.
Following is an example of ordering the elements in an array which would be undetermined.
Example :
The following is an example output of the JSON_ARRAYAGG() aggregate function:
mysql> select CountryCode,Language,JSON_ARRAYAGG(Percentage) as percent,JSON_ARRAYAGG(IsOfficial) as Official from countrylanguage where CountryCode='ABW';
+-------------+----------+----------------------------------------------------------------+----------------------+
| CountryCode | Language | percent | Official |
+-------------+----------+----------------------------------------------------------------+----------------------+
| ABW | Dutch | [5.300000190734863, 9.5, 76.69999694824219, 7.400000095367432] | ["T", "F", "F", "F"] |
+-------------+----------+----------------------------------------------------------------+----------------------+
1 row in set (0.00 sec)
3. JSON_OBJECTAGG() :
This function enables to create a JSON object containing key-value pairs.
It accepts two arguments, the first of these being used as a key and the second as a value. These arguments can be column names or expressions.
Syntax :
The syntax goes like this:
JSON_OBJECTAGG(key, value)
Here key is the column or expression that represents the key of the key/value pair, and value is the column or expression that represents the value of the key/value pair.
Example :
mysql> select district AS 'State',JSON_OBJECTAGG(Name, Population) AS 'City/Population' from city where CountryCode = 'AUS' GROUP BY State;
+-----------------+-----------------------------------------------------------------------------------------+
| State | City/Population |
+-----------------+-----------------------------------------------------------------------------------------+
| Capital Region | {"Canberra": 322723} |
| New South Wales | {"Sydney": 3276207, "Newcastle": 270324, "Wollongong": 219761, "Central Coast": 227657} |
| Queensland | {"Cairns": 92273, "Brisbane": 1291117, "Gold Coast": 311932, "Townsville": 109914} |
| South Australia | {"Adelaide": 978100} |
| Tasmania | {"Hobart": 126118} |
| Victoria | {"Geelong": 125382, "Melbourne": 2865329} |
| West Australia | {"Perth": 1096829} |
+-----------------+-----------------------------------------------------------------------------------------+
7 rows in set (0.00 sec)
In this example we grouped the cities by their state/district. However, this can be used only if one big JSON object ( in above example , all cities/populations for that country) is needed.
4. JSON_STORAGE_SIZE() :
This function returns the number of bytes used to store the binary representation of a JSON document.
When the argument is a string, the function returns the amount of storage space in binary representation. It is created by parsing the string as JSON and converting it to binary.
Syntax :
JSON_STORAGE_SIZE(json_val)
In the above example, json_val is the JSON document for which the storage size will be returned. This must be a valid JSON document (or a string which can be parsed as one).
Example :
mysql> select event_name,JSON_STORAGE_SIZE(browser) as "Size in Bytes",visitor from user_events where id=6;
+------------+---------------+---------+
| event_name | Size in Bytes | visitor |
+------------+---------------+---------+
| purchase | 977272832 | 4 |
+------------+---------------+---------+
1 row in set (0.00 sec)
mysql> select format_bytes(977272832);
+-------------------------+
| format_bytes(977272832) |
+-------------------------+
| 932.00 MiB |
+-------------------------+
1 row in set (0.00 sec)
5.JSON_STORAGE_FREE() :
In MySQL, the JSON_STORAGE_FREE() function shows how much storage space is released after a JSON Document was updated.
It shows how much storage space was freed in it’s binary representation after it was updated or replaced using JSON_REPLACE(), or JSON_REMOVE().
For a JSON document (or a string which can be parsed as one), this function returns 0.
Syntax :
JSON_STORAGE_FREE(json_val)
Example :
mysql> select visitor,JSON_STORAGE_FREE(browser) from user_events where id=5;
+---------+----------------------------+
| visitor | JSON_STORAGE_FREE(browser) |
+---------+----------------------------+
| 4 | 135266304 |
+---------+----------------------------+
1 row in set (0.00 sec)
mysql> select format_bytes(135266304);
+-------------------------+
| format_bytes(135266304) |
+-------------------------+
| 129.00 MiB |
+-------------------------+
1 row in set (0.00 sec)
6.JSON_TABLE():
This will convert a JSON document into a relationaltable.
When extracting JSON data into a relational table, we must specify what to place in the two dimensions of the relational table.
For the first dimension, we must specify which columns are in the relational table.
Syntax :
JSON_TABLE(expr,path columns)
In JSON path expressions, the following directions have an associated syntax:
$ – Start with the current object.
. – Look inside an object.
[ ] – Look inside an array.
– Reference the value associated with a key.
Example :
mysql> select os,name from user_events,json_table(browser,'columns(os varchar(20) path '$.os',name varchar(20) path '$.name')) as json_to_sql_table;
+---------+---------+
| os | name |
+---------+---------+
| Mac | Safari |
| Windows | Firefox |
| Mac | Safari |
| Windows | Firefox |
| Windows | Firefox |
| Windows | Chrome |
+---------+---------+
6 rows in set (0.00 sec)
7.JSON_OVERLAPS():
This function compares two JSON documents and returns 1 if it has any key/value pairs or array elements in common.
Diagram :
Example
mysql> select browser->>'$.resolution' as Resolution,browser->>'$.os' as Browser from user_events where JSON_OVERLAPS(browser->'$.name[*].os','["Mac","Firefox"]');
+-------------------------+---------+
| Resolution | Browser |
+-------------------------+---------+
| {"x": 1920, "y": 1080} | Mac |
+-------------------------+---------+
1 row in set (0.00 sec)
8.JSON_SCHEMA_VALID() :
As we know MySQL will reject an invalid JSON document when using the JSON data type.
But there is a difference between valid and validation against a schema. With schema validation you can define how the data should be formatted.
Example :
Let’s create a simple document schema that looks at a key named ‘salary’ and set up rules that the minimum value is 5000 and the maximum value is 10000.
The salary is set to around 9000 and we use JSON_SCHEMA_VALID() to test. The test document passes the validation test, with 1 as a pass and 0 as a fail.
mysql> set @cmp='{ "salary": 9000}';
Query OK, 0 rows affected (0.00 sec)
mysql> select JSON_SCHEMA_VALID(@schema,@cmp);
+---------------------------------+
| JSON_SCHEMA_VALID(@schema,@cmp) |
+---------------------------------+
| 1 |
+---------------------------------+
1 row in set (0.00 sec)
If set more than 10000, the object stands invalid.
mysql> set @cmp='{ "salary": 20000}';
Query OK, 0 rows affected (0.00 sec)
mysql> select JSON_SCHEMA_VALID(@schema,@cmp);
+---------------------------------+
| JSON_SCHEMA_VALID(@schema,@cmp) |
+---------------------------------+
| 0 |
+---------------------------------+
1 row in set (0.00 sec)
9.JSON_SCHEMA_VALIDATION_REPORT() :
The JSON_SCHEMA_VALIDATION_REPORT() function Validates a JSON document and a JSON schema.
The schema must be a valid JSON object, and the document must be a valid JSON document. If all the provided conditions are met, the function returns a valid report.