A few days ago one of our intern @mydbops reached me with a SQL query. The query scans only a row according to the execution plan. But query does not seems optimally performing.
Below is the SQL query and its explain plan. ( MySQL 5.7 )
select username, role from user_roles where username= '9977223389' ORDER BY role_id DESC LIMIT 1;
Execution plan and table structure
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: user_roles
partitions: NULL
type: index
possible_keys: NULL
key: PRIMARY
key_len: 4
ref: NULL
rows: 1
filtered: 10.00
Extra: Using where
1 row in set, 1 warning (0.00 sec)
show create table user_roles\G
*************************** 1. row ***************************
Table: user_roles
Create Table: CREATE TABLE `user_roles` (
`role_id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'Role ID',
`role` varchar(255) DEFAULT NULL COMMENT 'Role',
`username` varchar(255) DEFAULT NULL COMMENT 'Username',
PRIMARY KEY (`role_id`)
) ENGINE=InnoDB;
The execution plan estimates the query scans one row ( rows : 1).
The 'rows'
column indicates the number of rows MySQL believes it must examine to execute the query.
MySQL Ref Manual
Let us review the slow query log file too
# Query_time: 0.261564 Lock_time: 0.000181 Rows_sent: 1 Rows_examined: 486271 Rows_affected: 0
# Bytes_sent: 174
SET timestamp=1584903045;
select username, role from user_roles where username= '9977223389' ORDER BY role_id DESC LIMIT 1;
The slow query logs shows a high number in estimated rows Rows_examined. The estimated rows in slow log equals the counts of records in tables, So the query is actually doing a FULL TABLE SCAN (FTS). But according to the optimiser it choses the primary key over the ORDER BY clause as an optimal index.
Let us validate the query cost via format=JSON
EXPLAIN: {
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "65861.40"
},
"ordering_operation": {
"using_filesort": false,
"table": {
"table_name": "user_roles",
"access_type": "index",
"key": "PRIMARY",
"used_key_parts": [
"role_id"
],
"key_length": "4",
"rows_examined_per_scan": 1,
"rows_produced_per_join": 32434,
"filtered": "10.00",
"cost_info": {
"read_cost": "59374.56",
"eval_cost": "6486.84",
"prefix_cost": "65861.40",
"data_read_per_join": "16M"
},
"used_columns": [
"role_id",
"role",
"username"
],
"attached_condition": "(`user_roles`.`username` = '9977223389')"
}
}
}
}
1 row in set, 1 warning (0.00 sec)
Though the access type is Index the query cost ( 65861) is huge for examining a single row based on Primary KEY. The execution plan is misleading in this case.
A better index for performance can be a composite index on (username, id ) and can be a covering index over column role too.
create index idx_combo on user_roles(username,role_id,role);
Query OK, 0 rows affected (2.80 sec)
Records: 0 Duplicates: 0 Warnings: 0
explain select username, role from user_roles where username= '9977223389' ORDER BY role_id DESC LIMIT 1;
+----+-------------+------------+------------+------+---------------+-----------+---------+-------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+---------------+-----------+---------+-------+------+----------+--------------------------+
| 1 | SIMPLE | user_roles | NULL | ref | idx_combo | idx_combo | 258 | const | 1 | 100.00 | Using where; Using index |
+----+-------------+------------+------------+------+---------------+-----------+---------+-------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
## Format = JSON
EXPLAIN: {
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "1.20"
},
The query performs more optimally and there is a huge reduction in cost. The optimiser can be misleading with “Order By” over primary keys. Now our interns understands how the query can be optimised and dig in further. Hope this will be fixed in upcoming releases.
Update 15-07-2020
A new optimiZer switch to avoid picking index in sort was introduced in MySQL 8.0.21 "prefer_indexing_order". It might affect other queries, please do valid checks before turning it OFF.
More good reads.
https://blog.jcole.us/2019/09/30/reconsidering-access-paths-for-index-ordering-a-dangerous-optimization-and-a-fix/
https://dom.as/2015/07/30/on-order-by-optimization/
www.percona.com/community-blog/2019/07/29/mysql-optimizer-naughty-aberrations-on-queries-combining-where-order-by-and-limit/