Row scanned equals to 1, Is the query is optimally tuned ?

Mar 24, 2020

Mins to Read

All

Understanding MySQL Query Optimization: When Execution Plans Can Be Misleading

A few days ago one of our intern @mydbops reached me with a SQL query. The query scans only a row according to the execution plan. But query does not seems optimally performing.

Below is the SQL query and its explain plan. ( MySQL 5.7 )

 
select username, role from user_roles where username= '9977223389' ORDER BY role_id DESC LIMIT 1;

Execution plan and table structure

 
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: user_roles
   partitions: NULL
         type: index
possible_keys: NULL
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 1
     filtered: 10.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

show create table user_roles\G
*************************** 1. row ***************************
       Table: user_roles
Create Table: CREATE TABLE `user_roles` (
  `role_id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'Role ID',
  `role` varchar(255) DEFAULT NULL COMMENT 'Role',
  `username` varchar(255) DEFAULT NULL COMMENT 'Username',
  PRIMARY KEY (`role_id`)
) ENGINE=InnoDB;

The execution plan estimates the query scans one row ( rows : 1).

The 'rows' column indicates the number of rows MySQL believes it must examine to execute the query.

MySQL Ref Manual

Let us review the slow query log file too

 
# Query_time: 0.261564  Lock_time: 0.000181  Rows_sent: 1  Rows_examined: 486271  Rows_affected: 0
# Bytes_sent: 174
SET timestamp=1584903045;
select username, role from user_roles where username= '9977223389' ORDER BY role_id DESC LIMIT 1;

The slow query logs shows a high number in estimated rows Rows_examined. The estimated rows in slow log equals the counts of records in tables, So the query is actually doing a FULL TABLE SCAN (FTS). But according to the optimiser it choses the primary key over the ORDER BY clause as an optimal index.

Let us validate the query cost via format=JSON

 
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "65861.40"
    },
    "ordering_operation": {
      "using_filesort": false,
      "table": {
        "table_name": "user_roles",
        "access_type": "index",
        "key": "PRIMARY",
        "used_key_parts": [
          "role_id"
        ],
        "key_length": "4",
        "rows_examined_per_scan": 1,
        "rows_produced_per_join": 32434,
        "filtered": "10.00",
        "cost_info": {
          "read_cost": "59374.56",
          "eval_cost": "6486.84",
          "prefix_cost": "65861.40",
          "data_read_per_join": "16M"
        },
        "used_columns": [
          "role_id",
          "role",
          "username"
        ],
        "attached_condition": "(`user_roles`.`username` = '9977223389')"
      }
    }
  }
}
1 row in set, 1 warning (0.00 sec)

Though the access type is Index the query cost ( 65861) is huge for examining a single row based on Primary KEY. The execution plan is misleading in this case.

A better index for performance can be a composite index on (username, id ) and can be a covering index over column role too.

 
create index idx_combo on user_roles(username,role_id,role);
Query OK, 0 rows affected (2.80 sec)
Records: 0  Duplicates: 0  Warnings: 0

explain  select username, role from user_roles where username= '9977223389' ORDER BY role_id DESC LIMIT 1; 
+----+-------------+------------+------------+------+---------------+-----------+---------+-------+------+----------+--------------------------+
| id | select_type | table      | partitions | type | possible_keys | key       | key_len | ref   | rows | filtered | Extra                    |
+----+-------------+------------+------------+------+---------------+-----------+---------+-------+------+----------+--------------------------+
|  1 | SIMPLE      | user_roles | NULL       | ref  | idx_combo     | idx_combo | 258     | const |    1 |   100.00 | Using where; Using index |
+----+-------------+------------+------------+------+---------------+-----------+---------+-------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)

## Format = JSON

EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "1.20"
    },

The query performs more optimally and there is a huge reduction in cost. The optimiser can be misleading with “Order By” over primary keys. Now our interns understands how the query can be optimised and dig in further. Hope this will be fixed in upcoming releases.

Update 15-07-2020

A new optimiZer switch to avoid picking index in sort was introduced in MySQL 8.0.21 "prefer_indexing_order". It might affect other queries, please do valid checks before turning it OFF.

More good reads.

https://blog.jcole.us/2019/09/30/reconsidering-access-paths-for-index-ordering-a-dangerous-optimization-and-a-fix/

https://dom.as/2015/07/30/on-order-by-optimization/

www.percona.com/community-blog/2019/07/29/mysql-optimizer-naughty-aberrations-on-queries-combining-where-order-by-and-limit/

optimization

Performance

MySQL

No items found.