Optimiser is the brain of the RDBMS. Optimiser decides the right access method , algorithms , join order and right index to be used for better execution of the query. This blog is made to shed some lights on Cost based optimiser in MySQL 5.7. The cost or statistics are stored in the data dictionary .
What is cost-based optimization ?
- The cost model is based on estimates of cost various operations occur during query execution.
- The optimizer has a set of default “cost constants” it will make decision on execution plans.
- In MySQL 5.7, the optimizer has addition a database of cost estimates to use during the execution plan.
- These cost estimates are stored in server_cost & engine_cost tables in MySQL schema. For more details Cost Model
MySQL Optimizer
Cost Model Databases:
The MySQL system database contain cost estimate information tables inside MySQL schema.
==> Server_cost
==> Engine_cost
Server_cost:
- The server_cost estimates general server operations.The server_cost contain these columns.
Cost_name:
- The name of a cost estimate used in cost-model,it’s not case sensitive .
Cost_value:
- If the value is not null the server use’s the cost .Otherwise it use’s default estimate value.
Last_Update:
- It show’s time of the last row updated.
Comment:
- It’s a descriptive comment associated with cost model.
Cost_name values for server_cost tables:
Engine_cost:
- It estimates the operations specific for particular storage engines.
Engine_name:
- The name of the storage engine this cost estimate applied .If the value is default it applies all storage engines.
Device_type:
- It’s intended for specifying storage devices like hard drives (vs) solid state drives.
Cost_name:
- The name of a cost estimate used in cost-model,it’s not case sensitive .
Cost_value:
- If the value is not null the server use’s the cost .Otherwise it use’s default estimate value.
Last_Update:
- It show’s time of the last row updated.
Comment:
- Same as the server cost table.
Cost_name’s for engine_cost table:
io_block_read_cost:
- The default value of io_block_read_cost is (1.0).It will reading an index or data block from disk.
memory_block_read_cost:
- The default value is (1.0),it’s similar to io_block_read_cost. It represents the cost of reading an index or data block from an memory database buffer.
Cost-based Model Architecture:
How to estimate a Cost?
The cost for executing a query below this method used.
Cost Unit:
The cost unit is calculated form executing query read a random data pages.
Cost Factors:
There are 2 main factors are used.
==> IO cost
==> CPU cost
IO cost:
- The IO cost is estimates from storage engine based on number of pages to read.
- IO is estimated in pages read from table and pages read from index.
CPU cost:
The CPU cost is based on evaluating query conditions , comparing keys/records and sorting keys.
Example:
This is cost model for a range scan
Query:
select * from City where Population between 731200 and 801200;
In this table City contain a 4013 records.
Cost-model:
Query:
explain format=json select * from City where Population between 731200 and 801200;
Calculating IO cost:
IO cost = pages in table * io_block_read_cost
Example:
Let us assume io_block_read_cost is 1.0
IO cost = 445 * 1.0
Calculating CPU cost:
CPU cost = records * row_evaluate_cost
Example:
Let’s assume row_evaluate_cost=0.2
CPU cost = 4013 * 0.2
Total cost = 525.26
Cost-model for joins:
- The query have a two or more table joins a MySQL optimizer mission is to find best combination of join order and access method.
- So the response time becomes come to low as possible.
Query:
- In this join query MySQL may choose (City,Country) or (Country,City) join order. The join order does not matter when deciding a access method.
- In either case query execution plan will be table scan on first table and access for second table.
- This is the graphical representation of this query.
- This query scanning a full table so it will increase a total cost of the query. How to reduce the total cost of this query?.
- The best solution is adding a indexes for specific columns & most important thing is enable the “condition_fanout_filter” in optimizer switch.
Condition_Fanout_Filter:
- It will make the better decision to use all condition’s on table, and it’s estimate a qualifying row’s that will be join to the next table.
Example:
Enable a Condition filter:Enable the condition filter in globally,
After that create a index for above that query in population field, it will give better performance.
It will reduce the number of rows scanning and decrease the query cost.
Configurable cost:
- In MySQL 5.7 we can able to configure the server_cost and engine_cost to manually.
- In default cost value have a NULL values in both server_cost & engine_cost tables.
Example:
Advantages:
- In MySQL 5.6 we cannot assign the value for server cost and engine cost.But MySQL 5.7 we can assign the values.
- The MySQL 5.7 they introduce a condition filter.it will make a better decision to use all condition’s in the table.
- The cost model will chose the scenario for least cost and most efficient way to run the query.
Conclusion:
- In MySQL 5.7 they improved the optimizer performance,the cost-based optimization is best method for analyzing the performance & cost of the query.
- In this blog I explained basics of cost-based optimization and how it’s works. Next series I will explain very detailed about this cost model.