Partitioning is a process in which a single larger table is split into several smaller tables (physically) and still considered as a single table.It is generally a good idea for the tables whose size is in a few 100 GB’s.
While performing select,update,delete operations in a partitioned tables, we can notice a better performance in queries .This simple process of optimization is called partition pruning in which we can avoid scanning the certain partitions which does not have any matching values based on partition key.
The following example will explain you in detail about the partition pruning
Here I have used an Amazon linux 2 machine
Hardware Info
1 core CPU
1 GB RAM
20 GB Disk (SSD)
maximum of 3000 IOPS
MySQL 5.7 is installed in this machine and two tables (one is not partitioned and other is partitioned) with 1.7 million records of same data is loaded in using Sysbench.
Table Structure without partition
mysql> show create table sbtest.sbtest1\G
*************************** 1. row ***************************
Table: sbtest1
Create Table: CREATE TABLE `sbtest1` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`k` int(11) NOT NULL DEFAULT '0',
`c` char(120) NOT NULL DEFAULT ,
`pad` char(60) NOT NULL DEFAULT ,
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`,`created`)
) ENGINE=InnoDB AUTO_INCREMENT=1734978 DEFAULT CHARSET=latin1
I have partitioned the second table with the help of the partition manager.we have explained about partition manager in our previous blog. partition_manager
mysql> show create table sbtest.sbtest1\G
*************************** 1. row ***************************
Table: sbtest1
Create Table: CREATE TABLE `sbtest1`
`id` int(11) NOT NULL AUTO_INCREMENT,
`k` int(11) NOT NULL DEFAULT '0',
`c` char(120) NOT NULL DEFAULT '',
`pad` char(60) NOT NULL DEFAULT '',
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`,`created`)
) ENGINE=InnoDB AUTO_INCREMENT=1734978 DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE (unix_timestamp(created))
(PARTITION p_START VALUES LESS THAN (0) ENGINE = InnoDB,
PARTITION p_1544832000 VALUES LESS THAN (1544832000) ENGINE = InnoDB,
PARTITION p_1547424000 VALUES LESS THAN (1547424000) ENGINE = InnoDB,
PARTITION p_1550016000 VALUES LESS THAN (1550016000) ENGINE = InnoDB,
PARTITION p_1552608000 VALUES LESS THAN (1552608000) ENGINE = InnoDB,
PARTITION p_1555200000 VALUES LESS THAN (1555200000) ENGINE = InnoDB,
PARTITION p_1557792000 VALUES LESS THAN (1557792000) ENGINE = InnoDB,
PARTITION p_1560384000 VALUES LESS THAN (1560384000) ENGINE = InnoDB,
PARTITION p_1562976000 VALUES LESS THAN (1562976000) ENGINE = InnoDB,
PARTITION p_1565568000 VALUES LESS THAN (1565568000) ENGINE = InnoDB,
PARTITION p_1568160000 VALUES LESS THAN (1568160000) ENGINE = InnoDB,
PARTITION p_1570752000 VALUES LESS THAN (1570752000) ENGINE = InnoDB,
PARTITION p_1573344000 VALUES LESS THAN (1573344000) ENGINE = InnoDB,
PARTITION p_1575936000 VALUES LESS THAN (1575936000) ENGINE = InnoDB,
PARTITION p_1578528000 VALUES LESS THAN (1578528000) ENGINE = InnoDB,
PARTITION p_1581120000 VALUES LESS THAN (1581120000) ENGINE = InnoDB,
PARTITION p_1583712000 VALUES LESS THAN (1583712000) ENGINE = InnoDB,
PARTITION p_1586304000 VALUES LESS THAN (1586304000) ENGINE = InnoDB,
PARTITION p_1588896000 VALUES LESS THAN (1588896000) ENGINE = InnoDB,
PARTITION p_1591488000 VALUES LESS THAN (1591488000) ENGINE = InnoDB,
PARTITION p_1594080000 VALUES LESS THAN (1594080000) ENGINE = InnoDB,
PARTITION p_1596672000 VALUES LESS THAN (1596672000) ENGINE = InnoDB,
PARTITION p_1599264000 VALUES LESS THAN (1599264000) ENGINE = InnoDB,
PARTITION p_1607040000 VALUES LESS THAN (1607040000) ENGINE = InnoDB,
PARTITION p_END VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
1 row in set (0.00 sec)
Now Let us try to fetch the same data from both tables.we will see the difference between them in the Explain plan.
mysql> explain select * from sbtest1.sbtest1 where created between "2019-01-31 00:00:00" and "2019-05-12 00:00:00"\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: sbtest1
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1574797
filtered: 11.11
Extra: Using where
The table without partition nearly examines the full table scan and it takes around 8 secs to fetch the records.
mysql> select * from sbtest1.sbtest1 where created between "2019-01-31 00:00:00" and "2019-05-12 00:00:00";
0f7550ef001dab270bd4a6849151ce79 -
175616 rows in set (8.25 sec)
Repeating the same in partitioned table
mysql> explain select * from sbtest.sbtest1 where created between "2019-01-31 00:00:00" and "2019-05-12 00:00:00"\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: sbtest1
partitions: p_1550016000,p_1552608000,p_1555200000,p_1557792000
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1
filtered: 100.00
Extra: Using where
mysql> select * from sbtest.sbtest1 where created between "2019-01-31 00:00:00" and "2019-05-12 00:00:00";
0f7550ef001dab270bd4a6849151ce79 -
175616 rows in set (0.47 sec)
It is taken only less than 0.5 seconds to fetch the same data.Instead of searching the entire table, it scans only the certain partition where the matched value will be present.By limiting the search,it is possible to find matching rows in short span when compared with unpartitioned tables.This process of avoiding the unwanted partitions is known as pruning.
The below pictorial representation shows the disk operations while fetching the data
1) While scanning the full table
Without partition, it consumes more resources.Disk IOPS reaches the maximum of 900 and utilization goes above 95%
2) While scanning only specific partitions(Pruning)
While pruning the specific partitions are accessed and consumption of system resources is greatly reduced. The optimiser performs pruning in the following two cases (in Where clause)
A) Partition_column = constant [Eq Ref]
B) Partition_column in (constant 1, constant 2,….) [Range]
In case A,
The optimiser search against a single value(key) and find a partition where the matching value will be present and scans only that partition
In case B,
The optimiser search for each given values in the where clause and makes the list of matched partitioning and then scan only that partitioning.
One of our clients reported an issue that partition is not pruning is not working well as expected. While checking the issue further, we analysed that they had given incorrect values in WHERE clause. If we are giving invalid values in where clause, it will be treated as NULL. So partition will not be efficient for those kind of scenarios.
mysql> explain select * from sbtest.sbtest1 where created between "2019-01-31 00:00:00" and "2019-05-121 00:00:00"\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: sbtest1
partitions:p_1550016000,p_1552608000,p_1555200000,p_1557792000,p_1560384000,p_1562976000,p_1565568000,p_1568160000,p_1570752000,p_1573344000,p_1575936000,p_1578528000,p_1581120000,p_1583712000,p_1586304000,p_1588896000,p_1591488000,p_1594080000,p_1596672000,p_1599264000,p_1601856000,p_END
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1
filtered: 100.00
Extra: Using where
The wrong value in above where clause causes the optimiser to scan all the partitions.Here the pruning is not working.
As like read operations(select), write operations (only update,insert is not supported currently)can also speed up by this process.
mysql> explain update sbtest1.sbtest1 set k=100 where created between "2018-12-25 09:08:00" and "2019-07-17 00:00:00"\G
*************************** 1. row ***************************
id: 1
select_type: UPDATE
table: sbtest1
partitions: NULL
type: index
possible_keys: NULL
key: PRIMARY
key_len: 8
ref: NULL
rows: 1574797
filtered: 100.00
Extra: Using where
1 row in set (0.00 sec)
mysql> update sbtest1.sbtest1 set k=100 where created between "2018-12-25 09:08:00" and "2019-07-17 00:00:00";
Query OK, 353397 rows affected (17.36 sec)
Rows matched: 353397 Changed: 353397 Warnings: 0
To update rows, it took nearly 18 seconds.Also have a look on the below graphs. Again with partitioned tables
mysql> explain update sbtest.sbtest1 set k=100 where created between "2018-12-25 09:08:00" and "2019-07-17 00:00:00"\G
*************************** 1. row ***************************
id: 1
select_type: UPDATE
table: sbtest1 partitions:p_1547424000,p_1550016000,p_1552608000,p_1555200000,p_1557792000,p_1560384000,p_1562976000,p_1565568000
type: index
possible_keys: NULL
key: PRIMARY
key_len: 8
ref: NULL
rows: 1
filtered: 100.00
Extra: Using where
1 row in set (0.00 sec)
mysql> update sbtest.sbtest1 set k=100 where created between "2018-12-25 09:08:00" and "2019-07-17 00:00:00";
Query OK, 353397 rows affected (2.51 sec)
Rows matched: 353397 Changed: 353397 Warnings: 0
Benefits of Partition Pruning:
Faster looks up with Point Select and range conditions
Better response time with write operations updates and deletes
Reduced system resources like IO utilisation.
This is all about partition pruning and how it will be useful in large tables of production environment.
Enhance database performance with Mydbops MySQL services! We provide comprehensive solutions including managed services, consulting, and remote DBA support.