Conquer massive datasets with ease! Partitioning is a way of splitting the actual data down into separate .ibd files (data compartments) in the file system based on the defined ranges using the partitioning key. It can help us with maintaining the enormous amount of data in different partitions without much hassle.
In this blog post, we are going to see how to manage table partitioning using yahoo partition manager.
Needs for partitioning:
- Enhanced data retrieval ( reduced IO ) with smaller B+Tree.
- Easy Archival or Purge by dropping or truncate of partition
- Lesser fragmentation, hence avoiding frequent table optimization.
Partitions management activity like adding/Dropping partition is made easy and automated by using yahoo partition manager.
Implementation:
Implementation of this partition manager tool is quite simple. We can start using by importing this sql file , We will walk through the Internal working of partition manager in this blog.
The partition_manager_settings table
CREATE TABLE `partition_manager_settings` (
`table` varchar(64) NOT NULL COMMENT 'table name',
`column` varchar(64) NOT NULL COMMENT 'numeric column with time info',
`granularity` int(10) unsigned NOT NULL COMMENT 'granularity of column, i.e. 1=seconds, 60=minutes...',
`increment` int(10) unsigned NOT NULL COMMENT 'seconds per individual partition',
`retain` int(10) unsigned NULL COMMENT 'seconds of data to retain, null for infinite',
`buffer` int(10) unsigned NULL COMMENT 'seconds of empty future partitions to create',
PRIMARY KEY (`table`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=Dynamic;
Table breakdown:
This table (partition_manager_settings) will be created by executing the partition manager script.
Along with partition_manager_settings table by default, it will create an event for automatic execution of this procedure in the specified interval to have a check for dropping and adding partitions based on the conditions.
mysql> show events\G
*************************** 1. row ***************************
Db: mydbops
Name: run_partition_manager
Definer: root@localhost
Time zone: SYSTEM
Type: RECURRING
Execute at: NULL
Interval value: 86400
Interval field: SECOND
Starts: 2000-01-01 00:00:00
Ends: NULL
Status: ENABLED
Originator: 2345
character_set_client: utf8mb4
collation_connection: utf8mb4_general_ci
Database Collation: utf8_general_ci
Event structure:
mysql> show create event run_partition_manager\G
*************************** 1. row ***************************
Event: run_partition_manager
sql_mode: ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
time_zone: SYSTEM
Create Event: CREATE DEFINER=`root`@`localhost` EVENT
`run_partition_manager` ON SCHEDULE EVERY 86400 SECOND STARTS
'2000-01-01 00:00:00' ON COMPLETION PRESERVE ENABLE DO BEGIN
IF @@global.read_only=0 THEN
CALL partition_manager();
END IF;
END
character_set_client: utf8mb4
collation_connection: utf8mb4_general_ci
Database Collation: utf8_general_ci
By default, this event will run for every day (86400 seconds).
Now let’s see our demo implementation.
A sample test table structure below,
mysql> show create table data\G
*************************** 1. row ***************************
Table: data
Create Table: CREATE TABLE `data` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(50) NOT NULL,
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`,`created`),
KEY `index_created` (`created`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
A mandatory thing for creating partitions for a table is to have the column(partition key) based on which are adding has to be a part of the primary key/Unique key
Here we are making `created` as our partition key, which is part of composite primary key “PRIMARY KEY (`id`,`created`)”
You need to insert the table and column which you need to do partitions.
mysql> select * from partition_manager_settings\G
*************************** 1. row ***************************
table: data
column: unix_timestamp(created)
granularity: 1
increment: 3600
retain: 7200
buffer: 36000
Here I have added the table to create an hourly based partition and drop the partitions which are older than 2 hours. Along with that, it will create 10 empty partitions as a buffer, each time when the partition manager event is being called.
Below is table structure after execution of the partition manager procedure.
mysql> show create table data\G
*************************** 1. row ***************************
Table: data
Create Table: CREATE TABLE `data` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(50) NOT NULL,
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`,`created`),
KEY `index_created` (`created`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
/*!50100 PARTITION BY RANGE (unix_timestamp(created))
(PARTITION p_START VALUES LESS THAN (0) ENGINE = InnoDB,
PARTITION p_1534402800 VALUES LESS THAN (1534402800) ENGINE = InnoDB,
PARTITION p_1534406400 VALUES LESS THAN (1534406400) ENGINE = InnoDB,
PARTITION p_1534410000 VALUES LESS THAN (1534410000) ENGINE = InnoDB,
PARTITION p_1534413600 VALUES LESS THAN (1534413600) ENGINE = InnoDB,
PARTITION p_1534417200 VALUES LESS THAN (1534417200) ENGINE = InnoDB,
PARTITION p_1534420800 VALUES LESS THAN (1534420800) ENGINE = InnoDB,
PARTITION p_1534424400 VALUES LESS THAN (1534424400) ENGINE = InnoDB,
PARTITION p_1534428000 VALUES LESS THAN (1534428000) ENGINE = InnoDB,
PARTITION p_1534431600 VALUES LESS THAN (1534431600) ENGINE = InnoDB,
PARTITION p_1534435200 VALUES LESS THAN (1534435200) ENGINE = InnoDB,
PARTITION p_1534438800 VALUES LESS THAN (1534438800) ENGINE = InnoDB,
PARTITION p_1534442400 VALUES LESS THAN (1534442400) ENGINE = InnoDB,
PARTITION p_1534446000 VALUES LESS THAN (1534446000) ENGINE = InnoDB,
PARTITION p_END VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
In the file system level, each partition has its own tablespace.
mysql> \! ls -lrth /usr/local/var/mysql/mydbops/
total 3608
-rw-r----- 1 dhanasekar admin 61B Jul 18 19:24 db.opt
-rw-r----- 1 dhanasekar admin 8.7K Aug 16 13:52 partition_manager_settings.frm
-rw-r----- 1 mydbops admin 96K Aug 16 13:54 partition_manager_settings.ibd
-rw-r----- 1 mydbops admin 8.5K Aug 16 13:54 data.frm
-rw-r----- 1 mydbops admin 112K Aug 16 13:54 data#P#p_START.ibd
-rw-r----- 1 mydbops admin 112K Aug 16 13:54 data#P#p_1534402800.ibd
-rw-r----- 1 mydbops admin 112K Aug 16 13:54 data#P#p_1534406400.ibd
-rw-r----- 1 mydbops admin 112K Aug 16 13:54 data#P#p_1534417200.ibd
-rw-r----- 1 mydbops admin 112K Aug 16 13:54 data#P#p_1534420800.ibd
-rw-r----- 1 mydbops admin 112K Aug 16 13:54 data#P#p_1534424400.ibd
-rw-r----- 1 mydbops admin 112K Aug 16 13:54 data#P#p_1534410000.ibd
-rw-r----- 1 mydbops admin 112K Aug 16 13:54 data#P#p_1534413600.ibd
-rw-r----- 1 mydbops admin 112K Aug 16 13:54 data#P#p_1534428000.ibd
-rw-r----- 1 mydbops admin 112K Aug 16 13:54 data#P#p_1534431600.ibd
-rw-r----- 1 mydbops admin 112K Aug 16 13:54 data#P#p_1534435200.ibd
-rw-r----- 1 mydbops admin 112K Aug 16 13:54 data#P#p_1534442400.ibd
-rw-r----- 1 mydbops admin 112K Aug 16 13:54 data#P#p_1534446000.ibd
-rw-r----- 1 mydbops admin 112K Aug 16 13:54 data#P#p_END.ibd
-rw-r----- 1 mydbops admin 112K Aug 16 13:54 data#P#p_1534438800.ibd
So Now its easy to maintain the partitions for n number of tables using the partition manager.
The column retains will have the value of how much data which needs to be active in the table. In this case, the retain has the value as 7200 seconds which is 2 hours So it will drop the partitions after the certain period of time.
The next iteration of the procedure call is based on the value of a minimum of increment from the partition_manager_settings table. There is one another partition to correct this setting at the end of the partition_manager procedure.
Last few lines of partition manager procedure:
....
....
close cur_table_list;
# confirm schedule for next run
call schedule_partition_manager(); /* 5.6.29+/5.7.11+ only - mysql bug 77288 */
END;;
DELIMITER ;
The procedure Schedule_partition_manager controls the execution of the “run_partition_manager“, Based on the min increment value of each table which is being managed.
mysql> show create procedure schedule_partition_manager\G
*************************** 1. row ***************************
Procedure: schedule_partition_manager
sql_mode: ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
Create Procedure: CREATE DEFINER=`root`@`localhost` PROCEDURE
`schedule_partition_manager`()
begin
declare min_increment int unsigned;
set min_increment=null;
select min(s.increment)
from partition_manager_settings s
into min_increment;
if min_increment is not null then
ALTER DEFINER='root'@'localhost' EVENT run_partition_manager ON
SCHEDULE EVERY min_increment SECOND STARTS '2000-01-01 00:00:00'
ENABLE;
end if;
end
character_set_client: utf8mb4
collation_connection: utf8mb4_general_ci
Database Collation: utf8_general_ci
Manageability:
The partition_manager_settings table is where we need to add/remove tables to make partitions. This will be executed in an order of insertion in this table.
Key Takeaways/Limitations:
- This partition manager will only work for range partitions.
- This directly won’t work for the partitions for timezone dependent.
- We have to make use of functions to get the required partitions.
- Ensure that events are always on
This procedure will help you in managing the partition in an efficient way. I hope this blog post gives you a better idea about yahoo partition manager.
Mydbops, your trusted opensource database management partner, provides comprehensive management services to Expert Partitioning Strategy, Automated Partition Management and Performance Optimization.
{{cta}}