Fastest Parallel replication method in MySQL 8.

Mydbops
Feb 14, 2021
10
Mins to Read
All

From MySQL 5.7, we had a Multi-threaded Slave (MTS) Applier mechanism called LOGICAL_CLOCK to overcome the problems of parallel replication within a database.

To further improve the parallelisation mechanism, from MySQL 8 (5.7.22) we have write-set replication, so before going further , lets look at the difference between Logical clock (LC) and Writeset.

LOGICAL_CLOCK

Transactions that are part of the same binary log group commit on a master are applied in parallel on a slave. The dependencies between transactions are tracked based on their timestamps to provide additional parallelisation where possible.

WRITESET

Write-set is a mechanism to track independent transactions that can be executed in parallel in the slave. Parallelising on write sets has potentially much more parallelism than logical_clock ,since it does not depend on the commit history, and as such, applying binary logs on a slave, it mean that replication can become much faster when compared to other parallel mechanism.

Below are the variables related write-set are :

 
mysql> show variables where variable_Name in ('transaction_write_set_extraction', 'binlog_transaction_dependency_tracking');
+----------------------------------------+--------------+
| Variable_name                          | Value        |
+----------------------------------------+--------------+
| binlog_transaction_dependency_tracking | COMMIT_ORDER |
| transaction_write_set_extraction       | XXHASH64     |
+----------------------------------------+--------------+
	

transaction_write_set_extraction : specifies the algorithm used to hash the writes extracted during a transaction. binlog_format must be set to ROW to change the value of this system variable. XXHASH64 is the most recommended method.

Note : The variable transaction_write_set_extraction is not enabled by default in MySQL 5.7.22 and above.

binlog_transaction_dependency_tracking :  specifies the source of dependency information that the source records in the binary log to help replicas determine which transactions can be executed in parallel. The possible values are:

  • COMMIT_ORDER: Dependency information is generated from the source’s commit timestamps. This is the default.
  • WRITESET: Dependency information is generated from the source’s write set, and any transactions that write different tuples can be can be made parallel. in other words the data is parallelised the transactions has not touched or modified the same data row.
  • WRITESET_SESSION: Dependency information is generated from the source’s write set, and any transactions that write different tuples can be made parallel, with the exception that no two updates from the same session can be reordered.

now we will dive into testing environment and look at how the write-set parallelises the replication?.

I have installed latest MySQL 8.0.23  version for testing purpose

 
mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.23    |
+-----------+
1 row in set (0.00 sec)
	

In order to enable the write-set parallelism ,Binlog_transacion_dependency_tracking is to be set from COMMIT_ORDER (default) to WRITESET

 
mysql> SET GLOBAL binlog_transaction_dependency_tracking = WRITESET;
Query OK, 0 rows affected (0.00 sec)
	

Creating a test database and table.

 
mysql> create database mydbops_test;
Query OK, 1 row affected (0.00 sec)

mysql> CREATE TABLE writeset_test ( 
`id` int(10) unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY, 
`city` varchar(50) NOT NULL);
Query OK, 0 rows affected, 1 warning (0.02 sec)
	
 
mysql> show binary logs;
+---------------+-----------+-----------+
| Log_name      | File_size | Encrypted |
+---------------+-----------+-----------+
| binlog.000001 |      1527 | No        |
| binlog.000002 |      2291 | No        |
| binlog.000003 |      1659 | No        |
+---------------+-----------+-----------+
3 rows in set (0.00 sec)
	

Let us insert some sample data into the table writeset_test table.

 
mysql> flush logs;
Query OK, 0 rows affected (0.02 sec)
	
 
mysql> insert into writeset_test (city) values ('Delhi');
Query OK, 1 row affected (0.00 sec)

mysql> insert into writeset_test (city) values ('Mangalore');
Query OK, 1 row affected (0.01 sec)

mysql> insert into writeset_test (city) values ('Kolkata');
Query OK, 1 row affected (0.00 sec)

mysql> insert into writeset_test (city) values ('Kanpur');
Query OK, 1 row affected (0.01 sec)
	

To observe the behaviour of WRITESET , let us decode the binlog and see the how the above inserted transactions will be parallelised.

 
[root@vm8 mysql]# mysqlbinlog binlog.000004 | grep last_ |     sed -e 's/server id.*last/[...] last/' -e 's/.rbr_only.*/ [...]/'

#210119  3:59:07 [...] last_committed=0 sequence_number=1 [...]

#210119  4:03:26 [...] last_committed=1 sequence_number=2 [...]

#210119  4:03:34 [...] last_committed=1 sequence_number=3 [...]

#210119  4:04:11 [...] last_committed=1 sequence_number=4 [...]

#210119  4:17:49 [...] last_committed=1 sequence_number=5 [...]
	

As we can see from the above logs , the transaction from sequence_number 2-5 will be executed parallel in slave, as the all the transactions are independent and non conflicting to each other.

When the same the row is modified or deleted from multiple transaction , those transactions will not be part of WRITESET

 
mysql> delete from writeset_test where id=6;
Query OK, 1 row affected (0.00 sec)

mysql> insert into writeset_test (id,city) values (6,'Kanpur');
Query OK, 1 row affected (0.01 sec)

#210119 11:22:40 [...] last_committed=4 sequence_number=5 [...]
#210119 11:22:41 [...] last_committed=5 sequence_number=6 [...]
	

In above example, the id 6 is deleted and inserted back , as this rows will conflicted with 2 transactions, it will not  replicated in parallel.

MySQL 8 Parallel Replication
MySQL 8 Parallel Replication

WRITESET_SESSION

Now lets look at another mode Binlog_transacion_dependency_tracking is WRITESET_SESSION

When we want to restrict the two transaction from same session should not be executed on slave when slave-preserve-commit-order option is not enabled , then WRITESET_SESSION mode can be used, which  basically never allows the transactions from the same session to be executed in parallel.

lets look at example to see the behaviour of WRITESET_SESSION , now we already now that the 2 transactions in same session cannot be parallelised , so the will execute the the queries from 2 different session and see the results by decode the the binlog

Session 1

 
mysql> insert into mydbops_test.writeset_test (city) values ('Ahmedabad');
Query OK, 1 row affected (0.01 sec)

mysql> update writeset_test set city='Mangalore' where id=13;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

#210119 12:41:30 [...] last_committed=1 sequence_number=2 [...]

#210119 12:42:22 [...] last_committed=2 sequence_number=3 [...]
	

In session 1 ,even the transactions are on non-conflicting rows, it cannot be parallelised.

Session 2

 
mysql> delete from  writeset_test where id=11;
Query OK, 1 row affected (0.01 sec)

mysql> insert into mydbops_test.writeset_test (city) values ('Patna');
Query OK, 1 row affected (0.00 sec)

#210119 12:41:30 [...] last_committed=1 sequence_number=2 [...]

#210119 12:42:22 [...] last_committed=2 sequence_number=3 [...]

#210119 12:44:34 [...] last_committed=1 sequence_number=4 [...]

#210119 12:45:13 [...] last_committed=4 sequence_number=5 [...]
	

From session 2 , the transaction 2(from session1) and  transaction 3 are taken in write-set which will can be executed in parallel.

Writeset based replication is the fastest replication method in MySQL. It ensures the faster synchronisation of Replica set by utilising the underlying the hardware efficiently.

Limitations of Writeset

  • Supports only InnoDB .
  • Ensure tables have Primary keys.
  • Foreign key tables will not take benefit of WRITESET mode.
  • Slave_parallel_workers should be greater than zero.
  • Larger value of binlog_transaction_dependency_history_size will keep more row hashes in memory.

The writset replication needs fine tuning based on the workload to reap a very optimal performance out of it.

No items found.

About the Author

Mydbops

Subscribe Now!

Subscribe here to get exclusive updates on upcoming webinars, meetups, and to receive instant updates on new database technologies.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.