An Overview of DDL Algorithm’s in MySQL ( covers MySQL 8)

Mydbops
Mar 4, 2020
10
Mins to Read
All

Database schema change is becoming more frequent than before, Four out of five application updates(Releases) requires a corresponding database change, For a DBA schema change is a more often a repetitive task, it might be a request from the application team for adding or modifying columns in a table and many more cases.

MySQL supports online DDL from 5.6 and the latest MySQL 8.0 supports instant columns addition.

This blog post will look at the online DDL algorithms inbuilt which can be used to perform schema changes in MySQL.

DDL Algorithms supported by InnoDB is,

  • COPY
  • INPLACE
  • INSTANT ( from 8.0 versions)

INPLACE Algorithm:

INPLACE algorithm performs operations in-place to the original table and avoids the table copy and rebuild, whenever possible.

If the INPLACE algorithm is specified with the ALGORITHM clause if the ALTER TABLE  operation does not support the INPLACE algorithm, then an alter will be exited with an error by suggesting possible algorithm which can be used.

 
mysql> alter table sbtest1 add column h int(11) default null,algorithm=inplace;
Query OK, 0 rows affected (14.12 sec)
Records: 0  Duplicates: 0  Warnings: 0
	

INPLACE algorithm is dependent on two important variables when it performs the table operation.

  • It uses tmp dir to write sort files, in the defined tmp dir(uses /tmp by default), if defined tmp dir is not enough, we can explicitly define by the innodb_tmpdir system variable.
  • It also uses a temporary log file called innodb_online_alter_log_max_size to track data changes by DML queries executed like INSERT, UPDATE , DELETE in the table during the DDL operation, The maximum size for this log file can be configured by the dynamic variable innodb_online_alter_log_max_size (default is 128MB) system variable.

the incoming writes during the process of altering are stored with a size defined in innodb_online_alter_log_max_size are applied at the end of the DDL operation by locking the table for some seconds based on the write rate.

If the incoming writes floods the innodb_online_alter_log_max_size defined size, then DDL operation fails and the uncommitted transactions are rolled back.

Example:

 
mysql> alter table sbtest.sbtest5 add column l varchar(100),algorithm=inplace;
ERROR 1799 (HY000): Creating index 'PRIMARY' required more than 'innodb_online_alter_log_max_size' bytes of modification log. Please try again.
	

Below is an internal flow of INPLACE algorithm, when some operations perform a table rebuilt.

The below table provides operations that use an online DDL with INPLACE algorithm to perform table rebuilt.

Note: The InnoDB needs extra disk space to perform the above-listed operations, either equal to the size of the original table in the datadir or more in some cases.

Drawbacks of ‘INPLACE’ algorithm

  • Long-running online DDL operations can cause replication lag in slaves. Online DDL operation must finish running on the master before it is run on the slave. Also, DML that was processed concurrently on the master is only processed on the slave after the DDL operation on the slave is completed.
  • larger innodb_online_alter_log_max_size size extends the period of time at the end of the DDL operation when the table is locked to apply the data from the log.
  • At time can cause high IO usage for a larger table at high concurrency servers ( Aggressive in terms of resource consumption)

COPY Algorithm

Algorithm COPY alters the schema of the existing table by creating a new temporary table with the altered schema, once it migrates the data completely to the new temporary table it swaps and drops the old table.

Example:

 
mysql> alter table sbtest1 modify column h varchar(20) not null,algorithm=inplace;
ERROR 1846 (0A000): ALGORITHM=INPLACE is not supported. Reason: Cannot change column type INPLACE. Try ALGORITHM=COPY.
	

When the INPLACE algorithm is not supported, MySQL throws an error and prescribes using COPY algorithm.

 
mysql> alter table sbtest1 modify column h varchar(20) not null,algorithm=COPY;
Query OK, 1024578 rows affected (17.95 sec)
Records: 1024578  Duplicates: 0 Warnings: 0
	

ALTER TABLE with ALGORITHM=COPY is an expensive operation as it blocks concurrent DML’s (inserts,updates,deletes) operations, but it allows concurrent read queries(SELECT’S) when LOCK=SHARED.

if the lock mode LOCK=EXCLUSIVE is used, both reads/writes are blocked until the completion of the alter.

Below is an internal flow of the COPY algorithm when it creates a copy of a table.

Drawbacks of COPY Algorithm

  • There is no mechanism to pause a DDL operation or to throttle I/O or CPU usage during the operation.
  • Rollback of operation can be an expensive process.
  • Blocks Concurrent DML’s are not allowed during the ALTER table
  • Causes replication lag

INSTANT Algorithm

In further improvement in online DDL’s ( column addition ) MySQL 8.0 has come up INSTANT algorithm ( a patch from TENCENT ) . This feature makes instant and in-place table alterations for column addition and allows concurrent DML with Improved responsiveness and availability in busy production environments.

If ALGORITHM is not specified, the server will first try the DEFAULT=INSTANT algorithm for all column addition. If it can not be done, then the server will try INPLACE algorithm; and if that can not be supported, at last server will finally try COPY algorithm.

INSTANT algorithm performs only metadata changes in the data dictionary. It doesn’t acquire any metadata lock during schema changes and as it doesn’t touch the data file of the table.

Example:

 
mysql> alter table city add pincode int(11) default null, algorithm=INSTANT;
Query OK, 0 rows affected (0.04 sec)
Records: 0  Duplicates: 0  Warnings: 0
	

The INSTANT algorithm supports only a few of the operations which are listed below.

MySQL 8 versions have two new views added i.e I_S.innodb_tables and I_S.innodb_columns .

Example :

Table structure before adding a column using INSTANT

 
mysql> show create table sbtest.sbtest7\G
*************************** 1. row ***************************
       Table: sbtest7
Create Table: CREATE TABLE `sbtest7` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `k` int(11) NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  `f` varchar(200) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `k_7` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=1000001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)
	

When we query the innodb_tables table, the number of instant_columns will be zero.

 
mysql> SELECT table_id, name, instant_cols FROM information_schema.innodb_tables where name like '%sbtest7%';
+----------+----------------+--------------+
| table_id | name           | instant_cols |
+----------+----------------+--------------+
|     1258 | sbtest/sbtest7 |            0 |
+----------+----------------+--------------+
1 row in set (0.00 sec)
	

Let’s add a column using INSTANT algorithm,

 
mysql> alter table sbtest7 add g varchar(100) not null default 'Mysql 8 is great', algorithm=instant;
Query OK, 0 rows affected (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 
	

after adding column instant_cols becomes 5.

 
mysql> SELECT table_id, name, instant_cols FROM information_schema.innodb_tables where name like '%sbtest7%';
+----------+----------------+--------------+
| table_id | name           | instant_cols |
+----------+----------------+--------------+
|     1258 | sbtest/sbtest7 |            5 |
+----------+----------------+--------------+
1 row in set (0.00 sec)
	

This means that  instant_cols keeps a track of the number of columns present in table sbtest7 before an instant column addition.

And the default values of columns that are added by the INSTANT algorithm are stored in the I_S.innodb_columns table.

 
mysql> SELECT table_id, name, has_default, default_value FROM information_schema.innodb_columns WHERE table_id = 1258;
+----------+------+-------------+----------------------------------+
| table_id | name | has_default | default_value                    |
+----------+------+-------------+----------------------------------+
|     1258 | id   | 0           | NULL                             |
|     1258 | k    | 0           | NULL                             |
|     1258 | c    | 0           | NULL                             |
|     1258 | pad  | 0           | NULL                             |
|     1258 | f    | 0           | NULL                             |
|     1258 | g    | 1           | 4d7973716c2038206973206772656174 |
+----------+------+-------------+----------------------------------+
6 rows in set (0.23 sec
	

Column g has_deafult value 1 and default_valuestored in hash format.

Drawbacks of INSTANT algorithm

  • A column can only be added as the last column of the table. Adding a column to any other position among other columns is not supported.
  • Columns cannot be added to tables that use  ROW_FORMAT=COMPRESSED.
  • Columns cannot be added to tables that include a FULLTEXT index.
  • Columns cannot be added to temporary tables. Temporary tables only support ALGORITHM=COPY.
  • Columns cannot be added to tables that reside in the data dictionary tablespace(shared tablespace).

Comparison of INPLACE, COPY AND INSTANT Algorithms.

we are the end of the blog the let us calculate the time difference between all 3 algorithm over a table with 1M records.

INPLACE – 7.09 sec

 
mysql> alter table sbtest7 add g varchar(100) not null default 0, algorithm=inplace;
Query OK, 0 rows affected (7.09 sec)
Records: 0  Duplicates: 0  Warnings: 0
	

COPY – 14.34 sec

 
mysql> alter table sbtest7 add g varchar(100) not null default 0, algorithm=copy;
Query OK, 1000000 rows affected (14.34 sec)
Records: 1000000  Duplicates: 0 Warnings: 0
	

INSTANT – 0.03 sec

 
mysql> alter table sbtest7 add g varchar(100) not null default 0, algorithm=instant;
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0
	

As we can see, using INSTANT the column was added in no time (just 0.03secs) when compared to the other two algorithms.

Key Takeaways:

  • INSTANT will be the right algorithm to choose when our current MySQL Version is greater than 8 and also based on Alter type which we are trying to achieve.
  • Using the COPY algorithm for larger tables, with more no of async slaves, will be an expensive operation.
  • For alternatives, we can use other tools like pt-online-schema-change or gh-ost for schema changes, which provides more options for throttling the resource usage and more control.

I hope these blogs give you the insights of online DDL and the right algorithms to choose for performing schema changes.

Featured image by Glenn Carstens-Peters on Unsplash

No items found.
No items found.

About the Author

Mydbops

Subscribe Now!

Subscribe here to get exclusive updates on upcoming webinars, meetups, and to receive instant updates on new database technologies.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.