Database schema change is becoming more frequent than before, Four out of five application updates(Releases) requires a corresponding database change, For a DBA schema change is a more often a repetitive task, it might be a request from the application team for adding or modifying columns in a table and many more cases.
MySQL supports online DDL from 5.6 and the latest MySQL 8.0 supports instant columns addition.
This blog post will look at the online DDL algorithms inbuilt which can be used to perform schema changes in MySQL.
DDL Algorithms supported by InnoDB is,
- COPY
- INPLACE
- INSTANT ( from 8.0 versions)
INPLACE Algorithm:
INPLACE
algorithm performs operations in-place to the original table and avoids the table copy and rebuild, whenever possible.
If the INPLACE
algorithm is specified with the ALGORITHM clause if the ALTER TABLE
operation does not support the INPLACE
algorithm, then an alter will be exited with an error by suggesting possible algorithm which can be used.
INPLACE
algorithm is dependent on two important variables when it performs the table operation.
- It uses tmp dir to write sort files, in the defined tmp dir(uses /tmp by default), if defined tmp dir is not enough, we can explicitly define by the
innodb_tmpdir
system variable.
- It also uses a temporary log file called innodb_online_alter_log_max_size to track data changes by DML queries executed like
INSERT
,UPDATE
,DELETE
in the table during the DDL operation, The maximum size for this log file can be configured by the dynamic variableinnodb_online_alter_log_max_size
(default is 128MB) system variable.
the incoming writes during the process of altering are stored with a size defined in innodb_online_alter_log_max_size
are applied at the end of the DDL operation by locking the table for some seconds based on the write rate.
If the incoming writes floods the innodb_online_alter_log_max_size
defined size, then DDL operation fails and the uncommitted transactions are rolled back.
Example:
Below is an internal flow of INPLACE
algorithm, when some operations perform a table rebuilt.
The below table provides operations that use an online DDL with INPLACE
algorithm to perform table rebuilt.
Note: The InnoDB needs extra disk space to perform the above-listed operations, either equal to the size of the original table in the datadir or more in some cases.
Drawbacks of ‘INPLACE
’ algorithm
- Long-running online DDL operations can cause replication lag in slaves. Online DDL operation must finish running on the master before it is run on the slave. Also, DML that was processed concurrently on the master is only processed on the slave after the DDL operation on the slave is completed.
- larger
innodb_online_alter_log_max_size
size extends the period of time at the end of the DDL operation when the table is locked to apply the data from the log. - At time can cause high IO usage for a larger table at high concurrency servers ( Aggressive in terms of resource consumption)
COPY Algorithm
Algorithm COPY
alters the schema of the existing table by creating a new temporary table with the altered schema, once it migrates the data completely to the new temporary table it swaps and drops the old table.
Example:
When the INPLACE
algorithm is not supported, MySQL throws an error and prescribes using COPY algorithm.
ALTER TABLE with ALGORITHM
=COPY
is an expensive operation as it blocks concurrent DML’s (inserts,updates,deletes) operations, but it allows concurrent read queries(SELECT’S) when LOCK
=SHARED
.
if the lock mode LOCK
=EXCLUSIVE
is used, both reads/writes are blocked until the completion of the alter.
Below is an internal flow of the COPY
algorithm when it creates a copy of a table.
Drawbacks of COPY Algorithm
- There is no mechanism to pause a DDL operation or to throttle I/O or CPU usage during the operation.
- Rollback of operation can be an expensive process.
- Blocks Concurrent DML’s are not allowed during the ALTER table
- Causes replication lag
INSTANT Algorithm
In further improvement in online DDL’s ( column addition ) MySQL 8.0 has come up INSTANT
algorithm ( a patch from TENCENT ) . This feature makes instant and in-place table alterations for column addition and allows concurrent DML with Improved responsiveness and availability in busy production environments.
If ALGORITHM is not specified, the server will first try the DEFAULT
=INSTANT
algorithm for all column addition. If it can not be done, then the server will try INPLACE
algorithm; and if that can not be supported, at last server will finally try COPY
algorithm.
INSTANT
algorithm performs only metadata changes in the data dictionary. It doesn’t acquire any metadata lock during schema changes and as it doesn’t touch the data file of the table.
Example:
The INSTANT
algorithm supports only a few of the operations which are listed below.
MySQL 8 versions have two new views added i.e I_S.innodb_tables
and I_S.innodb_columns
.
Example :
Table structure before adding a column using INSTANT
When we query the innodb_tables table, the number of instant_columns will be zero.
Let’s add a column using INSTANT
algorithm,
after adding column instant_cols
becomes 5.
This means that instant_cols
keeps a track of the number of columns present in table sbtest7 before an instant column addition.
And the default values of columns that are added by the INSTANT
algorithm are stored in the I_S.innodb_columns
table.
Column g has_deafult
value 1 and default_value
stored in hash format.
Drawbacks of INSTANT
algorithm
- A column can only be added as the last column of the table. Adding a column to any other position among other columns is not supported.
- Columns cannot be added to tables that use
ROW_FORMAT
=COMPRESSED
. - Columns cannot be added to tables that include a FULLTEXT index.
- Columns cannot be added to temporary tables. Temporary tables only support
ALGORITHM
=COPY
. - Columns cannot be added to tables that reside in the data dictionary tablespace(shared tablespace).
Comparison of INPLACE, COPY AND INSTANT Algorithms.
we are the end of the blog the let us calculate the time difference between all 3 algorithm over a table with 1M records.
INPLACE
– 7.09 sec
COPY
– 14.34 sec
INSTANT
– 0.03 sec
As we can see, using INSTANT
the column was added in no time (just 0.03secs) when compared to the other two algorithms.
Key Takeaways:
INSTANT
will be the right algorithm to choose when our current MySQL Version is greater than 8 and also based on Alter type which we are trying to achieve.- Using the
COPY
algorithm for larger tables, with more no of async slaves, will be an expensive operation. - For alternatives, we can use other tools like pt-online-schema-change or gh-ost for schema changes, which provides more options for throttling the resource usage and more control.
I hope these blogs give you the insights of online DDL and the right algorithms to choose for performing schema changes.
Featured image by Glenn Carstens-Peters on Unsplash