DDL (Data Definition Language) statements create, alter, and remove database objects. These types of changes can be a very dangerous action to take on such a critical piece of your infrastructure. You want to make sure that the command that you are executing has been given proper thought and testing.
In this post I go through multiple version of MySQL and verify the best course of action to take in regards to executing DDL statements. There are many things that you have to consider when making these types of changes, such as disk space, load on the database server, slave replication, the type of DDL statement you are executing, and if it will lock the table.
Because of these risks, there are tools that can be used to help mitigate some of the dangers. But unless you have tested and verified their functionality, these tools in themselves can cause trouble. Whenever in doubt, take the time to test and verify any changes that you will make. In my testing I will be using :
- MySQL 5.6+ Inherent Online DDL
pt-online-schema-change in particular since it is a very popular tool and I have used it many times. Also, the primary reason it was created was before MySQL offered online DDL changes. In some cases, depending on your environment, the best course of action may be removing the database server from being accessed, by failing over to a slave, or taking a cluster node offline.
I will be focusing on the most common DDL statements as I want to keep this post to a reasonable size. Many of the MySQL DDL statements by default are using the INPLACE algorithm where it is able, which is only available in MySQL 5.6 or later. In earlier versions 5.5 and 5.1 with the InnoDB plugin they had fast index creation but all other table alters were blocking. Online DDL with the INPLACE algorithm allows MySQL to make a copy of the table in the background, copy the data to this table, make your table alters, and then swap the tables, all without locking the table. Some DDL statements can be done instantaneously, such as dropping an index or renaming a column. When MySQL isn’t able to use the INPLACE algorithm it will have to revert to using the COPY algorithm which will in turn lock the table. An example of this is changing a column definition from VARCHAR to BLOB. Whenever you are doing an INPLACE alter you will want to specify the algorithm in your command. This will help protect you in the case that MySQL is unable to do an INPLACE alter. MySQL will return an error rather than running the command with the COPY algorithm.
ALTER TABLE employee_test ALGORITHM=INPLACE, CHANGE COLUMN first_name first_name BLOB NULL; ERROR 1846 (0A000): ALGORITHM=INPLACE is not supported. Reason: Cannot change column type INPLACE. Try ALGORITHM=COPY.
All of my testing was done without specifying the algorithm, allowing MySQL to determine the best algorithm to use. If there are any DDL statements that you want more information on, please refer to the documentation for the release of MySQL that you are using, as I will not be going into foreign keys.
All of my testing was done in virtual machines (VMs) on my laptop. I have a VM that will be running mysqlslap to perform remote DML statements such as SELECT, UPDATE, DEELTE and INSERT, causing load on the database server. This will allow me to see any potential table locks or performance impact. Here is the setup of the MySQL machine and it’s components. I have created the table shown below and imported 10 million rows. While mysqlslap was running I performed each of the DDL statements and watched that the DML statements were being executed with no table locks. I then recorded the time as they completed.
MySQL Server Stats
- CPU : 4x CPUs at 2.6 GHz Intel Core i7
- Memory allocated to VM : 2 Gig
- Memory allocated to MySQL Innodb buffer pool: 1 Gig
- Flash Storage
- Table has 10 Million Rows.
- DML (Data Manipulation Language) statements such as select, insert, update, and delete, that will be executed against the table during DDL statements
CREATE TABLE `employee_test` ( `emp_no` int(11) NOT NULL AUTO_INCREMENT, `birth_date` date NOT NULL, `first_name` varchar(14) NOT NULL, `last_name` varchar(16) NOT NULL, `gender` enum('M','F') NOT NULL, `hire_date` date NOT NULL, PRIMARY KEY (`emp_no`), KEY `ix_lastname` (`last_name`), KEY `ix_firstname` (`first_name`) ) ENGINE=InnoDB AUTO_INCREMENT=10968502 DEFAULT CHARSET=latin1
MySQL DDL Commands
CREATE INDEX ix_hire_date ON employee_test (hire_date); --CREATE INDEX CREATE FULLTEXT INDEX ix_lastname_fulltext ON employee_test(last_name); --CREATE FULLTEXT INDEX DROP INDEX ix_hire_date ON employee_test; --DROP INDEX OPTIMIZE TABLE employee_test; --OPTIMIZE TABLE ALTER TABLE employee_test ADD COLUMN test_column INT NULL; --ADD COLUMN ALTER TABLE employee_test DROP COLUMN f_name; --DROP COLUMN ALTER TABLE employee_test CHANGE first_name f_name varchar(14) NOT NULL; --RENAME COLUMN ALTER TABLE employee_test MODIFY COLUMN emp_no BIGINT AUTO_INCREMENT NOT NULL; --CHANGE COLUMN TYPE
pt-online-schema-change DDL Commands
pt-online-schema-change --execute --alter 'ADD FULLTEXT INDEX ix_lastname_fulltext (last_name)' D=employees,t=employee_test pt-online-schema-change --execute --alter 'ENGINE=INNODB' D=employees,t=employee_test pt-online-schema-change --execute --alter 'ADD COLUMN test_column3 INT NULL' D=employees,t=employee_test pt-online-schema-change --execute --alter 'MODIFY COLUMN gender BLOB NULL' D=employees,t=employee_test
This matrix is a representation of the testing that I performed and how quickly the commands took to execute. Be careful with Fulltext indexes on your tables since they potentially can cause additional locking by creating the necessary infrastructure in the background. Please see MySQL Innodb Fulltext Indexes for more details. This requirement causes a great deal of locking on the table.
For the DDL statements that cause locking of the table we wanted to look at incorporating pt-online-schema-change, to help us overcome this obstacle.
pt-online-schema-change allowed us to perform the operations that locked the table previously with no locking. pt-onilne-schema-change also has many other features such as helping with the impact on slave replication, and handling foreign keys. But it also has it’s limitation such as not being able to run it on a table that already has triggers, or complications with foreign keys. There are also impacts on your environment if it is not properly tested and verified. One such example is, every time that I ran pt-online-schema-change in my test it caused a deadlock causing mysqlslap to die and no longer perform and further statements.
mysqlslap: Cannot run query UPDATE employee_test SET first_name = ‘BigPurpleDog’ WHERE last_name = ‘SmallGreenCat’; ERROR : Deadlock found when trying to get lock; try restarting transaction
This is why it is very important to try and determine the impact if any that pt-online-schema-change may have on your environment before starting to use it. I did not encounter this behavior with any of the MySQL DDL statements that I ran.
While performing the changes there were consistent increases in CPU load, disk I/O, and disk usage as the new tables were being created for the table alters. We have to remember that when certain DDL statements are being executed, a full copy of the table is being performed, so you will want to make sure you have enough disk space to complete the change. This is why it is very important to take into consideration the size of the table you are altering and the load on the MySQL server while performing DDL statements. It is preferred that you run any of the DDL statements that cause table copies, off hours as to avoid any delays or outages to the application that is using the data.
Query Execution Impact
Server Performance Impact
As I have observed in performing these tests, there are many things to consider when performing DDL statements to avoid potential downfalls. Here is a summary of the recommendations to executing DDL statements or using pt-online-schema-change. Before considering any of this determine if the statement you are going to perform is going to copy a table, and if it does, make sure you have enough disk space.
If you are going to make changes to your production servers, make sure that you run your DDL statements during off hours when the server is at it’s lowest utilization for both CPU and disk.
For an added safety measure when you are performing any of the MySQL DDL statements that you are expecting to be executed INPLACE and will not lock the table, make sure you specify ALGORITHM=INPLACE in your statement. If MySQL is unable to execute the command in place, it will just return an error, instead of executing the statement with the COPY algorithm which will lock the table. Here are samples of the DDL statements that you should be able run INPLACE and not cause any locking of your table.
ALTER TABLE employee_test ALGORITHM=INPLACE, ADD INDEX ix_hire_date (hire_date); --CREATE INDEX ALTER TABLE employee_test ALGORITHM=INPLACE, DROP INDEX ix_firstname; --DROP INDEX ALTER TABLE employee_test ALGORITHM=INPLACE, ENGINE=INNODB; --OPTIMIZE TABLE ALTER TABLE employee_test ALGORITHM=INPLACE, ADD COLUMN test_column INT NULL; --ADD COLUMN ALTER TABLE employee_test ALGORITHM=INPLACE, DROP COLUMN f_name; --DROP COLUMN ALTER TABLE employee_test ALGORITHM=INPLACE, CHANGE first_name f_name varchar(14) NOT NULL; --RENAME COLUMN