Using gh-ost for online schema changes

Posted in: MySQL

The pt-online-schema-change tool has been a workhorse for years, allowing MySQL DBAs to alter tables with minimal impact to application workload, and before MySQL had native online alter capabilities. Although consistently grateful for the tool’s existence, I never liked the messiness of having to use and clean up triggers, and most DBAs have a horror story or two to tell about using any of the online alter methods.

When Github’s online schema transmogrifer (gh-ost; https://github.com/github/gh-ost) was released, then, I jumped on the chance to test it out.

Following are my testing notes. I tried out the flags available to run gh-ost from different locations in a replication setup, both functionally and under load. The load applied was via mysqlslap, using a combination of reads and writes, with multiple threads for concurrency. This is to be considered minimal load testing, but it allowed me to observe gh-ost throttling itself, to see the wait time for cutover, and to use the operational controls.

The main two benefits I saw in gh-ost were the avoidance of triggers and having operational controls to pause or cleanly stop the change process. The nicely verbose output is the best way to understand what the tool is doing; I’ve pasted a sample output below. I also cover the operational controls in a section below.

Don’t miss reading the helpful docs but here are some highlights:

You can run gh-ost from the master or a replica. The preferred way is from a replica, to avoid extra load on the master.

Row-based replication is required on the node you are running gh-ost from.

Any table being altered requires a primary key.

It is functionally possible to run multiple gh-ost changes at once.

The installation process is simple:

Visit the download page at https://github.com/github/gh-ost/releases/latest to get the link for the latest version to use below.

wget https://github.com/github/gh-ost/releases/download/v[VERSION]/gh-ost-binary-linux-[VERSION].tar.gz

tar -xvf gh-ost-binary-linux-[VERSION].tar.gz

I tested on bare-bones (Centos7) servers and did not find any other software was required to be installed.

Just a few changes to make on the database:

Add these permissions:

master> grant ALTER, CREATE, DELETE, DROP, INDEX, INSERT, LOCK TABLES, SELECT, TRIGGER, UPDATE on *.* to [email protected]’192.168.56%’ identified by password ‘xxx’;

master> grant super, replication slave on *.* to [email protected]’192.168.56%’;
Add this setting on the node you will run gh-ost on. It’s not explicitly needed if running from a replica because gh-ost will set it for you, but note this is required.

slave> set global binlog_format=ROW;
You do need to set this if running on a master, otherwise you will receive an error message:

2016-09-01 16:07:29 FATAL 192.168.56.68:3306 has STATEMENT binlog_format, but I’m too scared to change it to ROW because it has replicas. Bailing out

 

Notes about the files created:
You’ll see some tables are created during the process.
_t1_ghc is the record of changes, not data change capture

_t1_gho is the ghost table that is an initial copy and then receives changes from binlogs

_t1_del is the cutover table
Unless you specify otherwise, _t1_del will remain after the process is complete, as a backup.

Sample statements for various setups are below. A few brief notes on usage:

As mentioned above, the tool is flexible to allow you to run it from different locations. These are listed below.

Drop the –execute flag in the below statements to do a dry run.

Using –exact-rowcount causes a select count(*) and doesn’t help with accuracy of output with regard to the percentage complete. In a federated setup, select count(*) can crash the server. I’d avoid this flag unless you’ve tested it well for your setup. In any case, the percentage complete and estimated time to completion will not be very accurate, even if you do an exact-rowcount. This is especially true of a cluster with heavy writes.

1. Running gh-ost on a replica, with changes going to the master and all replicas. This is the desired method to avoid load on the master.

The tool will detect the master from a “show slave status” statement, which causes obvious issues with non-native replication such as Tungsten. See more on this at https://github.com/github/gh-ost/issues/212.
./gh-ost \

–max-load=Threads_running=25 \

–critical-load=Threads_running=1000 \

–chunk-size=1000 \

–throttle-control-replicas=”192.168.56.144″ \

–max-lag-millis=1500 \

–user=”ghost” \

–password=”ghost” \

–host=192.168.56.144 \

–database=”mysqlslap” \

–table=”t1″ \

–verbose \

–alter=”add column whatever3 varchar(50)” \

–switch-to-rbr \

–allow-master-master \

–cut-over=default \

–default-retries=120 \

–panic-flag-file=/tmp/ghost.panic.flag \

–postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \

–initially-drop-ghost-table \

–execute

 

2. Test on replica, without affecting master or other replicas

See more at: https://github.com/github/gh-ost/blob/master/doc/testing-on-replica.md

./gh-ost \

–test-on-replica \

–max-load=Threads_running=25 \

–critical-load=Threads_running=1000 \

–chunk-size=1000 \

–throttle-control-replicas=”192.168.56.20″ \

–max-lag-millis=1500 \

–user=”ghost” \

–password=”ghost” \

–host=192.168.56.20 \

–database=”mysqlslap” \

–table=”t1″ \

–verbose \

–alter=”add column charcol6 varchar(25)” \

–switch-to-rbr \

–cut-over=default \

–default-retries=120 \

–panic-flag-file=/tmp/ghost.panic.flag \

–postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \

–execute

 

3. Migrate on the replica only (perhaps used with “shell game” method)

./gh-ost \

–migrate-on-replica \

–max-load=Threads_running=25 \

–critical-load=Threads_running=1000 \

–chunk-size=1000 \

–throttle-control-replicas=”192.168.56.20″ \

–max-lag-millis=1500 \

–user=”ghost” \

–password=”ghost” \

–host=192.168.56.20 \

–database=”mysqlslap” \

–table=”t1″ \

–verbose \

–alter=”add column charcol6 varchar(25)” \

–switch-to-rbr \

–cut-over=default \

–default-retries=120 \

–panic-flag-file=/tmp/ghost.panic.flag \

–postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \

–execute

 

4. Run from master, which replicates down to slaves as configured within mysql

./gh-ost \

–allow-on-master \

–max-load=Threads_running=25 \

–critical-load=Threads_running=1000 \

–chunk-size=1000 \

–throttle-control-replicas=”192.168.56.144″ \

–max-lag-millis=1500 \

–user=”ghost” \

–password=”ghost” \

–host=192.168.56.145 \

–database=”mysqlslap” \

–table=”t1″ \

–verbose \

–alter=”add column whatever6 varchar(50)” \

–cut-over=default \

–default-retries=120 \

–switch-to-rbr \

–panic-flag-file=/tmp/ghost.panic.flag \

–postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \

–execute

 

Sample output:

Here is some sample status output. You can see the automatic throttling when lag exceeds my setting of –max-lag-millis. Adding the –debug flag gives even more output.

 

Copy: 8000/11966 66.9%; Applied: 0; Backlog: 0/100; Time: 28s(total), 28s(copy); streamer: mysqld-bin.000075:73975014; ETA: 13s

Copy: 8000/11966 66.9%; Applied: 0; Backlog: 0/100; Time: 29s(total), 29s(copy); streamer: mysqld-bin.000075:73975014; ETA: 14s

Copy: 8000/11966 66.9%; Applied: 0; Backlog: 0/100; Time: 30s(total), 30s(copy); streamer: mysqld-bin.000075:73975014; ETA: throttled, 192.168.56.20:3306 replica-lag=2.000000s

Copy: 9000/11966 75.2%; Applied: 0; Backlog: 0/100; Time: 31s(total), 31s(copy); streamer: mysqld-bin.000075:84539586; ETA: throttled, lag=2.070653s

Copy: 9000/11966 75.2%; Applied: 0; Backlog: 0/100; Time: 32s(total), 32s(copy); streamer: mysqld-bin.000075:84539586; ETA: 10s

Copy: 9000/11966 75.2%; Applied: 0; Backlog: 0/100; Time: 33s(total), 33s(copy); streamer: mysqld-bin.000075:84539586; ETA: throttled, 192.168.56.20:3306 replica-lag=2.000000s

Copy: 9000/11966 75.2%; Applied: 0; Backlog: 0/100; Time: 34s(total), 34s(copy); streamer: mysqld-bin.000075:95104741; ETA: throttled, lag=2.070911s

Copy: 10000/11966 83.6%; Applied: 0; Backlog: 0/100; Time: 35s(total), 35s(copy); streamer: mysqld-bin.000075:95106041; ETA: 6s

Copy: 10000/11966 83.6%; Applied: 0; Backlog: 0/100; Time: 36s(total), 36s(copy); streamer: mysqld-bin.000075:95106041; ETA: 7s

Copy: 10000/11966 83.6%; Applied: 0; Backlog: 0/100; Time: 37s(total), 37s(copy); streamer: mysqld-bin.000075:95106041; ETA: throttled, 192.168.56.20:3306 replica-lag=2.000000s

 

The “downtime” for this tool is during the rename. Here is sample output during a heavy read and write load. The timing is similar to other online schema change tools.

2016-08-22 11:56:06 INFO Lock & rename duration: 3.02072298s. During this time, queries on `sbtest5` were blocked

 

Operational controls:

In addition to not using triggers (and the related load), a great feature of gh-ost is operational controls. You can change the configuration while the process is running, and throttle it (that is, pause the process and avoid any load).

 

Run operational commands against the gh-ost*.sock file created in the /tmp directory. The file is named according to the change you’re making (in cases of multiple concurrent changes).

 

Manually throttle (pause) the process via the command line interface:

echo throttle | nc -U /tmp/gh-ost.mysqlslap.t1.sock
Restart a throttled process:

echo no-throttle | nc -U /tmp/gh-ost.mysqlslap.t1.sock

 

Example for changing configuration while process is running:

echo “chunk-size=250” | nc -U /tmp/gh-ost.mysqlslap.t1.sock
View the status of the process (could be used, for example, if you’ve started the process in screen, without logging back into screen). “sup” is brief and “status” is detailed.

echo sup | nc -U /tmp/gh-ost.mysqlslap.t1.sock

echo status | nc -U /tmp/gh-ost.mysqlslap.t1.sock

email
Want to talk with an expert? Schedule a call with our team to get the conversation started.

About the Author

Internal Principal Consultant
With experience as an open-source DBA and developer for software-as-a-service environments, Valerie has expertise in web-scale data storage and data delivery, including MySQL, Cassandra, Postgres, and MongoDB.

2 Comments. Leave new

Thank you for testing `gh-ost`, and your feedback is appreciated!

A few comments:

> Any table being altered requires a primary key.

That is not the case: `gh-ost` is able to utilize any `UNIQUE KEY`; the one requirement is that the original and modified table schema both share the same `UNIQUE KEY`. For a `UNIQUE KEY` that is nor the `PRIMARY KEY` you want to make sure no `NULL` exists in indexed columns; so either all columns are `NOT NULL` or you need to provide `gh-ost` with `–allow-nullable-unique-key` to make it trust you that no actual `NULL`s exist. Of course `PRIMARY KEY` is preferred and makes life simpler, and is the key of choice when possible.

> Using –exact-rowcount causes a sel-ect count(*) and doesn’t help with accuracy of output with regard to the percentage complete

We notice otherwise, and I guess this boils down to how accurate your InnoDB statistics are. By the way, we use `–exact-rowcount` along with `–concurrent-rowcount` so that we don;t actually waste time before actually starting the actual migration. At any case, we observe very accurate progress indication with these flags.

> In a federated setup, sel-ect count(*) can crash the server.

Curious: do you mean a setup with a `federated` engine?

Reply
Valerie Parham-Thompson
October 21, 2016 1:36 pm

Thank you for the clarification on primary key, Shlomi.

I will test this with exact-rowcount and concurrent-rowcount together. Maybe this will improve what I’ve seen, and will share the output. My previous tests were under a high-insert workload.

Yes, federated engine.

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *