Using gh-ost for online schema changes

Valerie Parham-Thompson

October 21, 2016

The pt-online-schema-change tool has been a workhorse for years, allowing MySQL DBAs to alter tables with minimal impact to application workload, and before MySQL had native online alter capabilities. Although consistently grateful for the tool’s existence, I never liked the messiness of having to use and clean up triggers, and most DBAs have a horror story or two to tell about using any of the online alter methods. When Github's online schema transmogrifer (gh-ost; https://github.com/github/gh-ost) was released, then, I jumped on the chance to test it out. Following are my testing notes. I tried out the flags available to run gh-ost from different locations in a replication setup, both functionally and under load. The load applied was via mysqlslap, using a combination of reads and writes, with multiple threads for concurrency. This is to be considered minimal load testing, but it allowed me to observe gh-ost throttling itself, to see the wait time for cutover, and to use the operational controls. The main two benefits I saw in gh-ost were the avoidance of triggers and having operational controls to pause or cleanly stop the change process. The nicely verbose output is the best way to understand what the tool is doing; I've pasted a sample output below. I also cover the operational controls in a section below. Don't miss reading the helpful docs but here are some highlights: You can run gh-ost from the master or a replica. The preferred way is from a replica, to avoid extra load on the master. Row-based replication is required on the node you are running gh-ost from. Any table being altered requires a primary key. It is functionally possible to run multiple gh-ost changes at once. The installation process is simple:

Visit the download page at https://github.com/github/gh-ost/releases/latest to get the link for the latest version to use below.

wget https://github.com/github/gh-ost/releases/download/v[VERSION]/gh-ost-binary-linux-[VERSION].tar.gz

tar -xvf gh-ost-binary-linux-[VERSION].tar.gz I tested on bare-bones (Centos7) servers and did not find any other software was required to be installed. Just a few changes to make on the database: Add these permissions: master> grant ALTER, CREATE, DELETE, DROP, INDEX, INSERT, LOCK TABLES, SELECT, TRIGGER, UPDATE on *.* to ghost@'192.168.56%' identified by password 'xxx'; master> grant super, replication slave on *.* to ghost@'192.168.56%'; Add this setting on the node you will run gh-ost on. It's not explicitly needed if running from a replica because gh-ost will set it for you, but note this is required. slave> set global binlog_format=ROW; You do need to set this if running on a master, otherwise you will receive an error message: 2016-09-01 16:07:29 FATAL 192.168.56.68:3306 has STATEMENT binlog_format, but I'm too scared to change it to ROW because it has replicas. Bailing out Notes about the files created: You'll see some tables are created during the process. _t1_ghc is the record of changes, not data change capture _t1_gho is the ghost table that is an initial copy and then receives changes from binlogs _t1_del is the cutover table Unless you specify otherwise, _t1_del will remain after the process is complete, as a backup.

Sample statements for various setups are below. A few brief notes on usage:

As mentioned above, the tool is flexible to allow you to run it from different locations. These are listed below.

Drop the --execute flag in the below statements to do a dry run.

Using --exact-rowcount causes a select count(*) and doesn't help with accuracy of output with regard to the percentage complete. In a federated setup, select count(*) can crash the server. I'd avoid this flag unless you've tested it well for your setup. In any case, the percentage complete and estimated time to completion will not be very accurate, even if you do an exact-rowcount. This is especially true of a cluster with heavy writes.

**1. Running gh-ost on a replica, with changes going to the master and all replicas. This is the desired method to avoid load on the master.**

The tool will detect the master from a “show slave status” statement, which causes obvious issues with non-native replication such as Tungsten. See more on this at https://github.com/github/gh-ost/issues/212. ./gh-ost \ --max-load=Threads_running=25 \ --critical-load=Threads_running=1000 \ --chunk-size=1000 \ --throttle-control-replicas="192.168.56.144" \ --max-lag-millis=1500 \ --user="ghost" \ --password="ghost" \ --host=192.168.56.144 \ --database="mysqlslap" \ --table="t1" \ --verbose \ --alter="add column whatever3 varchar(50)" \ --switch-to-rbr \ --allow-master-master \ --cut-over=default \ --default-retries=120 \ --panic-flag-file=/tmp/ghost.panic.flag \ --postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \ --initially-drop-ghost-table \ --execute

2. Test on replica, without affecting master or other replicas

See more at: https://github.com/github/gh-ost/blob/master/doc/testing-on-replica.md ./gh-ost \ --test-on-replica \ --max-load=Threads_running=25 \ --critical-load=Threads_running=1000 \ --chunk-size=1000 \ --throttle-control-replicas="192.168.56.20" \ --max-lag-millis=1500 \ --user="ghost" \ --password="ghost" \ --host=192.168.56.20 \ --database="mysqlslap" \ --table="t1" \ --verbose \ --alter="add column charcol6 varchar(25)" \ --switch-to-rbr \ --cut-over=default \ --default-retries=120 \ --panic-flag-file=/tmp/ghost.panic.flag \ --postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \ --execute

3. Migrate on the replica only (perhaps used with "shell game" method)

./gh-ost \ --migrate-on-replica \ --max-load=Threads_running=25 \ --critical-load=Threads_running=1000 \ --chunk-size=1000 \ --throttle-control-replicas="192.168.56.20" \ --max-lag-millis=1500 \ --user="ghost" \ --password="ghost" \ --host=192.168.56.20 \ --database="mysqlslap" \ --table="t1" \ --verbose \ --alter="add column charcol6 varchar(25)" \ --switch-to-rbr \ --cut-over=default \ --default-retries=120 \ --panic-flag-file=/tmp/ghost.panic.flag \ --postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \ --execute

4. Run from master, which replicates down to slaves as configured within mysql

./gh-ost \

--allow-on-master \ --max-load=Threads_running=25 \ --critical-load=Threads_running=1000 \ --chunk-size=1000 \ --throttle-control-replicas="192.168.56.144" \ --max-lag-millis=1500 \ --user="ghost" \ --password="ghost" \ --host=192.168.56.145 \ --database="mysqlslap" \ --table="t1" \ --verbose \ --alter="add column whatever6 varchar(50)" \ --cut-over=default \ --default-retries=120 \ --switch-to-rbr \ --panic-flag-file=/tmp/ghost.panic.flag \ --postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \ --execute

Sample output:

Here is some sample status output. You can see the automatic throttling when lag exceeds my setting of –max-lag-millis. Adding the --debug flag gives even more output.

Copy: 8000/11966 66.9%; Applied: 0; Backlog: 0/100; Time: 28s(total), 28s(copy); streamer: mysqld-bin.000075:73975014; ETA: 13s Copy: 8000/11966 66.9%; Applied: 0; Backlog: 0/100; Time: 29s(total), 29s(copy); streamer: mysqld-bin.000075:73975014; ETA: 14s Copy: 8000/11966 66.9%; Applied: 0; Backlog: 0/100; Time: 30s(total), 30s(copy); streamer: mysqld-bin.000075:73975014; ETA: throttled, 192.168.56.20:3306 replica-lag=2.000000s Copy: 9000/11966 75.2%; Applied: 0; Backlog: 0/100; Time: 31s(total), 31s(copy); streamer: mysqld-bin.000075:84539586; ETA: throttled, lag=2.070653s Copy: 9000/11966 75.2%; Applied: 0; Backlog: 0/100; Time: 32s(total), 32s(copy); streamer: mysqld-bin.000075:84539586; ETA: 10s Copy: 9000/11966 75.2%; Applied: 0; Backlog: 0/100; Time: 33s(total), 33s(copy); streamer: mysqld-bin.000075:84539586; ETA: throttled, 192.168.56.20:3306 replica-lag=2.000000s Copy: 9000/11966 75.2%; Applied: 0; Backlog: 0/100; Time: 34s(total), 34s(copy); streamer: mysqld-bin.000075:95104741; ETA: throttled, lag=2.070911s Copy: 10000/11966 83.6%; Applied: 0; Backlog: 0/100; Time: 35s(total), 35s(copy); streamer: mysqld-bin.000075:95106041; ETA: 6s Copy: 10000/11966 83.6%; Applied: 0; Backlog: 0/100; Time: 36s(total), 36s(copy); streamer: mysqld-bin.000075:95106041; ETA: 7s Copy: 10000/11966 83.6%; Applied: 0; Backlog: 0/100; Time: 37s(total), 37s(copy); streamer: mysqld-bin.000075:95106041; ETA: throttled, 192.168.56.20:3306 replica-lag=2.000000s The "downtime" for this tool is during the rename. Here is sample output during a heavy read and write load. The timing is similar to other online schema change tools. 2016-08-22 11:56:06 INFO Lock & rename duration: 3.02072298s. During this time, queries on `sbtest5` were blocked

Operational controls:

In addition to not using triggers (and the related load), a great feature of gh-ost is operational controls. You can change the configuration while the process is running, and throttle it (that is, pause the process and avoid any load).

Run operational commands against the gh-ost*.sock file created in the /tmp directory. The file is named according to the change you're making (in cases of multiple concurrent changes). Manually throttle (pause) the process via the command line interface: echo throttle | nc -U /tmp/gh-ost.mysqlslap.t1.sock Restart a throttled process: echo no-throttle | nc -U /tmp/gh-ost.mysqlslap.t1.sock Example for changing configuration while process is running: echo "chunk-size=250" | nc -U /tmp/gh-ost.mysqlslap.t1.sock View the status of the process (could be used, for example, if you've started the process in screen, without logging back into screen). "sup" is brief and "status" is detailed. echo sup | nc -U /tmp/gh-ost.mysqlslap.t1.sock echo status | nc -U /tmp/gh-ost.mysqlslap.t1.sock

Insight and analysis of technology and business strategy