MySQL Streaming Xtrabackup to Slave Recovery

Posted in: MySQL, Technical Track

Overview

There are times when I need to restore a slave from a backup of a master or another slave, but too many times I have taken the typical approach of taking the backup on the source server and then copying it to the target server.  This takes a great deal of time, especially as your database grows in size.

These steps are going to detail how to use Netcat (nc) and Percona Xtrabackup (innobackupexec) to stream the backup from the source server to the target server, saving a great deal of time by copying the data only once to the desired location.  While the data is streaming to the target server, it’s being compressed and then uncompressed on the fly, reducing the amount of traffic going across the network by around 85% to 90% (typical backup compression ratios of MySQL Innodb I have witnessed are in this range).  I will also provide a simple script to complete these steps to give you a start at creating your own customized solution.

Requirements

In order to accomplish this task, you need to keep the following items in mind:

  • Netcat (nc) – Application needed on both servers, used for data streaming.
  • Percona Xtrabackup – Application needed on both servers, used to perform the backup.
  • MySQL Access – MySQL access is required in order to do the backup on the master, and the slave configuration on the slave.
  • Pigz (optional) – This is only needed if you want to compress and uncompress it on the fly.  If you are going to use this, it’s needed on both servers.
  • Debian – All code and scripts were tested using Debian.  Commands may slightly change with different OS’s.

Steps

Here are the steps that are required to accomplish this task.  The source server is the server where the backup is coming from.  The target server is where the backup is going to.

Step 1. Stop MySQL on Target Server and Clear MySQL Data

On the server that needs to be restored, we will make sure that MySQL is stopped.  Then we will clear out the old data as this will all be replaced with the backup coming from the source server.  The example assumes your MySQL data directory is /var/lib/mysql.

service mysql stop
rm -rf /var/lib/mysql/*

Step 2. Start Listener on Target Server

Now that the target server has its MySQL data directory cleared of its old data, it is now ready to receive the new backup directly from the source server. The port 2112 can be changed, but the port needs to match on both source and target commands. Nc command and options may vary by OS.

nc -l -p 2112 | unpigz -c | xbstream -x -C /var/lib/mysql

No Compression would be

nc -l -p 2112 | xbstream -x -C /var/lib/mysql

Step 3. Start Backup on Source Server

The listener is now up on the target server, ready to accept the connection. We will now start the backup on the source server to stream to the target server. Update the parallel option to match the number of cpu cores on your server. Use the lower core count between the source and target server. The port 2112 can be changed, but the port needs to match on both source and target commands. Nc command and options may vary by OS.

innobackupex --stream=xbstream --parallel=4 /tmp | pigz -c --fast | nc -w 2 TARGET_SERVER 2112

No Compression would be

innobackupex --stream=xbstream --parallel=4 /tmp | nc -w 2 TARGET_SERVER 2112

Step 4. Prepare Backup on Target Server

Percona xtrabackup requires that you prepare the backup after it has been completed to apply any outstanding logs and get the database ready to be started. Use as much memory on your target server as you can without causing it to go OOM (Out of Memory). As an example, 75% of your total memory would be a good starting point if there is nothing else running on your server. On a server with 4G of RAM you could safely set user memory to 3G.

innobackupex --use-memory=3G --apply-log /var/lib/mysql

Step 5. Update Ownership and Start MySQL

Now that the apply logs step has completed on your backup, you should be able to update the ownership of the files to mysql:mysql, and then start the MySQL service.

chown -R mysql:mysql /var/lib/mysql
service mysql start

Step 6. Configure Replication

If the source server is a slave, you should be able to just start the new slave as the positioning will be already configured. If your source server is the master, then you will have to figure out if you are using GTID or legacy replication. If you are using GTIDs, you should be able start replication with gtid_purged being set and the master auto position parameter. If you are using legacy replication, you can find the master log and position in the xtrabackup_binlog_info file in your backup directory. In this scenario, the backup directory is the MySQL data directory (/var/lib/mysql/) on the target server.

GTID Replication

mysql
CHANGE MASTER TO MASTER_HOST='<MASTER_SERVER>', 
MASTER_USER='<REPL_USER>', 
MASTER_PASSWORD='<REPL_PASS>', 
MASTER_AUTO_POSITION = 1; 
START SLAVE;"

Legacy Replication

cat /var/lib/mysql/xtrabackup_binlog_info
mysql 
CHANGE MASTER TO MASTER_HOST='<MASTER_SERVER>', 
MASTER_USER='<REPL_USER>', 
MASTER_PASSWORD='<REPL_PASS>', 
MASTER_LOG_FILE='<LOG_FROM_xtrabackup_binlog_info', 
MASTER_LOG_POS=<POSITION_FROM_xtrabackup_binlog_info>; 
START SLAVE;"

Script

Below you will find the simple script that I came up with to get you started on automating this task to quickly rebuild a slave in your own environment.  The script was created with a lot of assumptions that you may not have in your environment.  Please make sure to update accordingly.

  • The script is running from a separate server that has access to both MySQL servers. May work by running on one of the MySQL servers but not tested to do so.
  • The account running the commands on the remote servers have SUDO access to run commands.
  • SSH key of the remote server is set up to allow ssh access to both MySQL servers allowing for ssh with no password prompt.
  • .my.cnf is configured in the home directory of the account being used to SSH on the MySQL servers allowing the script to run MySQL and Xtrabackup commands with no password prompt.
  • The following software is installed on the MySQL servers : netcat (nc), pigz, xtrabackup.
  • Firewall rules are open for the port being used by NETCAT streaming.
  • All my testing was on Debian servers. Found with other OS’s and Netcac versions there is a -d flag for running nc in the background. In Debian you have to use -p for netcat when telling it to listen on a port.

Script

#!/bin/bash

SOURCE_SERVER=$1
TARGET_SERVER=$2

# This is if you have /var/lib/mysql on the / (Root) drive.  You can change this to '/' to capture the size of the drive to get an idea of how long is left on your backup
MOUNT_CONTAINING_DATADIR='/var'  

#Should match the number of CPU's on your server
BACKUP_THREADS=2  

# Seconds to wait in the loop to check that the backup has completed
TIMER=5  

# Amount of memory to use to apply the logs to the newly backed up server
MEMORY_GB_USED_APPLY_LOGS='1G' 

# Change this to a 1 if you want to configure the target a slave of the source server
CONFIGURE_TARGET_AS_SLAVE_OF_SOURCE=1 

GTID_STATUS=''
SSH='ssh -q -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null'
NETCAT_PORT=2112
REPL_USER='repl'
REPL_PASS='repl_PASSWORD'
MYSQL_DATADIR='/var/lib/mysql'

echo ""
echo "Starting MySQL Streaming Slave Rebuild Script"
echo ""

if [ ! $1 ];then
	echo "SOURCE Server not set.  Please run like :"
	echo ""
	echo "$0 <source_server> <target_server>"
	echo ""
	exit 1
fi

# VALIDATE EXECUTION

echo "Are you sure you wish to perform this rebuild.  THIS IS DESTRUCTIVE!!!"
echo "  SOURCE SERVER (Performing Backup) : $SOURCE_SERVER"
echo "  TARGET SERVER (Receiving Backup) : $TARGET_SERVER"
echo "    All files in $MYSQL_DATADIR on $TARGET_SERVER will be DELETED!!!"
echo -n "START NOW ? (y/n) : "
read CONFIRM
if [ $CONFIRM == "y" -o $CONFIRM == "Y" ]; then
	echo ""
	echo "STARTING REBUILD!!!"
else
	echo "Y or y was not chosen. Exiting."
	exit 1
fi

# PREPARE TARGET

echo "  Stopping MySQL"
$SSH $TARGET_SERVER "service mysql stop"
echo "  Clearing $MYSQL_DATADIR"
$SSH $TARGET_SERVER "rm -rf $MYSQL_DATADIR/*"

# PERFORM STREAMING BACKUP

echo "  Start Listener on Target server $TARGET_SERVER:$NETCAT_PORT to accept the backup and place it in $MYSQL_DATADIR"
$SSH $TARGET_SERVER "nc -l -p $NETCAT_PORT | unpigz -c | xbstream -x -C $MYSQL_DATADIR" > /dev/null 2>&1 &

echo "  Starting backup on source server $SOURCE_SERVER:$NETCAT_PORT to stream backup"
$SSH $SOURCE_SERVER "innobackupex --stream=xbstream --parallel=$BACKUP_THREADS /tmp | pigz -c --fast | nc -w 2 $TARGET_SERVER $NETCAT_PORT" > /tmp/backup_log 2>&1 &
sleep 4
echo "    Watching backup every $TIMER seconds to validate when the backup is complete"
LOOP=1
while [ 1 -eq $LOOP ];do
	BACKUP_PROCESSES=`$SSH $SOURCE_SERVER "ps aux | grep -v grep | grep -w innobackupex | wc -l"`
	if [ $BACKUP_PROCESSES -eq 0 ]; then
		echo "    Backup has COMPLETED!!"
		LOOP=2
	else
		echo "    Backup is Running!"
		sleep $TIMER
	fi
done

# PREPARE AND COMPLETE BACKUP ON TARGET

echo "  Applying logs to the Xtrabackup"
$SSH $TARGET_SERVER "innobackupex --use-memory=$MEMORY_GB_USED_APPLY_LOGS --apply-log $MYSQL_DATADIR" > /tmp/backup_log 2>&1 &
sleep 3
LOOP=1
while [ 1 -eq $LOOP ];do
        APPLY_PROCESSES=`$SSH $TARGET_SERVER "ps aux | grep -v grep | grep innobackupex | wc -l"`
        if [ $APPLY_PROCESSES -eq 0 ]; then
                echo "    Apply logs has COMPLETED!!"
                LOOP=2
        else
                echo "    Apply Logs Running!"
                sleep $TIMER
        fi
done

sleep 1

echo "  Updating ownership on the backup files so that MySQL owns them"
$SSH $TARGET_SERVER "chown -R mysql:mysql $MYSQL_DATADIR"

echo "  Starting MySQL"
$SSH $TARGET_SERVER "service mysql start"

if [ $CONFIGURE_TARGET_AS_SLAVE_OF_SOURCE -eq 1 ]; then
	echo "  Configuring Replication"
	GTID_STATUS=`$SSH $SOURCE_SERVER "mysql -BN -e \"SHOW VARIABLES LIKE 'gtid_mode'\"" | grep -w ON | wc -l`
	if [ $GTID_STATUS -gt 0 ]; then
                echo "Found GTID ON.  Using Master Auto Position.  SLAVE STARTED"
                GTID_POS=`$SSH $TARGET_SERVER "cat $MYSQL_DATADIR/xtrabackup_binlog_info" | awk '{print $3}' | head -n 1 | sed 's/,//'`
                $SSH $TARGET_SERVER "mysql -e \"SET GLOBAL gtid_purged='$GTID_POS';\""
                $SSH $TARGET_SERVER "mysql -e \"CHANGE MASTER TO MASTER_HOST='$SOURCE_SERVER', MASTER_USER='$REPL_USER', MASTER_PASSWORD='$REPL_PASS', MASTER_AUTO_POSITION = 1; START SLAVE; \""
	else
		echo "Found GTID not ON.  Grabbing positioning from the backup file and using that to configure replication"
		MASTER_LOG=`$SSH $TARGET_SERVER "cat $MYSQL_DATADIR/xtrabackup_binlog_info" | awk '{print $1}'`
		MASTER_POS=`$SSH $TARGET_SERVER "cat $MYSQL_DATADIR/xtrabackup_binlog_info" | awk '{print $2}'`
		echo "Setting the slave to $MASTER_LOG and $MASTER_POS.  SLAVE STARTED"
		$SSH $TARGET_SERVER "mysql -e \"CHANGE MASTER TO MASTER_HOST='$SOURCE_SERVER', MASTER_USER='$REPL_USER', MASTER_PASSWORD='$REPL_PASS', MASTER_LOG_FILE='$MASTER_LOG', MASTER_LOG_POS=$MASTER_POS; START SLAVE;\""
	fi
fi

Script Output


[root@bastion DB]# ./mysql_file_streaming_slave_rebuild.sh 10.10.10.198 10.10.10.197

Starting MySQL Streaming Slave Rebuild Script

Are you sure you wish to perform this rebuild.  THIS IS DESTRUCTIVE!!!
  SOURCE SERVER (Performing Backup) : 10.10.10.198
  TARGET SERVER (Receiving Backup) : 10.10.10.197
    All files in /var/lib/mysql on 10.10.10.197 will be DELETED!!!
START NOW ? (y/n) : y

STARTING REBUILD!!!
  Stopping MySQL
  Clearing /var/lib/mysql
  Start Listener on Target server 10.10.10.197:2112 to accept the backup and place it in /var/lib/mysql
  Starting backup on source server 10.10.10.198:2112 to stream backup
    Watching backup every 5 seconds to validate when the backup is complete
    Backup is Running!
    Backup is Running!
    Backup is Running!
    Backup has COMPLETED!!
  Applying logs to the Xtrabackup
    Apply Logs Running!
    Apply Logs Running!
    Apply logs has COMPLETED!!
  Updating ownership on the backup files so that MySQL owns them
  Starting MySQL
  Configuring Replication
Found GTID ON.  Using Master Auto Position.  SLAVE STARTED

 

Conclusion

I have found that this has greatly increased the timeframe in which it took me to recover a failed slave. By transferring the data only once from the source server to the target server, and with the data being compressed during the transfer, I feel this is one of the most efficient methods of recovering a failed slave, or building a new one.

email

Interested in working with Kevin? Schedule a tech call.

MySQL Database Consultant

No comments

Leave a Reply

Your email address will not be published. Required fields are marked *