While investigating alternatives to migrate to Google Cloud SQL, I encountered a lack of support for external masters. However, it’s possible to overcome this limitation by replicating into Google Cloud SQL using Tungsten replicator.
Cloud SQL is Google’s database-as-a-service solution, similar to RDS for Amazon Web Services. You can get a fully managed database in only a few clicks (or API calls). At the time of writing this, the only supported databases are MySQL and Postgres.
Cloud SQL alternatives
Google offers two different options for MySQL deployments.
1st generation instances:
- Only MySQL versions 5.5 and 5.6 can provisioned
- Max memory is limited to 16 Gb
- Max of 250 Gb storage (up to 500 Gb with Silver or higher support package)
- MyISAM and InnoDB
- Asynchronous replication for read replicas
2nd generation instances:
- Only MySQL versions 5.6 and 5.7 can be deployed
- Maximum memory is limited to 205 Gb
- Maximum of 10 Tb storage
- InnoDB storage engine only
- Semi-sync replication only
- GTID replication only
- No support for external master
There are some limitations that are common to both flavors:
- no SUPER privilege
- no triggers
- no performance schema
- no replication between 1st and 2nd generation instances is possible
From the above, it is quite obvious most production deployments would want to use 2nd gen instances.
The problem is there is no migration path that doesn’t involve stopping application activity to be able to take a dump of the data, due to the fact that external masters are not supported on 2nd gen instances.
So how do we migrate our database to Google Cloud SQL while keeping downtime as low as possible?
The solution
The answer is to use Tungsten Replicator, so that replication is completely external to the database.
Note that since we only need the replicator, the FOSS version available on Github is enough for our purposes. There is no need to buy Tungsten commercial version, which includes the cluster functionality and official support.
We will need to install two different Tungsten processes: One will attach to the source database to read transactions from the binary logs, while the second will apply those transactions to the Cloud SQL instance.
This is what it looks like:
Preparing the environment
The first thing you will need is a place to install the replicator. I suggest to provision a dedicated instance (instance-1 in the diagram) in the same zone as your Cloud SQL instance.
At minimum you would want an n1-standard-1 size, as Tungsten does consume its fair share of memory.
Tungsten needs some packages (do check out the complete requirements list here) so let’s go ahead and install them:
apt-get install ruby default-jre
Now we need a database user for Tungsten on the source and target databases:
GRANT ALL ON *.* TO [email protected]'%' IDENTIFIED BY 'secret';
It is usually a good idea to have a dedicated OS user as well:
useradd -m -d /opt/continuent tungsten
Get the replicator package from GitHub and extract the contents as tungsten OS user:
wget https://github.com/continuent/tungsten-replicator/releases/download/5.2.1/tungsten-replicator-5.2.1.tar.gz
tar zxf tungsten-replicator-5.2.1.tar.gz
Tungsten installation
Prepare the Tungsten config files on instance-1 as follows. instance-1 is the server where Tungsten will run from, and instance-2 is the server that has the source database.
Process that will read from MySQL
vi /etc/tungsten/tungsten-mysqlreader.ini
[defaults] replication-user=tungsten replication-password=secret skip-validation-check=MySQLUnsupportedDataTypesCheck skip-validation-check=MySQLPermissionsCheck skip-validation-check=MySQLMyISAMCheck [mysqlreader] install-directory=/opt/continuent/mysqlreader master=instance-1 members=instance-1 datasource-host=instance-2 datasource-user=tungsten datasource-password=secret
Process that will write to Cloud SQL
vi /etc/tungsten/tungsten-writetocloudsql.ini
[defaults] replication-user=tungsten replication-password=secret [writetocloudsql] datasource-type=mysql install-directory=/opt/continuent/writetocloudsql master=instance-1 members=instance-1 topology=master-slave datasource-host=cloudsql_ip_address datasource-user=tungsten datasource-password=secret privileged-slave=false skip-validation-check=InstallerMasterSlaveCheck skip-validation-check=MySQLPermissionsCheck skip-validation-check=MySQLBinaryLogsEnabledCheck rmi-port=10002 master-thl-port=2112 master-thl-host=instance-1 thl-port=2113
Note the use of privileged-slave=false and the various validation checks that need to be skipped for the applier process. That allows us to get past the SUPER requirement and the other Cloud SQL limitations.
Since we are running both extractor and applier processes on the same instance, we need to manually specify the ports on the second process so that there are no conflicts.
Now we are ready to install the replicators, by running the following as tungsten OS user:
cd tungsten-replicator-5.2.1/tools
tpm install
Replicating into Google Cloud SQL using Tungsten
At this point you would start the reader process using trepctl online command to start capturing events from the source instance, and use something like mysqldump to dump & load your dataset into a Google Cloud SQL instance. Make sure you have the binlog coordinates of that dump available.
Once the Cloud SQL instance is properly seeded, we would start the applier process from the appropriate position e.g.
/opt/continuent/writetocloudsql/tungsten/tungsten-replicator/bin/trepctl online -from-event 'mysql-bin.000011:0000000000002552;0'
Checking the status of each process:
[email protected]:~$ /opt/continuent/mysqlreader/tungsten/tungsten-replicator/bin/trepctl status
Processing status command...
NAME VALUE
---- -----
appliedLastEventId : mysql-bin.000007:0000000000000520;112
appliedLastSeqno : 1
appliedLatency : 0.355
autoRecoveryEnabled : false
autoRecoveryTotal : 0
channels : 1
clusterName : mysqlreader
currentEventId : mysql-bin.000007:0000000000000520
currentTimeMillis : 1511354659857
dataServerHost : instance-2
extensions :
host : instance-2
latestEpochNumber : 0
masterConnectUri : thl://localhost:/
masterListenUri : thl://instance-1:2112/
maximumStoredSeqNo : 1
minimumStoredSeqNo : 0
offlineRequests : NONE
pendingError : NONE
pendingErrorCode : NONE
ndingErrorEventId : NONE
pendingErrorSeqno : -1
pendingExceptionMessage: NONE
pipelineSource : jdbc:mysql:thin://instance-2:3306/tungsten_mysqlreader?noPrepStmtCache=true
relativeLatency : 3.857
resourcePrecedence : 99
rmiPort : 10000
role : master
seqnoType : java.lang.Long
serviceName : mysqlreader
serviceType : local
simpleServiceName : mysqlreader
siteName : default
sourceId : instance-2
state : ONLINE
timeInStateSeconds : 72.806
timezone : GMT
transitioningTo :
uptimeSeconds : 74.46
useSSLConnection : false
version : Tungsten Replicator 5.2.1
Finished status command...
[email protected]:/etc/tungsten$ /opt/continuent/writetocloudsql/tungsten/tungsten-replicator/bin/trepctl status
Processing status command...
NAME VALUE
---- -----
appliedLastEventId : NONE
appliedLastSeqno : -1
appliedLatency : -1.0
autoRecoveryEnabled : false
autoRecoveryTotal : 0
channels : -1
clusterName : writetocloudsql
currentEventId : NONE
currentTimeMillis : 1510939133227
dataServerHost : cloudsql
extensions :
host : cloudsql
latestEpochNumber : -1
masterConnectUri : thl://localhost:/
masterListenUri : thl://instance-1:2113/
maximumStoredSeqNo : -1
minimumStoredSeqNo : -1
offlineRequests : NONE
pendingError : Replicator configuration failed
pendingErrorCode : NONE
pendingErrorEventId : NONE
pendingErrorSeqno : -1
pendingExceptionMessage: Unable to translate property value: key=serverId value = 3555962359
pipelineSource : UNKNOWN
relativeLatency : -1.0
resourcePrecedence : 99
rmiPort : 10002
role : master
seqnoType : java.lang.Long
serviceName : writetocloudsql
serviceType : unknown
simpleServiceName : writetocloudsql
siteName : default
sourceId : 35.184.133.21
state : OFFLINE:ERROR
timeInStateSeconds : 431.65
timezone : GMT
transitioningTo :
uptimeSeconds : 433.596
useSSLConnection : false
version : Tungsten Replicator 5.2.1
Finished status command...
I’ve discovered there is a bug with Tungsten 5.2, where high values of server-id parameter prevent replicator from working. CloudSQL sets very high server-ids by default, and this cannot be modified by a user.
I have already reported this to Continuent so hopefully they will come up with a way to fix this soon.
In the meantime the only way to get past this is to open a ticket with Google support, and have them modify the server-id on Cloud SQL instance for you.
Conclusion
Cloud SQL is a very interesting platform for those wanting a fully managed database solution.
Until 2nd gen instances have the ability to replicate from an external master, replicating into Google Cloud SQL using Tungsten after the initial load is one valid alternative. By doing so, you can keep data in sync until you are ready to do the cutover to the new platform.
10 Comments. Leave new
Hi,
We want to keep the tungesten replicator at the source side (master)
We have skipped server id check by using –skip-validation-check=MySQLApplierServerIDCheck option.
Forgot to say that this article helped me a lot. Thank you very much.
Hi Kasi, I believe that would help you get past the installer in case it is unable to check for the server-id value, but you’d still see the error I’ve described in the writer process
We used tungsten replicator 5.2.1. to replicate a mysql instances 5.7 with google cloud sql second generation 5.7 and its working fine.
Can i have an monitoring shell script to monitor this replication
Thanks and Regards,
Preethi
Hi Preethi, glad to hear its working fine now. May I ask what server-id value did you get for your cloudsql instance?
Re: monitoring I am not aware of any scripts available. We usually just grep the state output of trepctl status command and alert if that’s different than ONLINE. Also gather the value of latency for trending.
Hi,
We would like to update the writetocloudsql.ini file with cloud sql proxy details instead of cloud sql instances.
Is it possible.
Regards.
Preethi
Hi Preethi, that should be possible as far as I know, but I haven’t tested it.
We use MySQL 5.7 with JSON data types. Apparently, the newest version of Tungsten supports this now, however, the GitHub project seems to be behind and is only on 5.2.1. Any idea why this is? Are there other products that achieve the same goal?
Hi Gustav, usually the make the new releases available in Github, not sure what’s going on. I suggest you open an issue to request Continuent folks to update the code in Github. I am not aware of any other tools that support JSON.