Best practice for setting up MySQL replication filters

Posted in: MySQL, Technical Track

It is not uncommon that we need to filter out some DBs or Tables while setting up replication. It is important to understand how MySQL evaluates/process the replication filtering rules to avoid the conflicting or confusion while we setting them up.The purpose of this blog is to illustrate the rules and provide some suggestions for best practice.

MySQL provides 3 levels of filters for setting up replication: Binary log, DB and Table. The binlog filters apply on the master to control how to log the changes. Since MySQL replication is based on the binlog, it is the first level filter and has the highest priority. While the DB-level and Table-level filters apply on the slaves, since each table belongs to a schema, the DB-level filters have higher priority than the Table-level ones. Inside the Table-level filters, MySQl will evaluate the options in the order of: –replicate-do-table, –replicate-ignore-table ,  –replicate-wild-do-table , –replicate-wild-ignore-table.

Based on that, we have the following suggestions for setting up MySQL replication filter as best practice:

I)Do not setup any binlog-level filters unless you really need to and can afford losing the chance of  having an extra full copy of data changes for the master.

II)In DB-level filters, use either one or none of the two options: –replicate-do-db or –replicate-ignore-db. Never use both at the same time.

III) While using binlog_format=’statement’ OR ‘mixed’ (in mixed mode, if  a transaction is deterministic then it will be stored in statement format) and set up –replicate-do-db or –replicate-ignore-db on slaves, make sure never make changes on the tables across the default database on master otherwise you might lose the changes on slave due to default database not matching.

IV)In Table-level filters, use only one of the 2 options, or use the following two combination: –replicate-ignore-table and —replicate-wild-do-table to avoid conflicting and confusing.

For MariaDB replication filters within Galera cluster, it should be used with caution. As a general rule except for InnoDB DML updates, the following replication filters are not honored in a Galera cluster :  binlog-do-db ,binlog-ignore-db, replicate-wild-do-db, replicate-wild-ignore-db. However, replicate-do-db,replicate-ignore-db filters are honored for DDL and DML for both InnoDB & MyISAM engines. As they might create discrepancies and replication may abort (see MDEV-421, MDEV-6229). (https://mariadb.com/kb/en/mariadb/mariadb-galera-cluster-known-limitations/), For the slaves replicating from cluster, the rules are similar with normal replication settings as above.

Here are the details/reasons:

1)Binlog-level filters

A)How MySQL process the Binlog-level filters

There are 2 options for setting binlog filter on master:  –binlog-do-db and –binlog-ignore-db. MySQL will check –binlog-do-db first, if there are any options, it will apply this one and ignore –binlog-ignore-db. If the –binlog-do-db is NOT set, then mysql will check –binlog-ignore-db.If both of them are empty, it will log changes for all DBs.

See the below examples. In scenario 1) no binlog level filters are set and so all changes were logged; In scenario 2) -binlog-do-db and –binlog-ignore-db are all set to m_test and changes on the DB m_test were logged and changes on the DB test were NOT logged;In scenario 3) only –binlog-ignore-db is set to m_test and so changes on the DB m_test were NOT logged and changes on the DB test were  logged;
scenario 1)–binlog-do-db and –binlog-ignore-db is NOT set:

mysql> show master status;

+——————+———-+————–+——————+——————-+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+——————+———-+————–+——————+——————-+

| vm-01-bin.000003 |      120 |              |                  |                   |

+——————+———-+————–+——————+——————-+

1 row in set (0.00 sec)

mysql> show binlog events in “vm-01-bin.000003” from 120;  

Empty set (0.00 sec)

mysql> insert into t1(id,insert_time) values(10,now());

Query OK, 1 row affected (0.05 sec)

 

mysql> show binlog events in “vm-01-bin.000003” from 120;

+——————+—–+————+———–+————-+—————————————————————+

| Log_name         | Pos | Event_type | Server_id | End_log_pos | Info                                                          |

+——————+—–+————+———–+————-+—————————————————————+

| vm-01-bin.000003 | 120 | Query      |         1 |         211 | BEGIN                                                         |

| vm-01-bin.000003 | 211 | Query      |         1 |         344 | use `m_test`; insert into t1(id,insert_time) values(10,now()) |

| vm-01-bin.000003 | 344 | Xid        |         1 |         375 | COMMIT /* xid=17 */                                           |

+——————+—–+————+———–+————-+—————————————————————+

3 rows in set (0.00 sec)

scenario 2)–binlog-do-db=m_test and –binlog-ignore-db=m_test:

— insert into tables of DB m_test was logged

mysql> show master status;

+——————+———-+————–+——————+——————-+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+——————+———-+————–+——————+——————-+

| vm-01-bin.000004 |      656 | m_test       | m_test           |                   |

+——————+———-+————–+——————+——————-+

1 row in set (0.00 sec)

 

mysql> use m_test

 

mysql> insert into t1(insert_time) values(now());

Query OK, 1 row affected (0.02 sec)

 

mysql> show binlog events in “vm-01-bin.000004” from 656;

+——————+—–+————+———–+————-+———————————————————+

| Log_name         | Pos | Event_type | Server_id | End_log_pos | Info                                                    |

+——————+—–+————+———–+————-+———————————————————+

| vm-01-bin.000004 | 656 | Query      |         1 |         747 | BEGIN                                                   |

| vm-01-bin.000004 | 747 | Intvar     |         1 |         779 | INSERT_ID=13                                            |

| vm-01-bin.000004 | 779 | Query      |         1 |         906 | use `m_test`; insert into t1(insert_time) values(now()) |

| vm-01-bin.000004 | 906 | Xid        |         1 |         937 | COMMIT /* xid=26 */                                     |

+——————+—–+————+———–+————-+———————————————————+

4 rows in set (0.00 sec)

— insert into tables of DB test was NOT logged

mysql> use test;

 

mysql> show master status ;

+——————+———-+————–+——————+——————-+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+——————+———-+————–+——————+——————-+

| vm-01-bin.000004 |      937 | m_test       | m_test           |                   |

+——————+———-+————–+——————+——————-+

 

mysql> insert into t1(`a`) values(‘ab’);

Query OK, 1 row affected (0.03 sec)

 

mysql> show binlog events in “vm-01-bin.000004” from 937;

Empty set (0.00 sec)

 

scenario 3)–Binlog_Do_DB=null –binlog-ignore-db=m_test:

mysql> use m_test

mysql> show master status;

+——————+———-+————–+——————+——————-+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+——————+———-+————–+——————+——————-+

| vm-01-bin.000005 |      120 |              | m_test           |                   |

+——————+———-+————–+——————+——————-+

mysql> insert into t1(insert_time) values(now());

Query OK, 1 row affected (0.01 sec)

 

mysql> show binlog events in “vm-01-bin.000005” from 120;

Empty set (0.00 sec)

 

mysql> use test

mysql> insert into t1(`a`) values(‘ba’);

Query OK, 1 row affected (0.03 sec)

 

mysql> show binlog events in “vm-01-bin.000005” from 120;

+——————+—–+————+———–+————-+———————————————-+

| Log_name         | Pos | Event_type | Server_id | End_log_pos | Info                                         |

+——————+—–+————+———–+————-+———————————————-+

| vm-01-bin.000005 | 120 | Query      |         1 |         199 | BEGIN                                        |

| vm-01-bin.000005 | 199 | Query      |         1 |         305 | use `test`; insert into t1(`a`) values(‘ba’) |

| vm-01-bin.000005 | 305 | Xid        |         1 |         336 | COMMIT /* xid=22 */                          |

+——————+—–+————+———–+————-+———————————————-+

3 rows in set (0.00 sec)

 

B)Best practice for setting up the Binlog-level filters

So, for Binlog-level filter, we will use either one (and ONLY one or none) of the 2 options: –binlog-do-db to make MySQL log changes for the DBs in the list. OR, –binlog-ignore-db to make MySQL log changes for the DBs NOT in the list. Or leave both of them empty to log changes for all the DBs.

However, we usually recommend NOT to setup any binlog-level filters. The reason is that to log changes for all DBs and set up filters only on slaves will achieve the same purpose and let us have an extra full copy of data changes for the master, in case we will need that for recovery.

 

2)DB-level filters

A)How MySQL process the DB-level filters

There are 2 options for setting DB-level filters:  –replicate-do-db or –replicate-ignore-db. MySQL processes these two filters the similar way as it processes the Binlog-level filters, the difference is that it ONLY applies on the slaves and so affects how the slaves replicate from its master. It will check –replicate-do-db first, if there are any options, it will replicate the DBs in the list and ignore –replicate-ignore-db. If the –replicate-do-db is NOT set, then mysql will check –replicate-ignore-db and replicate all the DBs except for the ones in this list.If both of them are empty, it will replicate all the DBs. you can find the process in the below chart from https://dev.mysql.com/doc/refman/5.7/en/replication-rules-db-options.html

There is a trick for DB-level filters though If the binlog_format is set as statement or mixed. (The binlog_format =mixed also applies here, it is because that  in mixed mode replication, in case the transaction  is deterministic it will be resolved to statement which is equivalent to statement mode) .. Since “With statement-based replication, the default database is checked for a match.” (https://dev.mysql.com/doc/refman/5.7/en/replication-rules-db-options.html). If you set up –replicate-do-db and you update a table out of the default database in master, the update statement will not be replicated if the default database you are running command from is not in the  –replicate-do-db. For example, there are 2 DBs in master, you set binlog_format=’statement’ OR ‘mixed’ and set –replicate-do-db=DB1 on slave. when execute the following commands: use DB2; update DB1.t1 … This update command will not be executed on slave. To make the update statement replicated to slave, you need to do: use DB1, update t1 …

For example: with binlog_format=statement or binlog_format=mixed,  we insert into m_test.t1 in two approaches: one is using default DB as m_test, the other one is using default DB test, the changes are all logged in the master. But in slave, after it caught up, only the insert(default DB is m_test) was replicated to slave, and the insert (default DB is test) was NOT replicated. As shown below:

Scenario 1) binlog_format=statement

In master: insert into m_test.t1 in two approaches: one is using default DB as m_test, the other one is using default DB test, the changes are all logged

mysql> use m_test

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

 

Database changed

mysql> delete from t1;

Query OK, 16 rows affected (0.02 sec)

 

mysql> select * from m_test.t1;

Empty set (0.00 sec)

 

mysql> use m_test

Database changed

mysql> insert into m_test.t1(insert_time) values(now());

Query OK, 1 row affected (0.04 sec)

 

mysql> use test;

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

 

Database changed

mysql> insert into m_test.t1(insert_time) values(now());

Query OK, 1 row affected (0.03 sec)

 

mysql> show binlog events in “vm-01-bin.000006” from 654;

+——————+——+————+———–+————-+—————————————————————-+

| Log_name         | Pos  | Event_type | Server_id | End_log_pos | Info                                                           |

+——————+——+————+———–+————-+—————————————————————-+

| vm-01-bin.000006 |  654 | Xid        |         1 |         685 | COMMIT /* xid=39 */                                            |

| vm-01-bin.000006 |  685 | Query      |         1 |         768 | BEGIN                                                          |

| vm-01-bin.000006 |  768 | Query      |         1 |         860 | use `m_test`; delete from t1                                   |

| vm-01-bin.000006 |  860 | Xid        |         1 |         891 | COMMIT /* xid=48 */                                            |

| vm-01-bin.000006 |  891 | Query      |         1 |         982 | BEGIN                                                          |

| vm-01-bin.000006 |  982 | Intvar     |         1 |        1014 | INSERT_ID=17                                                   |

| vm-01-bin.000006 | 1014 | Query      |         1 |        1148 | use `m_test`; insert into m_test.t1(insert_time) values(now()) |

| vm-01-bin.000006 | 1148 | Xid        |         1 |        1179 | COMMIT /* xid=52 */                                            |

| vm-01-bin.000006 | 1179 | Query      |         1 |        1268 | BEGIN                                                          |

| vm-01-bin.000006 | 1268 | Intvar     |         1 |        1300 | INSERT_ID=18                                                   |

| vm-01-bin.000006 | 1300 | Query      |         1 |        1432 | use `test`; insert into m_test.t1(insert_time) values(now())   |

| vm-01-bin.000006 | 1432 | Xid        |         1 |        1463 | COMMIT /* xid=60 */                                            |

+——————+——+————+———–+————-+—————————————————————-+

12 rows in set (0.00 sec)

 

mysql> select * from m_test.t1;

+—-+———————+

| id | insert_time         |

+—-+———————+

| 17 | 2016-03-20 14:59:41 |

| 18 | 2016-03-20 15:00:01 |

+—-+———————+

2 rows in set (0.00 sec)

 

In slave: after it caught up, only the first insert(default DB is m_test) was replicated to slave, and the insert (default DB is test) was NOT replicated

mysql> show slave status\G

*************************** 1. row ***************************

              Slave_IO_State: Waiting for master to send event

                 Master_Host: 10.0.2.6

                 Master_User: repl

                 Master_Port: 3306

               Connect_Retry: 10

             Master_Log_File: vm-01-bin.000006

         Read_Master_Log_Pos: 1463

              Relay_Log_File: ewang-vm-03-relay-bin.000017

               Relay_Log_Pos: 1626

       Relay_Master_Log_File: vm-01-bin.000006

            Slave_IO_Running: Yes

           Slave_SQL_Running: Yes

             Replicate_Do_DB: m_test

         Replicate_Ignore_DB:

          Replicate_Do_Table:

      Replicate_Ignore_Table:

     Replicate_Wild_Do_Table:

 Replicate_Wild_Ignore_Table:

                  Last_Errno: 0

                  Last_Error:

                Skip_Counter: 0

         Exec_Master_Log_Pos: 1463

             Relay_Log_Space: 1805

             Until_Condition: None

              Until_Log_File:

               Until_Log_Pos: 0

          Master_SSL_Allowed: No

          Master_SSL_CA_File:

          Master_SSL_CA_Path:

             Master_SSL_Cert:

           Master_SSL_Cipher:

              Master_SSL_Key:

       Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

               Last_IO_Errno: 0

               Last_IO_Error:

              Last_SQL_Errno: 0

              Last_SQL_Error:

 Replicate_Ignore_Server_Ids:

            Master_Server_Id: 1

                 Master_UUID: a22b3fb2-5e70-11e5-b55a-0800279d00c5

            Master_Info_File: /mysql/data/master.info

                   SQL_Delay: 0

         SQL_Remaining_Delay: NULL

     Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it

          Master_Retry_Count: 86400

                 Master_Bind:

     Last_IO_Error_Timestamp:

    Last_SQL_Error_Timestamp:

              Master_SSL_Crl:

          Master_SSL_Crlpath:

          Retrieved_Gtid_Set:

           Executed_Gtid_Set:

               Auto_Position: 0

1 row in set (0.00 sec)

 

mysql> select * from m_test.t1;

+—-+———————+

| id | insert_time         |

+—-+———————+

| 17 | 2016-03-20 14:59:41 |

+—-+———————+

1 row in set (0.00 sec)

 

Scenario 2) binlog_format=mixed

In master:

mysql> show variables like ‘binlog_format’;

+—————+——-+

| Variable_name | Value |

+—————+——-+

| binlog_format | MIXED |

+—————+——-+

1 row in set (0.00 sec)

 

mysql> show master status;

+——————+———-+————–+——————+——————-+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+——————+———-+————–+——————+——————-+

| vm-01-bin.000007 |      120 |              |                  |                   |

+——————+———-+————–+——————+——————-+

1 row in set (0.00 sec)

 

mysql> use m_test

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

 

Database changed

mysql> insert into m_test.t1(insert_time) values(now());

Query OK, 1 row affected (0.04 sec)

 

mysql> use test;

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

 

Database changed

mysql> insert into m_test.t1(insert_time) values(now());

Query OK, 1 row affected (0.04 sec)

 

mysql> show binlog events in “vm-01-bin.000007” from 120;

+——————+—–+————+———–+————-+—————————————————————-+

| Log_name         | Pos | Event_type | Server_id | End_log_pos | Info                                                           |

+——————+—–+————+———–+————-+—————————————————————-+

| vm-01-bin.000007 | 120 | Query      |         1 |         211 | BEGIN                                                          |

| vm-01-bin.000007 | 211 | Intvar     |         1 |         243 | INSERT_ID=19                                                   |

| vm-01-bin.000007 | 243 | Query      |         1 |         377 | use `m_test`; insert into m_test.t1(insert_time) values(now()) |

| vm-01-bin.000007 | 377 | Xid        |         1 |         408 | COMMIT /* xid=45 */                                            |

| vm-01-bin.000007 | 408 | Query      |         1 |         497 | BEGIN                                                          |

| vm-01-bin.000007 | 497 | Intvar     |         1 |         529 | INSERT_ID=20                                                   |

| vm-01-bin.000007 | 529 | Query      |         1 |         661 | use `test`; insert into m_test.t1(insert_time) values(now())   |

| vm-01-bin.000007 | 661 | Xid        |         1 |         692 | COMMIT /* xid=53 */                                            |

+——————+—–+————+———–+————-+—————————————————————-+

8 rows in set (0.00 sec)

mysql> select * from m_test.t1;

+—-+———————+

| id | insert_time         |

+—-+———————+

| 17 | 2016-03-20 14:59:41 |

| 18 | 2016-03-20 15:00:01 |

| 19 | 2016-03-20 15:09:14 |

| 20 | 2016-03-20 15:09:25 |

+—-+———————+

4 rows in set (0.00 sec)

 

In slave:

mysql> show variables like ‘binlog_format’;

+—————+——-+

| Variable_name | Value |

+—————+——-+

| binlog_format | MIXED |

+—————+——-+

1 row in set (0.00 sec)

 

mysql> show slave status\G

*************************** 1. row ***************************

              Slave_IO_State: Waiting for master to send event

                 Master_Host: 10.0.2.6

                 Master_User: repl

                 Master_Port: 3306

               Connect_Retry: 10

             Master_Log_File: vm-01-bin.000007

         Read_Master_Log_Pos: 692

              Relay_Log_File: ewang-vm-03-relay-bin.000023

               Relay_Log_Pos: 855

       Relay_Master_Log_File: vm-01-bin.000007

            Slave_IO_Running: Yes

           Slave_SQL_Running: Yes

             Replicate_Do_DB: m_test

         Replicate_Ignore_DB:

          Replicate_Do_Table:

      Replicate_Ignore_Table:

     Replicate_Wild_Do_Table:

 Replicate_Wild_Ignore_Table:

                  Last_Errno: 0

                  Last_Error:

                Skip_Counter: 0

         Exec_Master_Log_Pos: 692

             Relay_Log_Space: 1034

             Until_Condition: None

              Until_Log_File:

               Until_Log_Pos: 0

          Master_SSL_Allowed: No

          Master_SSL_CA_File:

          Master_SSL_CA_Path:

             Master_SSL_Cert:

           Master_SSL_Cipher:

              Master_SSL_Key:

       Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

               Last_IO_Errno: 0

               Last_IO_Error:

              Last_SQL_Errno: 0

              Last_SQL_Error:

 Replicate_Ignore_Server_Ids:

            Master_Server_Id: 1

                 Master_UUID: a22b3fb2-5e70-11e5-b55a-0800279d00c5

            Master_Info_File: /mysql/data/master.info

                   SQL_Delay: 0

         SQL_Remaining_Delay: NULL

     Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it

          Master_Retry_Count: 86400

                 Master_Bind:

     Last_IO_Error_Timestamp:

    Last_SQL_Error_Timestamp:

              Master_SSL_Crl:

          Master_SSL_Crlpath:

          Retrieved_Gtid_Set:

           Executed_Gtid_Set:

               Auto_Position: 0

1 row in set (0.00 sec)

 

mysql> select * from m_test.t1;

+—-+———————+

| id | insert_time         |

+—-+———————+

| 17 | 2016-03-20 14:59:41 |

| 19 | 2016-03-20 15:09:14 |

+—-+———————+

2 rows in set (0.00 sec)

 

B)Best practice for setting up the DB-level filters

Use either one or none of the two options: –replicate-do-db or –replicate-ignore-db. Never use both at the same time.

If you use binlog_format=’statement’  OR ‘mixed’ and set up –replicate-do-db or –replicate-ignore-db on slaves, make sure never make changes on the tables across the default database, otherwise the data discrepancy will be expected in the slaves.

 

3)Table-level filters

There are 4 options for setting Table-level filters: –replicate-do-table, –replicate-ignore-table ,  –replicate-wild-do-table or –replicate-wild-ignore-table. MySQL evaluates the options in order. you can find the process in the below chart from https://dev.mysql.com/doc/refman/5.6/en/replication-rules-table-options.html

 

The above chart shows us that MySQL will first check –replicate-do-table, the tables listed here will be replicated and so won’t be ignored by the following options like –replicate-ignore-table , or –replicate-wild-ignore-table. Then MySQL will check –replicate-ignore-table, the tables listed here will be ignored even if it shows up in the following options  –replicate-wild-do-table. The lowest priority is –replicate-wild-ignore-table.

B)Best practice for setting up the Table-level filters

Due to the priorities for the 4 Table_level options, to avoid confusing/conflicting, we suggest using only one of the 4 options, or using the following two options: –replicate-ignore-table and replicate-wild-do-table so that it is clearly that the tables in –replicate-ignore-table will be ignored and the tables in replicate-wild-do-table will be replicated.

 

email

Author

Interested in working with Edwin? Schedule a tech call.

No comments

Leave a Reply

Your email address will not be published. Required fields are marked *