欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  数据库

GTIDs in MySQL 5.6: New replication protocol; new ways to br_MySQL

程序员文章站 2022-05-02 13:38:19
...
One of the MySQL 5.6 features many people are interested in is Global Transactions IDs (GTIDs). This is for a good reason: Reconnecting a slave to a new master has always been a challenge while it is so trivial when GTIDs are enabled. However, using GTIDs is not only about replacing good old binlog file/position with unique identifiers, it is also using a new replication protocol. And if you are not aware of it, it can bite.

Replication protocols: old vs new

The old protocol is pretty straightforward: the slave connects to a given binary log file at a specific offset, and the master sends all the transactions from there.

The new protocol is slightly different: the slave first sends the range of GTIDs it has executed, and then the master sends every missing transaction. It also guarantees that a transaction with a given GTID can only be executed once on a specific slave.

In practice, does it change anything? Well, it may change a lot of things. Imagine the following situation: you want to start replicating from trx 4, but trx 2 is missing on the slave for some reason.

GTIDs in MySQL 5.6: New replication protocol; new ways to br_MySQL

With the old replication protocol, trx 2 will never be executed while with the new replication protocol, itWILLbe executed automatically.

Here are 2 common situations where you can see the new replication protocol in action.

Skipping transactions

It is well known that the good oldSET GLOBAL sql_slave_skip_counter = Nis no longer supported when you want to skip a transaction and GTIDs are enabled. Instead, to skip the transaction withGTID XXX:N, you have toinject an empty transaction:

mysql> SET gtid_next = 'XXX:N';mysql> BEGIN; COMMIT;mysql> SET gtid_next = 'AUTOMATIC';

mysql>SETgtid_next='XXX:N';

mysql>BEGIN;COMMIT;

mysql>SETgtid_next='AUTOMATIC';

Why can’t we usesql_slave_skip_counter? Because of the new replication protocol!

Imagine that we have 3 servers like the picture below:

GTIDs in MySQL 5.6: New replication protocol; new ways to br_MySQL

Let’s assume thatsql_slave_skip_counteris allowed and has been used on S2 to skip trx 2. What happens if you make S2 a slave of S1?

Both servers will exchange the range of executed GTIDs, and S1 will realize that it has to send trx 2 to S2. Two options then:

  • If trx 2 is still in the binary logs of S1, it will be sent to S2, and the transaction is no longer skipped.
  • If trx 2 no longer exists in the binary logs of S1, you will get a replication error.

This is clearly not safe, that’s whysql_slave_skip_counteris not allowed with GTIDs. The only safe option to skip a transaction is to execute a fake transaction instead of the real one.

Errant transactions

If you execute a transaction locally on a slave (called errant transaction in the MySQL documentation), what will happen if you promote this slave to be the new master?

With the old replication protocol, basically nothing (to be accurate, data will be inconsistent between the new master and its slaves, but that can probably be fixed later).

With the new protocol, the errant transaction will be identified as missing everywhere and will be automatically executed on failover, which has the potential to break replication.

Let’s say you have a master (M), and 2 slaves (S1 and S2). Here are 2 simple scenarios where reconnecting slaves to the new master will fail (with different replication errors):

# Scenario 1

# S1mysql> CREATE DATABASE mydb;# Mmysql> CREATE DATABASE IF NOT EXISTS mydb;# Thanks to 'IF NOT EXITS', replication doesn't break on S1. Now move S2 to S1:# S2mysql> STOP SLAVE; CHANGE MASTER TO MASTER_HOST='S1'; START SLAVE;# This creates a conflict with existing data!mysql> SHOW SLAVE STATUS/G[...]Last_SQL_Errno: 1007 Last_SQL_Error: Error 'Can't create database 'mydb'; database exists' on query. Default database: 'mydb'. Query: 'CREATE DATABASE mydb'[...]
# S1

mysql>CREATEDATABASEmydb;

# M

mysql>CREATEDATABASEIFNOTEXISTSmydb;

# Thanks to 'IF NOT EXITS', replication doesn't break on S1. Now move S2 to S1:

# S2

mysql>STOPSLAVE;CHANGEMASTERTOMASTER_HOST='S1';STARTSLAVE;

# This creates a conflict with existing data!

mysql>SHOWSLAVESTATUS/G

[...]

Last_SQL_Errno:1007

Last_SQL_Error:Error'Can'tcreatedatabase'mydb';databaseexists' on query. Default database: 'mydb'. Query: 'CREATEDATABASEmydb'

[...]

# Scenario 2

# S1mysql> CREATE DATABASE mydb;# Now, we'll remove this transaction from the binary logs# S1mysql> FLUSH LOGS;mysql> PURGE BINARY LOGS TO 'mysql-bin.000008';# Mmysql> CREATE DATABASE IF NOT EXISTS mydb;# S2mysql> STOP SLAVE; CHANGE MASTER TO MASTER_HOST='S1'; START SLAVE;# The missing transaction is no longer available in the master's binary logs!mysql> SHOW SLAVE STATUS/G[...]Last_IO_Errno: 1236Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'[...]
# S1

mysql>CREATEDATABASEmydb;

# Now, we'll remove this transaction from the binary logs

# S1

mysql>FLUSHLOGS;

mysql>PURGEBINARYLOGSTO'mysql-bin.000008';

# M

mysql>CREATEDATABASEIFNOTEXISTSmydb;

# S2

mysql>STOPSLAVE;CHANGEMASTERTOMASTER_HOST='S1';STARTSLAVE;

# The missing transaction is no longer available in the master's binary logs!

mysql>SHOWSLAVESTATUS/G

[...]

Last_IO_Errno:1236

Last_IO_Error:Gotfatalerror1236frommasterwhenreadingdatafrombinarylog:'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'

[...]

As you can understand, errant transactions should be avoided with GTID-based replication. If you need to run a local transaction, your best option is to disable binary logging for that specific statement:

mysql> SET SQL_LOG_BIN = 0;mysql> # Run local transaction

mysql>SETSQL_LOG_BIN=0;

mysql># Run local transaction

Conclusion

GTIDs are a great step forward in the way we are able to reconnect replicas to other servers. But they also come with new operational challenges. If you plan to use GTIDs, make sure you correctly understand the new replication protocol, otherwise you may end up breaking replication in new and unexpected ways.

I’ll do more exploration about errant transactions in a future post.