欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  数据库

From Percona Server to MySQL and back to Percona Server: bew_MySQL

程序员文章站 2022-05-15 12:23:25
...
We're migrating some of our "vanilla" MySQL 5.5 servers to Percona Server 5.5. One of the major incentives is thecrash-safe replicationfeature, allowing slaves to die (power failure) and resume replication without losing position in relay logs.

Whether or not we will migrate all our servers depends on further benchmarking; so far we've noticed unexpected results, but these are still premature to publish.

However the fact that we are using both MySQL & Percona Server has led us into a peculiar situation which I'd like to share. We reseed our servers via LVM snapshots. If we need a new machine, or have a corrupted slave, we capture an image of a running slave and duplicate it, a process which takes the better part of a day. This duplicates not only the data, of course, but also the relay logs, therelay-log.infofile,master.infofile, implying theposition within the topology.

With crash safe replication this also means the transactional relay log position. Recap: crash safe replication writes, per transaction, the relay log status intoibdata1file. So the relay log info inibdata1is in perfect alignment with your committed transactions. Upon server startup, Percona Server reads the info fromibdata1and overwritesrelay-log.infofile (it completely disregards whatever was in that file prior to startup).

Can you guess what could get wrong here? Here's the scenario we had; the same problem can unfold in different scenarios.

Take a look at the following topology:

From Percona Server to MySQL and back to Percona Server: bew_MySQL

(this image is an actual online visualization of a replication topology; for purposes of this blog it's a sandbox topology on my laptop. Please stand by for some very cool open source release announcement shortly)

We copiedsrv-2(Percona Server) intosrv-3(MySQL). They both run well.A few days laterwe addedsrv-4as Percona Server and (I'm cutting the story short here) reseeded it fromsrv-3. We startedsrv-4. Bam! Won't replicate since it can't find the required master logs.

Why? It was reseeded fromsrv-3which was well replicating. It took less than 24 hours to complete the process and the master has 4 days of binary logs retention. Why would the newsrv-4fail to find the required logs on the master?

The catch here is that the Crash Safe Replication info residing inibdata1was copied fromsrv-2tosrv-3, where it was ignored (remembersrv-3is plain old MySQL and is ignorant of this info). This turned the info onsrv-3stale; it never got updated. Not only was it stale, it was also out of sync withsrv-3's execution. But when data was copied tosrv-4, Crash Safe Replication info was copied along, andsrv-4was happy to read this info upon strartup and use it to overwrite the perfectly validrelay-log.infofile. By that time the master has long since purged the binary logs indicated in the newly rewrittenrelay-log.infofile.

To some respect we were lucky, because this gave us immediate feedback and insight on what went wrong. Had replication found the logs on the master, it would have probably executed for a while, then crash on some Duplicate Key error where it would be much more difficult to track the origin of the problem.

Now that we are aware of the problem, we are more careful: you need to be careful once for each newly reseeded Percona Server instance, upon startup. We've added the following row to our/etc/init.d/mysqlscript, just before starting the server:

cp $datadir/relay-log.info $datadir/relay-log.info.pre-start

When we start a Percona Server for the first time we make sure to resetrelay-log.infousingrelay-log.info.pre-start. We then go on with our lives. Until such time that all of our topology is composed of Percona Server, we have one more thing to be careful about.