Tuesday, September 8, 2009

How to set up mysql multi-master replication

Here's one way to implement multi-master replication in MySql for fault-tolerance. I'm assuming you have two databases with each one replicating from the other one. In principle you could extend this to more databases replicating to each other in a ring. It's important to note that the reason to do this is NOT to gain more write scalability (you actually loose write scalability since every write has to happen twice). What you gain is reliability since the different db's can be in different data centers on different networks, with different power grids, in different countries etc.

The basic problem you need to solve with multi-master replication is that there is no single db that holds locks that synchronize transactions. Two transactions on the different dbs will happily run completely unaware of each other and then try to replicate their results to each other and potentially conflict in the updates they made.

1) Set up each mysql instance to use its own unique id sequence so that inserts done on one database will replicate to the other database without conflict. Having unique IDs on each db also helps you debug problems later if something goes wrong since you can tell which db an insert was done on.

Set one db to start inserting at auto increment value 1, the other to start at 2 and then have them each increment by 10 between successive IDs. So all the IDs on the first db will be 1, 11, 21, etc and the ones on the second will be 2, 12, 22, 32 etc. You can also set server-id to the start value so that you remember which db uses which sequence of values.

Read more on how to do this at the mysql website

2) Don't use the pattern "SELECT something, operate on it in your app code and then INSERT or UPDATE". For example, to increment a counter don't read the value, increment it in your app and then write it back with an UPDATE.

Normally innodb tables will create a read lock when you do the SELECT and thus prevent any other transaction from inserting or modifying a record that would have matched the SELECT's WHERE clause. However, since transactions don't span across replication, you could have both dbs run the SELECT, not see a record and both decide to insert it. Or two different transactions on the two db's could read a value, operate on it and then write it back and conflict with each other.

Instead, you can use an INSERT ... ON DUPLICATE KEY UPDATE ... combined with a judicious choice of unique index fields to make sure that the record is only inserted once and then updated in the future. You can also move the update logic out of your application and into your UPDATE statement.

For example, if you're trying to update a counter don't SELECT it's value, if it's not there INSERT it with a zero and if it is there UPDATE it with a new value. Instead create an index on the stat's name perhaps and then use INSERT ... ON DUPLICATE KEY UPDATE to either initialize the counter with a zero value or increment it by one if the counter name already exists. Or a simple UPDATE bar SET foo = foo + 1 could be used if you know the stat record will be there since this can run on both db's and give the correct result without requiring an locks.

3) Since writes are 2x more expensive now that they have to be executed on two different db's, you may want to consider ways to remove writes from your application. Maybe you can store the data in memcache? Or, if you can't remove the write from the db, maybe you can write the data to non-replicated tables and later merge them together in a batch job?

4) Try appending to tables instead of inserting or updating values in table. INSERTs to a table where the only unique index is an auto-increment field can run without any locks (other than for the auto increment value) and so can safely replicate across db's. Maybe you can insert values from your web app running against both db's and then later have a back-end batch process merge the results together?

5) You probably can't use ORMs like sqlalchemy since you can't tell what sql they're generating and it probably makes assumptions about transaction that don't hold.

6) Make users sticky to the same db if at all possible. For example, if you're using geographic DNS to load balance across two different data centers, always send the same client subnet to the same datacenter if possible. This will reduce the chance the the user will write to one db, bounce to the other db, and not see their change or somehow make a conflicting change.

7) Try to avoid using unique values beyond auto increment values. They're impossible to implement since two users could exactly simultaneously try to create the same unique value on both dbs, succeed and then have the replication fail since the value being INSERTed or UPDATEd into the other db will already exist.

Sometimes you just can't avoid unique values. For example, if your site lets you create usernames there's a tiny chance that two users could try to create the same username at the same time. You can at least make this less likely by requiring minimum username lengths.
blog comments powered by Disqus