MySQL Replication from Production to staging to development

We've recently gone through a project with some clients whose Drupal Database was very large.  Once your Drupal database gets big, there are some typical problems that you'll encounter.  Most of the problems center around copying the Production instance on your database.  Depending on the size, making a copy can take hours, even days.  The way to remove these costly obstacles is to implement a set of chained slaves through database replication. 

Let's pause for a brief overview of what replication actually is.  Web applications like Drupal are run from one key database, The Master.   Slaves are copies of Master databases.  Replication is the act of live copying all changes that happen on the Master to a slave.  This is the simple form of database replication: live copying data from a Master databases to a copy or Slave database.

In most software design methods Production, staging, and development instances of the database are maintained, in order to support constant coding development.  Production database are sacred.  The staging database is used to do final testing of newly developed code, before it gets promoted to production.  Development is used before everything gets promoted to Staging.  If you're copying databases around, it's time consuming, and often troublesome.

The way to solve this problem, once and for all is to use replication.  Production copies it's data to a staging slave and then a development slave.  Staging and development feed testing instances within the networks where they live.

It's important to note that there's more than time saved by doing it this way.  By employing replication, the Production, Staging and development databases can live in a hosting environment properly suited for them.  There's no need, for example, to have a development server running on Amazon.  This should ideally live on an internal server, costing nothing by comparison. 

The diagram above illustrates the flow of data from Production, to slave, then development through chained slaves.  The value add is that everything works faster between sprints.

You'll find this solution will work for you too, if you employ it in this way, refreshing the stage testing and development testing instances as needed.  It should be noted, that this is not an easy setup.  It requires an advanced MySQL sys admin to properly tune and check.  Once it's setup properly though, your development can hum along nicely.




Already had a great comment from someone on MySQL professionals through LinkedIN.  It's a private group so I'll give you the short version of his warning.  This solution is error prone.   It is.  He's right.  You need a MySQL professional, if you need something like this. 

In this solution, I was never a big fan of maintaining replication to the Development Test box from Development slave, for example.  I turned it on, but only after one of the primary Leads acknowledged the weaknesses, in favor of doing it.

To stress his point about things needing a senior level DBA, google pt-table-checksum and pt-table-sync, if you're not familiar with them.  This solution requires those tools being used properly, which is not easy.

Hope to hear more.


Add new comment