Replication Documentation

Overview
Things Never to Do
Issues in Replication
How Do I Delete Records?
How the Programs Work
Important Things to Be Aware Of
Local Records
What the System Tables Are
What the Replication Variables Are
Programs and Directories
What the Programs Do
Known Limitations
Wishlist

Overview

This replication software, which is is released to the public domain by the author without any warranty whatsoever, is designed to do two-way replication of mysql databases. It is written in PHP, and could be adopted to two-way replication of other databases relatively easier. The "gotcha:" this software is not transparent and will not work with arbitrary databases The databases and the software that uses them must be built with replication in mind. This guide explains how to build and use a replicated database step-by-step.

Create a Replicable Database

To create a replicable database, you must create a database with special system tables to track the replication process, and those tables you wish to replicate must have special fields to keep track of information for replication. The utility "createreplicable.php3" creates a new database, overwriting any existing database with the same name, adds the systems tables, and defines a variable $repsql which provides the fields needed for a replicable table. You should write your own PHP program to create the tables you need for your database. Define the variables $connect, $user, $pass and $db, then include the files "createreplicable.php3". Following this, create your tables. You may use two variables defined in createreplicable.php3: $conn which is the connection to the new database, and $repsql which is used to help creating replicable tables. If you do not wish a table to be replicable, do nothing special. If you wish to define a replicable table, use the sql:

"CREATE TABLE name_of_your_table (" . $repsql . ", your_field your_field type, etc....)"

For the record

$repsql = "rep_db int default 0, id bigint PRIMARY KEY, rep_modified timestamp, rep_conflict enum('y', 'n', 'l') default 'n', deleted enum('y', 'n') default 'n', INDEX(rep_conflict, deleted), INDEX(rep_modified)";

Obviously since, $repsql defines a primary key, you should not define one of your own, nor should you use any of the field names used by $repsql. From the user point of view, the key variables are "id", "rep_db" and "deleted".

After you define the table, you must make it known to the database as a replicated table. You do this by adding the name of the table to the system table that tracks all replicated tables, "rep_tables".

"INSERT INTO rep_tables (name) VALUES ('name_of_your_table')"

Populate Your Database

Once the database is created, you should populate it with data. The only thing you need to be aware of is that when you insert records into a replicated table, you must generate the primary key "id" in a special way. You can do this by including the file "getkey.php3", which defines a single function "getkey($conn)" where the argument $conn is a valid PHP connection to your database. It return a number that you can use as a key.

"INSERT INTO name_of_your_table (id, your_fields) VALUES (" . getkey($conn) . ", values_of_your_fields)"

Make a Replica

Now you run "makereplica.php3" to create a replica of your database.

Update Your Databases

You may now replicate changes between your databases by running "update.php3". This program is strictly one way: there is a source database and a target database. No data is ever changed in the source database. If you want to do two way replication, run the program twice, with the source and target interchanged.

Things Never to Do

Never make a copy of a replicated database. Make a new replica using "makereplica.php3".
Never delete a record from a replicated table.
When updating a record in a replicated table ALWAYS set rep_db = 0.
Remember to use getkey (or an equivalent) to generate new keys for replicated table.

Issues in Replication

There are two basic types of replication. One way replication assumes that one database is a master copy and receives all transactions. Other slave databases mirror the master, and are used as mirrors or for backup. Mysql supports one-way replication directly. A simple and reliable method of one-way replication is simply to copy the entire database. However, as databases can be quite large this can be costly. So replication programs attempt to identify which records have been changed and copy only the changes.

In two way replication all replicas may receive transactions. This poses two problems:

Key generation: ordinarily records are identified by a unique key. Typically these keys are generated by autonumbering - each new record receives a key value one larger than the previous record. However, this scheme will not work with replicated databases: if there are two identical replicas each with one hundred records, if each replica has a new record added to it, both get the same key "101", even though the records are intended to be different. Hence the use of "getkey" to generate keys. Each database has a unique ID. Keys for that database always contain that id imbedded in the key, so that keys added in two different replicas are always different. The "getkey" utility simply reads the database to determine the next available key for that particular replica.
Conflicts: two identical records may be modified in different ways in different replicas before the update program is run. This generates a conflict: the program must somehow decide which record is correct. Here the solution is to keep the most recent record, but also make a copy of the older record so that it can be restored by hand, if it turns out it was the one wanted.
Deletes: if a record exists in one replica but not the other, is this because it was added to one, or deleted from the other? The only way to tell is to keep track of which records have been deleted. Hence the rule: never delete a record from a replicated table.

How Do I Delete A Record?

If you are not allowed to delete records from a replicated table, what do you do with records that you want to get rid of? Answer: update the record and set deleted = 'y'. Specifically, to delete record "id" use the sql

"UPDATE name_of_your_table SET rep_db = 0, deleted = 'y' WHERE id = " . $id

Please do not forget the where clause. The biggest source of programmer errors I make is forgetting the where clause and accidentally deleting or otherwise modifying all of the records in the table.

This has the following implication for SELECT queries if you do not want to see the deleted records: when you SELECT, you should generally include in yhour WHERE clause "WHERE deleted = 'n' AND rep_conflict <> 'y'" (presuming you do not want to see duplicate records generated by conflicts).

This strategy of never deleting a record might seem expensive, since the record might be quite long and take up a lot of space in the database. For this reason, there is a system table called "rep_graveyard" where old records go to die. If you run the utility program "undertaker.php3", every night, for example, it will delete the records marked deleted = 'y' from every replicated table and record the essential system information in the "rep_graveyard" file so the the update routine can continue to work correctly. So: do not delete records, mark them deleted and let the undertaker do the actual deleting. Notice that every record ever inserted into a replicated table will always be represented at least by a stub in the graveyard. This is unavoidable overhead if replication is to be reliable.

How the Programs Work

Each time you insert or update a record, we must keep track of which database the change was made in, and the time at which the change occured. The field rep_db identifies the database, the field rep_modified the modification time. For programming simplicity, the current database is always identified as "0". Hence the rule that when you update a record, you should always set rep_db = 0. You do not need to do this when you insert a record (although you can), because 0 is the default value for rep_db. You do not need to explicitly set rep_modified, because it is a timestamp, and it is the first timestamp in the table, so Mysql always sets it equal to the current time unless you explictly set it, which you should never do.

In addition to tracking changes, the update program tracks the updates. There is a system table rep_dbs which has a list of every known replica, and the latest time at which all records from that replica have been received by the current database. In particular, at the end of "update.php3" the field "updated_from" is updated. Notice that if the program fails, the field will not be updated, since we do not know in that case that all records have been received.

Whenever "update.php3" runs, it iterates through each replica in turn. Suppose that we are currently retrieving records from replica 2, when the source database is replica 3 and the target database is replica 4. Then the update program asks replica 3 to provide it with all records identified as rep_db = 2 with modification dates on or after the "updated_from" in replica 4 for replica 2. If the records are successfully retrieved, the "updated_from" in the target database 4 to the value in the source database 3 (if it is later than the existing "updated_from"), since database 4 now knows the records that database 3 knew.

This type of tracking makes it fairly easy to identify when there is no conflict. If a database knew about a particular record prior to changing it there is no conflict. So in particular, if update retrieves a record from database 2 with modification time 37, and the target database 4's "updated_from" is 32 (that is, less than 37), then database 4 knew about the record from database 2, and the current record in database 4 should supersede the record from database 2. For safety when there is a conflict, the most recent record is chosen, but the old record is backed up, and marked rep_conflict = 'y'.

Important Things to Be Aware Of

Problems with the Clock

The replication program assumes that a records modified in a particular replica (with the same value of rep_db) with a later rep_modified timestamp have actually been modified later than records with earlier timestamps. However, this need not be true for several reasons:

Different computers may have their clocks set differently. If you move a replica from one system to another this can cause the time to change. It is generally better if you want to move a database to a different system, to make a replica and move the replica, so that all transactions in that replica take place on the same system.
Even on a single computer, the clock may go backwards. For example, someone could come along an reset it. Or an automatic program updating the time from the internet could set the clock back. This is not likely to be a terrible problem, since the change is likely to be less than a second. Finally, the switch from daylight savings time to standard time will set the clock back by an hour.

Since the clock may move the wrong direction, it is useful to identify what problems this may cause. First, the update program retrieves only records modified since the last (successful) retrieval. This could miss records altogether if the clock was set back. To guard against this, each database has a value "time_adjust" which default to 1.5 hours. When records are retrieved, not only are records since the last update time retrieved, but all records modified during the preceeding hour and a half are retrieved as well. This adds a little work, since probably those records have been updated already, but is safe, since retrieving the records a second time does not harm. If you know that the clock has been set back by more than an hour and a half, you can change the value of this variable to collect the extra records.

Second, the system may fail to identify records as having conflicts, because it believes that one database knew about a record, when in fact the modification time of the record is wrong. This is a known limitation of the system, but should not occur frequently.

The best practice: if you know the clock has been set back by a substantial amount of time, replace the database with a new replica.

How to Make Backups, and Why You Should NOT Copy Replicated Databases

Recall that every database has a unique id that is used to generate keys, as well as keeping track of which records have been modified in which database. Moreover, records modified in the current database are identified as "0" to ease programming. Hence the program that creates new replicas first copies the original database, then creates a new id for the new database, then stamps all the rep_db = 0 records with the actual id of the database that is being replicated.

What happens if you have two databases with the same id? Lots of bad things: first, new records will entered in the two databases with duplicate keys. Second, records modified in one database will be confused with those modified in another. You might think it would be safe to delete a database, and replace it with another database with the same id, but this will also confuse the replication system: if the database deleted has transactions since it was duplicated, and some of those transactions have made their way into other replicas, the update program will be confused to find records with later dates supposedly generated by a database that does not contain those records.

The most brutal error, of trying to replicate directly from one database to another replica with the same id, has been blocked: "update.php3" checks the ids and if it finds they are the same, it stops and does not run. However, if you replicate indirectly, the program cannot catch the problem: for example, replica id 1 is on machine A, replica id 2 on machine B and replica id 1 on machine C. If you replicate between A and B and B and C, but not A and C, you will get all sorts of errors.

The solution: do not copy, replicate. That is, if you want to back up a database, do not copy make a replica. Whenever you move a replica to a new location, always delete the old copy.

What if you somehow have to delete a database with a particular id, and the copy you have available for backup has the same id (that is, you disregarded the instructions above). Is there anything safe to do? Yes: make a replica of the copy, and replace the original with the replica. This is completely safe. So again the rule: when in doubt replicate.

Note: when a new replica is created, the conflict records are retained, so they are not lost if the replica is used as a backup.

Local Records

If you set rep_conflict = 'l' (that is the lower case latin letter L, not the number 1), the record will be excluded from replication. For example, I have a calendar database that automatically adds holidays to the calendar, and marks them as externally generated. Periodically, it deletes all these records, and regenerates them. This generates a lot of adds and deletes, and can be done separately on each replica, so there is little reason to waste time replicating the records. So I mark all these records as local, and they do not get replicated.

What the System Tables Are

rep_dbs contains a list of all known replicas. Fields:

rep_db primary key, the unique identifier of that particular replica
updated_from, the time at which records from rep_db are known to have been recieved by the current database
current enum('y', 'n'), one and only one record in rep_dbs should be marked 'y', the rest 'n'. The record marked 'y' identifies the current replica
priority int default 100, priorities for resolving conflicts, not currently implemented
time_adjust int default 5400, time (in seconds) prior to updated_from time for which records should actually be recieved

rep_tables a list of all tables which are to be replicated

rep_table, identifies the table by number, used by other system tables, should be the same in all replicas, generated automatically through autoincrement
name varchar(255) NOT NULL UNIQUE INDEX, the name of the table: do not use reserved keywords please

rep_gen a table that should contain one and only one record, used for generating ids

gen, last used key
rep_time timestamp, a timestamp that can be used to find the system time (it is better to do "SELECT now()" but I did not know this when I wrote the code)
rep_code, a unique code generated by createreplicable that should be the same in every replica. If you make design changes, you should change this code so you do not try to replicate between databases with different designs
rep_lock, used to make sure only one replication program uses the database at a time
rep_lock_time, time at which the lock was obtained
rep_lock_max default 3600, length of time in seconds after rep_lock_time at which it is assumed the lock is released

rep_graveyard holds records that have been deleted

rep_db, database in which the record was deleted
rep_table, table from which it was deleted, refers to rep_table in rep_tables
id, id of the deleted record
rep_modified, time at which the record was marked deleted

rep_conflicts used to keep track of conflict records

rep_table, table which has the conflict
id, id of the record with rep_conflict = 'n', that is the currently used version of the record
conflict_id, id of the record with rep_conflict = 'y', that is the alternate version of the record

rep_errors log of errors discovered during the replication process

error_id, key identifies the error
rep_table, table in which the error occured
id, id of the record with the error
error_time, time of error
is_warning, is this a warning or an actual error?
error_description, verbal description of the error

What the Replication Variables Are

rep_db int default 0, identifies the replica in which the modification took place
id bigint PRIMARY KEY
rep_modified timestamp, identifies the time at which the record was modified
rep_conflict enum('y', 'n', 'l') default 'n', identifies whether the record has a conflict or is local. conflict and local records are not replicated.
deleted enum('y', 'n') default 'n'

Programs and Directories

The distribution consists of a main directory and a subdirectory called win32. The main directory contains subroutines designed to be included in PHP scripts. The win32 subdirectory has the the NT executables, including DLLs and ini files for PHP. Please: Regardless of what operating system you use, make sure to turn "magic-quotes" off in PHP since this escapes characters in a not terribly useful way. Instead use the fxs() function in fixsql.php3 to fix your strings so you can use them as sql literals. Also set the max_execution_time = 0 to keep your scripts from timing out.
You should create a master directory that will contain your replication scripts and the shell scripts that invoke the php interpreter. You will probably want to put the interpreter and its configuration file in this directory as well. The main directory from this distribution should be a subdirectory of the master directory called "replication".

The utility programs "getkey.php3", "getbasekey.php3" and "fixsql.php3" should be copied to the master directory.

Samples of programs can be found in the "win32" directory; to use them first copy them to your master directory.

The programs are commented.

What the Programs Do

createreplicable.php3, creates a new replicable database
dbcopy.php3, copies a database safely, locking all tables before copying: used by makereplica.php3 - you should not use it for reasons given above
dbinspect.php3, shows a list of databases, checks tables in a replicated database for errors, displays the error log, displays conflict and has options for deleting or accepting conflict records
dblock.php3, locks and unlocks all tables in a database, run this after creating or updating a backup, and before making a copy (for example to tape or cd-rw) to make sure the database is consistent
fixsql.php3, contains a single function fxs($string) returning a string that is safe for use as an sql literal by escaping the required characters
getbasekey.php3, gets a database id for a new replica, should be used by replication routines only
getkey.php3, contains a single function getkey($conn) generating a new key for the database on $conn
makereplica.php3, makes a replica of a database
markreplica.php3, takes a database, preferably copied by dbcopy.php3, and gives it a new id and makes the other changes needed to make it a new replica: used by makereplica.php3 - you should not use it.
optimize.php3, optimizes all the tables in a database
repair.php3, repairs all the tables in a database
undertaker.php3, repairs the database and puts records marked as deleted into the graveyard; set $norep to nonzero to skip the database repair
undertakerf.php3, contains all functions needed for undertaker.php3
update.php3, one way replication from source to target. Changes only the target. To use for two-way replication, just run it in both directions.
updatef.php3, contains functions needed for update.php3

Common Information About Scripts

All scripts are designed so that they can be included multiple times and so that they can run from any directory. All php include paths are relative to the directory in which the initial php script originally loaded. That is, if a webserver loads a script, or you run php standalong against a script, called lochou.php3 in directory foo, then the include statement
include "z.php3"; include "dir/z.php3"; include "/top/z.php3";
refers to "foo/z.php3"; "foo/dir/z.php3"; "/top/z.php3"
in lochou.php3 and in every script included or required in locou.php3, regardless of where those scripts are actually located. The scripts createreplicable.php3, makereplica.php3, markreplica.php3, update.php3 and undertaker.php3 all "include" other scripts from the replication directory. To make this work, they must know where the originally loaded script in which they are directly or indirectly included resides, so that they can construct the correct path to the included files. The default is that this "originally loaded" script is located in the parent directory of the replication directory. However, this can be overridden by setting the variable $relative. This variable should be set to the relative path from the "originally loaded script" to the parent directory of the "replication directory." For example, if the "originally loaded script" is a "sibling directory" residing under the same parent as the "replication directory" set $relative = "../" which is the path to the common parent directory. Or you can set $relative to the absolute path of the parent directory of the "replication directory." Note that you should always use "/" in directories and never "\" since "/" will work both under unix and windows, but "\" will not work under unix.

PHP also has the property that functions cannot be loaded more than once. That is, if an included file contains a function definition and the file is included more than once (for example to update more than one database), PHP will terminate with an error. To avoid this problem, markreplica.php3, update.php3 and undertaker.php3 place all functions in a separate file, and use a flag to make sure that file only gets loaded once. update.php3 and undertaker.php3 load specialized functions in updatef.php3 and undertakerf.php3 and the lock variable is "$rep_loaded_functions_u" and "$rep_loaded_functions" respectively. All routines use the "$includeexclude" flag to determine whether the pair of files "fixsql.php3" and "getkey.php3" have been loaded; since these functions are commonly loaded by other routines, after loading them you should also set $includeexclude=true (or any nonzero value) to prevent them from being loaded again. In addition, markreplica.php3 uses the flag $includeexclude_base to determine if the getbasekey.php3 file has been loaded.

Known Limitations

NULL fields of type char, varchar, text and blob will be replicated as zero length strings, not as NULLs