MySQL replication is based on the master server keeping track of all changes to your databases (updates, deletes, and so on) in its binary logs. Therefore, to use replication, you must enable binary logging on the master server. See Section 5.2.3, “The Binary Log”.
Each slave server receives from the master the saved updates that the master has recorded in its binary log, so that the slave can execute the same updates on its copy of the data.
It is extremely important to realize that the binary log is simply a record starting from the fixed point in time at which you enable binary logging. Any slaves that you set up need copies of the databases on your master as they existed at the moment you enabled binary logging on the master. If you start your slaves with databases that are not in the same state as those on the master when the binary log was started, your slaves are quite likely to fail.
After the slave has been set up with a copy of the master's data, it
connects to the master and waits for updates to process. If the
master fails, or the slave loses connectivity with your master, the
slave keeps trying to connect periodically until it is able to
resume listening for updates. The
--master-connect-retry option controls the retry
interval. The default is 60 seconds.
Each slave keeps track of where it left off when it last read from its master server. The master has no knowledge of how many slaves it has or which ones are up to date at any given time.
MySQL replication capabilities are implemented using three threads
(one on the master server and two on the slave). When a
START SLAVE statement is issued on a slave
server, the slave creates an I/O thread, which connects to the
master and asks it to send the updates recorded in its binary
logs. The master creates a thread to send the binary log contents
to the slave. This thread can be identified as the Binlog
Dump thread in the output of SHOW
PROCESSLIST on the master. The slave I/O thread reads
the updates that the master Binlog Dump thread
sends and copies them to local files, known as relay
logs, in the slave's data directory. The third thread
is the SQL thread, which the slave creates to read the relay logs
and to execute the updates they contain.
MySQL Enterprise. For constant monitoring of the status of slaves subscribe to the MySQL Enterprise Monitor. For more information see http://www.mysql.com/products/enterprise/advisors.html.
In the preceding description, there are three threads per master/slave connection. A master that has multiple slaves creates one thread for each currently-connected slave, and each slave has its own I/O and SQL threads.
The slave uses two threads so that reading updates from the master and executing them can be separated into two independent tasks. Thus, the task of reading statements is not slowed down if statement execution is slow. For example, if the slave server has not been running for a while, its I/O thread can quickly fetch all the binary log contents from the master when the slave starts, even if the SQL thread lags far behind. If the slave stops before the SQL thread has executed all the fetched statements, the I/O thread has at least fetched everything so that a safe copy of the statements is stored locally in the slave's relay logs, ready for execution the next time that the slave starts. This enables the master server to purge its binary logs sooner because it no longer needs to wait for the slave to fetch their contents.
The SHOW PROCESSLIST statement provides
information that tells you what is happening on the master and on
the slave regarding replication. See
Section 7.5.5, “Examining Thread Information”, for descriptions of all
replicated-related states.
The following example illustrates how the three threads show up in
the output from SHOW PROCESSLIST.
On the master server, the output from SHOW
PROCESSLIST looks like this:
mysql> SHOW PROCESSLIST\G
*************************** 1. row ***************************
Id: 2
User: root
Host: localhost:32931
db: NULL
Command: Binlog Dump
Time: 94
State: Has sent all binlog to slave; waiting for binlog to
be updated
Info: NULL
Here, thread 2 is a Binlog Dump replication
thread for a connected slave. The State
information indicates that all outstanding updates have been sent
to the slave and that the master is waiting for more updates to
occur. If you see no Binlog Dump threads on a
master server, this means that replication is not running —
that is, that no slaves are currently connected.
On the slave server, the output from SHOW
PROCESSLIST looks like this:
mysql> SHOW PROCESSLIST\G
*************************** 1. row ***************************
Id: 10
User: system user
Host:
db: NULL
Command: Connect
Time: 11
State: Waiting for master to send event
Info: NULL
*************************** 2. row ***************************
Id: 11
User: system user
Host:
db: NULL
Command: Connect
Time: 11
State: Has read all relay log; waiting for the slave I/O
thread to update it
Info: NULL
This information indicates that thread 10 is the I/O thread that
is communicating with the master server, and thread 11 is the SQL
thread that is processing the updates stored in the relay logs. At
the time that the SHOW PROCESSLIST was run,
both threads were idle, waiting for further updates.
The value in the Time column can show how late
the slave is compared to the master. See
Section 15.3.4, “Replication FAQ”.
By default, relay logs filenames have the form
,
where host_name-relay-bin.nnnnnnhost_name is the name of the
slave server host and nnnnnn is a
sequence number. Successive relay log files are created using
successive sequence numbers, beginning with
000001. The slave uses an index file to track
the relay log files currently in use. The default relay log index
filename is
.
By default, the slave server creates relay log files in its data
directory. The default filenames can be overridden with the
host_name-relay-bin.index--relay-log and
--relay-log-index server options. See
Section 15.1.2, “Replication Startup Options and Variables”.
Relay logs have the same format as binary logs and can be read
using mysqlbinlog. The SQL thread automatically
deletes each relay log file as soon as it has executed all events
in the file and no longer needs it. There is no explicit mechanism
for deleting relay logs because the SQL thread takes care of doing
so. However, FLUSH LOGS rotates relay logs,
which influences when the SQL thread deletes them.
A slave server creates a new relay log file under the following conditions:
Each time the I/O thread starts.
When the logs are flushed; for example, with FLUSH
LOGS or mysqladmin flush-logs.
When the size of the current relay log file becomes too large. The meaning of “too large” is determined as follows:
If the value of max_relay_log_size is
greater than 0, that is the maximum relay log file size.
If the value of max_relay_log_size is
0, max_binlog_size determines the
maximum relay log file size.
A slave replication server creates two additional small files in
the data directory. These status files are
named master.info and
relay-log.info by default. Their names can be
changed by using the --master-info-file and
--relay-log-info-file options. See
Section 15.1.2, “Replication Startup Options and Variables”.
The two status files contain information like that shown in the
output of the SHOW SLAVE STATUS statement,
which is discussed in Section 12.6.2, “SQL Statements for Controlling Slave Servers”.
Because the status files are stored on disk, they survive a slave
server's shutdown. The next time the slave starts up, it reads the
two files to determine how far it has proceeded in reading binary
logs from the master and in processing its own relay logs.
The I/O thread updates the master.info file.
The following table shows the correspondence between the lines in
the file and the columns displayed by SHOW SLAVE
STATUS.
| Line | Description |
| 1 | Number of lines in the file |
| 2 | Master_Log_File |
| 3 | Read_Master_Log_Pos |
| 4 | Master_Host |
| 5 | Master_User |
| 6 | Password (not shown by SHOW SLAVE STATUS) |
| 7 | Master_Port |
| 8 | Connect_Retry |
| 9 | Master_SSL_Allowed |
| 10 | Master_SSL_CA_File |
| 11 | Master_SSL_CA_Path |
| 12 | Master_SSL_Cert |
| 13 | Master_SSL_Cipher |
| 14 | Master_SSL_Key |
The SQL thread updates the relay-log.info
file. The following table shows the correspondence between the
lines in the file and the columns displayed by SHOW SLAVE
STATUS.
| Line | Description |
| 1 | Relay_Log_File |
| 2 | Relay_Log_Pos |
| 3 | Relay_Master_Log_File |
| 4 | Exec_Master_Log_Pos |
The contents of the relay-log.info file and
the states shown by the SHOW SLAVE STATES
command may not match if the relay-log.info
file has not been flushed to disk. Ideally, you should only view
relay-log.info on a slave that is offline
(i.e. mysqld is not running). For a running
system, SHOW SLAVE STATUS should be used.
When you back up the slave's data, you should back up these two
status files as well, along with the relay log files. They are
needed to resume replication after you restore the slave's data.
If you lose the relay logs but still have the
relay-log.info file, you can check it to
determine how far the SQL thread has executed in the master binary
logs. Then you can use CHANGE MASTER TO with
the MASTER_LOG_FILE and
MASTER_LOG_POS options to tell the slave to
re-read the binary logs from that point. Of course, this requires
that the binary logs still exist on the master server.
If your slave is subject to replicating LOAD DATA
INFILE statements, you should also back up any
SQL_LOAD-* files that exist in the directory
that the slave uses for this purpose. The slave needs these files
to resume replication of any interrupted LOAD DATA
INFILE operations. The directory location is specified
using the --slave-load-tmpdir option. If this
option is not specified, the directory location is the value of
the tmpdir system variable.
If a master server does not write a statement to its binary log, the statement is not replicated. If the server does log the statement, the statement is sent to all slaves and each slave determines whether to execute it or ignore it.
On the master side, decisions about which statements to log are
based on the --binlog-do-db and
--binlog-ignore-db options that control binary
logging. For a description of the rules that servers use in
evaluating these options, see Section 5.2.3, “The Binary Log”.
On the slave side, decisions about whether to execute or ignore
statements received from the master are made according to the
--replicate-* options that the slave was started
with. (See Section 15.1.2, “Replication Startup Options and Variables”.) The slave
evaluates these options using the following procedure, which first
checks the database-level options and then the table-level
options.
In the simplest case, when there are no
--replicate-* options, the procedure yields the
result that the slave executes all statements that it receives
from the master. Otherwise, the result depends on the particular
options given. In general, to make it easier to determine what
effect an option set will have, it is recommended that you avoid
mixing “do” and “ignore” options, or
wildcard and non-wildcard options.
Stage 1. Check the database options.
At this stage, the slave checks whether there are any
--replicate-do-db or
--replicate-ignore-db options that specify
database-specific conditions:
No: Permit the statement and proceed to the table-checking stage.
Yes: Test the options using the same
rules as for the --binlog-do-db and
--binlog-ignore-db options to determine
whether to permit or ignore the statement. What is the result
of the test?
Permit: Do not execute the statement immediately. Defer the decision and proceed to the table-checking stage.
Ignore: Ignore the statement and exit.
This stage can permit a statement for further option-checking, or cause it to be ignored. However, statements that are permitted at this stage are not actually executed yet. Instead, they pass to the following stage that checks the table options.
Stage 2. Check the table options.
First, as a preliminary condition, the slave checks whether the
statement occurs within a stored function or (prior to MySQL
5.0.12) a stored procedure. If so, execute the statement and exit.
(Stored procedures are exempt from this test as of MySQL 5.0.12
because procedure logging occurs at the level of statements that
are executed within the routine rather than at the
CALL level.)
Next, the slave checks for table options and evaluates them. If
the server reaches this point, it executes all statements if there
are no table options. If there are “do” table
options, the statement must match one of them if it is to be
executed; otherwise, it is ignored. If there are any
“ignore” options, all statements are executed except
those that match any ignore option. The
following steps describe how this evaluation occurs in more
detail.
Are there any --replicate-*-table options?
No: There are no table restrictions, so all statements match. Execute the statement and exit.
Yes: There are table restrictions.
Evaluate the tables to be updated against them. There
might be multiple tables to update, so loop through the
following steps for each table looking for a matching
option (first the non-wild options, and then the wild
options). Only tables that are to be updated are compared
to the options. For example, if the statement is
INSERT INTO sales SELECT * FROM prices,
only sales is compared to the options).
If several tables are to be updated (multiple-table
statement), the first table that matches “do”
or “ignore” wins. That is, the server checks
the first table against the options. If no decision could
be made, it checks the second table against the options,
and so on.
Are there any --replicate-do-table options?
No: Proceed to the next step.
Yes: Does the table match any of them?
No: Proceed to the next step.
Yes: Execute the statement and exit.
Are there any --replicate-ignore-table
options?
No: Proceed to the next step.
Yes: Does the table match any of them?
No: Proceed to the next step.
Yes: Ignore the statement and exit.
Are there any --replicate-wild-do-table
options?
No: Proceed to the next step.
Yes: Does the table match any of them?
No: Proceed to the next step.
Yes: Execute the statement and exit.
Are there any --replicate-wild-ignore-table
options?
No: Proceed to the next step.
Yes: Does the table match any of them?
No: Proceed to the next step.
Yes: Ignore the statement and exit.
No --replicate-*-table option was matched. Is
there another table to test against these options?
No: We have now tested all tables to
be updated and could not match any option. Are there
--replicate-do-table or
--replicate-wild-do-table options?
No: There were no “do” table options, so no explicit “do” match is required. Execute the statement and exit.
Yes: There were “do” table options, so the statement is executed only with an explicit match to one of them. Ignore the statement and exit.
Yes: Loop.
Examples:
No --replicate-* options at all
The slave executes all statements that it receives from the master.
--replicate-*-db options, but no table
options
The slave permits or ignores statements using the database options. Then it executes all statements permitted by those options because there are no table restrictions.
--replicate-*-table options, but no database
options
All statements are permitted at the database-checking stage because there are no database conditions. The slave executes or ignores statements based on the table options.
A mix of database and table options
The slave permits or ignores statements using the database options. Then it evaluates all statements permitted by those options according to the table options. In some cases, this process can yield what might seem a counterintuitive result. Consider the following set of options:
[mysqld] replicate-do-db = db1 replicate-do-table = db2.mytbl2
Suppose that db1 is the default database
and the slave receives this statement:
INSERT INTO mytbl1 VALUES(1,2,3);
The database is db1, which matches the
--replicate-do-db option at the
database-checking stage. The algorithm then proceeds to the
table-checking stage. If there were no table options, the
statement would be executed. However, because the options
include a “do” table option, the statement must
match if it is to be executed. The statement does not match,
so it is ignored. (The same would happen for any table in
db1.)