27 What to Do When the Master Database Encounters Problems #

In the previous articles, [24], [25], and [26], I introduced the basic structure of MySQL master-slave replication. However, those articles only covered a one-master-one-slave structure.

In most internet application scenarios, there is a higher ratio of read operations to write operations. Therefore, in the development process of your business, you are likely to encounter performance issues related to read operations. To solve read performance problems at the database level, we need to discuss a different structure in the next two articles: a one-master-multiple-slave architecture.

In this article, we will first discuss the correctness of switching to a one-master-multiple-slave structure. Then, in the next article, we will discuss the methods for ensuring the correctness of query logic in a one-master-multiple-slave structure.

As shown in Figure 1, this is a basic one-master-multiple-slave structure.

Figure 1 Basic one-master-multiple-slave structure

In this figure, the dashed arrows represent the master-slave relationship, where A and A’ are each other’s primary and standby databases respectively. The slave databases B, C, and D are connected to the master database A. The one-master-multiple-slave structure is usually used for read-write separation, where the master database is responsible for all writes and some reads, while other read requests are shared by the slave databases.

Today, we will discuss the process of switching the primary and standby databases in a one-master-multiple-slave architecture when the primary database fails.

As shown in Figure 2, this is the result after the primary database fails and the switch to the standby database is completed.

Figure 2 Basic one-master-multiple-slave structure - primary-standby switch

Compared to the switch process in a one-master-one-slave structure, the complexity of the primary-standby switch increases in a one-master-multiple-slave structure because the slave databases B, C, and D need to be redirected to the new primary database A’. Therefore, we will now explore how a switching system completes the process of primary-standby switch in a one-master-multiple-slave architecture.

Position-based Primary-Standby Switch #

Here, we need to review a key concept.

When we set node B as a slave to node A’, the “change master” command needs to be executed:

CHANGE MASTER TO 
MASTER_HOST=$host_name 
MASTER_PORT=$port 
MASTER_USER=$user_name 
MASTER_PASSWORD=$password 
MASTER_LOG_FILE=$master_log_name 
MASTER_LOG_POS=$master_log_pos

This command has six parameters:

MASTER_HOST, MASTER_PORT, MASTER_USER, and MASTER_PASSWORD represent the IP, port, username, and password of the new primary database A'.
The last two parameters, MASTER_LOG_FILE and MASTER_LOG_POS, indicate that the slave database B needs to continue synchronizing the log from the position specified by the log file name and log position of the new primary database.

Now, we encounter a problem: to set node B as a slave to A’, it is inevitable to set the two log position parameters, but how should these parameters be set?

Initially, node B was a slave to node A, and it recorded the log position of node A. However, the log position values of node A and A’ are different for the same logs. Therefore, when switching the slave database B, the logic of “finding the synchronization position” needs to be performed.

It is difficult to obtain an exact synchronization position, so only an approximate position can be obtained. Why? Let me explain how this position is generally determined, and then you will understand the reason for the lack of precision.

Considering that no data can be lost during the switch process, when finding the position, we always need to find a position that is slightly ahead and then skip those transactions that have already been executed on the slave database B.

One method to obtain the synchronization position is as follows:

Wait for the new primary database A’ to finish synchronizing all relay logs.
Execute the command “show master status” on database A’ to obtain the latest “File” and “Position” of A'.
Record the time T when the original primary database A encountered a failure.
Use the mysqlbinlog tool to parse file “File” on A’ and obtain the position at time T.

mysqlbinlog File --stop-datetime=T --start-datetime=T

Figure 3 Partial output of mysqlbinlog

In the figure, the value “123” after “end_log_pos” represents the position where the new instance A’ wrote new binlogs at time T. Then, we can use this value of 123 as $master_log_pos in the “change master” command for node B. Of course, this value is not accurate. Why?

Let’s imagine a scenario. Suppose at time T, the master database A has completed executing an insert statement to insert a row of data R and has sent the binlog to A’ and B. And then, just after sending, the host of the master database A loses power.

At this point, the system is in the following state:

On replica B, since it has synchronized the binlog, the row R already exists.
On the new master database A’, the row R also exists, and the log is written after position 123.
We execute the change master command on replica B, pointing to position 123 of the File file of A’, which will synchronize the binlog that inserted the row R to replica B for execution.

At this point, the replication thread of replica B will report a “Duplicate entry ‘id_of_R’ for key ‘PRIMARY’” error, indicating a primary key conflict, and then stop replication.

Therefore, in most cases, when switching tasks, we need to proactively skip these errors, and there are two commonly used methods.

One approach is to proactively skip a transaction. The syntax for skipping a command is as follows:

set global sql_slave_skip_counter=1;
start slave;

Because during the switching process, more than one transaction may be repeated, so we need to keep an eye on it when replica B first receives the new master database A’. Whenever encountering these errors and stopping, we execute the skip command once, until no more stopping occurs, in order to skip all possible transactions involved.

Another approach is to skip specific errors by setting the slave_skip_errors parameter.

When performing a master-slave switch, there are two types of errors that are often encountered:

The 1062 error is a unique key conflict when inserting data.
The 1032 error is a row not found when deleting data.

Therefore, we can set the slave_skip_errors to “1032,1062”, so that when encountering these two errors, they will be skipped directly.

It is worth noting that this method of directly skipping specified errors is only applicable to establishing the master-slave relationship between the replica and the new master database when there is no accurate synchronization position available.

The context here is that we are very clear that skipping errors of type 1032 and 1062 during the master-slave switch is lossless. This is why we can set the slave_skip_errors parameter like this. After the synchronization relationship between the master and slave is established and executes stably for a period of time, we still need to clear this parameter to avoid skipping them in case of actual data inconsistency between the master and slave later.

GTID #

Although skipping transactions through sql_slave_skip_counter and ignoring errors through slave_skip_errors ultimately establish the master-slave relationship between replica B and the new master database A’, both of these operations are complex and prone to errors. Therefore, MySQL 5.6 introduced GTID to solve this difficulty completely.

So what exactly does GTID mean and how does it solve the synchronization position problem? Now, let me briefly introduce it to you.

GTID stands for Global Transaction Identifier, which is a unique identifier generated for a transaction when it is committed. It consists of two parts and has the following format:

GTID = server_uuid:gno

where:

server_uuid is a globally unique value automatically generated when an instance starts for the first time.
gno is an integer whose initial value is 1 and is assigned to each transaction when it is committed, and then incremented by 1.

I need to explain to you here that in the MySQL official documentation, the GTID format is defined as follows:

GTID = source_id:transaction_id

Here, source_id is the server_uuid, and the following transaction_id can easily be misleading. Therefore, I changed it to gno. Why is using transaction_id misleading?

Because in MySQL, we say that transaction_id refers to the transaction ID, which is assigned during the transaction execution. If the transaction is rolled back, the transaction ID will also increase. However, gno is only assigned when the transaction is committed.

In terms of effect, GTID is often continuous, so it is easier to understand using gno.

Enabling GTID mode is also simple. We just need to add the gtid_mode=on and enforce_gtid_consistency=on parameters when starting a MySQL instance.

In GTID mode, each transaction corresponds to a GTID. There are two ways to generate this GTID, which depends on the value of the session variable gtid_next.

If gtid_next=automatic, it means using the default value. In this case, MySQL assigns server_uuid:gno to the transaction. a. When recording the binlog, it first records one line SET @@SESSION.GTID_NEXT='server_uuid:gno'; b. It adds this GTID to the GTID set of the current instance.
If gtid_next is a specified GTID value, such as setting gtid_next='current_gtid' using the set command, then there are two possibilities: a. If current_gtid already exists in the GTID set of the instance, the upcoming transaction will be ignored by the system. b. If current_gtid does not exist in the GTID set of the instance, it will be assigned to the upcoming transaction. This means the system does not need to generate a new GTID for this transaction, so gno does not need to be incremented.

Note that each current_gtid can only be used for one transaction. After this transaction is committed, if the next transaction needs to be executed, a set command must be executed to set gtid_next to another GTID or automatic. In this way, each MySQL instance maintains a set of GTIDs, which corresponds to “all transactions executed by this instance”.

It may not be easy to understand from the description above, so next I will use a simple example to explain the basic usage of GTIDs.

We create a table t in instance X.

CREATE TABLE `t` (
  `id` int(11) NOT NULL,
  `c` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

INSERT INTO t VALUES (1,1);

Figure 4 shows the binlog of the initial data.

As you can see, there is a SET @@SESSION.GTID_NEXT command before the transaction’s BEGIN. If instance X has a replica, when the binlog of the CREATE TABLE and INSERT statements is synchronized to the replica and executed, the replica will first execute these two SET commands. The GTID set that is added to the replica is the two GTIDs in the figure.

Let’s assume that instance X is a replica of another instance Y, and at this time, the following INSERT statement is executed on instance Y:

INSERT INTO t VALUES (1,1);

And the GTID of this statement on instance Y is “aaaaaaaa-cccc-dddd-eeee-ffffffffffff:10”.

When instance X, as a replica of instance Y, synchronizes and executes this transaction, a primary key conflict will obviously occur, causing the replication thread on instance X to stop. So how do we handle this situation?

The solution is to execute the following sequence of statements:

SET gtid_next='aaaaaaaa-cccc-dddd-eeee-ffffffffffff:10';
BEGIN;
COMMIT;
SET gtid_next=automatic;
START SLAVE;

The purpose of the first three statements is to add this GTID to the GTID set of instance X by committing an empty transaction. Figure 5 shows the result of executing SHOW MASTER STATUS after completing this empty transaction.

Figure 5 shows the result of SHOW MASTER STATUS.

As you can see, the GTID “aaaaaaaa-cccc-dddd-eeee-ffffffffffff:10” has been added to the Executed_Gtid_Set on instance X.

In this way, when I execute the START SLAVE command to start the replication thread, although instance X will continue to execute the transactions passed by instance Y, since “aaaaaaaa-cccc-dddd-eeee-ffffffffffff:10” already exists in the GTID set of instance X, instance X will directly skip this transaction and the primary key conflict error will not occur again.

In the above sequence of statements, there is another statement SET gtid_next=automatic before the START SLAVE command. The purpose of this statement is to “restore the default GTID allocation behavior”, which means that if there are new transactions to be executed in the future, they will be allocated according to the original allocation method and continue to allocate gno=3.

GTID-based Master-Slave Switching #

Now that we understand the concept of GTIDs, let’s take a look at the usage of GTID-based master-slave replication.

In GTID mode, the syntax for setting a backup instance B as a replica of a new master instance A’ is as follows:

CHANGE MASTER TO 
MASTER_HOST=$host_name 
MASTER_PORT=$port 
MASTER_USER=$user_name 
MASTER_PASSWORD=$password 
master_auto_position=1

Where master_auto_position=1 indicates that GTID protocol is used for this master-slave relationship. As you can see, the troublesome MASTER_LOG_FILE and MASTER_LOG_POS parameters that we encountered before are no longer needed.

Let’s denote the GTID set of instance A’ at the current moment as set_a, and the GTID set of instance B as set_b. Next, let’s take a look at the logic of the master-slave switch now.

To start replication on instance B, the process for obtaining binlogs is as follows:

Instance B specifies instance A’ as the master and establishes a connection based on the master-slave protocol.
Instance B sends set_b to instance A’.
Instance A’ calculates the difference between set_a and set_b, which is the set of all GTIDs that exist in set_a but not in set_b. It then determines if all the binlog transactions required for this difference set are present locally on A’.
- If they are not present, it indicates that instances A’ has already deleted the binlogs needed by instance B, so it returns an error.
- If they are all present, A’ finds the first transaction that is not in set_b from its own binlog files and sends it to B.
After this, it reads the file from this transaction onwards, sequentially retrieves the binlogs, and sends them to B for execution. Actually, this logic contains a design idea: in a GTID-based master-slave relationship, the system assumes that as long as the master-slave relationship is established, it must ensure that the logs sent from the master to the slave are complete. Therefore, if Instance B needs logs that no longer exist, Instance A’ will refuse to send the logs to B.

This is different from the binlog position-based master-slave protocol. In the binlog position-based protocol, it is decided by the slave which position to specify, and the master sends the corresponding position without verifying the completeness of the logs.

Based on the above introduction, let’s take a look at how master-slave switch is implemented in a one-master-multiple-slave scenario after the introduction of GTID.

Since there is no need to find the binlog position anymore, the slave servers B, C, and D only need to execute the change master command to point to instance A'.

Strictly speaking, the master-slave switch still requires finding the binlog position, but this work is automatically completed within instance A’. However, since this work is automatic, it is very friendly to the developers of the HA system.

After that, the new master A’ writes to the binlog it generates, and the format of the GTID set in the binlog of the master A’ is: server_uuid_of_A':1-M.

If the GTID set format of the slave B was server_uuid_of_A:1-N before, then after the switch, the GTID set format becomes server_uuid_of_A:1-N, server_uuid_of_A':1-M.

Of course, the previous master A’ was also a slave of A, so the GTID set of master A’ and the slave B are the same. This achieves the desired result.

GTID and Online DDL #

Next, let me give you another example to help you understand GTID.

In a previous article, when discussing the performance problem of slow queries during business peak periods, I mentioned that if the performance problem is caused by missing indexes, it can be solved by online index addition. However, considering avoiding the impact of adding indexes on the performance of the master, we can first add the index on the slave and then switch.

At that time, I mentioned that in the dual-master structure, the DDL statements executed by the slave will also be sent to the master. To avoid impacting the master after being sent back, the binlog needs to be turned off by executing set sql_log_bin=off.

A classmate in the comments section raised a question: if we do this, the index will be added to the database, but the binlog will not record this update, will it cause inconsistency between the data and the logs?

This question is excellent. At the time, I used GTID as an example in my reply. Today, let me explain it further.

Assume that the two databases with a master-slave relationship are still instances X and Y, and the current master is X, and both have the GTID mode enabled. The switch process between the master and slave can be as follows:

Execute stop slave on instance X.
Execute the DDL statement on instance Y. Note that it is not necessary to turn off the binlog here.
After the execution is completed, retrieve the corresponding GTID for this DDL statement and record it as server_uuid_of_Y:gno.
Execute the following sequence of statements on instance X:

set GTID_NEXT=“server_uuid_of_Y:gno”; begin; commit; set gtid_next=automatic; start slave;

The purpose of doing this is to record the update on instance Y in the binlog and ensure that this update is not executed on instance X.

Next, complete the master-slave switch, and then repeat the above process.

Summary #

In today’s article, I introduced the master-slave switch process in a one-master-multiple-slave configuration. In this process, the slave finding the new master’s binlog position is a pain point. Based on this, I introduced the GTID mode introduced in MySQL 5.6, and explained the basic concepts and usage of GTID.

As you can see, in GTID mode, the switch in a one-master-multiple-slave configuration becomes very convenient.

Therefore, if your MySQL version supports GTID, I recommend using the GTID mode as much as possible for the master-slave switch.

In the next article, we will also see the application of GTID mode in the read-write separation scenario.

Finally, it’s time for our question.

In GTID mode, when setting up the master-slave relationship and the slave executes the start slave command, if the master finds that the required binlog has been deleted, resulting in the failure to create the master-slave relationship, how do you think this can be handled?

Please write your method in the comment section, and I will discuss this question with you at the end of the next article. Thank you for listening, and feel free to share this article with more friends to read together.

Previous Question #

In the previous article, I left you with a question: if the master database is under single-threaded stress mode and the slave is catching up with the master, what parameter should be set for binlog-transaction-dependency-tracking?

The answer is that this parameter should be set to WRITESET.

Since the master database is under single-threaded stress mode, the commit_id of each transaction is different. If the parameter is set to COMMIT_ORDER, the slave can only execute in single-threaded mode.

Similarly, since the WRITESET_SESSION mode requires that the logs of the same thread must be applied in the same order as executed on the master, it will also result in degraded to single-threaded replication under the single-threaded stress mode of the master.

Therefore, binlog-transaction-dependency-tracking should be set to WRITESET.