15 My SQL Replication the Simplest Yet Easiest to Misconfigure

15 MySQL Replication - The Simplest Yet Easiest to Misconfigure #

Starting today, we officially enter the stage of designing high availability architectures.

In the previous two modules, we learned about table structure design and index design in the MySQL architecture. For developers, mastering these concepts is sufficient for coding based on business logic.

However, when it comes to deploying your applications, you need to have a high availability design. In a production environment, without a proper high availability architecture, any physical hardware failure would render your application completely unusable.

This is unimaginable in the world of Internet services with massive concurrent access. Therefore, in addition to business architecture, you need to have a design for availability.

In this lesson, we will learn about the most fundamental and core concept in MySQL high availability architecture: MySQL Replication.

MySQL Replication Architecture #

Database replication is essentially data synchronization. MySQL database synchronizes data incrementally based on its binary log, which records all modifications made to the database.

In the default ROW format binary log, all records affected by a single SQL operation are logged. For example, if a SQL statement updates three rows, the binary log will record the before and after images of those three modified records.

For INSERT or DELETE operations, the binary log records information about all columns of the inserted or deleted record. Let’s look at an example:

DELETE FROM orders_test 
WHERE o_orderdate = '1997-12-31';
Query OK, 2482 rows affected (0.07 sec)

In the above example, the SQL statement performs a delete operation, deleting a total of 2482 rows. You can use the command SHOW BINLOG EVENTS in the MySQL command line to view the content of a specific binary log file. For example, the delete operation mentioned above occurred in the binary log file binlog.000004. You can see:

Drawing 0.png

Using the built-in mysqlbinlog command, you can parse the binary log and observe the detailed information of each record, such as:

Drawing 1.png

From the figure, you can see the complete information of the deleted records and the attributes of each column, such as the column type and whether it allows NULL values.

With the binary log, MySQL can achieve data synchronization through replication technology. The essence of data replication is to synchronize changes made in one MySQL database to another. The following diagram shows the current replication architecture of MySQL:

Drawing 3.png

As shown in the diagram, in MySQL replication, there is one server called the Master (also known as the Primary), and the other servers are Slaves (also known as Standby):

The Master server sends the binary log generated by data modifications to the Slave servers via a Dump thread.
The I/O thread in the Slave server receives the binary log and saves it as a relay log.
The SQL/Worker thread in the Slave server executes the relay log in parallel, which replays the logs generated by the Master.

Thanks to the binary log, MySQL replication is more flexible compared to other databases such as Oracle and PostgreSQL. Users can build replication topologies according to their own needs, for example:

Drawing 4.png

In the above diagram, Slave1, Slave2, and Slave3 are all Slave servers of the Master, and Slave11 is a Slave server of Slave1. Slave1 is both a Slave of the Master and a Master to Slave11, so it is a cascading Slave. Similarly, Slave3 is also a cascading Slave.

Now that we have understood the basic concepts of replication, let’s continue with configuring MySQL replication.

MySQL Replication Configuration #

Configuring MySQL replication is very simple. The basic steps are as follows:

Create the necessary accounts and permissions for replication.
Copy the data from the Master server, which can be done using logical backup tools like mysqldump, mysqlpump, or physical backup tools like the Clone Plugin.
Establish the replication relationship using the CHANGE MASTER TO command.
Observe the replication status using the SHOW SLAVE STATUS command.

Although the principles and implementation of MySQL replication are straightforward, it is easy to make mistakes during the configuration process. Therefore, make sure to include the following configurations in your configuration file:

gtid_mode = on
enforce_gtid_consistency = 1
binlog_gtid_simple_recovery = 1
relay_log_recovery = ON
master_info_repository = TABLE 
relay_log_info_repository = TABLE

These settings are used to ensure crash safety, which means that regardless of whether the Master or Slave servers crash, when they recover and reconnect to the host, the data on both ends will remain consistent and no inconsistencies will occur.

I often hear feedback from students saying that there are instances of data inconsistencies in their MySQL replication. Please confirm that the above parameters have been configured, otherwise any inconsistencies observed are not caused by MySQL itself, but rather due to incorrect usage of MySQL.

Now that we have learned about replication configurations, let’s take a look at the types of replication supported by MySQL.

Types of Replication in MySQL and Application Options #

MySQL replication can be divided into the following types: Drawing 5.png

The default replication method in MySQL is asynchronous replication. Many new developers mistakenly use asynchronous replication in their businesses because they are not aware that there are other types of replication available. In order to address this issue, let’s take a detailed look at each replication type and their suitability for different business scenarios, so that you can make the correct choice for your business.

Asynchronous Replication #

In asynchronous replication, the master server doesn’t concern itself with whether the slave server has received the binary logs. Therefore, there is no dependency between the master and slave servers. You can think of the master and slave servers as separate entities working independently, and eventually the data will be consistent through the use of binary log synchronization.

Asynchronous replication offers the best performance because it has minimal impact on the database itself, unless there is a significant delay between the master and slave servers, which would require the Dump Thread to read a large number of binary log files.

If the business doesn’t require strict data consistency and can tolerate data loss, even a significant amount of data loss, asynchronous replication is recommended due to its superior performance. For example, businesses like Weibo, which have high performance requirements but can usually tolerate data loss, would benefit from using asynchronous replication. However, data security is often the primary concern for core business systems, such as monitoring and alert systems.

Semi-Synchronous Replication #

Semi-synchronous replication requires that during the commit process on the master server, at least N slaves have received the binary logs. This ensures that if the master server fails, at least N of the slave servers will have complete data.

Semi-synchronous replication is not a built-in feature of MySQL, but rather requires the installation of a semi-synchronous plugin and enabling the semi-synchronous replication feature. You can set N to the number of slave servers that successfully receive the binary logs. For example:

plugin-load="rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"

rpl-semi-sync-master-enabled = 1

rpl-semi-sync-slave-enabled = 1

rpl_semi_sync_master_wait_no_slave = 1

In the above configuration:

The first line requires the installation of the semi-synchronous plugin when the database starts.
The second and third lines enable the semi-synchronous master and slave plugins, respectively.
The fourth line specifies that during the semi-synchronous replication process, the committed transaction must be received by at least one slave server.

In semi-synchronous replication, there is a lossy version called “lossy semi-synchronous replication” that existed before MySQL 5.7. In this version, when the master server fails, the slave server loses the last batch of committed data. If at this point the slave server is promoted (failover) to become the master server, it is possible to lose the already committed data, resulting in a rollback.

The principle of lossy semi-synchronous replication is illustrated in the following diagram:

Drawing 6.png

As you can see, lossy semi-synchronous replication waits for the slave to return an ACK after the master transaction is committed, as shown in step 4. This ACK indicates that at least one slave has received the binary logs. If at this point the binary logs have not been sent to the slave server and the master server fails, the slave server will lose the data that the master server had already committed.

MySQL 5.7 introduced lossless semi-synchronous replication to address this issue. The principle of lossless semi-synchronous replication is illustrated in the following diagram:

Drawing 7.png From the figure above, it can be seen that the WAIT ACK of lossless semi-synchronous replication occurs before the transaction is committed. Even if the slave has not received the binary log, if the master crashes and the last transaction has not been committed, the data is not visible externally and there is no data loss problem.

Therefore, for any business with data consistency requirements, such as the core order business of e-commerce, banking, insurance, securities, and other businesses closely related to funds, it is necessary to use lossless semi-synchronous replication. This ensures that the data is safe and guaranteed, even if a crash occurs, the slave still has a complete copy of the data.

Multi-source Replication #

Whether it is asynchronous replication or semi-synchronous replication, it is a 1 Master to N Slave architecture. However, MySQL also supports N Masters to 1 Slave, and this architecture is called multi-source replication.

Multi-source replication allows data from different MySQL instances to be synchronized to 1 MySQL instance, making it convenient to perform some statistical queries on a single slave server, such as common OLAP business queries.

The architecture of multi-source replication is as follows:

Drawing 8.png

The figure above shows the Order database, Inventory database, and Supplier database being synchronized to a single MySQL instance via multi-source replication. Then, with the complex SQL capabilities provided by MySQL 8.0, deep data analysis and mining can be performed on the business.

Delayed Replication #

In the replication architecture introduced earlier, the slave replays the binary log as quickly as possible after receiving it in order to avoid delays between the master and the slave. However, delayed replication allows the slave to delay replaying the received binary log in order to prevent immediate synchronization to the slave from causing complete data loss due to operational mistakes on the master server.

Delayed replication can be set using the following command:

CHANGE MASTER TO master_delay = 3600

This way, the slave is intentionally set to be 1 hour behind the master.

Delayed replication is very common in the design of database backup architectures. For example, a delayed backup server with a one-day delay can be set up, essentially providing users with a snapshot from 24 hours ago.

When there is a catastrophic mistake online, such as executing commands like DROP TABLE or DROP DATABASE, users have a snapshot from 24 hours ago, allowing for quick data recovery.

For the financial industry, delayed replication is a necessary part of backup design.

Summary #

After learning the content today, you should have a clear understanding of MySQL replication technology and realize that replication is the foundation of data synchronization, and the binary log is the cornerstone of replication. Let’s summarize the key points today:

The binary log records all changes to MySQL.
The command SHOW BINLOG EVENTS IN … FROM … can be used to view basic information about the binary log.
The tool mysqlbinlog can be used to view detailed content of the binary log.
Although setting up replication is simple, do not forget to configure crash-safe-related parameters, otherwise it may lead to inconsistent data between the master and slave.
Asynchronous replication is used in non-core business scenarios that do not require data consistency.
Lossless semi-synchronous replication is used in core business scenarios such as banking, insurance, securities, and other core businesses that require strict data consistency.
Multi-source replication allows multiple master data to be consolidated into one database instance for analysis.
Delayed replication is mainly used for preventing operational mistakes, and the financial industry should pay special attention to such scenarios.