28 How to Deploy Distributed System Across Regions in Multiple Data Centers

28 How to Deploy Distributed System Across Regions in Multiple Data Centers #

Hello, I’m Tang Yang.

Let’s imagine the following scenario: Your vertically integrated e-commerce system is deployed in an IDC data center, and one day the data center announces that there will be a network device migration the next day, which may result in intermittent or short network interruptions.

The network interruptions in the data center will certainly have a negative impact on your business. Even though the migration is scheduled for the early morning hours (the low peak period for your business), as the technical leader, you still need to think of ways to mitigate the impact of the planned isolation. Unfortunately, with the current technical architecture, all the e-commerce services are deployed in a single IDC data center, and you don’t have any good solutions.

The availability of the IDC data center is the Achilles’ heel of the entire system. Once the IDC data center experiences serious problems, similar to what happened to some large companies, it can have a severe impact on the overall service availability. For example:

In July 2016, China Unicom rectified more than 40 IDC data centers under its management, and a large number of non-compliant connections were disconnected. This action affected the Blue Toon data center used by Maimai, resulting in a 15-hour downtime for Maimai. The famous AcFun website even experienced a downtime of more than 48 hours, causing significant losses.

Currently, the architecture characteristic of single data center deployment means that the availability of your system is dependent on the availability of the data center. In other words, the data center holds the lifeline of the system. Therefore, you start to think about how to further improve the system’s availability through architectural transformation. After searching for solutions online and learning from the experiences of some large companies, you discover that “multi-data center deployment” can address this problem.

What are the challenges of multi-data center deployment #

The meaning of multi-data center deployment is: deploying multiple sets of services in different IDC data centers, which share the same business data and can handle traffic from users.

In this way, when one data center experiences network failures, fires, or even major unavoidable disasters such as earthquakes or floods, you can switch user traffic to other regional data centers at any time to ensure that the system can continue to run uninterrupted. This architecture sounds very good, but it is actually very complex and difficult to implement. So what makes it complex?

Let’s say we have two data centers, A and B, both of which deploy application services. The master and slave databases are deployed in data center A. So how does the application in data center B access the data? There are two approaches.

One approach is to directly read the slave database in data center A across data centers:

img

Another approach is to deploy a slave database in data center B, synchronize the data from the master database across data centers, and then the application in data center B can read the data from this slave database:

img

Regardless of which approach, they involve cross-data center data transfer, which requires relatively high latency between data centers. And the latency between data centers is closely related to the distance between them. You can remember a few numbers:

  1. The latency between two data centers in the same location generally ranges from 1ms to 3ms.

What impact does this latency have? You should know that our interface response time needs to be controlled within 200ms, and an interface may make several calls to third-party HTTP services or RPC services. If these services are deployed in different data centers, the interface response time will increase by several milliseconds, which is acceptable.

An interface may involve several database writes. If the database master is in a different data center, the interface response time will also increase by several milliseconds to tens of milliseconds due to writing to the master database in a different data center, which is also acceptable.

However, the number of cache reads and database reads in an interface may reach a dozen or even dozens of times, which will increase the delay by tens or even hundreds of milliseconds, which is unacceptable.

  1. The latency between two data centers in different parts of the country is within 50ms.

The specific latency data varies depending on the distance. For example, the latency of a dedicated line between Beijing and Tianjin is within 10ms; the latency between Beijing and Shanghai will increase to nearly 30ms; if you want to deploy two data centers in Beijing and Guangzhou, the latency will reach 50ms. With this latency data, to ensure that the interface response time is within 200ms, you should minimize cross-data center service calls and avoid cross-data center database and cache operations.

  1. If your business is an international service that requires cross-country dual data center deployment, the latency between data centers will be higher. According to the data from major cloud providers, for example, if you want to access a service deployed on the US West Coast from China, the latency will be around 100ms to 200ms. With this latency, you should avoid data cross-data center synchronous calls and only do asynchronous data synchronization.

If you are considering a multi-data center deployment architecture, these numbers are crucial foundational data that you need to firmly remember to avoid performance degradation caused by cross-data center data access.

The data delay between data centers is objectively present, and you cannot change it. What you can do is to minimize the impact of data delay on interface response time. So how do you design a multi-data center deployment solution under data delay?

Step-by-step iterative multi-datacenter deployment plan #

1. Active-active within the same city #

The deployment plan for multi-datacenter is not something that can be achieved overnight, but rather a process of continuous iteration and development. As mentioned earlier, the latency between datacenters within the same city is around 1ms to 3ms, which allows for tolerable cross-datacenter calls. Therefore, the complexity of this active-active plan will be relatively low.

However, this plan can only achieve datacenter-level disaster recovery and cannot handle disaster recovery at the city level. Nevertheless, compared to natural disasters such as earthquakes or floods, the probability of network failures or power outages in datacenters is much higher. So, if your system does not need city-level disaster recovery, active-active within the same city is generally sufficient. So, how should the active-active plan be designed?

Let’s consider the following scenario: You have two datacenters, A and B, in Beijing. Datacenter A is operated by China Unicom, while datacenter B is operated by China Telecom. The datacenters are connected via a dedicated line. The core idea of the plan is to avoid cross-datacenter calls as much as possible. The specific plan is as follows:

Firstly, the primary database can be deployed in one datacenter, for example, in datacenter A. This means that data from both datacenters A and B will be written to datacenter A. Then, deploy a secondary database in each of the two datacenters, and synchronize data from the primary database by using the master-slave replication mechanism. This way, query requests from the active-active datacenters can be directed to the local secondary database. In the event of a failure in datacenter A, the secondary database in datacenter B can be promoted to become the primary database, achieving disaster recovery.

Caches can also be deployed in both datacenters, and query requests can be directed to the local cache. If the data does not exist in the cache, the request can be forwarded to the local secondary database for data retrieval. Data updates can be made to both secondary databases in the two datacenters to maintain data consistency.

RPC services in different datacenters can register different service groups with the service registry. Meanwhile, the RPC clients, which are web services in different datacenters, only subscribe to the service group within their own datacenter. This way, RPC calls can be made within the local datacenter, avoiding cross-datacenter RPC calls.

img

Your system will definitely rely on other services within the company, such as auditing and search services. If these services are also deployed in active-active mode across datacenters, you should try to ensure that calls are made to services within the same datacenter to reduce latency.

By adopting the active-active architecture within the same city, you can achieve disaster recovery at the datacenter level and break the limitation of deploying services in a single datacenter. However, there will still be an issue of writing data across datacenters. Nonetheless, considering the relatively low volume of write requests, this can be tolerated in terms of performance.

2. Active-active across different cities #

The previous plan is sufficient for your current needs. However, your business is constantly growing. If one day the traffic of your e-commerce system reaches the level of JD.com or Taobao, you will need to ensure the availability of the system even if a major natural disaster occurs in the city where the datacenters are located. This is when you need to adopt the active-active plan across different cities (as far as I know, both Alibaba and Ele.me use the active-active plan across different cities).

When considering the active-active plan, the first thing you need to consider is the deployment location of the datacenters in different cities. They should not be too close to each other, otherwise, they may be affected by the same natural disaster. So, if your primary datacenter is in Beijing, you should avoid setting up the secondary datacenter in Tianjin, but instead choose a location that is farther away, such as Shanghai or Guangzhou. However, this will result in higher data transmission latency, and the cross-datacenter write database plan used in the active-active within the same city is not suitable here.

Therefore, you need to ensure that data is only written to the local data storage service and then implement a data synchronization plan to sync the data to the secondary datacenter. Generally, there are two types of data synchronization plans:

One is based on the master-slave replication of storage systems, such as MySQL and Redis. In this approach, the primary database is deployed in one datacenter, and the secondary database is deployed in the secondary datacenter, and the two databases are synchronized using master-slave replication to achieve data synchronization.

The other approach is based on a message queue. When a write request is generated in one datacenter, a message will be written to the message queue. Then, the application in the secondary datacenter consumes this message and executes the business logic, writing the data to the storage service.

I suggest that you use a combination of these two synchronization approaches. For example, you can use the message-based approach to synchronize cache data and HBase data, and use storage-based master-slave replication to synchronize data in MySQL and Redis.

Regardless of which approach is taken, there will be latency when transmitting data from one datacenter to another. Therefore, you need to ensure that when users read their own data, they read from the primary database in the datacenter where their data is located. To achieve this, you need to shard users, so that each user’s reads and writes preferably occur in the same datacenter. Also, when reading data and making service calls, try to call services within the local datacenter. Here’s a scenario: Suppose, in an e-commerce system, User A wants to view information about all the orders, and the shop information and seller information in these orders are likely to be stored in the secondary datacenter. In this case, you should prioritize making service calls and reading data from the local datacenter, even if it means reading data from the secondary database across datacenters, which may cause some latency but is acceptable.

img

Course Summary #

In this class, in order to improve the availability and stability of the system, I discussed the challenges of multi-data center deployment, as well as the deployment architectures of dual data centers in the same city and active-active in different locations. Here, I would like to emphasize a few key points:

The main reason for the difficulty of multi-data center deployment is the delay in data transmission between different data centers. You need to know that the delay of multi-data centers in the same city is generally between 1ms and 3ms, the delay of different data centers in the same country is below 50ms, and the delay of cross-national data centers is below 200ms.

The dual data center solution in the same city allows cross-data center data writing, but data reading and service invocation should be kept within the same data center as much as possible.

The active-active solution in different locations should avoid cross-data center synchronization for data writing and reading, and should use an asynchronous approach to synchronize data from one data center to another.

Multi-data center deployment is a solution that should only be considered when a business has developed to a certain scale and there is a need for data center disaster recovery. If possible, it should be avoided. Once your team decides to implement multi-data center deployment, the dual active data center solution should be sufficient to meet your needs, which is much simpler compared to active-active deployment in different locations. In the industry, there are very few companies that can build a truly asynchronous active-active architecture because this architecture is too complex to implement, so do not try it easily.

In summary, architecture needs to evolve and adjust based on the scale of the system and the requirements for availability, performance, and scalability. Blindly pursuing the “advancement” of architecture can only lead to complex solutions, increase operational costs, and cause inconvenience to system maintenance.