08 Service Discovery Whether to Choose Cp or Ap

08 Service Discovery - Whether to choose CP or AP #

Hello, my name is He Xiaofeng. In the previous lecture, I talked about “how to design a flexible RPC framework”. In summary, it is about how to apply plugins in the RPC framework and construct a microkernel-based RPC framework using a plugin approach. The key point is “pluginization”.

Today, I want to talk to you about the challenges faced by “service discovery” in RPC in the context of a large-scale cluster.

Why do we need service discovery? #

Let’s start with an example. Imagine you need to send an email to a colleague whom you have never worked with before, but you don’t have their email address. What would you do in this situation? If it were me, I would check the company’s corporate directory.

Similarly, in order to achieve high availability, in a production environment, service providers offer their services externally in the form of clusters. The IP addresses within these clusters can change at any time, so we need a “directory” to promptly obtain the corresponding service nodes. This process of obtaining the nodes is generally referred to as “service discovery”.

For both the service caller and the service provider, their contract is the interface, which is equivalent to the name in the “directory”. The service nodes refer to specific instances that provide this contract. The set of IP addresses serves as the “address book”, allowing us to discover services by obtaining the collection of service IP addresses through the interface. This is the service discovery mechanism in RPC frameworks, as shown in the following figure:

Service registration: When a service provider starts up, it registers the exposed interface with the registry. The registry saves the IP address and interface of this service node.
Service subscription: When a service caller starts up, it goes to the registry to search for and subscribe to the IP addresses of the service provider, and then caches them locally to be used for subsequent remote calls.

Why not use DNS? #

Since service discovery is so powerful, is it difficult to implement? In fact, similar mechanisms have always been around us. If we think about the essence of service discovery, it is to establish a mapping between the interface and the IP address of the service provider. Can’t we replace the IP addresses of the service providers with a unified domain name and use the mature DNS mechanism to achieve this?

Now, let’s take a look at the process of DNS:

If we use DNS to implement service discovery, all service provider nodes are configured under the same domain name. The service consumer can indeed obtain the IP address of a random service provider through DNS and establish a long connection with it. This seems to be fine, but why is this solution rarely used in our industry? If you haven’t thought about this question, you can stop for a moment and consider these two questions:

If an IP port goes offline, can the service consumer remove the service node in a timely manner?
If some service nodes have already been brought online, and I suddenly scale up the service, can the newly added service nodes immediately receive traffic?

The answer to both of these questions is “no”. This is because, in order to improve performance and reduce the load on DNS servers, DNS adopts a multi-level caching mechanism, and the cache time is generally long. In particular, the default cache of the JVM is permanently valid. Therefore, the service consumer cannot perceive changes in the service nodes in a timely manner.

At this point, you may think, can I add a load balancer? Bind the domain name to this load balancer and obtain the IP address of the load balancer through DNS. In this way, when the service is invoked, the service consumer can directly establish a connection with the VIP (Virtual IP) machine, and then the VIP machine completes TCP forwarding, as shown in the following figure:

This solution can indeed solve some of the problems encountered by DNS. However, it is not very suitable for the RPC scenario for the following reasons:

Building a load balancer or TCP/IP layer 4 proxy incurs additional costs.
Request traffic goes through the load balancer, resulting in additional network transmission and performance waste.
Adding or removing nodes in the load balancer usually requires manual operation, and there will be a large number of human operations and delays when there is a large-scale scaling up or down.
In service governance, we need more flexible load balancing strategies. The algorithms of current load balancers cannot fully meet these flexible requirements.

Therefore, although DNS or VIP solutions can serve as service discovery roles, they are still not suitable for direct use in the RPC scenario.

Service Discovery Based on ZooKeeper #

So how do we implement service discovery in RPC? We need to go back to the essence of service discovery, which is to establish a mapping between interfaces and service provider IPs. Is this mapping a form of naming service? Of course, we also hope that the registration center can achieve real-time change notification. Can it be achieved using open-source solutions like ZooKeeper and etcd? I can say with certainty, “Yes, it can.” Let me introduce a service discovery method based on ZooKeeper.

The overall idea is simple: build a ZooKeeper cluster as the registration center cluster. When a service is registered, the service nodes only need to write registration information to the ZooKeeper nodes. The service subscription and service distribution functions are completed using ZooKeeper’s Watcher mechanism. The overall process is as follows:

The service platform management end first creates a root path in ZooKeeper, which can be named according to the interface name (e.g., /service/com.demo.xxService). In this path, create directories for service providers and service consumers (e.g., provider, consumer), which are used to store the service provider’s node information and the service consumer’s node information, respectively.
When a service provider initiates registration, it will create a ephemeral node in the directory of service providers, where it stores the registration information of the service provider.
When a service consumer initiates subscription, it creates an ephemeral node in the directory of service consumers, where it stores the information of the service consumer. At the same time, the service consumer watches all the service node data in the service provider’s directory (/service/com.demo.xxService/provider).
When there is a data change in the service provider’s directory, ZooKeeper notifies the service consumer that initiated the subscription.

The service discovery implemented by my technical team in the early stages used ZooKeeper, and it ran smoothly for over a year. However, as the team’s degree of microservice adoption increased, the overall pressure on the ZooKeeper cluster also grew. This became especially apparent during centralized deployments. “Centralized eruption” refers to a situation during a large-scale deployment where a large number of service nodes simultaneously initiate registration operations. The CPU usage of the ZooKeeper cluster suddenly spikes, causing the ZooKeeper cluster to stop working. At the time, we were unable to immediately restart the ZooKeeper cluster, and the business could only continue after the ZooKeeper cluster was restored.

Through our investigation, we found that the root cause of this problem was the performance issue of ZooKeeper itself. When there are a large number of nodes connected to ZooKeeper, and there is a high frequency of read and write operations on ZooKeeper, and when the number of directories stored in ZooKeeper reaches a certain threshold, ZooKeeper becomes unstable. The CPU usage continues to increase, and ultimately it crashes. After the crash, because the nodes of various businesses continue to send read and write requests, as soon as it restarts, ZooKeeper is unable to handle the instant read and write pressure and crashes again.

This “incident” made us realize that the performance of the ZooKeeper cluster is clearly unable to support our existing scale of service clusters. We need to rethink our service discovery solution.

Message Bus-Based Eventually Consistent Registry #

We know that one of the key features of ZooKeeper is its strong consistency. Every time data is updated on any node of the ZooKeeper cluster, other ZooKeeper nodes are notified to perform the update simultaneously. It requires guaranteeing real-time and complete consistency of data on each node, which directly leads to a decrease in performance of the ZooKeeper cluster. This is similar to a game of passing objects, where everyone can only start the next round after receiving the object in the current round, rather than being able to proceed to the next round immediately after receiving the object themselves.

On the other hand, in service discovery of RPC frameworks, it is acceptable for the service consumer to discover a newly deployed node after a certain period of time, such as a few seconds. After all, within the first few seconds or even longer since the service node came online, there is no incoming request traffic, so it does not have any impact on the entire service cluster. Therefore, we can sacrifice consistency (CP) and choose eventual consistency (AP) to gain performance and stability of the entire registry cluster.

Is there a simple, efficient, and eventually consistent update mechanism that can replace ZooKeeper’s strongly consistent data update mechanism?

Because eventual consistency is required, we can consider using a message bus mechanism. Registration data can be fully cached in the memory of each registry, and data synchronization can be achieved through the message bus. When a registry node receives a service node registration, it generates a message and pushes it to the message bus, which then notifies other registry nodes to update data and distribute services through the message bus. This ensures eventual consistency of data among registry nodes. The specific process is shown in the diagram below:

When a service comes online, the registry node receives the registration request, and the service list data changes, generating a message that is pushed to the message bus. Each message has an incrementing version.
The message bus actively pushes messages to all registry nodes, and the registry nodes also periodically pull messages. Messages received are replayed in the message playback module, only accepting messages with versions greater than the local version. Messages with versions lower than the local version are directly discarded, thus achieving eventual consistency.
Consumers subscribe to get all service instances of specific interfaces from the registry memory and cache them in their own memory.
Using a push-pull mode, consumers can promptly obtain incremental changes of service instances and merge them with the cached data in memory.

To optimize performance, a two-level cache is used here, with a memory cache in the registry and another in the consumer. Eventual consistency is ensured through an asynchronous push-pull mode.

Furthermore, you may also think that the service consumer may not receive the latest service nodes, so there may be instances where the target node has already gone offline or no longer provides the specified interface. Is there any problem with this? We deal with this issue in the RPC framework. After the service consumer sends a request to the target node, the target node performs validity verification. If the specified interface service does not exist or is being shut down, the request is rejected. Upon receiving the rejection exception, the service consumer will safely retry with other nodes.

Through the message bus approach, we can accomplish the notification of data changes among registry clusters, ensure eventual consistency of data, and promptly trigger service distribution operations in the registry. After careful cultivation in the RPC field, you will realize that the feature of service discovery allows us to sacrifice strong consistency and prioritize the robustness of the system when designing large-scale cluster service discovery systems. Eventual consistency is the more commonly used strategy in distributed system design.

Summary #

Today, I shared about the service discovery mechanism in RPC framework and how to use ZooKeeper to achieve “service discovery”. I also discussed the issues that ZooKeeper may face as a registration center in a large-scale cluster.

Usually, we can use ZooKeeper, etcd, or distributed caches (like Hazelcast) to solve the problem of event notification. However, when the cluster reaches a certain scale, the ZooKeeper cluster or etcd cluster that we rely on may become unstable and fail to meet our needs.

In a large-scale service cluster, the registration center faces challenges when a large number of service nodes come online or go offline simultaneously. The registration center cluster receives a lot of service change requests, and the data between nodes becomes inconsistent, leading to the following problems:

High load on the registration center
Inconsistent data among nodes
Delayed or incorrect service node list distribution

The consistency of the service data in the registration center, which the RPC framework depends on, does not necessarily need to meet the “C” in the CAP theorem. It only needs to meet “AP”. We use a “message bus” notification mechanism to ensure the eventual consistency of the registration center data and solve these problems.

Furthermore, many concepts discussed today can be applied not only to “service discovery” in RPC frameworks. For example, the incremental update approach for pushing service node data improves the efficiency of “service distribution” in the registration center. This approach can also be used in other areas, such as a unified configuration center, to enhance the efficiency of distributing configurations.

After-class Reflection #

Currently, after the service provider is online, it will automatically register with the registration center. The service consumer will automatically detect the newly added instances and the traffic will quickly be directed to these new instances. If I want to divert the traffic from certain service provider instances without taking them offline, have you thought of any other convenient ways?

Please leave a comment and share your thoughts and doubts with me. You are also welcome to share this article with your friends and invite them to join the study. See you in the next class!