02 Protocol Design How to Design a Protocol That Is Extensible and Backwards Compatible

02 Protocol Design - How to design a protocol that is extensible and backwards compatible #

Hello, I am He Xiaofeng. In my previous talk, I shared the principles of Remote Procedure Call (RPC), which is to enable us to call a remote function as if it were a local function, helping us to abstract the complexity of remote invocation and allowing us to conveniently build distributed systems. In summary, the key is to achieve transparency.

Building upon the previous talk, let’s now talk about RPC protocols.

When it comes to protocols, the first thing that comes to your mind may be TCP, UDP, and so on. I find the implementation of these network transmission protocols somewhat difficult to understand. Although we also use these protocols in RPC, they are mostly transparent to our higher-level applications. We don’t really need to focus on the details of these protocols during the usage of RPC. So what is the RPC protocol that I’m going to talk about today?

Let me give you an example, and you will understand immediately. Are you familiar with HTTP protocol (the HTTP mentioned in this talk refers to version 1.x)? It is probably the most frequently used protocol in our daily work. We use it every day when we open a web page in our browser. So what is the relationship between HTTP protocol and RPC protocol? At first glance, they seem unrelated, but they have one thing in common: they both belong to the application layer protocols.

Therefore, the RPC protocol I am going to talk about today is centered around application layer protocols. Let’s start by understanding the HTTP protocol first. Let’s take a look at its protocol format. Recall what happens when we enter a URL in our browser? Let’s put aside DNS resolution for now. After receiving the command, the browser will encapsulate a request and send it to the IP address obtained from DNS resolution. With the help of a network packet capture tool, we can capture the request packets, as shown in the following image:

The Role of Protocols #

After reading about the HTTP protocol, you may have a question: why do we need protocols? Can’t we communicate without them?

We know that only binary data can be transmitted over the network. Therefore, before sending an RPC request into the network, it needs to convert the method call’s parameters into binary. After conversion, the binary data is written into a local socket and then sent to the network device by the network card.

However, during the transmission, the RPC does not send all the binary data of the request parameters as a whole to the opposite machine. It may be split into several data packets in the middle, or it may merge data packets from other requests (on the premise of being on the same TCP connection). The details of how to split and merge involve system parameter configuration and TCP window size. For the service provider application, it will receive a lot of binary data from the TCP channel. So how can we identify which binary data belongs to the first request?

It’s like asking you to read an article without punctuation marks. How do you recognize where each sentence ends? It’s simple. We add punctuation marks to complete the sentences.

Similarly, when transmitting data in RPC, in order to accurately “divide sentences,” we must also include “periods” in the data packets sent by the application as boundaries. This allows our receiving application to separate the correct data from the data stream. The period in this data packet is the boundary of the message, used to indicate the end of the request data. For example, the caller sends three messages: AB, CD, EF. Without boundaries, the receiving end may receive messages like ABCDEF or ABC, DEF. This would cause the received semantics to be inconsistent with the sent ones.

Therefore, to avoid inconsistencies in semantics, we need to set a boundary when sending requests and then divide the data according to this set boundary when receiving requests. This boundary’s semantic representation is what we call a protocol.

How to design a protocol? #

After understanding the role of a protocol, let’s take a look at how protocols are designed in RPC. You might ask, “You mentioned earlier that both HTTP and RPC belong to the application layer protocol. With an existing HTTP protocol, why not use it directly instead of designing a private protocol for RPC?”

To answer this question, let’s start with the purpose of RPC. Compared to HTTP, RPC is more responsible for communication between applications, which means it has higher performance requirements. However, the packet size of the HTTP protocol is much larger compared to the actual request data, as it includes many unnecessary contents such as line breaks and carriage returns. Another important reason is that the HTTP protocol is stateless, meaning the client cannot associate requests and responses, requiring a new connection to be established for each request and closed once the response is complete. Therefore, for an RPC that demands high performance, the HTTP protocol is generally unable to meet the requirements. As a result, RPC will choose to design a more compact private protocol.

So how do we design a private RPC protocol?

Before designing the protocol, let’s first organize what content needs to be included in the protocol for RPC communication.

The first thing to consider is the message boundary we mentioned earlier, but the size of each RPC request is not fixed. Therefore, our protocol must allow the receiving side to read variable-length content correctly. One approach is to fix a length (such as 4 bytes) to store the size of the entire request data. When receiving the data, we first read the value in the fixed-length position, where the value represents the length of the protocol body. Then, based on the value’s size, we can read the data of the protocol body. The design of the entire protocol can be like this:

However, the above protocol only achieves correct sentence division and is not suitable for RPC. This is because, for the service provider, it does not know which serialization method was used to generate the binary data within the protocol body. If the service provider cannot determine the serialization method used by the caller, even if it can restore the correct semantics, it cannot convert the binary back into objects, making it unable to complete the invocation upon receiving the data. Therefore, we need to separately extract the serialization method and store it in a fixed length, similar to the protocol length. These parameters that need to be stored with fixed length can be collectively called “protocol headers.” This way, the protocol can be divided into two parts: the protocol header and the protocol body.

In the protocol header, we will include parameters such as the protocol length, serialization method, protocol identifier, message ID, and message type. The protocol body generally contains the request interface method, the values of the request’s business parameters, and some additional attributes. With these considerations, a complete RPC protocol might look like this:

Extensible Protocol #

The protocol mentioned earlier belongs to the fixed-length protocol header, which means that new parameters cannot be added to the protocol header afterwards. Adding parameters would cause compatibility issues online. To give a specific example, let’s say you design an 88-bit protocol header, where the protocol length occupies 32 bits. And then, in order to add new functionality, you add 2 bits within the protocol header and place it at the end of the header. After the upgrade, the upgraded application will send requests using the new protocol. However, non-upgraded applications that receive the requests will still read the protocol header as 88 bits. The newly added 2 bits will be read as the first 2 bits of the protocol body, but the last 2 bits of the original protocol body will be discarded. This will result in incorrect data in the protocol body.

You may think, “Can’t I just add the parameters in the variable-length protocol body? Besides, you already mentioned that some additional attributes will be placed in the protocol body.”

Yes, new parameters can be added to the protocol body. But here’s a key point: the content in the protocol body has to be serialized. This means that in order to retrieve the value of your parameter, you must deserialize the entire protocol body. However, in some scenarios, this approach can be costly!

For example, if the service provider receives an expired request, which means that the time the service provider receives the request is later than the time the caller sends the request plus the configured timeout, there is no need to continue processing. It’s better to just return a timeout. To implement this functionality, the configured timeout needs to be passed in the protocol. If the protocol did not previously include a timeout parameter, adding the timeout to the protocol body seems a bit heavy, right? Obviously, this would increase CPU consumption.

Therefore, in order to ensure the smooth upgrade and transformation of protocols, it is necessary to design a protocol that supports extensibility. The key lies in making the protocol header extensible, so the length of the extended protocol header cannot be fixed. To read the content within the variable-length protocol header, a fixed location to read the length is definitely needed. We also need a fixed place to write the length into the protocol header. The overall protocol becomes three sections: a fixed part, the protocol header content, and the protocol body content. We can still collectively refer to the first two parts as the “protocol header”. The specific protocol is as follows:

Finally, I want to say that designing a simple RPC protocol is not difficult; what’s difficult is designing an “upgradeable” protocol. Not only do we need to achieve backward compatibility when expanding new features, but we also need to minimize resource consumption as much as possible. Therefore, the structure of our protocol needs to support the extension of the protocol body and also the extension of the protocol header. The above design method is based on my years of online experience. It can be said that achieving extensibility is crucial. I hope this protocol template can help you avoid some pitfalls.

Summary #

One of the main reasons that differentiate us human beings from other animals is our ability to communicate through language and use writing to solidify civilization. This allows us to stand on the shoulders of giants and grow. However, in order to ensure that our recorded words can be understood by others, we must use punctuation marks to achieve sentence segmentation. Otherwise, it may lead to misinterpretation of the meaning of the text, and even create jokes.

In RPC (Remote Procedure Call), the role of the protocol is similar to the punctuation marks in writing. It serves as the boundary for decomposing application request messages, ensuring that the binary data can be correctly reconstructed into semantic meaning after network transmission, avoiding miscommunication between the caller and the callee.

However, when designing a protocol, we should not only consider meeting current functionalities but also take a higher perspective. Just like designing system architectures, we need to ensure that the system we design can be well extended and support additional functionalities.

After-class Reflection #

Alright, that’s all for today’s content. Let’s end with a final reflection question. Today we discussed one of the reasons why RPC doesn’t directly use the HTTP protocol, which is the inability to establish a connection that associates requests with responses. Each time a request is made, a new connection needs to be established, and after the response is completed, the connection is closed. Therefore, we need to design a private protocol. So, how do we establish the association between requests and responses in RPC?

Feel free to share your thoughts with me in the comments, and also feel free to share this article with your friends and invite them to join the learning. See you in the next class!