07 Dark Language Headers, How to Realize Custom Protocol Communication With Netty

07 Dark language headers, how to realize custom protocol communication with Netty #

Since it is network programming, communication protocols are essential. Application layer communication requires the implementation of various network protocols. During the project development process, we need to construct an application layer protocol that meets our own business scenarios. In the previous lesson, we introduced how to use a network protocol to solve the low-level issues of TCP packet splitting and packet sticking. In this lesson, we will continue to discuss how to design an efficient, scalable, and easy-to-maintain custom communication protocol, as well as how to use Netty to implement a custom communication protocol.

Communication Protocol Design #

The so-called protocol is a secret code agreed upon by both communication parties in advance. In TCP network programming, the data packet formats sent by the sender and receiver are binary. The sender converts the object into a binary stream and sends it to the receiver. After receiving the binary data, the receiver needs to know how to parse it into an object. Therefore, the protocol is the foundation for normal communication between both parties.

Currently, there are many common protocols on the market, such as HTTP, HTTPS, JSON-RPC, FTP, IMAP, Protobuf, etc. Common protocols have good compatibility, are easy to maintain, and can achieve seamless integration between heterogeneous systems. If the business scenario and performance requirements are met, it is recommended to use a solution that uses a common protocol. Compared to common protocols, custom protocols have the following advantages.

  • Ultimate Performance: Common communication protocols consider many compatibility factors, which inevitably result in a loss of performance.
  • Scalability: Custom protocols are better suited for expansion than common protocols and can better meet business needs.
  • Security: Common protocols are public, and many vulnerabilities have been exploited by hackers. Custom protocols are more secure because hackers need to first crack your protocol content.

So how do we design a custom communication protocol? The answer to this question varies, but there are established guidelines for designing communication protocols. Based on practical experience, let’s take a look at the basic elements a complete network protocol should possess.

1. Magic Number #

A magic number is a secret code agreed upon by both communication parties, usually represented by a few fixed bytes. The purpose of the magic number is to prevent anyone from randomly sending data to the server port. When the server receives data, it parses the magic number in the first few fixed bytes and performs a correctness check. If it does not match the agreed magic number, the data is considered illegal, and the connection can be closed directly or other measures can be taken to enhance system security. The concept of a magic number is also reflected in compression algorithms, Java Class files, and other scenarios. For example, Class files store the magic number 0xCAFEBABE at the beginning, and the correctness of the magic number is verified when loading Class files.

2. Protocol Version Number #

As business requirements change, the protocol may need to be modified in terms of structure or fields, and different versions of the protocol correspond to different parsing methods. Therefore, in production-level projects, it is strongly recommended to reserve a protocol version number field.

3. Serialization Algorithm #

The serialization algorithm field indicates the method that the data sender should use to convert the requested object into binary, as well as how to convert binary back into an object, such as JSON, Hessian, or Java built-in serialization.

4. Message Type #

In different business scenarios, there may be different types of messages. For example, in an RPC framework, there are types of messages such as request, response, and heartbeat. In an IM instant messaging scenario, there are types of messages such as login, create group chat, send message, receive message, and exit group chat.

5. Length Field #

The length field represents the length of the request data, and the receiving party obtains a complete message based on the length field.

6. Request Data #

The request data is usually the binary stream obtained after serialization, and the content of each type of request data is different.

7. Status #

The status field is used to indicate whether the request is normal. Generally set by the callee. For example, if an RPC invocation fails, the callee can set the status field to an exception status.

8. Reserved Field #

The reserved field is optional. To cope with possible protocol upgrades, several bytes of reserved fields can be reserved for future needs.

Through studying the above basic elements of a protocol, we can obtain a more general protocol example:

+---------------------------------------------------------------+
| Magic Number 2 bytes | Protocol Version Number 1 byte | Serialization Algorithm 1 byte | Message Type 1 byte |
+---------------------------------------------------------------+
| Status 1 byte | Reserved Field 4 bytes | Data Length 4 bytes |
+---------------------------------------------------------------+
| Data Content (variable length) |
+---------------------------------------------------------------+

How to Implement Custom Protocol Communication in Netty #

After learning how to design a protocol, how do we implement custom communication protocols in Netty? In fact, Netty, as an excellent network communication framework, has provided us with a wealth of abstract base classes for encoding and decoding. These abstract base classes make it easier for us to extend and implement custom protocols based on them.

First, let’s take a look at how encoders and decoders are classified in Netty.

Common Encoder Types in Netty:

  • MessageToByteEncoder: Encodes objects into byte streams.
  • MessageToMessageEncoder: Encodes one type of message into another type of message.

Common Decoder Types in Netty:

  • ByteToMessageDecoder/ReplayingDecoder: Decodes byte streams into message objects.

  • MessageToMessageDecoder: Decodes one type of message into another type of message. Encoders can be divided into one-time encoders and two-time encoders. One-time encoders are used to solve the TCP packet fragmentation/sticking problem, and they parse the byte data obtained after protocol analysis. If you need to convert the parsed byte data into an object model, you need to use a two-time encoder. The same is true for the encoding process in reverse.

  • One-time encoders/decoders: MessageToByteEncoder/ByteToMessageDecoder.

  • Two-time encoders/decoders: MessageToMessageEncoder/MessageToMessageDecoder.

Now let’s have a detailed introduction of the commonly used abstract encoding and decoding classes in Netty.

Abstract Encoding Classes #

Drawing 0.png

From the inheritance diagram of abstract encoding classes, we can see that encoding classes are abstract class implementations of ChanneOutboundHandler, which specifically operate on Outbound outgoing data.

  • MessageToByteEncoder

MessageToByteEncoder is used to encode objects into byte streams. MessageToByteEncoder provides a unique encode abstract method, and we only need to implement the encode method to complete custom encoding. So when is the encode() method called? Let’s take a look at the core source code fragment of MessageToByteEncoder together, as shown below.

@Override

public void write(ChannelHandlerContext ctx, Object msg, ChannelPromise promise) throws Exception {

    ByteBuf buf = null;

    try {

        if (acceptOutboundMessage(msg)) { // 1. Is the message type a match?

            @SuppressWarnings("unchecked")

            I cast = (I) msg;

            buf = allocateBuffer(ctx, cast, preferDirect); // 2. Allocate ByteBuf resources

            try {

                encode(ctx, cast, buf); // 3. Execute the encode method to complete data encoding

            } finally {

                ReferenceCountUtil.release(cast);

            }

            if (buf.isReadable()) {

                ctx.write(buf, promise); // 4. Pass the write event forward

            } else {

                buf.release();

                ctx.write(Unpooled.EMPTY_BUFFER, promise);

            }

            buf = null;

        } else {

            ctx.write(msg, promise);

        }

    } catch (EncoderException e) {

        throw e;

    } catch (Throwable e) {

        throw new EncoderException(e);

    } finally {

        if (buf != null) {

            buf.release();

        }

    }

}

The MessageToByteEncoder class overrides the write() method of ChanneOutboundHandler, and its main logic consists of the following steps:

  1. acceptOutboundMessage checks if there is a matching message type. If there is a match, it executes the encoding process. If not, it passes the message to the next ChannelOutboundHandler directly.
  2. Allocates a ByteBuf resource, which by default uses off-heap memory.
  3. Calls the encode method implemented by the subclass to perform data encoding. Once the message is successfully encoded, it will be automatically released by calling ReferenceCountUtil.release(cast).
  4. If the ByteBuf is readable, it means that the data has been successfully encoded and then writes it to the ChannelHandlerContext to be passed to the next node of the pipeline. If the ByteBuf is not readable, it releases the ByteBuf resource and passes an empty ByteBuf object downwards.

The encoder implementation is very simple and does not need to consider the problem of splitting or sticking packets. The following example shows how to write a string into a ByteBuf instance, which will be passed to the next ChannelOutboundHandler in the ChannelPipeline:

public class StringToByteEncoder extends MessageToByteEncoder<String> {

    @Override
    protected void encode(ChannelHandlerContext channelHandlerContext, String data, ByteBuf byteBuf) throws Exception {
        byteBuf.writeBytes(data.getBytes());
    }
}

MessageToMessageEncoder

MessageToMessageEncoder is similar to MessageToByteEncoder; it also only needs to implement the encode method. The difference between MessageToMessageEncoder and MessageToByteEncoder is that MessageToMessageEncoder converts one format of message to another format of message. The second Message can be any object, and if that object is of type ByteBuf, the implementation principle is basically the same as MessageToByteEncoder. Additionally, the output of MessageToMessageEncoder is a list of objects, and the encoded result belongs to the intermediate object, which will eventually be converted into a ByteBuf for transmission.

Some commonly used implementation subclasses of MessageToMessageEncoder are StringEncoder, LineEncoder, Base64Encoder, etc. Take StringEncoder as an example to see how MessageToMessageEncoder is used. The source code example below converts CharSequence types (such as String, StringBuilder, StringBuffer, etc.) into ByteBuf types. Combined with StringDecoder, it can directly implement the encoding and decoding of string-type data.

@Override
protected void encode(ChannelHandlerContext ctx, CharSequence msg, List<Object> out) throws Exception {
    if (msg.length() == 0) {
        return;
    }
    out.add(ByteBufUtil.encodeString(ctx.alloc(), CharBuffer.wrap(msg), charset));
}

Abstract decoding classes #

Similarly, let’s first look at the inheritance diagram of abstract decoding classes that implement ChanneInboundHandler. Decoders operate on inbound data. Decoders are much more difficult to implement than encoders because decoders need to consider the problem of splitting or sticking packets. Since the receiver may not receive a complete message, the decoding framework needs to buffer the inbound data until a complete message is obtained.

Drawing 1.png

Abstract decoding class ByteToMessageDecoder

First, let’s take a look at the abstract methods defined by ByteToMessageDecoder:

public abstract class ByteToMessageDecoder extends ChannelInboundHandlerAdapter {
    protected abstract void decode(ChannelHandlerContext ctx, ByteBuf in, List<Object> out) throws Exception;

    protected void decodeLast(ChannelHandlerContext ctx, ByteBuf in, List<Object> out) throws Exception {
        if (in.isReadable()) {
            decodeRemovalReentryProtection(ctx, in, out);
        }
    }
}

decode() is the abstract method that users must implement. The method is called with the received data ByteBuf, and a List is used to add the encoded messages. Due to TCP sticking or splitting problems, the ByteBuf may contain multiple valid messages or may not be a complete message. Netty will repeatedly invoke the decode() method until no new complete messages can be decoded and added to the List, or until the ByteBuf has no more readable data. If the content of the List is not empty at this time, it will be passed to the next ChannelInboundHandler in the ChannelPipeline.

ByteToMessageDecoder also defines the decodeLast() method. Why does the abstract decoder have an additional decodeLast() method? Because decodeLast() is called once after the Channel is closed and is mainly used to handle the remaining bytes of the ByteBuf. The default implementation in Netty simply calls the decode() method. If there are special business requirements, you can extend the functionality by overriding the decodeLast() method.

ByteToMessageDecoder also has an abstract subclass called ReplayingDecoder. It encapsulates the management of the buffer. When reading data from the buffer, you no longer need to check the length of the bytes. If there is not enough length to read, ReplayingDecoder will terminate the decoding operation. The performance of ReplayingDecoder is slower compared to using ByteToMessageDecoder directly, so it is not recommended in most cases.

Abstract decoding class MessageToMessageDecoder

MessageToMessageDecoder is similar to ByteToMessageDecoder in that it encodes one type of message into another type of message. However, unlike ByteToMessageDecoder, MessageToMessageDecoder does not cache the data packets and is mainly used for message model conversion. It is recommended to use ByteToMessageDecoder to parse the TCP protocol and solve the problems of splitting and sticking data packets. After parsing the effective ByteBuf data, it is then passed to the subsequent MessageToMessageDecoder for data object conversion. The specific process is shown in the following diagram.

Lark20201109-102121.png

Communication Protocol in Practice #

In the section on communication protocol design mentioned above, we mentioned the basic elements of a protocol and provided a relatively general protocol example. Below, we will deepen our understanding of the Netty encoding and decoding framework by implementing the decoder for this protocol.

Before implementing the protocol encoder, we need to first understand one thing: how do we determine if the ByteBuf contains a complete message? The most common approach is to check the message length, dataLength. If the readable data length of the ByteBuf is smaller than dataLength, it means that the ByteBuf does not have enough data to obtain a complete message. The preceding fixed fields of the message header in this protocol contain 14 bytes, including magic number, protocol version, data length, and other fields. The length of the fixed fields and data length can be used to determine the integrity of the message. The example implementation logic of the encoder is as follows:

@Override
public final void decode(ChannelHandlerContext ctx, ByteBuf in, List<Object> out) {
    // Check if ByteBuf has enough readable bytes
    if (in.readableBytes() < 14) {
        return;
    }

    in.markReaderIndex(); // Marks the reader index of ByteBuf

    in.skipBytes(2); // Skips the magic number

    in.skipBytes(1); // Skips the protocol version

    byte serializeType = in.readByte();

    in.skipBytes(1); // Skips the message type

    in.skipBytes(1); // Skips the status field

    in.skipBytes(4); // Skips the reserved field

    int dataLength = in.readInt();

    if (in.readableBytes() < dataLength) {
        in.resetReaderIndex(); // Resets the reader index of ByteBuf
        return;
    }

    byte[] data = new byte[dataLength];
    in.readBytes(data);

    SerializeService serializeService = getSerializeServiceByType(serializeType);
    Object obj = serializeService.deserialize(data);

    if (obj != null) {
        out.add(obj);
    }
}

The ByteBuf API used in the above implementation is not elaborated on in this section. We will delve into ByteBuf in the third chapter of this column.

Summary #

In this lesson, we learned the basic elements of protocol design and how to use Netty to implement custom protocols. Netty provides a set of abstract classes that implement ChannelHandler, which has good scalability for implementing custom encoders and decoders in project development. Finally, we deepened our understanding of encoders and decoders through the practical implementation of a specific protocol. Did you learn it?

Of course, Netty’s work on encoding and decoding goes far beyond this. It also provides a rich set of ready-to-use encoders and decoders. In the next lesson, we will explore practical encoding and decoding techniques together.