08 Unpack and Use, Which Common Decoders Does Netty Support

08 Unpack and use, which common decoders does Netty support #

In the first two lessons, we introduced the problems of TCP packet splitting/merging, and how to use Netty to implement custom protocol encoding and decoding. As you can see, Netty has already encapsulated the underlying implementation of network communication for us. We only need to extend the ChannelHandler to implement custom encoding and decoding logic. What’s more user-friendly is that Netty provides many out-of-the-box decoders, which basically cover common solutions for TCP packet splitting/merging. In this lesson, we will explain the commonly used decoders in Netty and explore their usage and techniques together.

Before we start this lesson, let’s first review the mainstream solutions to TCP packet splitting/merging and summarize the corresponding encoder classes provided by Netty.

Fixed Length Decoder FixedLengthFrameDecoder #

The FixedLengthFrameDecoder is very simple. It directly sets the fixed length size frameLength through the constructor. No matter how much data the receiver fetches at a time, it will strictly decode according to frameLength. If accumulating read messages of length frameLength, the decoder will consider that a complete message has been received. If the message length is less than frameLength, the FixedLengthFrameDecoder will continue to wait for subsequent data packets until it receives a complete message. Let’s feel how simple it is to implement fixed length decoding using Netty through an example.

public class EchoServer {

    public void startEchoServer(int port) throws Exception {

        EventLoopGroup bossGroup = new NioEventLoopGroup();

        EventLoopGroup workerGroup = new NioEventLoopGroup();

        try {

            ServerBootstrap b = new ServerBootstrap();

            b.group(bossGroup, workerGroup)

                    .channel(NioServerSocketChannel.class)

                    .childHandler(new ChannelInitializer<SocketChannel>() {

                        @Override

                        public void initChannel(SocketChannel ch) {

                            ch.pipeline().addLast(new FixedLengthFrameDecoder(10));

                            ch.pipeline().addLast(new EchoServerHandler());

                        }

                    });

            ChannelFuture f = b.bind(port).sync();

            f.channel().closeFuture().sync();

        } finally {

            bossGroup.shutdownGracefully();

            workerGroup.shutdownGracefully();

        }

    }

    public static void main(String[] args) throws Exception {

        new EchoServer().startEchoServer(8088);

    }

}

@Sharable

public class EchoServerHandler extends ChannelInboundHandlerAdapter {

    @Override

    public void channelRead(ChannelHandlerContext ctx, Object msg) {

        System.out.println("Receive client : [" + ((ByteBuf) msg).toString(CharsetUtil.UTF_8) + "]");

    }

}

In the above server code, a 10-byte fixed-length decoder is used, and the result is printed through EchoServerHandler after decoding. We can start the server and send data to the server through the telnet command, and observe the output of the code.

Client input:

telnet localhost 8088

Trying ::1...

Connected to localhost.

Escape character is '^]'.

1234567890123

456789012

Server output:

Receive client : [1234567890]

Receive client : [123
    
45678]

Delimiter Based Decoder DelimiterBasedFrameDecoder #

Before using the DelimiterBasedFrameDecoder, we need to understand the effects of the following properties:

  • delimiters

delimiters specifies the special delimiters by passing in a ByteBuf as the parameter. delimiters is an array of ByteBuf, so we can specify multiple delimiters at the same time, but in the end, the shortest delimiter will be chosen for message splitting.

For example, the data received by the receiver is:

lengthFieldOffset - 长度域偏移量,表示长度域的起始位置
lengthFieldLength - 长度域的字节数
lengthAdjustment - 长度域调整值,表示消息长度需要调整的值
initialBytesToStrip - 需要跳过的字节数,即从解码后的 ByteBuf 中跳过的字节数
长度域解码器特有属性。
  • lengthFieldOffset - 长度域偏移量,表示长度域的起始位置
  • lengthFieldLength - 长度域的字节数
  • lengthAdjustment - 长度域调整值,表示消息长度需要调整的值
  • initialBytesToStrip - 需要跳过的字节数,即从解码后的 ByteBuf 中跳过的字节数
与其他解码器(如特定分隔符解码器)的相似的属性。
  • maxFrameLength - 最大帧长度,如果超过该长度还没有找到长度域,则会抛出 TooLongFrameException。
  • failFast - 控制在达到最大帧长度时是否抛出 TooLongFrameException。
  • byteOrder - 长度域定义的字节顺序,可以是 BIG_ENDIAN 或 LITTLE_ENDIAN。
  • lengthIncludesLengthField - 长度字段是否包含长度域本身的长度。

下面我们结合代码示例学习 LengthFieldBasedFrameDecoder 的用法。

b.group(bossGroup, workerGroup)
    .channel(NioServerSocketChannel.class)
    .childHandler(new ChannelInitializer<SocketChannel>() {
        @Override
        public void initChannel(SocketChannel ch) {
            ch.pipeline().addLast(new LengthFieldBasedFrameDecoder(1024, 0, 4, 0, 4));
            ch.pipeline().addLast(new EchoServerHandler());
        }
    });

通过 telnet 模拟客户端发送数据,观察代码输出的结果:

客户端输入:

telnet localhost 8088
Trying ::1...
Connected to localhost.
Escape character is '^]'.
0005hello

服务端输出:

Receive client : [hello]

在这个例子中,我们使用 LengthFieldBasedFrameDecoder 来解码消息。lengthFieldOffset 设置为 0,表示长度域的起始位置是消息的起始位置。lengthFieldLength 设置为 4,表示长度域占用的字节数为 4。lengthAdjustment 设置为 0,表示不需要调整消息长度。initialBytesToStrip 设置为 4,表示解码后的消息中需要跳过前面的 4 个字节。这样解码器就会从接收到的消息中解析出消息的长度,然后根据长度解析出完整的消息内容。

LengthFieldBasedFrameDecoder 可以很方便地处理带有长度字段的消息,适用于大部分基于长度拆包的场景。

// Offset of the length field, which is the starting position of the length data
private final int lengthFieldOffset;

// Number of bytes occupied by the length field
private final int lengthFieldLength;

/*

 * Length adjustment value
    
 *
    
 * In more complex protocol designs, the length field not only contains the length of the message, but also other data such as version number, data type, and data status. In this case, we need to use lengthAdjustment for adjustment.
    
 * 
    
 * lengthAdjustment = length of the message body - value of the length field
    
 *

 */

private final int lengthAdjustment;

// Number of initial bytes to be skipped, which is the starting position of the message content field

private final int initialBytesToStrip;

// Offset of the end of the length field, lengthFieldEndOffset = lengthFieldOffset + lengthFieldLength

private final int lengthFieldEndOffset;


* **Attributes similar to fixed length decoders and specific delimiter decoders.**


private final int maxFrameLength; // Maximum limit length of the message

private final boolean failFast; // Whether to throw TooLongFrameException immediately, used in conjunction with maxFrameLength

private boolean discardingTooLongFrame; // Whether in discarding mode

private long tooLongFrameLength; // Number of bytes to be discarded

private long bytesToDiscard; // Accumulated number of discarded bytes


Now let's explain the combination of each parameter using specific examples. In fact, the commentary in the source code of Netty's LengthFieldBasedFrameDecoder provides a very detailed description, giving a total of 7 scenario examples. Understanding these examples can basically help you truly understand the usage of LengthFieldBasedFrameDecoder parameters.

Example 1: Typical decoding based on message length + message content.


BEFORE DECODE (14 bytes)         AFTER DECODE (14 bytes)

+--------+----------------+      +--------+----------------+

| Length | Actual Content |----->| Length | Actual Content |

| 0x000C | "HELLO, WORLD" |      | 0x000C | "HELLO, WORLD" |

+--------+----------------+      +--------+----------------+


The above protocol is the most basic format, and the message only contains the Length and Content fields. The Length field is represented in hexadecimal and occupies 2 bytes. The value of Length 0x000C represents that the Content occupies 12 bytes. The decoder parameter combination for this protocol is as follows:

- lengthFieldOffset = 0, because the Length field is at the start of the message.
- lengthFieldLength = 2, the length is fixed according to the protocol design.
- lengthAdjustment = 0, the Length field only contains the length of the message, so no adjustment is needed.
- initialBytesToStrip = 0, after decoding, the content is still Length + Content, and no initial bytes need to be skipped.


Example 2: The decoding result needs to be truncated.


BEFORE DECODE (14 bytes)         AFTER DECODE (12 bytes)

+--------+----------------+      +----------------+

| Length | Actual Content |----->| Actual Content |

| 0x000C | "HELLO, WORLD" |      | "HELLO, WORLD" |

+--------+----------------+      +----------------+


Example 2 is different from Example 1 in that the decoding result only contains the message content, and the other parts are unchanged. The decoder parameter combination for this protocol is as follows:

- lengthFieldOffset = 0, because the Length field is at the start of the message.
- lengthFieldLength = 2, the length is fixed according to the protocol design.
- lengthAdjustment = 0, the Length field only contains the length of the message, so no adjustment is needed.
- initialBytesToStrip = 2, skip the length of the Length field. After decoding, the ByteBuf only contains the Content field.


Example 3: The length field contains the number of bytes occupied by the message length and message content.


BEFORE DECODE (14 bytes)         AFTER DECODE (14 bytes)

+--------+----------------+      +--------+----------------+

| Length | Actual Content |----->| Length | Actual Content |

| 0x000E | "HELLO, WORLD" |      | 0x000E | "HELLO, WORLD" |

+--------+----------------+      +--------+----------------+


Unlike the previous two examples, Example 3's Length field includes the fixed length of the Length field itself and the number of bytes occupied by the Content field. The value of Length is 0x000E (2 + 12 = 14 bytes). After lengthAdjustment (-2) on the Length field value (14 bytes), the real length of the Content field can be obtained. Therefore, the corresponding decoder parameter combination is as follows:

- lengthFieldOffset = 0, because the Length field is at the start of the message.
  * lengthFieldLength = 2, fixed length specified by the protocol design.
  * lengthAdjustment = -2, the length field is 14 bytes, so subtracting 2 gives the desired length for packet splitting.
  * initialBytesToStrip = 0, the decoded content remains as Length + Content, no initial bytes need to be skipped.

**Example 4: Decoding based on length field offset.**
    
    
    BEFORE DECODE (17 bytes)                      AFTER DECODE (17 bytes)
    
    +----------+----------+----------------+      +----------+----------+----------------+
    
    | Header 1 |  Length  | Actual Content |----->| Header 1 |  Length  | Actual Content |
    
    |  0xCAFE  | 0x00000C | "HELLO, WORLD" |      |  0xCAFE  | 0x00000C | "HELLO, WORLD" |
    
    +----------+----------+----------------+      +----------+----------+----------------+
    

In Example 4, the length field is no longer at the beginning of the message. The value of the length field is 0x00000C, indicating that the Content field occupies 12 bytes. The decoder parameters for this protocol combination are as follows:

  * lengthFieldOffset = 2, skip the 2 bytes occupied by Header 1 to get to the start position of the Length field.
  * lengthFieldLength = 3, fixed length specified by the protocol design.
  * lengthAdjustment = 0, the Length field only contains the message length, with no adjustment needed.
  * initialBytesToStrip = 0, the decoded content remains as the complete message, no initial bytes need to be skipped.

**Example 5: Length field and content field are not adjacent.**
    
    
    BEFORE DECODE (17 bytes)                      AFTER DECODE (17 bytes)
    
    +----------+----------+----------------+      +----------+----------+----------------+
    
    |  Length  | Header 1 | Actual Content |----->|  Length  | Header 1 | Actual Content |
    
    | 0x00000C |  0xCAFE  | "HELLO, WORLD" |      | 0x00000C |  0xCAFE  | "HELLO, WORLD" |
    
    +----------+----------+----------------+      +----------+----------+----------------+
    

In Example 5, the Length field is followed by Header 1, and the Length and Content fields are no longer adjacent. The length field represents the content after skipping the Header 1 field, so it needs to be adjusted using lengthAdjustment to obtain the Header + Content content. The decoder parameters for Example 5 are as follows:

  * lengthFieldOffset = 0, because the Length field is at the beginning of the message.
  * lengthFieldLength = 3, fixed length specified by the protocol design.
  * lengthAdjustment = 2, since Header + Content occupies a total of 2 + 12 = 14 bytes, the Length field value (12 bytes) needs to be added with lengthAdjustment (2 bytes) to obtain the Header + Content content (14 bytes).
  * initialBytesToStrip = 0, the decoded content remains as the complete message, no initial bytes need to be skipped.

**Example 6: Decoding based on length offset and length adjustment.**
    
    
    BEFORE DECODE (16 bytes)                       AFTER DECODE (13 bytes)
    
    +------+--------+------+----------------+      +------+----------------+
    
    | HDR1 | Length | HDR2 | Actual Content |----->| HDR2 | Actual Content |
    
    | 0xCA | 0x000C | 0xFE | "HELLO, WORLD" |      | 0xFE | "HELLO, WORLD" |
    
    +------+--------+------+----------------+      +------+----------------+
    

In Example 6, there are HDR1 and HDR2 fields before and after the Length field, each occupying 1 byte. Therefore, both length field offset and lengthAdjustment need to be adjusted, similar to Example 5. The decoder parameters for this protocol combination are as follows:

  * lengthFieldOffset = 1, skip the 1 byte occupied by HDR1 to get to the start position of the Length field.
  * lengthFieldLength = 2, fixed length specified by the protocol design.
  * lengthAdjustment = 1, since HDR2 + Content occupies a total of 1 + 12 = 13 bytes, the Length field value (12 bytes) needs to be added with lengthAdjustment (1 byte) to obtain the HDR2 + Content content (13 bytes).
  * initialBytesToStrip = 3, skip HDR1 and Length fields after decoding, both occupying a total of 3 bytes.

**Example 7: Length field includes multiple other fields besides Content.**
    
    
    BEFORE DECODE (16 bytes)                       AFTER DECODE (13 bytes)
    
    +------+--------+------+----------------+      +------+----------------+
    
    | HDR1 | Length | HDR2 | Actual Content |----->| HDR2 | Actual Content |
    
    | 0xCA | 0x0010 | 0xFE | "HELLO, WORLD" |      | 0xFE | "HELLO, WORLD" |
    
    +------+--------+------+----------------+      +------+----------------+
    

Example 7 is similar to Example 6, with the only difference being that the Length field records the length of the entire message, including the length of the Length field itself, HDR1, HDR2, and Content fields. The decoder needs to know how to adjust lengthAdjustment to obtain the HDR2 and Content content. Therefore, the following decoder parameter combination can be used:

  * lengthFieldOffset = 1, skip the 1 byte occupied by HDR1 to get to the start position of the Length field.
  * lengthFieldLength = 2, fixed length specified by the protocol design.
  * lengthAdjustment = -3, subtract HDR1 (1 byte) and the length of the Length field itself (2 bytes) from the Length field value (16 bytes) to obtain the HDR2 and Content content (1 + 12 = 13 bytes).
  * initialBytesToStrip = 3, skip HDR1 and Length fields after decoding, both occupying a total of 3 bytes.

The above 7 examples cover most of the usage scenarios for LengthFieldBasedFrameDecoder. Have you learned how to use it? Finally, here's a small task for you: in the previous lesson, we designed a more general protocol as shown below. How would you use the length field decoder (LengthFieldBasedFrameDecoder) to decode this protocol? Try it out yourself.

+—————————————————————+

| Magic Number 2 bytes | Protocol Version 1 byte | Serialization Algorithm 1 byte | Message Type 1 byte |

+—————————————————————+

| State 1 byte | Reserved Fields 4 bytes | Data Length 4 bytes |

+—————————————————————+

| Data Content (variable length) |

+—————————————————————+


### Summary

In this lesson, we introduced three commonly used decoders. We can see the elegance of Netty's design, as it is easy to achieve various functions by adjusting the parameters. Netty also considers robustness comprehensively, with many protective measures against edge cases. Implementing a robust decoder is not easy, as a single parsing error can cause the decoder to continue processing in a corrupted state. If you are using a length-based binary protocol, I recommend using LengthFieldBasedFrameDecoder, as it can meet the requirements of most real-world projects and does not require further custom implementations. I hope you can apply what you have learned in your project development.