17 Dubbo Remoting Layer Core Interface Analysis This Turns Out to Be a Design Compatible With All Nio Frameworks

17 Dubbo Remoting Layer Core Interface Analysis This Turns Out to Be a Design Compatible With All NIO Frameworks #

In the second part of this column, we have thoroughly introduced the implementation of Dubbo’s registry center. Now, let’s move on to the dubbo-remoting module, which provides various client and server communication capabilities. In Dubbo’s overall architectural design, the bottom-most part highlighted in red in the diagram below represents the Remoting layer, which includes the Exchange, Transport, and Serialize sub-layers. The dubbo-remoting module we are about to introduce corresponds mainly to the Exchange and Transport layers.

Drawing 0.png

Dubbo Overall Architectural Design

Rather than implementing a complete networking library, Dubbo uses existing, relatively mature third-party networking libraries such as Netty, Mina, or Grizzly, which are NIO frameworks. Based on the actual scenario and requirements, we can modify the configuration to choose the underlying NIO framework.

The following diagram shows the structure of the dubbo-remoting module, where each submodule corresponds to a third-party NIO framework. For example, the dubbo-remoting-netty4 submodule uses Netty4 to implement remote communication in Dubbo, and the dubbo-remoting-grizzly submodule uses Grizzly.

Drawing 1.png

We have already introduced dubbo-remoting-zookeeper in Lesson 15, when we discussed the implementation of the Zookeeper-based registry center. It uses Apache Curator to interact with Zookeeper.

dubbo-remoting-api Module #

It is worth noting that the dubbo-remoting-api in Dubbo is the top-level abstraction of other dubbo-remoting- modules*. The other dubbo-remoting submodules implement the dubbo-remoting-api module using third-party NIO libraries, as shown in the following diagram:

Drawing 2.png

Let’s first take a look at the abstraction of the entire Remoting layer in the dubbo-remoting-api module. The structure of the dubbo-remoting-api module is shown in the following diagram:

Drawing 3.png

Generally, similar or related classes are placed in the same package, so let’s start by understanding the functions of each package in the dubbo-remoting-api module.

  • The buffer package defines interfaces, abstract classes, and implementations related to buffers. Buffers play an important role in NIO frameworks and each NIO framework has its own buffer implementation. The buffer package abstracts the buffers of various NIO frameworks at a higher level and also provides some basic implementations.
  • The exchange package abstracts the concepts of Request and Response and adds many features to them. This is a very crucial part of the entire remote invocation.
  • The transport package abstracts the network transport layer, but it only handles the abstract transmission of one-way messages. That is, the Client sends the request message and the Server receives it; the Server sends the response message and the Client receives it. Many network libraries can implement the functionality of network transport, such as Netty, Grizzly, etc. The transport package is an abstraction layer on top of these network libraries.
  • Other interfaces such as Endpoint, Channel, Transporter, and Dispatcher are placed in the org.apache.dubbo.remoting package, and these interfaces are the core interfaces of Dubbo Remoting.

Now let’s see how Dubbo abstracts these core interfaces.

Core Interface of the Transport Layer #

In Dubbo, an “Endpoint” concept is abstracted. An Endpoint can be uniquely identified by an IP and port, and a TCP connection is created between two Endpoints to enable bidirectional data transmission. Dubbo abstracts the TCP connections between Endpoints as Channels. The Endpoint initiating the request is abstracted as the Client, and the Endpoint receiving the request is abstracted as the Server. These abstract concepts are also the foundation of the entire dubbo-remoting-api module, which we will introduce one by one.

The Endpoint interface definition in Dubbo is as follows:

Drawing 4.png

As shown in the diagram above, the get*() methods are used to obtain various attributes of the Endpoint, including the local address of the Endpoint, the associated URL information, and the ChannelHandler associated with the underlying Channel. The send() method is responsible for sending data, and we will explain the differences between the two overloaded versions of send() in detail when we discuss the implementation of Endpoint. The two overloaded close() methods and the startClose() method are used to close the underlying Channel, and the isClosed() method is used to check if the underlying Channel has been closed.

A Channel is an abstraction of the connection between two Endpoints, similar to a conveyor belt connecting two locations, and the messages transmitted by the two Endpoints are like goods on the conveyor belt. The message sender writes messages to the Channel, while the receiver reads messages from the Channel. This is similar to the Channel in Netty mentioned in Lesson 10.

Lark20200922-162359.png

The definition of the Channel interface is shown below, and we can see two points: one is that the Channel interface inherits the Endpoint interface, possessing the ability to switch states and send data; the other is that key-value (KV) attributes can be attached to the Channel.

Drawing 5.png

A ChannelHandler is a message handler registered on a Channel. Netty also has a similar abstraction, which you should be familiar with. The following diagram shows the definition of the ChannelHandler interface. In a ChannelHandler, you can handle events such as establishing and disconnecting Channel connections, handle read data, handle sent data, and handle exceptions. From the naming of these methods, they are all in the past tense, indicating that the corresponding events have already occurred. Drawing 6.png

It should be noted that the ChannelHandler interface is annotated with @SPI, indicating that this interface is an extension point.

In the previous lesson introducing Netty, we mentioned a special type of ChannelHandler specifically responsible for implementing encoding and decoding functions to convert byte data into meaningful messages or convert messages between each other. There is a similar abstraction in the dubbo-remoting-api, as shown below:

@SPI
public interface Codec2 {

    @Adaptive({Constants.CODEC_KEY})
    void encode(Channel channel, ChannelBuffer buffer, Object message) throws IOException;

    @Adaptive({Constants.CODEC_KEY})
    Object decode(Channel channel, ChannelBuffer buffer) throws IOException;

    enum DecodeResult {
        NEED_MORE_INPUT, SKIP_SOME_INPUT
    }
}

Here, it is worth noting that the Codec2 interface is annotated with @SPI, indicating that this interface is an extension interface. Additionally, both the encode() and decode() methods are also annotated with @Adaptive, which means that adapter classes will be generated, and the specific extension implementation class will be determined based on the codec value in the URL.

The DecodeResult enum is used to handle packet sticking and splitting during TCP transmission. In the previous version of the simplified RPC, this issue was also addressed. For example, when the currently available data is not sufficient to constitute a complete message, the NEED_MORE_INPUT enum will be used.

Next, let’s take a look at the Client and RemotingServer interfaces. They both abstract the client and server sides, and both inherit the Channel and Resetable interfaces, which means that they both have the ability to read and write data.

Drawing 7.png

Both the Client and Server are actually Endpoints. However, they are semantically distinguished by the responsibilities of requests and responses. Both of them have the ability to send data, so they both inherit the Endpoint interface. The main difference between the Client and Server is that a Client can only associate with one Channel, while a Server can accept multiple Channel connections initiated by Clients. Therefore, the RemotingServer interface defines methods for querying Channels, as shown in the diagram below:

Drawing 8.png

Dubbo further encapsulates a layer on top of the Client and Server, called the Transporter interface, which is defined as follows:

@SPI("netty")
public interface Transporter {

    @Adaptive({Constants.SERVER_KEY, Constants.TRANSPORTER_KEY})
    RemotingServer bind(URL url, ChannelHandler handler) throws RemotingException;
@Adaptive({Constants.CLIENT_KEY, Constants.TRANSPORTER_KEY})
public interface Client {

    Client connect(URL url, ChannelHandler handler) throws RemotingException;
}

We can see that the Transporter interface has the @SPI annotation. It is an extension interface with the default extension name “netty”. The appearance of the @Adaptive annotation indicates the dynamic generation of the adapter class, which determines the implementation class of RemotingServer based on the values of “server” and “transporter”, and determines the implementation of the Client interface based on the values of “client” and “transporter” successively.

What are the implementations of the Transporter interface? As shown in the figure below, there is a Transporter interface implementation for each supported NIO library, scattered in various dubbo-remoting-* implementation modules.

Drawing 9.png

What are the specific Client and RemotingServer returned by these Transporter interface implementations? As shown in the figure below, it returns the implementation of RemotingServer and Client corresponding to the NIO library.

Drawing 10.png Drawing 11.png

By now, you should have noticed that the interface abstracted by the Transporter layer is very similar to the core interface of Netty. So why abstract the Transporter layer separately instead of letting the upper layer directly use Netty like a simplified version of RPC framework?

The answer to this question is already apparent. The NIO libraries Netty, Mina, and Grizzly have different external interfaces and usage methods. If the upper layer depends directly on Netty or Grizzly, it depends on the specific NIO library implementation instead of relying on an abstract with transmission capability. If you want to switch to a different implementation later, you need to modify the dependency and related code, which is very likely to introduce bugs. This is not in line with the Open-Closed Principle in design patterns.

With the Transporter layer, we can modify the specific Transporter extension implementation used through Dubbo SPI, thereby switching to different Client and RemotingServer implementations without modifying any code. Even if a more advanced NIO library appears, we only need to develop the corresponding dubbo-remoting-* implementation module to provide core interfaces such as Transporter, Client, and RemotingServer to integrate it, fully complying with the Open-Closed Principle.

Finally, let’s take a look at a class called Transporters. It is not an interface but a facade class that encapsulates the creation of Transporter objects (through Dubbo SPI) and the handling of ChannelHandler, as shown below:

public class Transporters {

    private Transporters() {
        // omitted overloaded bind() and connect() methods
    }

    public static RemotingServer bind(URL url, ChannelHandler... handlers) throws RemotingException {
        ChannelHandler handler;
        if (handlers.length == 1) {
            handler = handlers[0];
        } else {
            handler = new ChannelHandlerDispatcher(handlers);
        }
        return getTransporter().bind(url, handler);
    }

    public static Client connect(URL url, ChannelHandler... handlers) throws RemotingException {
        ChannelHandler handler;
        if (handlers == null || handlers.length == 0) {
            handler = new ChannelHandlerAdapter();
        } else if (handlers.length == 1) {
            handler = handlers[0];
        } else {
           handler = new ChannelHandlerDispatcher(handlers);
        }
        return getTransporter().connect(url, handler);
    }

    public static Transporter getTransporter() {
        // Automatically generate the Transporter adapter and load it
        return ExtensionLoader.getExtensionLoader(Transporter.class)
            .getAdaptiveExtension();
    }
}

When creating Client and RemotingServer, you can specify multiple ChannelHandlers to be bound to the Channel to handle the transmitted data. In the Transporters.connect() and bind() methods, multiple ChannelHandlers are wrapped into a ChannelHandlerDispatcher object.

ChannelHandlerDispatcher is also one of the implementations of the ChannelHandler interface. It maintains a CopyOnWriteArraySet collection, and all implementations of the ChannelHandler interface will invoke the corresponding methods of each ChannelHandler element in it. In addition, ChannelHandlerDispatcher also provides methods for adding and removing ChannelHandler elements in the collection.

So far, we have completed the introduction of the core interfaces of the Dubbo Transport layer. Here is a summary:

  • The Endpoint interface abstracts the concept of “endpoint”, which is the foundation of all abstract interfaces.
  • The upper layer obtains the specific Transporter extension implementation through the Transporters facade class, and then obtains the corresponding Client and RemotingServer implementations through the Transporter, which can establish (or accept) interaction with the remote end through the Channel.
  • Both the Client and RemotingServer use ChannelHandler to process the transmitted data in the Channel, and the Codec2 interface, responsible for encoding and decoding, is abstracted as well.

The overall architecture is shown in the figure below, which is very similar to the architecture of Netty.

Lark20200922-162354.png

Overall Structure of the Transporter Layer

Summary #

In this lesson, we first introduced the position of the dubbo-remoting module in the Dubbo architecture and the structure of the dubbo-remoting module. Then, we analyzed the dependency relationship between various submodules of the dubbo-remoting module and explained the core functions of each package in the dubbo-remoting-api submodule. Finally, we further analyzed the core interfaces of the entire Transport layer and the Transporter architecture abstracted from these interfaces.

If you have any questions or ideas about this lesson, please feel free to share them with me.