15 Case Analysis From Bio to Nio and Then to Aio

15 Case Analysis- From BIO to NIO and Then to AIO #

Netty’s high-performance architecture is designed based on a network programming design pattern called Reactor. Nowadays, most I/O-related components use the Reactor model, including Tomcat, Redis, and Nginx, which demonstrates the widespread application of Reactor.

Reactor is the foundation of NIO. Why can NIO achieve higher performance compared to traditional blocking I/O? Let’s first take a look at some characteristics of traditional blocking I/O.

Blocking I/O Model #

Drawing 1.png

In the above diagram, we can see a typical BIO model. Whenever a connection arrives, it is handled by a coordinator which then opens a corresponding thread to take over the connection. If there are 1000 connections, it would require 1000 threads.

Threads are expensive resources, as they consume a large amount of memory and CPU scheduling time. Therefore, when there are a large number of connections, the efficiency of BIO becomes very low.

The following code is a simple socket server implemented using ServerSocket, listening on port 8888.

public class BIO {
    static boolean stop = false;

    public static void main(String[] args) throws Exception {
        int connectionNum = 0;
        int port = 8888;
        ExecutorService service = Executors.newCachedThreadPool();
        ServerSocket serverSocket = new ServerSocket(port);
        while (!stop) {
            if (10 == connectionNum) {
                stop = true;
            }
            Socket socket = serverSocket.accept();
            service.execute(() -> {
                try {
                    Scanner scanner = new Scanner(socket.getInputStream());
                    PrintStream printStream = new PrintStream(socket.getOutputStream());
                    while (!stop) {
                        String s = scanner.next().trim();
                        printStream.println("PONG:" + s);
                    }
                } catch (Exception ex) {
                    ex.printStackTrace();
                }
            });
            connectionNum++;
        }
        service.shutdown();
        serverSocket.close();
    }
}

After starting the server, we can use the nc command to test the connection, and the result is as follows.

$ nc -v localhost 8888
Connection to localhost port 8888 [tcp/ddi-tcp-1] succeeded!
hello
PONG:hello
nice
PONG:nice

Using JMC (Java Mission Control) mentioned in “04 | Practical Tools: How to Get Code Performance Data?”, during recording and initiating multiple connections, we can observe that multiple threads are running, each corresponding to a connection.

Drawing 2.png

We can see that I/O operations in BIO are blocking, and the lifecycle of a thread is the same as the lifecycle of a connection, and they cannot be reused.

In terms of individual blocking I/O, its efficiency is not slower than NIO. However, considering the scheduling and resource utilization of the entire server, NIO has significant advantages and is very suitable for high-concurrency scenarios.

Non-blocking I/O Model #

In fact, when performing I/O actions, most of the time is spent waiting. For example, establishing a socket connection takes a long time, during which it does not consume additional system resources but can only block and wait in a thread. In this case, system resources cannot be utilized properly.

Java’s NIO (Non-blocking I/O) uses epoll on the Linux platform as the underlying implementation. Epoll is a high-performance multiplexing I/O tool that improves some functionalities of tools like select and poll. Understanding the concept of epoll in network programming is almost a must-ask question during interviews.

Epoll’s data structure is directly supported by the kernel. By using operations such as epoll_create and epoll_ctl, descriptor (fd)-related event combinations can be constructed.

Here are two important concepts:

  • fd Every connection or file corresponds to a descriptor, such as a port number. When the kernel locates these connections, it uses the fd for addressing.
  • event When the resources corresponding to fd have state or data changes, the epoll_item structure is updated. When there are no event changes, epoll is suspended and does not consume system resources. As soon as new events arrive, epoll is activated, and the events are notified to the application.

There may also be an interview question about epoll, which is what improvements does epoll have compared to select?

You can answer this way:

  • Unlike select, epoll no longer needs to poll the fd set and does not need to exchange fd sets between user space and kernel space during invocation.
  • The complexity of obtaining ready fds for events is O(1) in epoll and O(n) in select.
  • select supports a maximum of approximately 1024 fds, while epoll supports 65535 fds.
  • select uses the polling mode to detect ready events, while epoll uses a notification mechanism, which is more efficient.

Let’s take Java’s NIO code as an example again to understand the specific concepts of NIO.

public class NIO {
    static boolean stop = false;

    public static void main(String[] args) throws Exception {
        int connectionNum = 0;
        int port = 8888;
        ExecutorService service = Executors.newCachedThreadPool();

        ServerSocketChannel ssc = ServerSocketChannel.open();
        ssc.configureBlocking(false);
        ssc.socket().bind(new InetSocketAddress("localhost", port));

        Selector selector = Selector.open();
        ssc.register(selector, ssc.validOps());

        while (!stop) {
            if (10 == connectionNum) {
                stop = true;
            }
            int num = selector.select();
            if (num == 0) {
                continue;
            }
            Iterator<SelectionKey> events = selector.selectedKeys().iterator();
            while (events.hasNext()) {
                SelectionKey event = events.next();

                if (event.isAcceptable()) {
                    SocketChannel sc = ssc.accept();
                    sc.configureBlocking(false);
                    sc.register(selector, SelectionKey.OP_READ);
                    connectionNum++;
                } else if (event.isReadable()) {
                    try {
                        SocketChannel sc = (SocketChannel) event.channel();
                        ByteBuffer buf = ByteBuffer.allocate(1024);
                        int size = sc.read(buf);
                        if(-1==size){
                            sc.close();
                        }
                        String result = new String(buf.array()).trim();
                        ByteBuffer wrap = ByteBuffer.wrap(("PONG:" + result).getBytes());
                        sc.write(wrap);
                    } catch (Exception ex) {
                        ex.printStackTrace();
                    }
                } else if (event.isWritable()) {
                    SocketChannel sc = (SocketChannel) event.channel();
                }

                events.remove();
            }
        }
        service.shutdown();
        ssc.close();
    }
}

The above code snippet is quite long and achieves the same functionality as BIO using NIO. From its API design, we can see some shadows of epoll.

First, we create a server ssc and open a new event selector to listen for its OP_ACCEPT event.

ServerSocketChannel ssc = ServerSocketChannel.open();
Selector selector = Selector.open();
ssc.register(selector, ssc.validOps());

There are 4 types of events in total:

  • New connection event (OP_ACCEPT)
  • Connection ready event (OP_CONNECT)
  • Read ready event (OP_READ)
  • Write ready event (OP_WRITE)

Any network and file operations can be abstracted into these four events.

Drawing 3.png

Next, in the while loop, we use the select() function to block in the main thread. Blocked means that the operating system no longer allocates CPU time slices to the current thread, so the select() function consumes almost no system resources.

int num = selector.select();

Once a new event arrives, such as a new connection, the main thread can be scheduled and the program can continue execution. At this point, we can continue to receive the subscribed events based on the event notifications. Since there may be multiple connections and events registered with the selector, there are also multiple events. We use a safe iterator loop to process them, and after processing, we remove them.

Here’s a question: What will happen if the events are not removed or if a certain event’s processing is missed?

Iterator<SelectionKey> events = selector.selectedKeys().iterator();
    while (events.hasNext()) {
        SelectionKey event = events.next();
        ...
        events.remove();
    }
}

When a new connection arrives, we subscribe to more events. For data reading, the corresponding event is OP_READ. Unlike BIO programming, where data is handled in a stream-oriented manner, NIO operates on abstract concepts called Channels and exchanges data through buffers.

SocketChannel sc = ssc.accept();
sc.configureBlocking(false);
sc.register(selector, SelectionKey.OP_READ);

It is worth noting that the server-side and client-side implementations can be different. For example, the server can use NIO while the client can use BIO, as there are no strict requirements.

Another event that is often asked about in interviews is OP_WRITE. As mentioned above, this event indicates write readiness and will continue to occur as long as the underlying buffer has space available. This wastes CPU resources, so we generally do not register for OP_WRITE.

There is also a detail to note when reading data. Unlike BIO, we don’t use a loop to get the data.

In the following code, we create a 1024-byte buffer for data reading. What happens if the data in the connection is larger than 1024 bytes?

SocketChannel sc = (SocketChannel) event.channel();
ByteBuffer buf = ByteBuffer.allocate(1024);
int size = sc.read(buf);

This involves two event notification mechanisms:

  • Level Triggered (LT) is the default mode in Java NIO. As long as the buffer has data, the event will keep occurring.
  • Edge Triggered (ET) mode. This mode triggers an event only once when there is data in the buffer. To trigger the event again, all data from the file descriptor (fd) must be read first.

As you can see, Java NIO uses level-triggered mode, which may wake up the thread more frequently and have lower efficiency compared to edge-triggered mode. Therefore, Netty uses JNI to implement edge-triggered mode, which is more efficient.

Reactor Pattern #

With an understanding of BIO and NIO and how they are used, the Reactor pattern becomes apparent.

NIO is based on an event-driven mechanism, with a selector called Selector that blocks and retrieves a list of events of interest. Once the event list is obtained, the dispatcher is used to perform the actual data operations.

Drawing 5.png

This image is from Doug Lea’s “Scalable IO in Java” and illustrates the basic elements of the simplest Reactor model.

If you review the “NIO Code in Java” example mentioned earlier, you will notice that the Reactor model consists of four main elements:

  • Acceptor: Handles the connection of clients and binds specific event handlers.
  • Event: Represents the specific events that occur, such as read, send, etc.
  • Handler: Executes the specific event handling logic, such as handling read/write events.
  • Reactor: Dispatches events to the respective handlers.

We can further refine this model, as shown in the following diagram, which divides the Reactor into a mainReactor and subReactor.

Drawing 7.png

This image is from Doug Lea’s “Scalable IO in Java.”

  • mainReactor: Responsible for listening and handling new connections, and then passing on the subsequent event handling to the subReactor.
  • subReactor: Performs event handling in a multi-threaded manner, transitioning from a blocking mode to a task queue mode.

Those familiar with Netty can see that this Reactor model forms the basis of Netty’s design. In Netty, the Boss thread corresponds to the handling and dispatching of connections, similar to the mainReactor, while the Worker threads correspond to the subReactor, using multiple threads to handle the dispatch and processing of read/write events.

This model assigns more specific responsibilities to each component, resulting in lower coupling and effectively solving the C10k problem.

AIO #

The concept of NIO is often misunderstood.

An interviewer might ask you: Why is the socket operation still blocking when using NIO with Channels for reading and writing? What is the main benefit of NIO?

// This line of code is blocking
int size = sc.read(buf);

You can answer by saying that NIO is only responsible for notifying events that occur on the file descriptor (fd). The part of obtaining and notifying events is non-blocking, but the operations that occur after receiving the notification are still blocking. Even if multiple threads are used to handle these events, they are still blocking.

AIO takes it a step further by making these operations on events non-blocking as well. The following code is a typical example of AIO, which registers a CompletionHandler callback function for event handling. The events here are hidden, such as the read function, which not only represents when the Channel is ready for reading, but also automatically reads the data into a ByteBuffer. When the reading is completed, it notifies you through the callback function for further operations.

public class AIO {
    public static void main(String[] args) throws Exception {
        int port = 8888;
        AsynchronousServerSocketChannel ssc = AsynchronousServerSocketChannel.open();
        ssc.bind(new InetSocketAddress("localhost", port));
ssc.accept(null, new CompletionHandler<AsynchronousSocketChannel, Object>() {
    void job(final AsynchronousSocketChannel sc) {
        ByteBuffer buffer = ByteBuffer.allocate(1024);
        sc.read(buffer, buffer, new CompletionHandler<Integer, ByteBuffer>() {
            @Override
            public void completed(Integer result, ByteBuffer attachment) {
                String str = new String(attachment.array()).trim();
                ByteBuffer wrap = ByteBuffer.wrap(("PONG:" + str).getBytes());
                sc.write(wrap, null, new CompletionHandler<Integer, Object>() {
                    @Override
                    public void completed(Integer result, Object attachment) {
                        job(sc);
                    }
                    @Override
                    public void failed(Throwable exc, Object attachment) {
                        System.out.println("error");
                    }
                });
            }
            @Override
            public void failed(Throwable exc, ByteBuffer attachment) {
                System.out.println("error");
            }
        });
    }
    @Override
    public void completed(AsynchronousSocketChannel sc, Object attachment) {
        ssc.accept(null, this);
        job(sc);
    }
    @Override
    public void failed(Throwable exc, Object attachment) {
        exc.printStackTrace();
        System.out.println("error");
    }
});
Thread.sleep(Integer.MAX_VALUE);
}

AIO was introduced in Java 1.7 and was expected to provide better performance. However, actual testing has not yielded satisfactory results. This is because AIO mainly handles automatic read and write operations on data. If these operations are not placed in a framework, they still have to be placed in the kernel, which does not save any operational steps and has limited impact on performance. On the other hand, Netty’s NIO model combined with multithreading is already doing well in this aspect, and its programming model is also simpler than AIO.

Therefore, there are not many practical applications of AIO in the market, so caution must be exercised when choosing a technology.

Reactive Programming #

You may have heard of Spring 5.0’s WebFlux, which is an alternative solution to Spring MVC and allows you to write reactive applications. The relationship between the two is illustrated in the following diagram:

image.png

Spring WebFlux is built on top of Netty, so it operates asynchronously and non-blocking. Other similar components include vert.x, akka, rxjava, etc.

WebFlux is a wrapper built on top of project reactor, and its fundamental feature is provided by project reactor. As for the underlying non-blocking model, it is ensured by Netty.

The non-blocking characteristics are easily understandable, but what does reactive programming actually mean?

Reactive programming is a programming paradigm that is focused on streams of data and propagation of changes. This means that in a programming language, it is convenient to express static or dynamic streams of data, and the related computational models will automatically propagate changing values through the data stream.

This statement may be obscure, in terms of programming, it means: express the producer-consumer pattern using a simple API, and automatically handle backpressure.

Backpressure refers to flow control between the producer and consumer, and by fully asynchronous operations, it reduces unnecessary waiting and resource consumption.

Leveraging Java’s lambda expressions makes the programming model very simple, and Java 9 introduced the Reactive Streams, which makes our operations easier.

For example, here is the Fluent API example of Spring Cloud Gateway, and reactive programming APIs are generally similar.

public RouteLocator customerRouteLocator(RouteLocatorBuilder builder) {
    return builder.routes()
            .route(r -> r.path("/market/**")
                    .filters(f -> f.filter(new RequestTimeFilter())
                            .addResponseHeader("X-Response-Default-Foo", "Default-Bar"))
                    .uri("http://localhost:8080/market/list")
                    .order(0)
                    .id("customer_filter_router")
            )
            .build();
}

Transitioning from traditional development models to Reactor’s development model has certain costs, but it can indeed improve the performance of our applications. Whether to adopt it or not depends on the trade-off between programming difficulty and performance.

Conclusion #

In this lesson, we have learned about concepts such as BIO, NIO, AIO, and the basic programming model Reactor. We have learned that:

  • BIO has a thread model where each connection is assigned a thread, which is very wasteful of resources;
  • NIO completes non-blocking operations by actively notifying through listening to critical events, but the handling of events themselves is still non-blocking;
  • AIO is completely asynchronous and non-blocking, but it is rarely used in practice.

Using Netty’s multi-acceptor model and multithreaded model, we can easily achieve similar operations to AIO. Netty’s event triggering mechanism uses the efficient ET mode, allowing more connections and better performance.

Using Netty, we can build the foundation of reactive programming, and along with features like lambda expressions, we can create reactive frameworks similar to WebFlux. Reactive programming is a trend, and now there are more and more frameworks and underlying databases supporting reactive programming, making our application responses more rapid.