11 Which Io Methods Are Provided by Java and How Does Nio Implement Multiplexing

11 Which IO methods are provided by Java and how does NIO implement multiplexing #

IO has always been a core part of software development, and with the growth of massive data and the development of distributed systems, the ability to expand IO has become even more important. Fortunately, the Java platform’s IO mechanism has been continuously improved. Although it still has some shortcomings in certain aspects, it has proven its ability to build highly scalable applications in practice.

The question I want to ask you today is, What IO methods does Java provide? How does NIO achieve multiplexing?

Typical Answer #

There are many ways to implement Java IO, and they can be easily distinguished based on different IO abstraction models and interaction methods.

First, there is the traditional java.io package, which is based on the stream model and provides some familiar IO functionalities such as the File abstraction, input/output streams, etc. The interaction method is synchronous and blocking, which means that when reading from an input stream or writing to an output stream, the thread will be blocked until the read or write action is completed. The calls between them are reliable and in a linear order.

The advantage of the java.io package is that the code is simple and straightforward, but the disadvantage is that it has limitations in terms of IO efficiency and scalability, and it can easily become a performance bottleneck for applications.

Many times, some network APIs provided under the java.net package, such as Socket, ServerSocket, and HttpURLConnection, are also classified as synchronous and blocking IO libraries, because network communication is also an IO behavior.

Second, the NIO framework (java.nio package) was introduced in Java 1.4, which provides new abstractions such as Channel, Selector, and Buffer. It allows building multiplexed, synchronous and non-blocking IO programs, while providing high-performance data manipulation methods that are closer to the underlying operating system.

Third, in Java 7, NIO was further improved in what is known as NIO 2. It introduced asynchronous non-blocking IO, often referred to as AIO (Asynchronous IO). Asynchronous IO operations are based on event and callback mechanisms. In simple terms, application operations return immediately without blocking, and when the background processing is completed, the operating system notifies the corresponding thread to continue its work.

Analysis of Exam Points #

The answers I listed above are based on a common classification method, namely, BIO, NIO, NIO 2 (AIO).

In actual interviews, there are many aspects that can be expanded upon from traditional IO to NIO and NIO 2, and the examination points involve various aspects, such as:

  • Basic API functions and design, the relationship and differences between InputStream/OutputStream and Reader/Writer.
  • The basic components of NIO and NIO 2.
  • Given a scenario, analyze the design and implementation principles of BIO, NIO, and other models.
  • What principles does NIO use to provide high-performance data operations and how to use them?
  • Alternatively, from the developer’s perspective, what problems do you think NIO’s own implementation has? Do you have any improvement ideas?

There is a lot of content related to IO, and it is difficult to cover it all in one column. IO is not just about multiplexing, and NIO 2 is not just about asynchronous IO, especially in the data operation section, which will be analyzed in detail in the next column.

Knowledge Expansion #

First, let’s clarify some basic concepts:

  • Differentiate between synchronous and asynchronous operations. Simply put, synchronous operations are a reliable and orderly execution mechanism. When we perform synchronous operations, the subsequent tasks wait for the current call to return before proceeding to the next step. On the other hand, asynchronous operations are the opposite, where other tasks do not need to wait for the current call to return and usually rely on events, callbacks, etc. to establish the order relationship between tasks.
  • Differentiate between blocking and non-blocking operations. When performing a blocking operation, the current thread is in a blocking state and cannot engage in other tasks. It can only continue when the condition is ready, such as when a new connection is established in ServerSocket or when data reading or writing operations are completed. Non-blocking operations, on the other hand, return directly regardless of whether the IO operation is complete, and the corresponding operation continues to be processed in the background.

It cannot be generalized that synchronous or blocking operations are inefficient. The specific efficiency depends on the application and system characteristics.

Regarding java.io, we are all very familiar with it. Here, I will provide a general summary, and if you need to learn more specific operations, you can refer to tutorials and other resources. In general, I believe you should at least understand the following:

  • IO is not just about file operations; it is also commonly used in network programming, such as Socket communication.
  • InputStream/OutputStream is used for reading or writing bytes, such as operating on image files.
  • Reader/Writer, on the other hand, is used for character operations and adds functionality such as character encoding and decoding. It is suitable for reading or writing text information from files. In essence, computers operate on bytes, whether it is network communication or file reading. Reader/Writer acts as a bridge between application logic and raw data.
  • Implementations such as BufferedOutputStream with buffer can avoid frequent disk reads and writes, thereby improving IO processing efficiency. This design utilizes a buffer to perform one operation on a batch of data. However, do not forget to flush when using it.
  • Refer to the class diagram below. Many IO utility classes implement the Closeable interface because resources need to be released. For example, when opening a FileInputStream, it obtains the corresponding file descriptor (FileDescriptor). It is necessary to ensure that FileInputStream is explicitly closed using mechanisms such as try-with-resources or try-finally, otherwise the resource will not be released. Using the Cleaner or finalize mechanism mentioned in the previous section of the column as the last line of defense for resource release is also necessary.

Below is a simplified class diagram I compiled, explaining the types and structural relationships that are commonly used in daily development applications.

  1. Overview of Java NIO

First, let’s get familiar with the main components of NIO:

  • Buffer: An efficient data container. Except for boolean type, all primitive data types have corresponding Buffer implementations.
  • Channel: Similar to file descriptors seen on operating systems like Linux, it is an abstraction used in NIO to support batch IO operations.

File or Socket is generally considered a higher-level abstraction, while Channel is a lower-level abstraction that interacts with the operating system. This allows NIO to take full advantage of the underlying mechanisms of modern operating systems and obtain performance optimizations for specific scenarios, such as DMA (Direct Memory Access). Different levels of abstractions are related to each other. We can obtain a Channel using a Socket, and vice versa.

  • Selector: It is the foundation for achieving multiplexing in NIO. It provides an efficient mechanism to detect whether multiple channels registered with the Selector are in a ready state, thus enabling efficient management of multiple channels by a single thread. The Selector also relies on underlying operating system mechanisms, with differences depending on the mode and version. For example, in the latest codebase, implementations for Linux can be found in epoll, while NIO2 (AIO) mode on Windows relies on iocp.
  • Charset: It provides the definition for Unicode strings, and NIO also provides corresponding encoders and decoders. For example, you can convert a string to a ByteBuffer using the following code:

Charset.defaultCharset().encode(“Hello world!”));

  1. What problems can NIO solve?

Now, let’s analyze why we need NIO and multiplexing through a typical scenario. Imagine that we need to implement a server application that can simply handle multiple client requests simultaneously.

Using the synchronous and blocking APIs in java.io and java.net, we can easily achieve this. Here’s a simplified implementation:

public class DemoServer extends Thread {
    private ServerSocket serverSocket;
    public int getPort() {
        return serverSocket.getLocalPort();
    }
    public void run() {
        try {
            serverSocket = new ServerSocket(0);
            while (true) {
                Socket socket = serverSocket.accept();
                RequestHandler requestHandler = new RequestHandler(socket);
                requestHandler.start();
            }
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            if (serverSocket != null) {
                try {
                    serverSocket.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
        }
    }
    public static void main(String[] args) throws IOException {
        DemoServer server = new DemoServer();
        server.start();
        try (Socket client = new Socket(InetAddress.getLocalHost(), server.getPort())) {
            BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(client.getInputStream()));
            bufferedReader.lines().forEach(s -> System.out.println(s));
        }
    }
}

// Simple implementation, no reading, just sending a string
class RequestHandler extends Thread {
    private Socket socket;
    RequestHandler(Socket socket) {
        this.socket = socket;
    }
    @Override
    public void run() {
        try (PrintWriter out = new PrintWriter(socket.getOutputStream())) {
            out.println("Hello world!");
            out.flush();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

The main points of this implementation are:

  • Start the ServerSocket on the server side, with port 0 indicating that an available port will be automatically bound.
  • Call the accept method, blocking and waiting for a client connection.
  • Simulate a simple client using the Socket, only performing connection, read, and printing.
  • Once the connection is established, start a separate thread to handle client requests.

With these steps, a simple Socket server is implemented.

Now let’s consider the potential problems with scalability in this solution.

As we know, the current Java thread implementation is relatively heavyweight. Starting or destroying a thread incurs significant overhead. Each thread has its own thread stack and other structures, which require a noticeable amount of memory. Therefore, it seems wasteful to start a thread for each client.

To address this issue, we can introduce a thread pool mechanism to avoid waste.

serverSocket = new ServerSocket(0);
executor = Executors.newFixedThreadPool(8);
while (true) {
    Socket socket = serverSocket.accept();
    RequestHandler requestHandler = new RequestHandler(socket);
    executor.execute(requestHandler);
}

By utilizing a fixed-size thread pool, we can manage worker threads and avoid the overhead of creating and destroying threads frequently. This is a typical way to build concurrent services. You can refer to the following diagram to understand this working mode.

If the number of connections is not very high, only a few hundred connections for a regular application, this mode usually works well. However, if the number of connections increases dramatically, this implementation will not work well. The overhead of thread context switching becomes noticeable in high concurrency scenarios, which is a disadvantage of synchronous blocking mode in terms of scalability.

The NIO multiplexing mechanism offers another approach. Please refer to the new version I provided below.

public class NIOServer extends Thread {
    public void run() {
        try (Selector selector = Selector.open();
             ServerSocketChannel serverSocket = ServerSocketChannel.open();) {
            serverSocket.bind(new InetSocketAddress(InetAddress.getLocalHost(), 8888));
            serverSocket.configureBlocking(false);
            serverSocket.register(selector, SelectionKey.OP_ACCEPT);
            while (true) {
                selector.select();
                Set<SelectionKey> selectedKeys = selector.selectedKeys();
                Iterator<SelectionKey> iter = selectedKeys.iterator();
                while (iter.hasNext()) {
                    SelectionKey key = iter.next();
                    sayHelloWorld((ServerSocketChannel) key.channel());
                    iter.remove();
                }
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
    private void sayHelloWorld(ServerSocketChannel server) throws IOException {
        try (SocketChannel client = server.accept();) {
            client.write(Charset.defaultCharset().encode("Hello world!"));
        }
    }
}

This concise example reveals the essence of NIO multiplexing. Let’s analyze the main steps and elements.

  • First, create a Selector with Selector.open(), which serves as a kind of dispatcher.
  • Then, create a ServerSocketChannel and register it with the Selector, specifying SelectionKey.OP_ACCEPT to inform the dispatcher that it is interested in new connection requests.

Note that we set the non-blocking mode explicitly. This is because the registration operation is not allowed in blocking mode, and an IllegalBlockingModeException will be thrown.

  • The Selector blocks at the select operation and will be awakened when a Channel is ready.
  • In the sayHelloWorld method, perform data operations using SocketChannel and Buffer. In this example, a string is sent.

In the previous two examples, IO was done in synchronous blocking mode, so multiple threads were needed to handle multiple tasks. On the other hand, NIO utilizes a single thread that polls for events efficiently. By locating the ready Channels, it determines what to do. Only the select phase is blocking, effectively avoiding the problem of frequent thread switching caused by a large number of client connections, greatly improving the scalability of the application. The following diagram illustrates this implementation approach.

In Java 7, NIO 2 introduced another asynchronous IO mode, which uses events and callbacks to handle operations like accept, read, and write. The AIO implementation looks similar to this:

AsynchronousServerSocketChannel serverSock = AsynchronousServerSocketChannel.open().bind(sockAddr);
serverSock.accept(serverSock, new CompletionHandler<>() {
    @Override
    public void completed(AsynchronousSocketChannel sockChannel, AsynchronousServerSocketChannel serverSock) {
        serverSock.accept(serverSock, this);
        // Another write(sock, CompletionHandler{})
        sayHelloWorld(sockChannel, Charset.defaultCharset().encode("Hello World!"));
    }
    // Other path handling methods omitted...
});

Because we haven’t covered some necessary concepts yet (such as Future and CompletionHandler), I will provide more explanations and introduce topics like Reactor and Proactor patterns in the following sections together with Netty. Here, I provide a conceptual comparison:

  • The basic abstractions are similar. AsynchronousServerSocketChannel corresponds to ServerSocketChannel in the previous example, and AsynchronousSocketChannel corresponds to SocketChannel.
  • The key to business logic lies in specifying the CompletionHandler callback interface. At critical points like accept/read/write, events are called through the event mechanism. This is a significantly different programming approach.

Today, I provided an introduction to Java’s IO mechanisms, briefly analyzed the components of traditional synchronous IO and NIO, and implemented and dissected them based on typical scenarios using different IO modes. In the next lesson, I will continue analyzing the main topics of Java IO.

Practice Exercise #

Did you understand the topic we discussed today? Let me give you a question to think about: What are the limitations of NIO multiplexing? Have you encountered any related problems?

Please write your thoughts on this question in the comments section. I will select well-thought-out comments and reward you with a study encouragement bonus. Feel free to discuss with me.

Are your friends also preparing for interviews? You can “Ask a friend to read” and share today’s topic with them. Perhaps you can help them out.