01 Getting to Know Netty, Why Is Netty So Popular

Hello, I’m Ruodi. Today we will officially start learning about Netty in this column.

As we all know, Java has a very complete ecosystem, and there may be several products to choose from for the same type of requirement. So why do people recommend Netty to you for Java network programming instead of Java NIO, Mina, or Grizzy?

In this lesson, let’s take a look at why Netty is so popular, what problems it solves, and its current development status, so that you can have a comprehensive understanding of Netty.

Why Choose Netty? #

Netty is an NIO network framework used for efficient development of network applications, greatly simplifying the development process. The development of TCP and UDP socket servers that we are familiar with is a typical case of how Netty simplifies network application development.

Since Netty is a network application framework, we can never bypass the following core focus points:

  • I/O models, thread models, and event handling mechanisms;
  • Easy-to-use API interfaces;
  • Support for data protocol and serialization.

We ultimately choose Netty because Netty can achieve excellence in these core focus points. Its robustness, performance, and scalability are among the best in the same field of frameworks. Now let’s take a look at how impressive Netty is from the following three aspects.

High Performance, Low Latency #

We often hear this sentence: “As long as you use the Netty framework for network programming, your program’s performance will not be poor.” Although this statement is quite absolute, it reflects people’s recognition of Netty’s high performance.

To achieve high-performance network applications, we need to address the issue of I/O models. Before we understand the principles of Netty’s high performance, we need to have a basic understanding of I/O models.

I/O requests can be divided into two stages: the calling stage and the execution stage.

  • The first stage is the I/O calling stage, where the user process initiates a system call to the kernel.
  • The second stage is the I/O execution stage. At this point, the kernel waits for the I/O request to be completed and returned. This stage consists of two processes: first, waiting for data to be ready and written to the kernel buffer; then, copying the data from the kernel buffer to the user space buffer.

To facilitate everyone’s understanding, take a look at this diagram:

Drawing 0.png

Next, let’s take a look at the five main I/O modes in Linux and the advantages and disadvantages of each I/O mode.

1. Synchronous Blocking I/O (BIO) #

1.png

As shown in the above diagram, the application process sends I/O requests to the kernel and the calling thread waits for the kernel to return results. A complete I/O request is called a BIO (Blocking I/O), so when implementing asynchronous operations with BIO, only the multi-threaded model can be used, with one thread per request. However, thread resources are limited and valuable, and creating too many threads will increase the overhead of thread switching.

2. Synchronous Non-blocking I/O (NIO) #

2.png

After introducing the network model of BIO, NIO is naturally easy to understand.

As shown in the above diagram, the application process no longer waits synchronously for the result after sending the I/O request to the kernel, but returns immediately and retrieves the result by polling. Although NIO greatly improves performance compared to BIO, the large number of system calls during polling results in significant overhead for context switching. Therefore, when using non-blocking I/O alone, the efficiency is not high, and as concurrency increases, non-blocking I/O will suffer from serious performance waste.

3. I/O Multiplexing #

3.png

Multiplexing realizes one thread handling multiple I/O handles. Multiplexing refers to multiple data channels, and multiplexing means using one or more fixed threads to handle each Socket. select, poll, epoll are all specific implementations of I/O multiplexing, and one select call by a thread can obtain the data status of multiple data channels in kernel space. Multiplexing solves the problems of synchronous blocking I/O and synchronous non-blocking I/O, making it a very efficient I/O model.

4. Signal-driven I/O #

4.png

Signal-driven I/O is not commonly used, it is a semi-asynchronous I/O model. When using signal-driven I/O, the kernel notifies the application process by sending a SIGIO signal when the data is ready, and the application process can start reading the data.

5. Asynchronous I/O #

5.png

The most important point of asynchronous I/O is that the process of copying data from the kernel buffer to the user-space buffer is also done asynchronously by the system, and the application process only needs to reference the data in the specified array. The main difference between asynchronous I/O and the semi-asynchronous pattern of signal-driven I/O is that signal-driven I/O allows the kernel to notify when an I/O operation can begin, while asynchronous I/O notifies when an I/O operation has been completed.

After understanding the above five types of I/O, let’s take a look at how Netty implements its own I/O model. Netty’s I/O model is based on non-blocking I/O and relies on the Selector multiplexer of the JDK NIO framework at the bottom. A Selector multiplexer can simultaneously poll multiple Channels, and with the adoption of epoll mode, only one thread is needed to be responsible for the polling of the Selector, which can handle thousands or tens of thousands of clients.

In the scenario of I/O multiplexing, when data is ready, an event dispatcher (Event Dispather) is needed, which is responsible for dispatching read and write events to the corresponding read and write event handlers (Event Handler). Event dispatchers have two design patterns: Reactor and Proactor, Reactor uses synchronous I/O, and Proactor uses asynchronous I/O.

Reactor implementation is relatively simple and is suitable for handling short-duration scenarios, but it can cause blocking for long-duration I/O operations. Proactor has higher performance, but the implementation logic is very complex. Currently, the mainstream event-driven models still rely on select or epoll to implement.

6.png

(Extracted from “Scalable IO in Java” by Lea D. source

The above diagram describes the master-slave Reactor multi-thread model used by Netty. All I/O events are registered with an I/O multiplexer, and when an I/O event is ready, the I/O multiplexer distributes the event to the corresponding event handler through an event dispatcher. This thread model avoids synchronization problems and the resource overhead caused by thread switching, truly achieving high performance and low latency.

Perfectly compensating for the deficiencies of Java NIO #

Before JDK 1.4 was introduced, only the BIO mode was available. The development process was relatively simple. A new thread was created to handle each incoming connection. As the concurrency level increased, BIO quickly encountered performance bottlenecks. Since JDK 1.4, NIO technology has been introduced, supporting select and poll. JDK 1.5 added support for epoll, and JDK 1.7 introduced NIO2, supporting the AIO model. Java has made great progress in the field of networking.

Since the performance of JDK NIO is already very good, why choose Netty? This is because Netty does what JDK should do, but does it more comprehensively. Let’s take a look at the outstanding advantages of Netty compared to JDK NIO.

  • Ease of use. When using JDK NIO programming, you need to understand many complex concepts, such as Channels, Selectors, Sockets, Buffers, etc., which can be very complicated to code. In contrast, Netty provides a higher level of encapsulation on top of NIO, shielding the complexity of NIO. Netty provides a more user-friendly API, and its unified API (blocking/non-blocking) greatly reduces the difficulty for developers to get started. At the same time, Netty provides many out-of-the-box tools, such as commonly used line decoders, length field decoders, etc., which would require you to implement them yourself in JDK NIO.
  • Stability. Netty is more reliable and stable. It fixes and improves many known issues in JDK NIO, such as the notorious select spin and CPU consumption of 100%, TCP reconnection, keep-alive detection, etc.
  • Scalability. Netty’s scalability is reflected in many aspects. Here I mainly list two points: one is the customizable thread model, which allows users to choose the reactor thread model through configuration parameters; the other is the extensible event-driven model, which separates concerns between the framework layer and the business layer. In most cases, developers only need to focus on implementing the business logic of ChannelHandlers.

Lower resource consumption #

As a network communication framework, it needs to process a large amount of network data, which inevitably leads to the problem of creating and destroying a large number of network objects, which is not friendly to JVM garbage collection. In order to reduce the pressure on JVM garbage collection, Netty mainly adopts two optimization measures:

  • Object pool reuse technology. By reusing objects, Netty avoids the overhead of frequent object creation and destruction.
  • Zero-copy technology. In addition to operating system-level zero-copy technology, Netty provides more user-level zero-copy technologies. For example, Netty uses DirectBuffer directly during I/O reading and writing, avoiding the need for data to be copied between heap and non-heap memory.

Because Netty not only achieves high performance, low latency, and lower resource consumption, but also perfectly complements the deficiencies of Java NIO, it is becoming increasingly popular among developers in network programming.

Choosing a network framework #

Many developers have used Tomcat. As an excellent web server, Tomcat seems to have solved similar problems for us. So what are the differences between Tomcat and Netty?

The biggest difference between Netty and Tomcat lies in the support for communication protocols. Tomcat can be considered as an HTTP server, mainly dealing with the transport layer of the HTTP protocol. In addition to supporting the HTTP protocol, Netty also supports multiple application layer protocols such as SSH and TLS/SSL, and can even customize application layer protocols.

Tomcat needs to comply with the Servlet specification. Before Servlet 3.0, it adopted a synchronous blocking model. Since Tomcat 6.x, it has supported NIO, which greatly improved its performance. However, Netty and Tomcat have different focuses, so Netty is not constrained by the Servlet specification and can maximize the advantages of NIO.

If you only need an HTTP server, I recommend using Tomcat. It is more mature and stable in this regard. But if you need to develop TCP-oriented network applications, Netty is your best choice.

In addition, other well-known network frameworks include Mina and Grizzly. Mina is the underlying NIO framework of the Apache Directory server. Since Mina and Netty are both led by Trustin Lee, their design principles are basically the same. Netty appeared later and can be regarded as an upgraded version of Mina, solving some design problems of Mina. For example, Netty provides an extensible encoding and decoding interface, optimizes the way ByteBuffers are allocated, and makes it more convenient and safe for users to use. Grizzly, from Sun Microsystems, is not as elegant as Netty in terms of design philosophy. It is almost a relatively basic encapsulation of Java NIO and is currently used in a small range in the industry.

In summary, Netty is a better choice for us.

Current status of Netty #

The success of Netty is inseparable from the careful operation of the community. It has short iteration cycles and relatively complete documentation. If you encounter any problems, you can get very timely responses through issues or emails.

You can learn relevant information from the official community. The following websites can help you learn:

  • Official community.
  • GitHub. As of July 2020, it has received over 24,000 stars and has been used by over 40,000 projects. Netty Official provides stable versions of 3.x and 4.x, while the previously in-testing 5.x version has been abandoned by the author. Previously, the official had never released any stable version of 5.x to the public. In my work, I have also encountered cases where some business units directly use Netty 5.x version when developing new projects. This is because many people trust the Netty community and believe that this can avoid future upgrades. Unfortunately, this convenient move was in vain after the abandonment of version 5.x. However, this also brings us a lesson: try to avoid using any unstable version of components in production environment as much as possible.

If there is no project burden, the current mainstream recommendation is the stable version of Netty 4.x. There have been major changes from Netty 3.x to 4.x, and they are not compatible with each other. Below, let’s briefly understand the changes and new features worth your attention in version 4.x.

  • Project Structure: Higher modularity, package name changed from org.jboss.netty to io.netty, no longer belonging to Jboss.
  • Commonly used APIs: Most APIs now support the fluent style. For more new APIs, refer to the following link: https://netty.io/news/2013/06/18/4-0-0-CR5.html.
  • Buffer Optimization: Buffer-related functionality has been adjusted by now.
    1. ChannelBuffer has changed to ByteBuf, and buffer-related utility classes can be used independently. Due to its user-friendly Buffer API design, it has become a perfect replacement for Java ByteBuffer.
    2. Buffer changes dynamically and the capacity of the buffer can be changed more safely.
    3. New data type CompositeByteBuf is added, which can be used to reduce data copying.
    4. More GC-friendly, added pooled buffer, and starting from version 4.1, jemalloc became the default memory allocation method.
    5. Memory leak detection function.
  • General utility classes: The io.netty.util.concurrent package provides a number of data structures for asynchronous programming.
  • More rigorous thread model control, reducing the mental burden on users when writing ChannelHandler, without having to worry too much about thread safety issues.

Netty 4.x brings a lot of improvements, making its performance and robustness more powerful. The spirit of continuous improvement and precise design of Netty is worth learning for everyone. Of course, there are more detailed changes, and interested students can refer to the following link: https://netty.io/wiki/new-and-noteworthy-in-4.0.html. If you are not clear about these concepts now, there is no need to worry. I will explain them in detail in the subsequent content of this column.

Who is using Netty? #

With its strong community influence, Netty is being adopted by more and more companies as their underlying communication framework. In the following figure, I have listed some companies that are using Netty. Let’s feel its popularity.

Drawing 7.png

Netty has been validated by many famous products in large-scale online applications, and its robustness and stability have been recognized by the industry. Some typical products include:

  • Service governance: Apache Dubbo, gRPC.
  • Big data: Hbase, Spark, Flink, Storm.
  • Search engine: Elasticsearch.
  • Message queue: RocketMQ, ActiveMQ.

There are many more excellent products that I won’t list one by one. Interested friends can refer to the following link: https://netty.io/wiki/related-projects.html.

Summary #

As an appetizer before the formal learning column, today I mainly introduced the advantages and features of Netty, and mentioned essential knowledge points such as I/O multiplexing, Reactor design pattern, and zero-copy to help you have a basic understanding of Netty. I believe you must have a taste for more and in the subsequent chapters, we will gradually step into the world of Netty.

Finally, I would like to leave you with a question to ponder: What is the internal structure of Netty? Why can Netty become such an excellent tool? I will answer this question for you in the next lesson.