40 Case Study When Network Request Delay Increases, What Should I Do

40 Case Study When Network Request Delay Increases, What Should I Do #

Hello, I’m Ni Pengfei.

In the previous section, we learned about methods to mitigate Distributed Denial of Service (DDoS) attacks. To quickly recap, DDoS attacks use a large number of forged requests, causing the targeted service to consume substantial resources in processing these invalid requests, thereby being unable to respond to legitimate user requests.

Due to the distributed nature, high traffic volume, and difficulty in tracing, there currently isn’t a foolproof method to completely defend against the problems caused by DDoS attacks. We can only try to alleviate the impacts of DDoS attacks.

For example, you can purchase professional traffic scrubbing devices and network firewalls to block malicious traffic at the network entrance, allowing normal traffic to enter the servers in the data center.

In Linux servers, you can enhance the server’s ability to resist attacks and reduce the impact of DDoS attacks on normal services through various methods such as kernel tuning, DPDK, and XDP. In application programs, you can use various levels of caching, WAF, CDN, and other methods to mitigate the impact of DDoS attacks on the application program.

However, please note that if the DDoS traffic has already reached the Linux server, even if various optimizations are implemented in the application layer, the network service latency will generally still be much higher than normal.

Therefore, in practical applications, we usually need to have Linux servers cooperate with professional traffic scrubbing devices and network firewalls to alleviate this issue.

In addition to DDoS attacks causing increased network latency, I’m sure you have encountered many other causes of network latency, such as:

Slow network transmission causing latency.
Slow packet processing in the Linux kernel protocol stack causing latency.
Slow application data processing causing latency, and so on.

So, when facing these causes of latency, what should we do? How do we identify the root cause of network latency? Today, I will use a case study to show you how to deal with these issues.

Network Latency #

I believe that when it comes to network latency, you can easily think of its meaning - the time it takes for network data transmission. However, it is important to note that this time can be unidirectional, referring to the one-way time from the source address to the destination address, or bidirectional, referring to the time it takes from the source address to the destination address and then back to the response from the destination address.

Usually, we are more accustomed to bidirectional communication latency, such as the result of a ping test, which is the round-trip time (RTT).

In addition to network latency, another commonly used indicator is application latency, which refers to the time it takes for an application to receive a request and respond. Usually, application latency also refers to round-trip latency, which is the sum of network data transmission time and data processing time.

In the Linux Network Basics, I mentioned that you can use ping to test network latency. Ping is based on the ICMP protocol, which calculates the time difference between ICMP echo response messages and ICMP echo request messages to obtain the round-trip delay. This process does not require special authentication and is often used by many network attacks, such as port scanning tools like nmap and packet crafting tools like hping3.

Therefore, to avoid these problems, many network services disable ICMP, which means we cannot use ping to test the availability and round-trip latency of network services. In this case, you can use traceroute or hping3 in TCP and UDP modes to obtain network latency.

For example, taking baidu.com as an example, you can run the following hping3 command to test the network latency from your machine to the Baidu search server:

# -c represents sending 3 requests, -S represents setting TCP SYN, -p represents the port number 80
$ hping3 -c 3 -S -p 80 baidu.com
HPING baidu.com (eth0 123.125.115.110): S set, 40 headers + 0 data bytes
len=46 ip=123.125.115.110 ttl=51 id=47908 sport=80 flags=SA seq=0 win=8192 rtt=20.9 ms
len=46 ip=123.125.115.110 ttl=51 id=6788  sport=80 flags=SA seq=1 win=8192 rtt=20.9 ms
len=46 ip=123.125.115.110 ttl=51 id=37699 sport=80 flags=SA seq=2 win=8192 rtt=20.9 ms

--- baidu.com hping statistic ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 20.9/20.9/20.9 ms

From the hping3 results, you can see that the round-trip delay (RTT) is 20.9ms.

Of course, we can also get similar results using traceroute:

# --tcp indicates using the TCP protocol, -p represents the port number, -n indicates no reverse DNS lookup for IP addresses in the result
$ traceroute --tcp -p 80 -n baidu.com
traceroute to baidu.com (123.125.115.110), 30 hops max, 60 byte packets
 1  * * *
 2  * * *
 3  * * *
 4  * * *
 5  * * *
 6  * * *
 7  * * *
 8  * * *
 9  * * *
10  * * *
11  * * *
12  * * *
13  * * *
14  123.125.115.110  20.684 ms *  20.798 ms

Traceroute sends three packets at each hop of the route and outputs the round-trip delay upon receiving a response. If there is no response or the response times out (default 5s), it will output an asterisk.

Now that we know the methods for testing network service latency based on TCP, let’s proceed to learning the analysis approach when network latency increases through a case study.

Case Preparation #

The following case is still based on Ubuntu 18.04 and is also applicable to other Linux systems. The configuration of the case environment I used is as follows:

- Machine configuration: 2 CPUs, 8GB memory.

- Pre-install tools such as docker, hping3, tcpdump, curl, wrk, Wireshark, etc. For example: apt-get install docker.io hping3 tcpdump curl.

You should be familiar with these tools. The installation and usage of wrk have been introduced in How to Evaluate System Network Performance. If you haven’t installed it yet, please execute the following command to install it:

$ https://github.com/wg/wrk
$ cd wrk
$ apt-get install build-essential -y
$ make
$ sudo cp wrk /usr/local/bin/

Since Wireshark requires a graphical interface, if your virtual machine does not have a graphical interface, you can install Wireshark on another machine (such as a Windows laptop).

Two virtual machines are used in this case, and I have drawn a diagram to represent their relationship.

Next, let’s open two terminals, SSH into each of the two machines respectively (the following steps assume that the terminal number is consistent with the VM number in the diagram), and install these mentioned tools. Note that curl and wrk only need to be installed on the client VM (i.e. VM2).

As with previous cases, all the following commands are run as root by default. If you are logged into the system as a normal user, please run the command sudo su root to switch to the root user.

If you encounter any problems during the installation process, it is recommended that you first search for a solution yourself. If you can’t solve it, you can ask me in the comments section. If you have installed them before, you can ignore this point.

Next, let’s move on to the practical operations of the case.

Case Study #

To compare the impact of increasing latency, first, let’s run the simplest Nginx by starting a container with the official Nginx image. In terminal 1, execute the following command to run the official Nginx, which will listen on port 80:

$ docker run --network=host --name=good -itd nginx
fb4ed7cb9177d10e270f8320a7fb64717eac3451114c9fab3c50e02be2e88ba2

Continuing in terminal 1, execute the following command to run the example application, which will listen on port 8080:

$ docker run --name nginx --network=host -itd feisky/nginx:latency
b99bd136dcfd907747d9c803fdc0255e578bad6d66f4e9c32b826d75b6812724

Then, in terminal 2, execute the curl command to verify that both containers have been started successfully. If everything is fine, you will see the following output:

# Port 80 is working fine
$ curl http://192.168.0.30
<!DOCTYPE html>
<html>
...
<p><em>Thank you for using nginx.</em></p>
</body>
</html>

# Port 8080 is working fine
$ curl http://192.168.0.30:8080
...
<p><em>Thank you for using nginx.</em></p>
</body>
</html>

Next, let’s use hping3 mentioned above to test the latency of the two ports and see if there is any difference. Still in terminal 2, execute the following commands to test the latency of port 80 and port 8080 of the example machine, respectively:

# Test the latency of port 80
$ hping3 -c 3 -S -p 80 192.168.0.30
HPING 192.168.0.30 (eth0 192.168.0.30): S set, 40 headers + 0 data bytes
len=44 ip=192.168.0.30 ttl=64 DF id=0 sport=80 flags=SA seq=0 win=29200 rtt=7.8 ms
len=44 ip=192.168.0.30 ttl=64 DF id=0 sport=80 flags=SA seq=1 win=29200 rtt=7.7 ms
len=44 ip=192.168.0.30 ttl=64 DF id=0 sport=80 flags=SA seq=2 win=29200 rtt=7.6 ms

--- 192.168.0.30 hping statistic ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 7.6/7.7/7.8 ms



# Test the latency of port 8080
$ hping3 -c 3 -S -p 8080 192.168.0.30
HPING 192.168.0.30 (eth0 192.168.0.30): S set, 40 headers + 0 data bytes
len=44 ip=192.168.0.30 ttl=64 DF id=0 sport=8080 flags=SA seq=0 win=29200 rtt=7.7 ms
len=44 ip=192.168.0.30 ttl=64 DF id=0 sport=8080 flags=SA seq=1 win=29200 rtt=7.6 ms
len=44 ip=192.168.0.30 ttl=64 DF id=0 sport=8080 flags=SA seq=2 win=29200 rtt=7.3 ms

--- 192.168.0.30 hping statistic ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 7.3/7.6/7.7 ms

From this output, you can see that the latency of both ports is similar, around 7ms. However, this is only for individual requests. What if we switch to concurrent requests? Next, let’s try using wrk. In this case, in terminal two, execute the following new commands to test the performance of the test case machine with 100 concurrent connections on ports 80 and 8080:

# Test performance on port 80
$ # wrk --latency -c 100 -t 2 --timeout 2 http://192.168.0.30/
Running 10s test @ http://192.168.0.30/
  2 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     9.19ms   12.32ms 319.61ms   97.80%
    Req/Sec     6.20k   426.80     8.25k    85.50%
  Latency Distribution
     50%    7.78ms
     75%    8.22ms
     90%    9.14ms
     99%   50.53ms
  123558 requests in 10.01s, 100.15MB read
Requests/sec:  12340.91
Transfer/sec:     10.00MB


# Test performance on port 8080
$ wrk --latency -c 100 -t 2 --timeout 2 http://192.168.0.30:8080/
Running 10s test @ http://192.168.0.30:8080/
  2 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    43.60ms    6.41ms  56.58ms   97.06%
    Req/Sec     1.15k   120.29     1.92k    88.50%
  Latency Distribution
     50%   44.02ms
     75%   44.33ms
     90%   47.62ms
     99%   48.88ms
  22853 requests in 10.01s, 18.55MB read
Requests/sec:   2283.31
Transfer/sec:      1.85MB

From the above two outputs, we can see that the average latency of the official Nginx (listening on port 80) is 9.19ms, while the average latency of the test case Nginx (listening on port 8080) is 43.6ms. From the latency distribution, we can see that 90% of requests to the official Nginx can be completed within 9ms, while 50% of requests to the test case Nginx have already reached 44ms.

Combining this with the output of hping3 above, we can easily see that the test case Nginx has a much higher latency under concurrent requests. What could be the reason for this?

I believe you have already figured it out. In the previous lesson, we learned to use tcpdump to capture the network packets sent and received and analyze whether there are any problems in the network communication process.

Next, in terminal one, execute the following tcpdump command to capture the network packets sent and received on port 8080 and save them to the nginx.pcap file:

$ tcpdump -nn tcp port 8080 -w nginx.pcap

Then switch to terminal two and re-execute the wrk command:

# Test performance on port 8080
$ wrk --latency -c 100 -t 2 --timeout 2 http://192.168.0.30:8080/

After the wrk command finishes, switch back to terminal one and press Ctrl+C to end the tcpdump command. Then, copy the captured nginx.pcap file to the machine with Wireshark (if VM1 already has a graphical interface, you can skip the copying step) and open it with Wireshark.

Since there are a lot of network packets, we can filter them first. For example, after selecting a packet, you can right-click and select “Follow” -> “TCP Stream”, as shown in the picture below:

Then, close the pop-up dialog and return to the Wireshark main window. At this time, you will find that Wireshark has automatically set a filter expression “tcp.stream eq 24” for you, as shown in the picture below (the source and destination IP addresses are omitted in the picture):

From here, you can see the request and response for each step of this TCP connection starting from the three-way handshake. However, this may not be intuitive enough. You can continue by clicking on “Statics -> Flow Graph” in the menu bar, selecting “Limit to display filter”, and setting the Flow type to “TCP Flows”:

Note that the left side of this graph is the client, and the right side is the Nginx server. From this graph, we can see that the initial three-way handshake and the first HTTP request and response were relatively fast, but the second HTTP request was slower, especially the client’s ACK response which was sent 40ms after receiving the first server segment (shown by the blue row in the graph).

Seeing the value of 40ms, does it remind you of something? Actually, this is the minimum timeout for TCP delayed acknowledgment (Delayed ACK).

Let me explain delayed acknowledgment. It is an optimization mechanism for TCP ACKs. Instead of sending an ACK for each request, it waits for a while (e.g. 40ms) to see if there is any “ride-sharing” opportunity. If there are other packets to be sent within this period of time, the ACK is sent along with those packets. However, if no other packets are received within this time frame, a standalone ACK is sent after the timeout.

Because the occurrence of 40ms is on the client side in this case, we have reason to suspect that the client has enabled the delayed acknowledgment mechanism. And in this case, the client is actually the wrk tool we ran earlier.

If you consult the TCP documentation (run man tcp), you will find that the TCP_QUICKACK option has to be specifically set for TCP sockets to enable fast acknowledgment mode. Otherwise, by default, the delayed acknowledgment is used:

TCP_QUICKACK (since Linux 2.4.4)
Enable quickack mode if set or disable quickack mode if cleared. In quickack mode, acks are sent immediately, rather than delayed if needed in accordance to normal TCP operation. This flag is not permanent, it only enables a switch to or from quickack mode. Subsequent operation of the TCP protocol will once again enter/leave quickack mode depending on internal protocol processing and factors such as delayed ack timeouts occurring and data transfer. This option should not be used in code intended to be portable.

To verify our assumption and confirm wrk’s behavior, we can use strace to observe which TCP options are set for the socket by wrk.

For example, you can switch to terminal 2 and execute the following command:

$ strace -f wrk --latency -c 100 -t 2 --timeout 2 http://192.168.0.30:8080/
...
setsockopt(52, SOL_TCP, TCP_NODELAY, [1], 4) = 0
...

This way, you can see that wrk only sets the TCP_NODELAY option without setting TCP_QUICKACK. This indicates that wrk is using delayed acknowledgment, which explains the 40ms delay mentioned earlier.

However, don’t forget that this is only the behavior of the client. In theory, the Nginx server should not be affected by this behavior. Did we miss any clues when analyzing the network packets? Let’s go back to Wireshark and take another look.

Carefully observe the Wireshark interface. The packet with number 1173 is the delayed ACK packet mentioned earlier. The next line, 1175, is the second segment sent by Nginx. It is combined with packet 697 to form a complete HTTP response (with ACK numbers 85).

The second segment is not sent together with the previous segment (packet 697); instead, it is sent after the client ACKs the first segment (packet 1173). This seems similar to delayed acknowledgment, but this time it is not an ACK, but rather data being sent.

Seeing this, I believe you will recall something - the Nagle algorithm. Before further analyzing the case, let me briefly introduce this algorithm.

The Nagle algorithm is an optimization algorithm used in TCP to reduce the number of small packets sent, aiming to improve the utilization of the actual bandwidth.

For example, when the payload is only 1 byte, and in addition, the TCP and IP headers each occupy 20 bytes, the entire network packet becomes 41 bytes. In this case, the actual bandwidth utilization is only 2.4% (1/41). In a larger network scenario, if the entire bandwidth is occupied by such small packets, the overall network utilization would be very low.

The Nagle algorithm is designed to solve this problem by merging small TCP packets to improve network bandwidth utilization. The Nagle algorithm stipulates that on a TCP connection, at most one unacknowledged and uncompleted segment can exist. No other segments are sent until the ACK for this segment is received. These small segments are then combined and sent together in one segment after receiving the ACK.

Obviously, the idea of the Nagle algorithm is good, but once you know about the default delayed acknowledgment mechanism in Linux, you may not feel the same way. When used together, they significantly increase network latency. The diagram below illustrates this:

When the server sends the first segment, because the client has delayed acknowledgment enabled, it has to wait for 40ms before replying with an ACK.
At the same time, because the server has Nagle enabled, and it has not yet received the ACK for the first segment, the server also waits here.
When the 40ms timeout expires, the client replies with an ACK, and only then does the server continue to send the second segment. Since it might be a problem related to Nagle, how can we determine whether the case Nginx has Nagle enabled?

If we check the documentation of TCP, we will find out that Nagle algorithm is only disabled when TCP_NODELAY is set. Therefore, we only need to check the tcp_nodelay option of Nginx.

TCP_NODELAY
              If set, disable the Nagle algorithm. This means that segments are always sent as soon as possible, even
              if there is only a small amount of data. When not set, data is buffered until there is a sufficient
              amount to send out, thereby avoiding the frequent sending of small packets, which results in poor uti-
              lization of the network. This option is overridden by TCP_CORK; however, setting this option forces an
              explicit flush of pending output, even if TCP_CORK is currently set.

Let’s go back to the first terminal and execute the following command to check the configuration of the case Nginx:

$ docker exec nginx cat /etc/nginx/nginx.conf | grep tcp_nodelay
    tcp_nodelay  off;

As expected, you can see that tcp_nodelay is turned off in the case Nginx. To fix this, we need to set it to on.

After making the change, do we solve the problem? We need to verify it. I have already packaged the modified application into a Docker image. In the first terminal, execute the following commands to start it:

# Remove the case application
$ docker rm -f nginx

# Start the optimized application
$ docker run --name nginx --network=host -itd feisky/nginx:nodelay

Then, switch to the second terminal and re-execute the wrk command to test the latency:

$ wrk --latency -c 100 -t 2 --timeout 2 http://192.168.0.30:8080/
Running 10s test @ http://192.168.0.30:8080/
  2 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     9.58ms   14.98ms 350.08ms   97.91%
    Req/Sec     6.22k   282.13     6.93k    68.50%
  Latency Distribution
     50%    7.78ms
     75%    8.20ms
     90%    9.02ms
     99%   73.14ms
  123990 requests in 10.01s, 100.50MB read
Requests/sec:  12384.04
Transfer/sec:     10.04MB

Indeed, now the latency has been reduced to 9ms, which is the same as the official Nginx image we tested (Nginx has tcp_nodelay enabled by default).

As a comparison, we can use tcpdump to capture network packets after the optimization (actually capturing the port 80 that the official Nginx listens on). You will get the following result:

From the figure, you can see that Nginx no longer needs to wait for ACKs, so the two segments, 536 and 540, are sent consecutively. As for the client, although it still enables delayed acknowledgments, it receives two packets that require ACK replies, so it doesn’t need to wait for 40ms and can directly combine them to reply with ACK.

Finally, don’t forget to stop these two container applications. In the first terminal, execute the following command to remove the case application:

$ docker rm -f nginx good

Summary #

Today, we learned about the analysis methods for increased network latency. Network latency is the core performance metric of a network. Due to various factors such as network transmission and packet processing, network latency is unavoidable. However, excessive network latency directly impacts user experience.

Therefore, when you discover increased network latency, you can use various tools such as traceroute, hping3, tcpdump, Wireshark, and strace to identify potential issues in the network. For example:

Use tools like hping3 and wrk to confirm whether the network latency of single and concurrent requests is normal.
Use traceroute to confirm the correctness of the route and check the latency of each hop gateway in the route.
Use tcpdump and Wireshark to confirm the normal sending and receiving of network packets.
Use tools like strace to observe whether the application’s calls to network sockets are normal.

By systematically troubleshooting from the route, network packet transmission, and application layer, you can locate the root cause of the problem.

Reflection #

Finally, I would like to invite you to discuss your understanding of network latency and how you analyze it when you discover an increase in network latency. You can summarize your thoughts by combining today’s content with your own operational records.

Feel free to discuss with me in the comment section and also share this article with your colleagues and friends. Let’s practice in real scenarios and make progress through communication.