15 Analysis How to Analyze Tcp Retransmission Issues Efficiently

15 Analysis How to Analyze TCP Retransmission Issues Efficiently #

Hello, I am Shao Yafang.

In the basic and case study chapters, we have discussed various issues such as RT jitter, packet loss, connection establishment failure, and so on. These issues are often accompanied by TCP retransmissions, so we usually capture TCP retransmission information to help us analyze these problems.

Moreover, TCP retransmission is also a signal that we typically use to determine the stability of a system. For example, if a server has a high TCP retransmission rate, it definitely indicates a problem with the server, and we need to take prompt action, otherwise more serious faults may occur.

However, analyzing TCP retransmission rates is not an easy task. For example, if the TCP retransmission rate of a server is high, what business operation is causing it? Many people do not know how to analyze this. Therefore, in this lesson, I will guide you to understand what TCP retransmission is all about and how to efficiently analyze it.

What is TCP Retransmission? #

I mentioned an example of TCP retransmission rate in the “Preface” as shown in the graph below:

This is a common TCP retransmission rate monitoring used by Internet companies. It is an indicator of server stability. If it is too high, like the spikes shown in the graph, it often indicates server instability. So what does TCP retransmission rate actually represent?

The TCP retransmission rate is calculated based on the metrics in the /proc/net/snmp file. The key metrics related to TCP in this file are as follows:

The calculation formula for TCP retransmission rate is as follows:

retrans = (RetransSegs - last RetransSegs) / (OutSegs - last OutSegs) * 100

In other words, the number of TCP retransmission packets divided by the total number of TCP packets sent in a unit of time gives the TCP retransmission rate. Now let’s take a closer look at the RetransSegs and OutSegs in this formula. I have created two example graphs to demonstrate the changes of these metrics:

From these two example graphs, you can see that after the sender sends a TCP packet, it will put the packet in the sender’s send queue, also known as the retransmission queue. At this time, OutSegs will increase by 1 and the queue length will be 1. If the sender receives an ACK from the receiver for this packet, the packet will be removed from the send queue, and the queue length will become 0. If the sender does not receive an ACK for this packet, it will trigger the retransmission mechanism. The example here demonstrates the case of timeout retransmission. In other words, when the sender sends a packet, it will start a timeout retransmission timer (RTO). If the sender has not received an ACK after this time, it will retransmit the packet, and then OutSegs will increase by 1, and at the same time RetransSegs will also increase by 1.

This is the meaning of OutSegs and RetransSegs: for each TCP packet sent (including retransmission packets), OutSegs will be increased by 1; for each retransmission packet sent, RetransSegs will be increased by 1. At the same time, I also show the changes of the retransmission queue in the graph, which you can take a closer look at.

In addition to the timeout retransmission shown in the graph above, there is also a mechanism called fast retransmission. For details about fast retransmission, you can refer to “Lesson 13”, so I won’t describe it in detail here.

Now that we understand how TCP retransmission is defined, let’s continue to look at the situations that can cause TCP retransmission.

The situations that can cause TCP retransmission can be broadly classified into the following two categories:

Packet Loss - TCP packets can be discarded during network transmission; the receiver may also discard the packet; ACKs sent by the receiver may be discarded during network transmission; the receiver may discard the packet due to errors during transmission… These situations will cause the sender to retransmit the TCP packet.
Congestion - TCP packets may queue at a switch/router during network transmission, such as the infamous bufferbloat; TCP packets may be reordered due to routing changes during network transmission; ACKs sent by the receiver may queue at a switch/router… These situations will cause the sender to retransmit the TCP packet again.

In summary, TCP retransmission can serve as a signal of communication quality, and we need to pay attention to it.

So, when we find a high TCP retransmission rate on a host, how do we analyze it?

Analysis of TCP retransmission common methods #

The most common analysis method is using tcpdump, which allows us to save the packets coming in and out of a specific network interface:

$ tcpdump -s 0 -i eth0 -w tcpdumpfile

Then, on Linux, we can use the tshark tool (the Linux version of Wireshark) to filter out TCP retransmission packets:

$ tshark -r tcpdumpfile -R tcp.analysis.retransmission

If there are any retransmission packets, they will be displayed. Here is an example of a TCP retransmission:

3481  20.277303 10.17.130.20 -> 124.74.250.144 TCP 70 [TCP Retransmission] 35993 > https [SYN] Seq=0 Win=14600 Len=0 MSS=1460 SACK_PERM=1 TSval=3231504691 TSecr=0

3659  22.277070 10.17.130.20 -> 124.74.250.144 TCP 70 [TCP Retransmission] 35993 > https [SYN] Seq=0 Win=14600 Len=0 MSS=1460 SACK_PERM=1 TSval=3231506691 TSecr=0

8649  46.539393 58.216.21.165 -> 10.17.130.20 TLSv1 113 [TCP Retransmission] Change Cipher Spec, Encrypted Handshake Messag

Using tcpdump, we can see detailed information about TCP retransmission. From the above TCP retransmission examples, we can see that they occurred on the TCP connection between 10.17.130.20:35993 and 124.74.250.144:443. By looking at the [SYN] TCP connection state, we can determine that these retransmissions occurred during the three-way handshake phase. With this information, we can continue analyzing why the HTTPS service on the host 124.74.250.144 is unable to establish new connections.

However, we all know that tcpdump is quite heavy. If we perform capture directly in a production environment, it is inevitable that there will be performance impact on the business. So, are there any lighter weight analysis methods available?

How to Analyze TCP Retransmissions Efficiently? #

In fact, just like how an application implements certain functionalities by calling corresponding functions, TCP retransmissions also require calling specific kernel functions. One such kernel function is tcp_retransmit_skb(). You can think of the “skb” in this function name as a network packet that needs to be sent. Therefore, if we want to efficiently track TCP retransmissions, we can directly trace this function.

The most common method to trace kernel functions is by using Kprobes. The general principle of Kprobes is as follows:

You can implement a kernel module that inserts a probe at the entry point of the tcp_retransmit_skb function using Kprobes and then register a break_handler. This will cause an exception jump to the registered break_handler when tcp_retransmit_skb is executed. In the break_handler, you can parse the TCP packet (skb) to determine what is being retransmitted.

If you find implementing a kernel module too cumbersome, you can use the ftrace framework with Kprobes. tcpretrans, implemented by Brendan Gregg, uses this approach. You can directly use this tool to trace TCP retransmissions. However, this tool has some limitations because it parses what is being retransmitted by reading the /proc/net/tcp file, so the information it can parse is relatively limited. Additionally, if the TCP connection has a short duration (e.g., in the case of short connections), this tool will not be able to parse the information. Furthermore, when using this tool, you need to ensure that your kernel has enabled the tracing feature of ftrace. In other words, the contents of /sys/kernel/debug/tracing/tracing_on should be set to 1. On CentOS-6, /sys/kernel/debug/tracing/tracing_enabled should also be set to 1.

$ cat /sys/kernel/debug/tracing/tracing_on 
1

If it is set to 0, you need to enable them, for example:

$ echo 1 > /sys/kernel/debug/tracing/tracing_on

After the tracing is completed, you need to disable them:

$ echo 0 > /sys/kernel/debug/tracing/tracing_on

Since Kprobes works by using exceptions, it does have some performance overhead. Therefore, it is recommended to avoid using Kprobes on the fast path of TCP packet transmission. However, since the retransmission path is the slow path, adding Kprobes to the retransmission path does not cause performance concerns.

Although Kprobes is slightly inconvenient to use, to facilitate Linux users in observing TCP retransmission events, the Linux 4.16 kernel version specifically added the TCP tracepoint to parse TCP retransmission events. If you are using older versions such as CentOS-7, you cannot use this tracepoint to observe. If your version is CentOS-8 or a later updated version, you can directly use this tracepoint to trace TCP retransmissions by using the following commands:

$ cd /sys/kernel/debug/tracing/events/
$ echo 1 > tcp/tcp_retransmit_skb/enable

Then, you can trace TCP retransmission events:

$ cat trace_pipe
<idle>-0     [007] ..s. 265119.290232: tcp_retransmit_skb: sport=22 dport=62264 saddr=172.23.245.8 daddr=172.30.18.225 saddrv6=::ffff:172.23.245.8 daddrv6=::ffff:172.30.18.225 state=TCP_ESTABLISHED

You can see that when a TCP retransmission occurs, the basic information of the event will be printed. As an additional note, the “state=TCP_ESTABLISHED” field was not present in earlier versions. Without this field, we cannot identify whether the retransmission event occurred during the three-way handshake. Therefore, I contributed a PATCH to the kernel to display the TCP connection state for problem analysis. You can refer to the tcp: expose sk_state in tcp_retransmit_skb tracepoint commit.

After tracing is completed, you need to disable this tracepoint:

$ echo 0 > tcp/tcp_retransmit_skb/enable

The Tracepoint method is not only more convenient to use but also has lower performance overhead compared to Kprobes. Therefore, we can also use Tracepoint on the fast path.

Due to Tracepoint’s support for TCP retransmission events, the tcpretrans tool has also been upgraded. It traces TCP retransmission events by parsing this Tracepoint instead of using the previous Kprobes method. For more details, you can refer to bcc tcpretrans. Additionally, Brendan Gregg also discussed with me before implementing these eBPF-based TCP tracing tools, so I am familiar with his tool.

That’s all for our discussion on analyzing TCP retransmission events. I hope it will inspire you to develop more efficient tools to analyze the TCP issues or other issues you encounter.

Class Summary #

In this class, we mainly discussed some knowledge about TCP retransmission. Here are a few key points to remember about TCP retransmission:

The TCP retransmission rate can serve as a signal of the TCP communication quality. A high rate indicates an unstable TCP connection.
The main causes of TCP retransmission are packet loss and network congestion.
Specific kernel functions are called during TCP retransmission. We can trace the calling situation of these functions to track TCP retransmission events.
Kprobe is a very versatile tracing tool. On older kernel versions, you can use this method to trace TCP retransmission events.
Tracepoint is a more lightweight and convenient tool for tracing TCP retransmission. However, your kernel version needs to be 4.16+.
If you want something simpler, you can directly use the tool called tcpretrans.

Homework #

Can the tracepoint observation method we mentioned or the tcpretrans tool be used to track received TCP retransmissions? Why? Feel free to discuss with me in the comments section.

Thank you for reading. If you found this lesson helpful, please feel free to share it with your friends. See you in the next lecture.