16 Series How to Analyze Common Tcp Issues

16 Series How to Analyze Common TCP Issues #

Hello, I’m Shaoyafang.

For internet services, there are numerous network issues, and many of these issues manifest as networking problems. This requires us to start from the network and analyze the root cause. To analyze various network issues, you must master some analysis methods, so that when problems occur, you can efficiently identify the causes. In this lesson, I will guide you through understanding common TCP problems and their corresponding analysis techniques.

Common Tools for Checking Network on Linux #

When server issues arise and we are unsure about the cause, we need to run some tools to check the overall condition of the system. Among them, dstat is a commonly used tool for such inspections:

$ dstat
--total-cpu-usage-- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai stl| read  writ| recv  send|  in   out | int   csw 
  8   1  91   0   0|   0  4096B|7492B 7757B|   0     0 |4029  7399 
  8   1  91   0   0|   0     0 |7245B 7276B|   0     0 |4049  6967 
  8   1  91   0   0|   0   144k|7148B 7386B|   0     0 |3896  6971 
  9   2  89   0   0|   0     0 |7397B 7285B|   0     0 |4611  7426 
  8   1  91   0   0|   0     0 |7294B 7258B|   0     0 |3976  7062 

As shown above, dstat will display overall usage of four system resources and two key system indicators. These four system resources are CPU, disk I/O, network, and memory. The two key system indicators are the interrupt count (int) and the context switch count (csw). Each system resource will output some key metrics, and you need to note the following:

If you find that a particular system resource’s metrics are relatively high, you need to analyze that system resource further. For example, if you find high network throughput, you can continue observing the relevant network metrics using dstat -h, and for TCP, you can use dstat -tcp:

$ dstat --tcp
------tcp-sockets-------
lis  act  syn  tim  clo 
  27   38    0    0    0
  27   38    0    0    0

This will display and count all TCP socket connections in the system, with the following meanings for the metrics:

After obtaining the overall status of TCP connections, if you want to see detailed information about each TCP connection, you can use the ss command to continue observing. With ss, you can see what each TCP connection looks like:

$ ss -natp
State         Recv-Q         Send-Q                                       Local Address:Port                     Peer Address:Port                                                                        
LISTEN0      100                                  0.0.0.0:36457         0.0.0.0:*                                                                                users:(("test",pid=11307,fd=17))                                       
LISTEN0      5                                    0.0.0.0:33811         0.0.0.0:*                                                                                users:(("test",pid=11307,fd=19))                                       
ESTAB 0      0                                  127.0.0.1:57396       127.0.1.1:34751                                                                            users:(("test",pid=11307,fd=106))                                      
ESTAB 0      0                                  127.0.0.1:57384       127.0.1.1:34751                                                                            users:(("test",pid=11307,fd=100))                                                                          

As shown above, we can see the state, receive queue size (Recv-Q), send queue size (Send-Q), local IP and port (Local Address:Port), remote IP and port (Peer Address:Port), and the process information that opened the TCP connection.

In addition to the ss command, you can also use the netstat command to view detailed information about all TCP connections:

$ netstat -natp

However, I do not recommend using netstat; it is better to use ss. Netstat is slower and more resource-intensive than ss. Netstat reads the files under /proc/net/ to parse network connection information, while ss uses the netlink method, which is much more efficient.

Netlink relies on some diagnostic modules in the kernel for parsing. For example, parsing TCP information requires the tcp_diag diagnostic module. If the diagnostic module does not exist, ss cannot use the netlink method and will fall back to the same way as netstat, that is, using the /proc/net/ method, but with a corresponding decrease in efficiency.

Furthermore, if you look at the netstat manual by typing man netstat, you will find the following sentence: “This program is obsolete. Replacement for netstat is ss.” Therefore, in the future, when analyzing network connection issues, we should try to use ss instead of netstat.

Netstat belongs to the relatively old net-tools toolset, while ss belongs to the iproute2 toolset. Most of the commonly used commands in net-tools can be replaced by the new commands in iproute2. For example:

In addition to viewing network connection information in the system, sometimes we also need to check the network status of the system, such as whether there are packet drops and the reasons behind them. For this, we need to use netstat -s or its alternative tool, nstat:

$ nstat -z | grep -i drop
TcpExtLockDroppedIcmps          0                  0.0
TcpExtListenDrops               0                  0.0
TcpExtTCPBacklogDrop            0                  0.0
TcpExtPFMemallocDrop            0                  0.0
TcpExtTCPMinTTLDrop             0                  0.0
TcpExtTCPDeferAcceptDrop        0                  0.0
TcpExtTCPReqQFullDrop           0                  0.0
TcpExtTCPOFODrop                0                  0.0
TcpExtTCPZeroWindowDrop         0                  0.0
TcpExtTCPRcvQDrop               0                  0.0

The above output includes common reasons for packet drops. Since my host is very stable, you can see that the output results are all 0.

If you have not found any abnormalities through these regular inspection methods, then you need to consider using the essential network analysis tool—tcpdump.

Tools You Must Master to Analyze Network Issues: tcpdump #

There are many tips and tricks for using tcpdump, but here we will explain its working principle so that you can understand what tcpdump actually does and what kind of problems it can analyze.

The basic principle of tcpdump is shown in the following diagram:

tcpdump uses the libpcap mechanism to capture packets. The basic principle is as follows: when sending or receiving packets, if the packet meets the rules set by tcpdump (BPF filter), a copy of the network packet will be placed in tcpdump’s kernel buffer. This part of the memory is then mapped to the user space of tcpdump using PACKET_MMAP. After parsing, tcpdump will output this information.

From the above diagram, you can also see that if a network packet is dropped by the network card, tcpdump cannot capture it when receiving packets. Similarly, if a network packet is discarded in the protocol stack while being sent, such as being dropped due to the fullness of the sending buffer, tcpdump also cannot capture it. We can summarize the capabilities of tcpdump as follows: tcpdump can handle problems within the network card, but may be limited in dealing with issues outside the network card (including on the network card). In this case, you need to use tcpdump on the other end to capture packets.

You should also know that tcpdump is relatively resource-intensive, mainly due to the BPF filter. If there are a large number of TCP connections in the system, the filtering process will be time-consuming, so it should be used with caution in a production environment. However, if you really have no idea for troubleshooting network problems, try using tcpdump to capture packets. Perhaps its output will bring you some unexpected surprises.

If you are running important business in a production environment and dare not use tcpdump to capture packets, then you need to study some lightweight tracing methods. Next, I recommend TCP Tracepoints to you.

Lightweight Analysis Method for TCP Troubleshooting: TCP Tracepoints #

Tracepoints are one of the commonly used methods for troubleshooting, and I usually enable tracepoints related to the problem, save the tracepoint output, and then analyze it offline. Typically, I write Python scripts to analyze the content, as Python is convenient for data analysis.

When it comes to TCP-related issues, I also prefer using TCP tracepoints to analyze the problems. To use these tracepoints, your kernel version needs to be 4.16 or above. The commonly used TCP tracepoints can be found in /sys/kernel/debug/tracing/events/tcp/ and /sys/kernel/debug/tracing/events/sock/. The functions of these tracepoints are shown in the table below:

I would like to mention two things here. The tcp_rcv_space_adjust in the table is a contribution I made to the kernel when analyzing RT jitter problems. You can find the details in net: introduce a new tracepoint for tcp_rcv_space_adjust. Additionally, the inet_sock_set_state tracepoint is also my contribution to the Linux kernel, and you can find more information in net: tracepoint: replace tcp_set_state tracepoint with inet_sock_set_state tracepoint. Actually, I am not very satisfied with the name “inet_sock_set_state”. I originally wanted to use “inet_sk_set_state” which is more concise. However, in order to better match the struct inet_sock structure in the kernel, I chose the current somewhat cumbersome name.

Let’s return to the lightweight tracing method of TCP tracepoints. There is an excellent article that explains it well, written by Brendan Gregg: TCP Tracepoints. It also provides detailed introduction of some tools based on tracepoints. If you find it cumbersome to parse the output of TCP tracepoints using Python scripts, you can directly use the recommended tools mentioned in the article. However, it is worth noting that these tools are all implemented based on eBPF, and eBPF has a drawback in that it has some CPU overhead during loading. This is because some compilation work consumes CPU resources. Therefore, before using these tools, you should check the CPU usage of your system. Once eBPF is loaded, the CPU overhead is minimal, generally less than 1%. When stopping the tracing with eBPF tools, there will also be some CPU overhead, but this overhead is much smaller than that during loading. However, you still need to pay attention to avoid affecting your business.

Compared to the bloated tcpdump, these TCP tracepoints are lightweight, and it is worthwhile to make use of them.

That’s all for this lesson.

Class Summary #

We discussed the usual routines for TCP problem analysis, and I want to emphasize the key points of this class again:

  • It is recommended to use alternative tools like ss instead of netstat as much as possible, because ss has lower performance overhead and runs faster.
  • When you are at a loss with network problems, you can consider using tcpdump to capture packets. However, when there are a large number of network connections in the system, it will have a noticeable impact on system performance. Therefore, you need to find a way to avoid substantial impacts on business operations.
  • TCP Tracepoints are a relatively lightweight analysis solution. You need to understand them and preferably try using them.

Homework #

May I ask why tcpdump uses the PACKET_MMAP method when parsing data from the kernel buffer? Are you familiar with this method? What are the advantages of doing so? Feel free to discuss with me in the comment section.

Thank you for reading. If you found this lesson helpful, please feel free to share it with your friends. See you in the next lecture.