34 About Networking in Linux You Must Know These Part Two

34 About Networking in Linux You Must Know These Part Two #

Hello, I am Ni Pengfei.

In the previous section, I took you through the basic principles of Linux networking. Let’s have a quick review: Linux networking builds its network protocol stack based on the TCP/IP model. The TCP/IP model consists of four layers: the application layer, the transport layer, the network layer, and the network interface layer. These are the core components of the Linux network stack.

When an application sends a packet of data through the socket interface, it goes through the network protocol stack from top to bottom for processing before being sent out by the network card. Similarly, when receiving packets, they go through the network stack from bottom to top for processing before being delivered to the application.

Now that you understand the basic principles and the sending/receiving process in Linux networking, you must be eager to know how to observe the performance of the network. Specifically, what metrics can be used to measure the performance of Linux networking?

Performance Metrics #

In practice, we usually measure the performance of a network using metrics such as bandwidth, throughput, latency, and PPS (Packet Per Second).

  • Bandwidth represents the maximum transmission rate of a link, usually measured in b/s (bits per second).

  • Throughput represents the amount of data successfully transmitted per unit of time, usually measured in b/s (bits per second) or B/s (bytes per second). Throughput is limited by the bandwidth, and the ratio of throughput to bandwidth represents the utilization of the network.

  • Latency represents the time delay from when a network request is sent until a response is received from the remote end. In different scenarios, this metric may have different meanings. For example, it can represent the time required to establish a connection (e.g., TCP handshake latency) or the round-trip time of a data packet (e.g., RTT).

  • PPS is short for Packet Per Second, indicating the transmission rate in terms of network packets. PPS is commonly used to evaluate the forwarding capability of a network, such as for hardware switches, which can achieve linear forwarding (i.e., PPS can reach or approach the theoretical maximum value). On the other hand, forwarding based on Linux servers is easily influenced by the size of network packets.

In addition to these metrics, other commonly used performance metrics include network availability (whether the network can communicate properly), concurrent connections (the number of TCP connections), packet loss rate (percentage of lost packets), and retransmission rate (proportion of retransmitted network packets).

Next, please open a terminal, SSH into the server, and let’s explore and observe these performance metrics together.

Networking Configuration #

The first step in analyzing network problems is usually to check the configuration and status of network interfaces. You can use the ifconfig or ip command to view the network configuration. Personally, I recommend using the ip tool because it provides more functionality and a more user-friendly interface.

ifconfig and ip belong to the net-tools and iproute2 packages, respectively. iproute2 is the next generation of net-tools. They are usually installed by default in most distributions. However, if you cannot find the ifconfig or ip command, you can install these two packages.

Taking the network interface eth0 as an example, you can run the following two commands to view its configuration and status:

$ ifconfig eth0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
      inet 10.240.0.30 netmask 255.240.0.0 broadcast 10.255.255.255
      inet6 fe80::20d:3aff:fe07:cf2a prefixlen 64 scopeid 0x20<link>
      ether 78:0d:3a:07:cf:3a txqueuelen 1000 (Ethernet)
      RX packets 40809142 bytes 9542369803 (9.5 GB)
      RX errors 0 dropped 0 overruns 0 frame 0
      TX packets 32637401 bytes 4815573306 (4.8 GB)
      TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

$ ip -s addr show dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
  link/ether 78:0d:3a:07:cf:3a brd ff:ff:ff:ff:ff:ff
  inet 10.240.0.30/12 brd 10.255.255.255 scope global eth0
      valid_lft forever preferred_lft forever
  inet6 fe80::20d:3aff:fe07:cf2a/64 scope link
      valid_lft forever preferred_lft forever
  RX: bytes packets errors dropped overrun mcast
   9542432350 40809397 0       0       0       193
  TX: bytes packets errors dropped carrier collsns
   4815625265 32637658 0       0       0       0

As you can see, the metrics output by the ifconfig and ip commands are basically the same, just the display format is slightly different. For example, they both include the status flags of the network interface, MTU size, IP, subnet, MAC address, and the statistics of network packet transmission and reception.

The meanings of these specific metrics are detailed in the documentation. However, there are a few metrics related to network performance that you should pay special attention to.

First, the status flags of the network interface. The RUNNING flag in the ifconfig output or the LOWER_UP flag in the ip output indicates that the physical network is connected, meaning that the network card is already connected to a switch or router. If you don’t see them, it usually means that the network cable has been disconnected.

Second, the size of the MTU. The default MTU size is 1500, but depending on the network architecture (such as whether overlay networks like VXLAN are used), you may need to increase or decrease the value of the MTU.

Third, the IP address, subnet, and MAC address of the network interface. These are all necessary for the normal functioning of the network, so you need to ensure that they are configured correctly.

Fourth, the number of bytes and packets transmitted and received by the network, the number of errors, and the packet loss situation. Particularly, when the metrics in the TX and RX sections such as errors, dropped, overruns, carrier, and collisions are not zero, it usually indicates network I/O issues. Among them:

  • errors represents the number of packets with errors, such as checksum errors, frame synchronization errors, etc.

  • dropped represents the number of dropped packets, which means that the packets have been received in the Ring Buffer but dropped due to insufficient memory, etc.

  • overruns represents the number of overrun packets, which means that the network I/O speed is too fast and the packets in the Ring Buffer cannot be processed in time (the queue is full) resultin in packet loss.

  • carrier represents the number of packets with carrier errors, such as mismatched duplex, physical cable problems, etc.

  • collisions represents the number of collision packets.

Socket Information #

ifconfig and ip only display statistical information about the transmission and reception of network packets on network interfaces. However, in actual performance issues, we also need to pay attention to the statistical information in the network protocol stack, such as socket information, network stack, network interfaces, and routing tables. You can use netstat or ss to view this information.

Personally, I recommend using ss to query network connection information because it provides better performance (faster speed) compared to netstat.

For example, you can execute the following command to query socket information:

# Specify -n to display numeric addresses and ports (instead of names)
# Specify -p to display process information
$ netstat -nlp | head -n 3
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.53:53           0.0.0.0:*               LISTEN      840/systemd-resolve

# Specify -l to display listening sockets only
# Specify -t to display TCP sockets only
# Specify -n to display numeric addresses and ports (instead of names)
# Specify -p to display process information
$ ss -ltnp | head -n 3
State    Recv-Q    Send-Q        Local Address:Port        Peer Address:Port
LISTEN   0         128           127.0.0.53%lo:53               0.0.0.0:*        users:(("systemd-resolve",pid=840,fd=13))
LISTEN   0         128                 0.0.0.0:22               0.0.0.0:*        users:(("sshd",pid=1459,fd=3))

The output of netstat and ss is similar and displays the status of the socket, receive queue, send queue, local address, remote address, process PID, and process name.

Among them, you need to pay special attention to the receive queue (Recv-Q) and send queue (Send-Q), which should usually be 0. If you find that they are not 0, it indicates a pile-up of network packets. However, you should also note that their meanings differ under different socket states.

When the socket is in the “Established” state:

  • Recv-Q represents the number of bytes in the socket buffer that have not been fetched by the application (i.e., the length of the receive queue).
  • Send-Q represents the number of bytes that have been sent but not yet acknowledged by the remote host (i.e., the length of the send queue).

When the socket is in the “Listening” state:

  • Recv-Q represents the length of the complete connection queue.
  • Send-Q represents the maximum length of the complete connection queue.

The term “complete connection” refers to the server receiving an ACK from the client, completing the TCP three-way handshake, and then moving the connection to the complete connection queue. These sockets in the complete connection queue need to be fetched by the accept() system call before the server can truly process client requests.

In addition to the complete connection queue, there is also a half-open connection queue. The term “half-open” refers to connections that have not completed the TCP three-way handshake and are only halfway established. When the server receives a SYN packet from the client, it puts this connection in the half-open connection queue and then sends a SYN+ACK packet to the client.

Protocol Stack Statistics #

Similarly, you can use netstat or ss to check the information of the protocol stack:

$ netstat -s
...
Tcp:
    3244906 active connection openings
    23143 passive connection openings
    115732 failed connection attempts
    2964 connection resets received
    1 connections established
    13025010 segments received
    17606946 segments sent out
    44438 segments retransmitted
    42 bad segments received
    5315 resets sent
    InCsumErrors: 42
...

$ ss -s
Total: 186 (kernel 1446)
TCP:   4 (estab 1, closed 0, orphaned 0, synrecv 0, timewait 0/0), ports 0

Transport Total     IP        IPv6
*	    1446      -         -
RAW	    2         1         1
UDP	    2         2         0
TCP	    4         3         1
...

These statistics of the protocol stack are very intuitive. ss only displays a brief summary of the established, closed, orphaned sockets, etc., while netstat provides more detailed information about the network protocol stack.

For example, the example output of netstat above shows various information about TCP protocol, such as the number of active connection openings, passive connection openings, failed connection attempts, segments received and sent, segments retransmitted, bad segments received, resets sent, and InCsumErrors.

Network Throughput and PPS #

Next, let’s take a look at how to check the current network throughput and PPS (Packets Per Second) of the system. Here, I recommend using our old friend sar, which we have used multiple times in the previous CPU, memory, and I/O modules.

By adding the -n parameter to sar, you can view network statistics information, such as network interfaces (DEV), network interface errors (EDEV), TCP, UDP, ICMP, etc. Execute the following command to obtain network interface statistics information:

# The number 1 indicates that a set of data is output every 1 second
$ sar -n DEV 1
Linux 4.15.0-1035 (ubuntu) 	01/06/19 	_x86_64_	(2 CPU)

13:21:40        IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s   %ifutil
13:21:41         eth0     18.00     20.00      5.79      4.25      0.00      0.00      0.00      0.00
13:21:41      docker0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
13:21:41           lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

There are quite a few metrics output here, let me briefly explain their meanings.

  • rxpck/s and txpck/s are the received and transmitted PPS (Packets Per Second) respectively, with a unit of packets/second.

  • rxkB/s and txkB/s are the received and transmitted throughput respectively, with a unit of KB/second.

  • rxcmp/s and txcmp/s are the received and transmitted compressed packets respectively, with a unit of packets/second.

  • %ifutil is the utilization of the network interface, which is (rxkB/s+txkB/s)/Bandwidth for half-duplex mode, and max(rxkB/s, txkB/s)/Bandwidth for full-duplex mode.

Among them, Bandwidth can be queried using ethtool, and its unit is usually Gb/s or Mb/s. Please note that here the lowercase letter b represents bits, not bytes. The commonly mentioned Gigabit Ethernet and 10 Gigabit Ethernet network cards are also in bits. As you can see below, my eth0 network card is a Gigabit Ethernet card:

$ ethtool eth0 | grep Speed
	Speed: 1000Mb/s

Connectivity and Latency #

Finally, we usually use ping to test the connectivity and latency to remote hosts, which is based on the ICMP protocol. For example, by executing the following command, you can test the connectivity and latency from your local machine to the IP address 114.114.114.114:

# -c3 means stop after sending three ICMP packets
$ ping -c3 114.114.114.114
PING 114.114.114.114 (114.114.114.114) 56(84) bytes of data.
64 bytes from 114.114.114.114: icmp_seq=1 ttl=54 time=244 ms
64 bytes from 114.114.114.114: icmp_seq=2 ttl=47 time=244 ms
64 bytes from 114.114.114.114: icmp_seq=3 ttl=67 time=244 ms

--- 114.114.114.114 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 244.023/244.070/244.105/0.034 ms

The output of ping can be divided into two parts.

  • The first part consists of information for each ICMP request, including the ICMP sequence number (icmp_seq), the TTL (Time to Live, or number of hops), and the round-trip delay.

  • The second part is a summary of the three ICMP requests.

For example, in the above example, 3 network packets were sent and 3 responses were received without any packet loss, indicating that the host being tested is reachable from the local machine. The average round-trip delay (RTT) is 244ms, which means it takes a total of 244ms from sending the ICMP to receiving the acknowledgment reply from 114.114.114.114.

Summary #

We usually use metrics such as bandwidth, throughput, latency, etc. to measure the performance of a network. Similarly, you can use tools like ifconfig, netstat, ss, sar, ping, etc. to view these network performance metrics.

In the next section, I will further explore the working principles of the Linux network by discussing the classic C10K and C100K problems.

Reflection #

Finally, I would like to discuss with you your understanding of Linux network performance. What metrics do you usually use to measure network performance? And what approach do you take to analyze corresponding performance issues? You can combine the knowledge you have learned today to put forward your own opinions.

Feel free to discuss with me in the comments section, and you are also welcome to share this article with your colleagues and friends. Let’s practice in real-world scenarios and make progress through communication.