13 Kubernetes Network Concepts and Strategy Control Collapse

13 Kubernetes Network Concepts and Strategy Control Collapse #

This article will mainly discuss the following five aspects:

Basic Networking Model of Kubernetes;
The Mystery of Netns;
Introduction to Mainstream Networking Solutions;
The Use of Network Policy;
Time for Reflection.

Basic Network Model of Kubernetes #

In this section, we will introduce some ideas of Kubernetes on network models. As we know, Kubernetes does not impose any restrictions on the specific implementation of networking and does not provide any particularly good reference cases. Kubernetes has restrictions on whether a container network is qualified, which is the container network model of Kubernetes. It can be summarized as “Three Chapters of Law” and “Four Major Goals”.

“Three Chapters of Law” means the admission criteria for evaluating or designing a container network. What are the three conditions that must be met in order to consider it a qualified network solution.
“Four Major Goals” means that when designing the topology of this network and implementing the specific functions of the network, we need to think clearly about whether the connectivity and other indicators can be achieved.

Three Chapters of Law #

Let’s first look at the “Three Chapters of Law”:

The first rule: any two pods can communicate directly, without the need for explicit use of NAT to receive data or address conversion.
The second rule: nodes can communicate directly with pods without the need for obvious address conversion.
The third rule: a pod sees its IP address the same as others see the IP address it uses, and there is no conversion in between.

Later in the text, I will explain why Kubernetes has some seemingly arbitrary models and requirements for container networks based on my personal understanding.

Four Major Goals #

The four major goals are actually about how the external world can step by step connect to the applications inside the containers when designing a Kubernetes system to provide services to the outside world from a network perspective.

How does the external world communicate with services? This refers to how an Internet or an external user of a company can use services. “Service” refers specifically to the concept of services in Kubernetes.
How does a service communicate with its backend pods?
How do pods communicate with each other?
Finally, how do containers within a pod communicate with each other?

The ultimate goal is to allow the external world to connect to the innermost part and provide services to the containers.

Explanation of Basic Restraints #

Regarding the basic restraints, the following interpretations can be made: the complexity of the development of container networks lies in the fact that they are actually parasitic on the host network. From this perspective, container network solutions can be roughly divided into two major factions, “Underlay” and “Overlay”:

The standard of “Underlay” is that it is on the same layer as the host network, and an external visible feature is whether it uses the same network segment and input-output basic devices as the host network, and whether the IP addresses of the containers need to be coordinated with the host network (coming from the same central allocation or unified partition). This is “Underlay.”
“Overlay” is different in that it does not need to request IP addresses from the components that manage the IPMs of the host network. Generally speaking, it only needs to avoid conflicts with the host network, and the IP can be freely assigned.

avatar

Why did the community propose such a simple and arbitrary model as perPodperIP? Personally, I think it brings many benefits for managing the traceability and performance monitoring of services in the future. Because one IP is consistent throughout, it has many advantages for cases or various small issues.

Unveiling Netns #

What does Netns actually achieve? #

Let’s briefly discuss the kernel basics that Network Namespace can achieve. Strictly speaking, runC container technology does not depend on any hardware. Its execution foundation lies in its own kernel, where the kernel represents the process in the form of a task. If it doesn’t require isolation, it uses the host’s namespace and doesn’t require specially configured namespace isolation data structures (nsproxy-namespace proxy).

avatar

On the contrary, if it is an independent network proxy or mount proxy, the real private data needs to be filled in. The data structures it can see are shown in the above figure.

From a sensory perspective, an isolated network space has its own network card or network device. The network card could be virtual or physical, and it will have its own IP address, IP table and routing table, as well as its own protocol stack states. Specifically, this refers to the TCP/IP protocol stack, which will have its own status, iptables, and ipvs.

Overall, this is equivalent to having a completely separate network that is isolated from the host network. Of course, the code for the protocol stack is still shared, only the data structures are different.

The Relationship between Pod and Netns #

avatar

This diagram clearly shows the relationship between Netns and pods. Each pod has its own network space, and the pod net container shares this network space. Typically, in K8s, the loopback interface is recommended for communication between pod net containers, while all containers provide services externally through the IP of the pod. Additionally, the Root Netns on the host machine can be seen as a special network space, except that its Pid is 1.

Introduction to Mainstream Network Solutions #

Typical Container Network Implementation Solutions #

Next, let’s briefly introduce some typical container network implementation solutions. Container network solutions are perhaps the most diverse area within K8s, with various implementations available. The complexity of container networks lies in the coordination with the underlying IaaS layer network and the need to make choices in terms of performance and flexibility of IP allocation. There are many different solutions to this problem.

avatar

Below are a few of the more prominent solutions: Flannel, Calico, Canal, and WeaveNet, with most of them adopting a strategy similar to Calico’s policy routing method.

Flannel is a more comprehensive solution that provides multiple network backends. Different backends implement different topologies, making it suitable for various scenarios.
Calico mainly adopts policy routing and uses BGP protocol for route synchronization between nodes. Its feature set is quite rich, especially in terms of Network Point support. As many people know, Calico typically requires MAC addresses to be directly accessible and not bridging across layer 2 domains.
Of course, there are also some community members who integrate the advantages of Flannel and Calico. This type of innovative project is referred to as Cilium.
Finally, let’s talk about WeaveNet. If there is a need to encrypt data during usage, WeaveNet can be selected. Its dynamic solution provides good encryption capabilities.

Flannel Solution #

avatar

The Flannel solution is currently the most commonly used. As shown in the figure above, we can see a typical container network solution. The first problem it solves is how the container packets reach the host, which is achieved by adding a bridge. The backend used is actually independent, i.e., how the packets leave the host, what kind of encapsulation is used, or whether encapsulation is required, can all be selected.

Now let’s introduce the three main backends:

One is the user-space UDP, which is the earliest implementation.
Then there is the kernel Vxlan, both of which are considered overlay solutions. Vxlan has better performance, but it has certain requirements for the kernel version and needs kernel support for Vxlan’s specific features.
If your cluster is not large enough and is in the same layer 2 domain, you can also choose to use the host-gw method. The backend for this method is essentially initiated by a set of broadcast routing rules, and it has higher performance.

The Usefulness of Network Policy #

Basic Concepts of Network Policy #

Let’s talk about the concept of Network Policy.

avatar

As mentioned earlier, the basic model of Kubernetes networking requires all pods to be interconnected. However, this can cause some problems. For example, there may be no direct communication between two departments in a Kubernetes cluster. In such cases, the concept of policy can be used.

The basic idea is as follows: by using selectors (labels or namespaces), a group of pods or communication endpoints can be identified, and their connectivity can be determined based on the description of flow characteristics. This can be understood as a mechanism similar to a whitelist.

Before using Network Policy, as shown in the above figure, please note that the apiserver needs to enable the following switches. Another important point is that the chosen network plugin needs to support the implementation of Network Policy. It is important to understand that Network Policy is just an object provided by Kubernetes and does not include built-in components for implementation. It depends on the support and completeness of the container network solution you choose. If you choose something like Flannel, which does not truly implement this policy, then using Network Policy will not have any effect.

Configuration Example #

avatar

Next, let’s discuss a configuration example and what needs to be done when designing a Network Policy. I believe there are three key points:

The first point is to control the objects, just like the spec section in this example. By using podSelector or namespace selector in spec, specific groups of pods can be selected to be controlled.
The second point is to consider the flow direction. Do you want to control inbound traffic, outbound traffic, or both?
The most important part is the third section. It involves describing the flow of the selected direction by adding control objects. Specifically, which streams can be allowed in or out? By using selectors, you can decide which streams can be potential endpoints. This is the selection of objects. You can also use mechanisms like IPBlock to specify which IP addresses are allowed. Lastly, you can specify protocols or ports. In essence, the flow characteristics form a tuple, and specific streams that can be accepted are selected based on these characteristics.

Summary of this Lesson #

The content of this lesson concludes here, and let’s summarize it briefly:

The core concept of the container network in a pod is IP. IP is the basic address for communication of each pod externally and internally, and it must be consistent, in line with the characteristics of the K8s model.
When introducing the network solution, the topology is the most crucial factor affecting the performance of the container network. You need to understand how your packets are connected end-to-end, how they go from the container to the host, whether they are encapsulated or decapsulated when leaving the host, or if they are routed through policy routing, and how they are eventually deciphered at the destination.
Container network selection and design choices. If you are not clear about your external network, or if you need a solution with the strongest generality, you can choose Flannel with Vxlan as the backend. If you are sure that your network can be directly connected at Layer 2, you can choose Calico or Flannel-Hostgw as a backend.
Finally, there is the Network Policy. It is a powerful tool for precise control of inbound and outbound flows in operation and usage. We have also introduced the implementation method, which requires you to clearly define who you want to control and how you want to define your flow.

Time for Reflection #

Finally, let’s take some time to reflect on the following questions:

Why has the standardization of interface CNI not resulted in a standardized implementation of container networks that is built into Kubernetes?
Why is there no standard controller or implementation for Network Policy, leaving it up to the owner of the container network?
Is it possible to implement container networks without using networking devices entirely? Considering the availability of solutions like RDMA, which differ from TCP/IP.
Given that there are often many network problems during operation and they can be difficult to diagnose, is it worth developing an open-source tool that can provide a friendly display of the network situation at each stage, from container to host, host to host, or encapsulation and decapsulation, and quickly locate any issues? As far as I know, such a tool does not currently exist.

This concludes my overview of the basic concepts of Kubernetes container networks and an introduction to Network Policy. Thank you all for watching.