03 Dev Ops Scenario Analysis of Practical Difficulties in Kubernetes ( K8s)

03 DevOps Scenario Analysis of Practical Difficulties in Kubernetes (K8s) #

Kubernetes is an open-source system used by DevOps teams to automate the deployment, scaling, and management of containerized applications. It is generally used to address pain points in the CI/CD (continuous integration/continuous deployment) process, such as the lack of unified toolchains and standardized build processes. However, DevOps teams often encounter unforeseen problems in various stages, including installation, deployment, networking, storage, and rolling updates of applications when implementing Kubernetes. The official reference materials do not provide deterministic solutions to these issues. Many DevOps teams have to temporarily solve the problems encountered based on their existing experience in order to iterate quickly. Due to the limitations of different scenarios, each team has its own demands and it becomes difficult to truly pass on and share experiences. This article aims to tackle the current pain points of DevOps, identify the core issues from the source, and compile viable methodologies based on best industry practices. This will enable DevOps teams to handle the challenges of implementing Kubernetes with ease and no longer be troubled by the difficulties.

Fragmentation of Kubernetes Knowledge System #

When implementing the Kubernetes system, many DevOps teams often rely on industry experiences shared on the Internet as references in order to minimize pitfalls. However, when facing specific problems in practice, it is difficult to find particularly suitable solutions from the existing approaches due to factors such as inconsistent environments and diverse scenario requirements.

Another even worse situation is that a lot of online resources are outdated, which creates obstacles for teams to build their knowledge system. Although teams can seek guidance from external experts, learn from professional books, and gradually fill the knowledge gaps, we should avoid an explosive bombardment of Kubernetes knowledge and establish a knowledge map to effectively identify suitable learning paths for our teams. This will enable Kubernetes to support the development of your business. The following is a reference diagram of a knowledge map I provide:

With this map, you will have a knowledge navigation chart that helps you have a global understanding of your team’s Kubernetes capabilities when needed.

The Dilemma of Choosing Container Networking #

Choosing the right container networking solution has always been a pain point for DevOps teams. Kubernetes clusters are designed with a two-layer networking structure. The first layer is the service layer network, and the second layer is the Pod network. The Pod network can be understood as the east-west container network, which is consistent with the design of common Docker container networks. The service layer network is the network through which Kubernetes exposes services to the outside world, and it can be simply understood as the north-south container network. The commonly used network deployed by the Kubernetes official is Flannel, which is a simplified overlay network that can only be used in development and testing environments due to its low performance. To solve the networking problem, the community has provided many excellent network plugins such as Calico, Contiv, Cilium, Kube-OVN, etc., which can cause confusion for users when making choices.

Firstly, when enterprises introduce Kubernetes networking, they simply add it as a system network to the enterprise network. Enterprise networks are generally designed as large layer 2 networks, and the network planning for each system is fixed. Such planning obviously cannot meet the development of Kubernetes networking. To understand and solve this problem well, we can first summarize the demands of most users as follows:

First, due to the significant increase in the number of container instances, it is unreasonable to plan the number of IP addresses according to the instances.
Second, we need to assign a fixed IP address to each container instance, just like virtual machine instances.
Third, container networking performance should not be compromised and should be at least on par with physical networks. In such scenarios, network performance is a crucial metric. According to online recommendations, some conclusions can be drawn: The virtual network performance of Calico is close to that of a physical network. It has simplified configurations and supports NetworkPolicy, making it the most versatile solution. In a physical network, using MacVlan can achieve native network performance and resolve communication issues with external networks.

Although this information is shared as best practices online, we still need to verify it through testing in our local network environment. This type of network validation is necessary, but when choosing a network option, we do not need to test every single network solution. We can follow some reasonable categorizations to clarify the direction.

In fact, we should go back to the original design of Kubernetes. It is a data center-level container cluster system, and we can give it an independent network layer. Following this direction, its network should use virtual networks. In this case, the possible options are Flannel, Calico, and Cilium.

There is another direction where the goal is for the Kubernetes network to be an extension of your entire enterprise network. In this design, Kubernetes networks rely on existing networks. The object system on Kubernetes is compressed to only use the Pod network layer. In many traditional legacy systems, if we want to implement the Kubernetes solution, we will face this problem. In this scenario, Contiv and Kube-Ovn can be used to connect directly to the native network. This design will smoothly migrate many legacy systems into cloud-native networks. This is a very good practice.

Introduction of storage solutions #

Container Pods can mount disks, including local storage, network storage, and object storage. In the early stages of Kubernetes development, there were numerous container storage drivers, and bugs in these drivers caused various issues when running containers. Now, the Kubernetes community has finally established the CSI container storage standard solution, which has reached the production-ready stage. Therefore, when choosing storage drivers, we must select CSI storage drivers to access storage.

Previously, local storage was directly mounted as directories. However, after adopting the CSI approach, local resources are also expected to be requested using PV and PVC, without directly mounting directories.

For network storage, NFS is commonly specified. NFS has the special ability to be shared and mounted by multiple hosts. However, PV and PVC still expect the target storage to be unique. In this case, it is important to plan the directory structure of NFS. Also, it is worth noting that the storage size cannot be limited in NFS and is fully planned by the underlying NFS system. To effectively manage the storage space, the DevOps team can manually create PVs and limit the space size, and then have users mount these manually created PVs using PVCs.

For object storage, due to enhanced abstraction capabilities, dynamic creation of PVs and automatic mounting of PVCs can be achieved, and the requested sizes can be dynamically met. This is the most ideal storage solution, but backend storage systems such as Ceph require dedicated maintenance personnel to ensure system stability. Thus, more consideration is needed when allocating resources.

Choosing a container engine #

Many people may wonder why there is a need to choose, and why not simply install Docker. Due to the development of container technology, the official Kubernetes engine has now default installed as the Cri-O open-source engine. Initially, the Kubernetes community defined a set of standards for container engines, which led to the diversification of container engines. In order to have a clearer understanding of the position of container engines, we can use a diagram to better understand the position of container engines:

It is obvious that Containerd has replaced Docker. As Containerd is derived from Docker source code, its reliability has been tested and proven over the years, making it the most reliable container engine currently available.

Furthermore, when business development requires the implementation of multi-tenancy and the host environment is not trusted, the container engine at this time needs further isolation. Currently, there are several options available, such as KataContainer, firecracker, and gVisor. These technologies are generally referred to as rich container technologies, which thoroughly isolate the container environment by using trimmed-down virtualization components, making them true lightweight virtual machines. To give you an analogy, imagine this. With server specifications becoming more advanced, with CPUs typically reaching 32 cores and memory going up to 256GB, it is common to see such large capacity hosts. If we only run containers for one user group on such a large host, it clearly becomes wasteful. Previously, resource allocation was achieved by using virtualization to isolate and allocate resources to the business after DevOps teams planned the Kubernetes cluster resources. When we merge the virtualization layer, such as Openstack and Kubernetes, it is necessary to truly introduce virtualization technology into Kubernetes. This is the significance of rich containers. Of course, the configuration of this area still cannot be done in a foolproof manner and still requires professional developers for optimization, so we still need to use it cautiously.

Cluster Scaling Issue #

For enterprises, they usually hope to deploy only one system to reduce management and operation costs. However, Kubernetes is an open-source system and in many scenarios, it cannot manage across networks. Therefore, we have to deploy multiple clusters for business segmentation. Multiple clusters mean multiple infrastructures, which puts some pressure on DevOps teams for operation and maintenance. Here, we can compare it as follows:

I believe we should refer to Google’s internal cluster management experience. Based on the scale of data centers, each data center should plan only one cluster. However, the current headache for domestic enterprises is that their internal networks are divided into several isolated zones using firewalls. Each network zone is opened with a limited number of ports through a whitelist, and even the production network only allows operation and maintenance actions through jump servers. If a Kubernetes cluster is deployed in such a complex network, it inevitably requires a lot of parameter and rule planning and refinement, which requires more effort than deploying separate clusters.

To effectively solve this problem, we can use an iterative approach. Initially, we can adopt a multi-cluster mode, and then gradually merge multiple clusters into a single cluster using host labels and namespace spaces. Of course, Google has only one security level, and regardless of development testing or integrated production, there is only one security strategy, which is very suitable for cluster planning. However, the security levels of many enterprises are managed separately, and their service quality SLAs are also different. In this case, it is reasonable to divide the cluster into development testing clusters and production clusters. After all, security is the lifeblood of enterprises, and then comes the planning of cluster operation costs.

Introduction of Security Audit #

Kubernetes is a complex system, and its security issues are highly valued by enterprises. First of all, for authentication and authorization of cluster calls, there is a native role-based access control (RBAC) model. This RBAC can support scenarios with not too many role permissions. However, it cannot easily meet more fine-grained control requirements. For example, allowing users to access their own namespace but not allowing them to access the kube-system system-level namespace. The community provides the Open Policy Agent tool to solve this problem.

In simple terms, RBAC is a whitelist approach, and when there are many user rules, changing policies requires updates to multiple role definitions, resulting in high maintenance costs. Using OPA is a blacklist approach, where only one rule is needed to handle changes.

In addition, the security of enterprise Kubernetes needs to be audited regularly with the help of some tools. One well-known tool is the CIS Kubernetes Benchmark, which you can refer to.

Issues with building a business assurance team #

Many DevOps teams have found that after taking over Kubernetes, the operational difficulty of this system is several times that of other systems. The requirements for business stability bring a lot of uncertain pressure to DevOps teams. The obvious reason is that Kubernetes raises the requirements for personnel capabilities.

Referring to the construction process of Google’s SRE team, I found that this is a position that is relatively lacking in domestic enterprises. SRE is not commonly seen in traditional domestic enterprises. It is similar to a senior operations and architecture engineer, but it focuses on ensuring the normal operation of the enterprise with a business-centered perspective. With the success of Alibaba Group’s introduction of a business assurance system, leading domestic companies have gradually accepted this new role and are constantly upgrading the capabilities of this position.

For traditional enterprises, how to effectively upgrade the knowledge structure of existing DevOps personnel and change their mindset to think about problems from a business assurance-centered perspective has become a new challenge. In terms of resources, the technical capabilities of many enterprises are completed through the integration of partners. It is not necessary for traditional enterprises to completely break the original job planning and adopt the practices of the internet, as there are many risks involved. This is because the first element of traditional enterprises is security, followed by reliability. Because the data reliability of traditional enterprises is much more mature than that of internet companies, a sufficient number of redundant systems can ensure data integrity. In this case, the original DevOps team should fully understand the limitations of Kubernetes and rely more on technical collaboration with partners to make the implementation of the Kubernetes system more robust.

Cluster installation issues #

Do not underestimate the issue of Kubernetes installation tools. Currently, there are many different ways to deploy Kubernetes clusters, which can be confusing for enterprise DevOps teams.

Firstly, the core components of Kubernetes are released in binary versions, which means that they can be installed and deployed at the host level using the systemd method. Some deployment versions use the method of deploying static pods, which is a relatively special deployment method that is not recommended for traditional enterprises. Some distribution versions use container deployment to store data, which may seem convenient during initial use. However, in future operations and maintenance, the isolation of containers may hinder the timely acquisition of component states, which may cause obstacles in troubleshooting business failures. Therefore, the best approach I recommend is to deploy using binary components. Currently, kubeadm is the recommended official way to install Kubernetes clusters. However, astonishingly, kubeadm also uses images to deploy core components. Although it saves users a lot of trouble in terms of convenience, it also brings many pitfalls to future troubleshooting and maintenance. Please use it with caution.

Secondly, since enterprises require high availability for clusters, the master nodes are generally configured as 3 nodes. The most important component is the maintenance of the etcd key-value cluster. One confusing aspect for enterprises is that as long as there are more than 3 master nodes, our system is highly available, but that is not the case. If the 3 master nodes are placed in the same network area, when this network area experiences instability, the service will still have problems. The solution is to place the masters in 3 different network areas to achieve fault tolerance and high availability.

In addition, there are also a large number of enterprises with limited server capacity, such as around 5-6 nodes, who also want to use Kubernetes clusters. It may feel wasteful to allocate 3 nodes as master nodes without running any business. At this time, I think a lightweight Kubernetes cluster system should be used. Here, I recommend the K3s cluster system, which cleverly compiles all core components into a single binary program, requiring only 40MB to deploy. Although this cluster is a single-node version, the business hosted on the Kubernetes cluster’s nodes can still be accessed even if the master node fails. We only need to deploy multiple nodes for the business to perfectly match this single-node cluster. Enterprises can flexibly adopt this solution according to their needs.

In summary, I believe that implementing Kubernetes in an enterprise presents certain technical challenges, and DevOps needs to face these challenges head-on and choose reasonable solutions based on the implementation situation.