10 Automated Operations Exploring the Secrets of Kubernetes Work Mechanisms

10 Automated Operations Exploring the Secrets of Kubernetes Work Mechanisms #

Hello, I’m Chrono.

In the last class, we saw that container technology only achieved application packaging and distribution, and many difficulties still arise when implementing operations and maintenance. Therefore, container orchestration technology is needed to solve these problems, and Kubernetes is the undisputed leader in this field, having become the “de facto standard”.

So, why is Kubernetes capable of shouldering such a leadership role? Is it simply because it was developed mainly by Google?

Today, I will take you to explore the internal architecture and working mechanism of Kubernetes, and understand the secrets that make it stand above the rest.

The Operating System in the Era of Cloud Computing #

As I mentioned before, Kubernetes is a production-grade container orchestration platform and cluster management system. It can create and schedule containers, monitor and manage servers.

So, what are containers? Containers are software, applications, and processes. What about servers? Servers are hardware, including CPUs, memory, hard drives, and network cards. In that case, what should we call something that can manage both software and hardware?

You might say: That would be an operating system!

You’re right. From a certain perspective, Kubernetes can be seen as a cluster-level operating system, with its main functions being resource management and job scheduling. However, Kubernetes does not run on a single machine to manage a single computing resource and process. Instead, it runs on multiple servers to manage hundreds or thousands of computing resources and millions of processes, on a much larger scale.

Therefore, you can compare Kubernetes to Linux when learning about it. This new operating system will naturally have a series of new terms and concepts that require a new way of thinking, and you may have to bid farewell to some of your old habits when necessary.

One difference you should note between Kubernetes and Linux is that Linux users are usually divided into two types: Dev and Ops, while Kubernetes only has one type of user: DevOps.

In traditional application implementation processes, developers and operators have clear responsibilities. After development is completed, detailed documentation needs to be written and handed over to operators for deployment and management. There is limited cross-play between the two roles.

However, in the world of Kubernetes, the boundaries between development and operation become less clear. With the rise of cloud-native concepts, developers must consider ongoing deployment and operation tasks from the beginning, and operators also need to get involved in the early stages of development to ensure proper monitoring and management of applications.

This may cause new Kubernetes users to face challenges in terms of identity transformation, which can be difficult at first. But don’t worry, it is completely normal, and any learning process has an adaptation period. Once you get past the initial stage of understanding the concepts, things will become easier.

Basic Architecture of Kubernetes #

An important function of an operating system is abstraction, which involves abstracting some simplified concepts from complex underlying transactions, and then using these concepts to manage system resources.

Kubernetes is no exception. Its goal is to manage large-scale clusters and applications, so it needs to abstract the system to a high enough level and decompose it into loosely coupled objects in order to simplify the system model and reduce the user’s mental burden.

Therefore, Kubernetes plays the role of a “master-level” system administrator, with rich experience in cluster operation and maintenance. It has developed its own set of working methods and can independently handle many complex management tasks without much external intervention.

Now let’s take a look at the “internal skills” of this experienced administrator.

There is an architecture diagram of Kubernetes on the official website, but I think it is not very clear and the key points are not highlighted, so I found another one (image source). Although this diagram is a bit “old”, it is still suitable for beginners to learn Kubernetes.

Kubernetes adopts the popular “Control Plane / Data Plane” architecture. The computers in the cluster are called “nodes”, which can be physical or virtual machines. A small number of nodes are used as the control plane to execute cluster management and maintenance tasks, and most of the other nodes are assigned to the data plane to run business applications.

The nodes in the control plane are called “Master Nodes” in Kubernetes, generally referred to as “Masters”. They are the most important part of the entire cluster and can be considered as the brain and heart of Kubernetes.

The nodes in the data plane are called “Worker Nodes”, generally referred to as “Workers” or “Nodes”. They are like the hands and feet of Kubernetes, working under the command of the Master.

There are a large number of nodes, forming a resource pool, where Kubernetes allocates resources and schedules applications. Because resources are “pooled”, management becomes relatively simple, and nodes can be added or removed from the cluster at will.

In this architecture diagram, we can also see a tool called kubectl, which is the Kubernetes client tool used to operate Kubernetes. However, it is located outside the cluster and theoretically does not belong to the cluster.

You can use the command kubectl get node to view the status of Kubernetes nodes:

kubectl get node

As you can see, there is currently only one Master in the minikube cluster, but where are the Nodes?

This is because the division between Master and Node is not absolute. When the cluster is small and the workload is low, the Master can also perform the tasks of a Node. Just like the minikube environment we built, it only has one node, which serves as both the Master and the Node.

Structure Inside the Nodes #

The internal structure of a Kubernetes node is also complex, consisting of many modules that can be divided into two categories: components and addons.

Components implement the core functionality of Kubernetes. Without these components, Kubernetes cannot be started. On the other hand, addons are additional features of Kubernetes that are not required for normal operation.

Let’s first talk about the components in the master and node, and then briefly mention the addons. Once you understand their workflows, you will understand why Kubernetes has such powerful automation capabilities.

Components in the Master #

There are 4 components in the master: apiserver, etcd, scheduler, and controller-manager.

The apiserver is the master node, serving as the single entry point for the entire Kubernetes system. It exposes a series of RESTful APIs with authentication and authorization functionalities. All other components can only communicate with the apiserver. It can be considered as the liaison officer in Kubernetes.

Etcd is a highly available distributed key-value database used to persistently store various resource objects and states in the system. It acts as the configuration manager in Kubernetes. Note that it only has direct contact with the apiserver. In other words, any other component that wants to read or write data in etcd must go through the apiserver.

The scheduler is responsible for container orchestration. It checks the resource status of nodes and schedules Pods to run on the most suitable nodes. It acts as the deployment personnel. Since the node status and Pod information are stored in etcd, the scheduler must obtain them through the apiserver.

The controller-manager is responsible for maintaining the state of resources such as containers and nodes. It implements functions such as fault detection, service migration, and application scaling. It acts as the monitoring and operation personnel. Similar to other components, it also needs to obtain information stored in etcd through the apiserver to perform various operations on resources.

All these 4 components are containerized and run in pods within the cluster. You can use the command kubectl get pod -n kube-system to view their status:

kubectl get pod -n kube-system

Note that the command includes the -n kube-system parameter, which specifies that we want to check the pods in the “kube-system” namespace. We will talk about namespaces later.

Components in the Node #

The components in the master, such as the apiserver and scheduler, need to obtain various information from nodes in order to make management decisions. So how do they get this information?

This is where the 3 components in the node come in: kubelet, kube-proxy, and container-runtime.

The kubelet is the agent of the node and is responsible for managing most of the operations related to the node. It is the only component on the node that can communicate with the apiserver, enabling functionalities such as status reporting, command issuance, and container start/stop. It can be considered as a “butler” on the node.

The role of kube-proxy is a bit special. It is the network proxy for the node and is only responsible for managing container network communication. Simply put, it forwards TCP/UDP packets for Pods, acting as a dedicated “postman.”

The third component, container-runtime, is the actual user of containers and images. It creates containers under the command of kubelet and manages the lifecycle of Pods. It is the “workhorse” that actually performs the tasks.

We must note that Kubernetes is a container orchestration platform, and it does not require the container-runtime to be Docker. It can be replaced with any other container runtime that complies with the standards, such as containerd or CRI-O. However, in this case, we are using Docker.

Among these 3 components, only kube-proxy is containerized. Kubelet cannot be containerized because it needs to manage the entire node, and containerization would limit its capabilities. Therefore, kubelet must run outside the container-runtime.

After logging into the node using the minikube ssh command, you can see kube-proxy using the docker ps command:

minikube ssh
docker ps |grep kube-proxy

However, docker ps cannot be found by kubelet and it needs to use the operating system’s ps command:

ps -ef|grep kubelet

Now, let’s put the components in the Node and the components in the Master together to understand the general workflow of Kubernetes:

The kubelet on each Node regularly reports the node status to the apiserver, which is then stored in etcd.
The kube-proxy on each Node acts as a TCP/UDP reverse proxy, allowing containers to provide stable services externally.
The scheduler obtains the current node status through the apiserver, schedules Pods, and then the apiserver sends commands to the kubelet of a specific Node, which in turn calls the container-runtime to start containers.
The controller-manager also obtains real-time node status from the apiserver, monitors possible abnormal situations, and uses appropriate means to adjust and recover.

In fact, this is not much different from the operational workflow before Kubernetes appeared, but the cleverness of Kubernetes lies in abstracting and standardizing these processes.

Therefore, these components are like numerous tireless operations engineers, moving the tedious and inefficient manual work into efficient computers, so that they can discover changes and abnormalities in the cluster at any time, collaborate with each other, and maintain the health of the cluster.

What are the plugins (addons) #

As long as the server node runs components such as apiserver, scheduler, kubelet, kube-proxy, and container-runtime, it can be said to be a fully functional Kubernetes cluster.

However, just like Linux, although the basic functions provided by the operating system are “usable,” to achieve the degree of “user-friendly,” additional functions need to be installed. In Kubernetes, these are called plugins (addons).

Due to the flexibility of Kubernetes itself, there are a large number of plugins available to extend and enhance its management capabilities for applications and clusters.

Minikube also supports many plugins, and you can view the list of plugins using the command minikube addons list:

minikube addons list

In my opinion, two of the important plugins are DNS and Dashboard.

You should be quite familiar with DNS, which implements domain name resolution services in a Kubernetes cluster, allowing us to communicate with each other using domain names instead of IP addresses. It is the basis for service discovery and load balancing. Due to its crucial role in microservices, service mesh, and other architectures, it is basically an essential plugin for Kubernetes.

The Dashboard is a graphical user interface for Kubernetes, providing a visual and user-friendly operation interface. Although most Kubernetes work is done using the command-line tool kubectl, sometimes it is convenient to view information on the Dashboard.

You only need to execute a simple command in the minikube environment to automatically open the Dashboard page in a browser, and it also supports Chinese:

minikube dashboard

Summary #

Alright, today we have studied the internal architecture and working mechanism of Kubernetes together. It can be seen that its functionality is very comprehensive, achieving most of the common operations and maintenance tasks, and it is fully automated, which can save a lot of manpower costs.

Due to the high level of abstraction in Kubernetes, there are many unfamiliar new terms that are not easy to understand, so I have created a mind map for reference to deepen your understanding.

In conclusion, let’s summarize today’s key points:

Kubernetes can manage applications and servers at the cluster level and can be regarded as a kind of cluster operating system. It uses the basic architecture of “control plane/data plane”, with the Master node implementing management and control functions, and the Worker node running specific business operations.
Kubernetes is composed of many modules, which can be divided into core components and optional plugins.
There are four components in the Master, namely apiserver, etcd, scheduler, and controller-manager.
There are three components in the Node, namely kubelet, kube-proxy, and container-runtime.
Typically, essential plugins include DNS and Dashboard.

Homework #

Now it’s time for homework. I have two questions for you to think about:

Do you think Kubernetes can be considered an operating system? What are the differences compared to a real operating system?
Discuss your understanding of the roles of Kubernetes components. Which ones do you think are the most important?

Feel free to leave comments or ask questions. Let’s engage in discussions with other classmates. See you in the next class.