08 Kubernetes ( K8s) Cluster Installation Tool Kubeadm Practical Implementation

08 Kubernetes (K8s) Cluster Installation Tool kubeadm Practical Implementation #

Kubeadm is a command-line tool maintained by the Kubernetes project that supports one-click deployment and installation of Kubernetes clusters. Readers who have used it are surely impressed by its ability to easily create clusters with just two operations: kubeadm init and kubeadm join. However, such convenience cannot be directly used in production environments. We need to consider the high availability layout of components and also consider sustainability and maintainability. These more practical business requirements urgently require us to rethink the usage of kubeadm in the industry. By learning from the successful experiences of others, we can use kubeadm correctly.

Firstly, the architecture diagram of a classic Kubernetes high availability cluster is defined as follows in the official community documentation:

7-1-kubeadm-k8s-ha-arch

From the above architecture diagram, we can see that the control plane of the Kubernetes cluster is stacked using 3 nodes to form a redundant high availability system for the control components. Among them, etcd serves as the central storage for cluster state data and uses the Raft consensus algorithm to ensure data consistency for read and write operations. Observant readers will notice that the apiserver in the control plane nodes interacts with the etcd component on the current host. This stacking method effectively ensures the read and write performance of components by diverting traffic, assuming that the cluster size is fixed.

Because the etcd key-value store holds the state data of the entire cluster, it is a critical component of the system. The official documentation also provides a high availability deployment architecture for an external etcd cluster:

7-2-ha-kube-etcd-external-arch

Kubeadm supports high availability deployment using both of the above architectures. The most obvious difference between the two architectures is that in the external etcd cluster mode, the number of etcd data plane machine nodes does not need to be the same as the number of control plane machine nodes. You can provide 3 or 5 etcd nodes according to the cluster size to ensure high availability of the business. The community’s development interest group k8s-sig-cluster-lifecycle has also released the open-source etcdadm tool to automate the deployment of external etcd clusters.

Benchmark Check Before Installation #

The first thing to check on the cluster hosts is the uniqueness of hardware information to prevent conflicts in cluster information. Ensure the uniqueness of MAC addresses and product_uuid on each node. The checking methods are as follows:

  • You can use the command ip link or ifconfig -a to obtain the MAC address of the network interface.
  • You can use the command sudo cat /sys/class/dmi/id/product_uuid to check the product_uuid.

Checking the uniqueness of hardware information is mainly to deal with the repetition of virtual machine environments created after the virtual machine template is created. By checking, you can avoid this issue.

In addition, we also need to ensure that the default network interface can access the Internet, as Kubernetes components communicate via the default route.

There is also an issue with the current mainstream Linux systems, where nftables can currently be used as a replacement for the iptables subsystem in the kernel. This causes the iptables command to act as a compatibility layer, and the nftables backend is currently not compatible with kubeadm. nftables will generate duplicate firewall rules and disrupt the operation of kube-proxy. Currently, mainstream systems such as CentOS can solve this by configuring as follows:

update-alternatives --set iptables /usr/sbin/iptables-legacy

Check Ports #

Control Plane Nodes #

Protocol Direction Port Range Purpose User
TCP Inbound 6443* Kubernetes API Server All components
TCP Inbound 2379-2380 etcd server client API kube-apiserver, etcd
TCP Inbound 10250 Kubelet API kubelet itself, control plane components
TCP Inbound 10251 kube-scheduler kube-scheduler itself
TCP Inbound 10252 kube-controller-manager kube-controller-manager itself

* Any port number marked with * can be overridden, so you need to ensure that the customized port is open.

Worker Nodes #

Protocol Direction Port Range Purpose User
TCP Inbound 10250 Kubelet API kubelet itself, control plane components
TCP Inbound 30000-32767 NodePort Services** All components

** Default port range for NodePort services. Note that when deploying a cluster in an enterprise, most cases involve initializing a small-scale cluster to begin with, so the situation of configuring an etcd cluster separately as an external entity is an exception. It is preferred for small-scale clusters to deploy the etcd cluster in a stacked manner on the control plane nodes.

In addition, the pod container network plugin will enable some custom ports, and you need to refer to their respective documentation to plan for port requirements.

Installing the Container Runtime Engine #

Kubenet cannot directly start containers, so the cluster nodes need to deploy a unified container runtime engine. Starting from version v1.6.0, Kubernetes started to allow the use of the Container Runtime Interface (CRI) by default. Starting from version v1.14.0, kubeadm will automatically detect the container runtime on Linux nodes by observing known UNIX domain sockets. The table below shows the detectable running runtimes and socket paths.

Runtime Socket Path
Docker /var/run/docker.sock
containerd /run/containerd/containerd.sock
CRI-O /var/run/crio/crio.sock

If both Docker and containerd are detected, Docker will be given priority. Currently, the industry is gradually moving away from Docker and moving towards the containerd engine, so attention should be paid to the upgrade of the container runtime engine in the cluster environment.

Installing kubeadm, kubelet, and kubectl #

When using kubeadm to install a cluster, it does not manage the version of the kubectl tool, so cluster administrators need to pay attention to the consistency of the version numbers to avoid version compatibility issues. If not done so, there is a risk of version deviation, which may result in unexpected errors and problems.

The official installation method for the components is to use the operating system’s package management system such as yum for management. However, in the actual network environment in China, we still face the problem of failed downloads. For a unified installation experience, it is recommended to download the corresponding system installation packages in advance. If it cannot be obtained, binary files can be used for deployment directly.

Disabling SELinux by running the commands setenforce 0 and sed ... and setting it to permissive mode effectively disables SELinux. This is necessary to allow containers to access the host file system, such as for normal use of the pod network. You must do this until kubelet supports SELinux upgrades.

Some RHEL/CentOS 7 users have encountered issues where traffic cannot be routed correctly due to iptables being bypassed. You should ensure that net.bridge.bridge-nf-call-iptables in the sysctl configuration is set to 1.

cat <<EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system

Make sure that the br_netfilter module is loaded before this step. This can be done by running lsmod | grep br_netfilter. To check if it is loaded, invoke modprobe br_netfilter.

In addition, on the control plane nodes, kubelet also needs to pay attention to the support for cgroup drivers. It supports cgroupfs by default, but there is also an option for the systemd driver. As mainstream operating systems support systemd, it is recommended for users of the containerd engine to switch to the systemd driver by configuring it.

Installing a Highly Available Cluster Using kubeadm #

Creating a Load Balancer for kube-apiserver #

Because the kube-apiserver is used to synchronize the cluster state between worker nodes and control plane nodes, the worker nodes need to distribute traffic to the control plane cluster through a reverse proxy. In general installation cases, an additional HAProxy with keepalived is used to load balance request traffic. As the latest Linux kernel already supports the IPVS component, it is possible to achieve kernel-level traffic proxy using IPVS rules dynamically maintained, and there have been industry practices using this method to access the apiserver. The specific configuration is shown in the diagram below:

7-3-kubeadm-lvs-lb

Summary of Practices #

Kubernetes has introduced many installation solutions, and due to the differentiation of environments, various installation tools have emerged, which makes it confusing for users to choose. Kubeadm is one of the more prominent solutions among multiple choices. Because it uses a containerized deployment method, its operational complexity is much greater than that of the binary method, and there are still issues such as version inconsistency during the installation process. Currently, the community is also optimizing and consolidating the stability of this aspect of functionality. It is foreseeable that in the near future, the kubeadm-based approach will become the mainstream installation solution.

References: