17 a More Realistic Cloud Native Practice Building a Multi Node Kubernetes Cluster

17 A More Realistic Cloud-Native Practice Building a Multi-Node Kubernetes Cluster #

Hello, I’m Chrono.

By now, you’ve already made progress in learning this series. In the “Getting Started” section, we learned about Docker and container technology. In the “Intermediate” section, we mastered the basic objects, principles, and operations of Kubernetes. It has been quite a journey with many gains.

At this point, you should have a preliminary understanding of Kubernetes and container orchestration. Now, let’s continue to delve deeper into other API objects of Kubernetes, which are essential concepts for cloud computing and cluster management but do not exist in Docker.

However, before we can do that, we need a more realistic Kubernetes environment than minikube. It should be a multi-node Kubernetes cluster, which is closer to a production system in the real world and allows us to quickly gain practical cluster usage experience.

So, in today’s lesson, let’s temporarily forget about minikube and use kubeadm (https://kubernetes.io/zh/docs/reference/setup-tools/kubeadm/) to build a new Kubernetes cluster. Let’s take a look at a more realistic cloud native environment.

What is kubeadm #

In the previous few lessons, we used minikube, which is very easy to use and does not require much configuration work to create a fully functional Kubernetes cluster in a single-machine environment. It has brought great convenience for learning, development, and testing.

However, minikube is still too “mini”, and while it is convenient, it also hides many details. It is quite different from a real production environment with a compute cluster. After all, many requirements and tasks can only be encountered in a multi-node cluster. Compared to that, minikube can only be considered a “toy”.

So, how is a multi-node Kubernetes cluster created from scratch?

In [Lesson 10], I mentioned that Kubernetes is composed of many modules, and the components that implement core functions, such as apiserver, etcd, and scheduler, are essentially executable files. Therefore, similar to other systems, they can be packaged and deployed to servers using tools like Shell scripts or Ansible.

However, the configuration and interrelationships between these components in Kubernetes are too complex. It is difficult to deploy using Shell or Ansible, and it requires considerable expertise in operations and maintenance knowledge to configure and build the cluster. Even so, the setup process is very cumbersome.

To simplify the deployment of Kubernetes and make it more “down-to-earth”, the community has introduced a tool specifically designed to install Kubernetes in a cluster environment called “kubeadm”, which stands for “Kubernetes Administrator”.

Kubeadm, similar to minikube in principle, also uses containers and images to package various components of Kubernetes. However, its goal is not to deploy on a single machine, but to easily deploy Kubernetes in a cluster environment and make this cluster approach or even reach production-level quality.

While maintaining this high standard, kubeadm also has the ease of use like minikube. With just a few commands, such as init, join, upgrade, reset, you can complete the management and maintenance of the Kubernetes cluster. This makes it suitable not only for cluster administrators but also for developers and testers.

What does the architecture of the experimental environment look like #

Before using kubeadm to build the experimental environment, let’s take a look at the architecture design of the cluster, which means preparing the necessary hardware facilities for the cluster.

Here I have drawn a system architecture diagram, which includes a total of 3 hosts. Of course, they are all virtual machines created using virtualization software such as VirtualBox/VMWare. Let me explain in detail:

In a multi-node cluster, at least two or more servers are required. To simplify, we will use the minimum value. So this Kubernetes cluster has only two hosts, one Master node and one Worker node. Of course, after fully mastering the usage of kubeadm, you can add more nodes to this cluster.

The Master node needs to run components like apiserver, etcd, scheduler, and controller-manager to manage the entire cluster, so it has higher configuration requirements, at least 2 CPU cores and 4GB of memory.

The Worker node does not have any management tasks and only runs business applications, so its configuration can be lower. To save resources, I allocated 1 CPU core and 1GB of memory to it, which can be considered as the lowest possible.

Considering a simulated production environment, there is also a server outside the Kubernetes cluster that functions as an auxiliary.

It is called Console, which means the control panel. We need to install the command-line tool kubectl on it, and all the management commands for the Kubernetes cluster are executed from this host. This also corresponds to the actual situation, because for security reasons, after the hosts in the cluster are deployed, direct logins and operations should be minimized.

I want to remind you that Console is only a logical concept and does not necessarily have to be independent. When you actually install and deploy, you can completely reuse the previously created minikube virtual machine or directly use the Master/Worker nodes as the control panel.

These three hosts together form our experimental environment, so when configuring, pay attention to their network options, they must be on the same network segment. You can review the [Preparation] section to ensure that they use the same “Host-Only” (VirtualBox) or “Custom” (VMWare Fusion) network.

Preparations before installation #

However, after having these hosts in the architecture diagram, we cannot immediately start installing Kubernetes using kubeadm because Kubernetes has some specific requirements for the system. We need to do some preparations on both the Master and Worker nodes.

You can find detailed information about these preparations on the official website of Kubernetes, but they are scattered across different documents and can be confusing. Therefore, I have consolidated them here, including changing the hostname, Docker configuration, network settings, and disabling swap partition.

First, since Kubernetes uses hostnames to distinguish nodes in the cluster, the hostname of each node must not be the same. You need to modify the file “/etc/hostname” and change it to a recognizable name. For example, the Master node can be named master and the Worker node can be named worker:

sudo vi /etc/hostname

Second, although Kubernetes currently supports multiple container runtimes, Docker is still the most convenient and easy-to-use one. So we will continue to use Docker as the underlying support for Kubernetes. Install Docker Engine using apt (refer to [Lesson 1]).

After installation, you need to make some modifications to the Docker configuration. In the file “/etc/docker/daemon.json”, change the cgroup driver to systemd, and then restart the Docker daemon. The specific operations are listed below:

cat <<EOF | sudo tee /etc/docker/daemon.json
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2"
}
EOF

sudo systemctl enable docker
sudo systemctl daemon-reload
sudo systemctl restart docker

Third, to allow Kubernetes to inspect and forward network traffic, you need to modify the iptables configuration to enable the “br_netfilter” module:

cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF

cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward=1 # better than modify /etc/sysctl.conf
EOF

sudo sysctl --system

Fourth, you need to modify the file “/etc/fstab” to disable the Linux swap partition and improve the performance of Kubernetes:

sudo swapoff -a
sudo sed -ri '/\sswap\s/s/^#?/#/' /etc/fstab

After completing these steps, it is recommended to restart the system and take a snapshot of the virtual machine as a backup to avoid repeating the steps due to operation mistakes in the future.

Installing kubeadm #

Alright, now we will install kubeadm. This step needs to be done on both the Master and Worker nodes.

kubeadm can be downloaded and installed directly from Google’s own software repository. However, due to unstable network connections in China, it’s difficult to download successfully. Therefore, we need to use alternative software sources. Here, I have chosen a certain cloud provider in China:

sudo apt install -y apt-transport-https ca-certificates curl

curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add -

cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF

sudo apt update

After updating the software repository, we can use apt install to obtain the three essential installation tools: kubeadm, kubelet, and kubectl. By default, apt will download the latest version, but we can also specify a version number. For example, we can use the same version as minikube, which is “1.23.3”:

sudo apt install -y kubeadm=1.23.3-00 kubelet=1.23.3-00 kubectl=1.23.3-00

After the installation is completed, you can use kubeadm version and kubectl version --client to verify if the versions are correct:

kubeadm version
kubectl version --client

In addition, as recommended by the Kubernetes official website, it’s better to use the command apt-mark hold to lock the versions of these three software packages and avoid accidental upgrades that may lead to version errors:

sudo apt-mark hold kubeadm kubelet kubectl

Downloading Kubernetes Component Images #

As I mentioned before, kubeadm packages components such as apiserver, etcd, scheduler, etc., into images and starts Kubernetes as containers. However, these images are not hosted on Docker Hub but on Google’s own image repository website called gcr.io, which is difficult to access in China. It is almost impossible to directly pull the images.

Therefore, we need to take some alternative measures and download the images to the local machine in advance.

You can use the command kubeadm config images list to view the list of images required to install Kubernetes. You can specify the version number with the --kubernetes-version parameter:

kubeadm config images list --kubernetes-version v1.23.3

k8s.gcr.io/kube-apiserver:v1.23.3
k8s.gcr.io/kube-controller-manager:v1.23.3
k8s.gcr.io/kube-scheduler:v1.23.3
k8s.gcr.io/kube-proxy:v1.23.3
k8s.gcr.io/pause:3.6
k8s.gcr.io/etcd:3.5.1-0
k8s.gcr.io/coredns/coredns:v1.8.6

Once you know the names and tags of the images, there are two methods to easily obtain these images.

The first method is to use minikube. Since minikube also packages the component images of Kubernetes, you can export these images from its nodes and then copy them over.

The process is straightforward. First, start minikube and then log in to the virtual node using minikube ssh. Use the docker save -o command to save the respective version of the images and then use minikube cp to copy them to the local machine. The rest is self-explanatory:

This method is secure and reliable, but it involves more steps. Therefore, there is a second method to download from Chinese mirror websites and then rename the images using docker tag. This can be automated using Shell programming:

repo=registry.aliyuncs.com/google_containers

for name in `kubeadm config images list --kubernetes-version v1.23.3`; do

    src_name=${name#k8s.gcr.io/}
    src_name=${src_name#coredns/}

    docker pull $repo/$src_name

    docker tag $repo/$src_name $name
    docker rmi $repo/$src_name
done

The second method is faster, but it comes with risks. If the website is unavailable or the images have been modified, it can be dangerous.

To mitigate these risks, you can combine these two methods. First, use the script to download from the Chinese mirror repository, and then compare it with the images in minikube. As long as the IMAGE ID is the same, it indicates that the images are correct.

The screenshot below shows the image list of Kubernetes 1.23.3 (amd64/arm64), which you can refer to during installation:

Installing the Master Node #

With all the preparations done, you can now proceed to install Kubernetes. Let’s start with the Master node.

The usage of kubeadm is very simple. You only need one command, kubeadm init, to start the components on the Master node. However, there are many parameters available to adjust the cluster configuration. You can use -h to see them all. Here, I will only mention three parameters that we will use in our experimental environment:

–pod-network-cidr: Set the IP address range for Pods in the cluster.
–apiserver-advertise-address: Set the IP address of the apiserver. This is important for servers with multiple network interfaces, such as VirtualBox virtual machines. You can specify which network interface the apiserver should use to provide services externally.
–kubernetes-version: Specify the version number of Kubernetes.

In the following installation command, I set the Pod address range to “10.10.0.0/16”, the apiserver service address to “192.168.10.210”, and the Kubernetes version to “1.23.3”:

sudo kubeadm init \
    --pod-network-cidr=10.10.0.0/16 \
    --apiserver-advertise-address=192.168.10.210 \
    --kubernetes-version=v1.23.3

Since we have already downloaded the images locally, the installation process with kubeadm is completed quickly. It will also prompt you with the following steps:

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

This means you need to create a “.kube” directory locally, then copy the kubectl configuration file. Just copy and paste as instructed.

There is another important command called “kubeadm join” that you must save after copying. Other nodes need to join the cluster using the token and CA certificate provided in this command:

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.10.210:6443 --token tv9mkx.tw7it9vphe158e74 \
    --discovery-token-ca-cert-hash sha256:e8721b8630d5b562e23c010c70559a6d3084f629abad6a2920e87855f8fb96f3

After the installation is completed, you can use kubectl version and kubectl get node to check the version of Kubernetes and the status of the cluster’s nodes:

kubectl version
kubectl get node

You will notice that the status of the Master node is “NotReady”. This is because the network plugin is still missing, and the internal network of the cluster is not functioning properly.

Installing Flannel Network Plugin #

Kubernetes defines the CNI standard and there are many network plugins available. Here, I’ll choose the most commonly used Flannel. You can find the relevant documentation on its GitHub repository (https://github.com/flannel-io/flannel/).

Installing Flannel is straightforward. You just need to deploy the project’s “kube-flannel.yml” file in Kubernetes. However, since it applies the Kubernetes network address, you need to modify the “net-conf.json” field in the file and change Network to the address range specified in the --pod-network-cidr parameter you set during kubeadm configuration.

For example, here you would modify it to “10.10.0.0/16”:

    net-conf.json: |
      {
        "Network": "10.10.0.0/16",
        "Backend": {
          "Type": "vxlan"
        }
      }

Once you’ve made the change, you can use kubectl apply to install the Flannel network:

kubectl apply -f kube-flannel.yml

Wait for a moment while the image is pulled and runs. After that, you can execute kubectl get node to check the node status:

kubectl get node

At this point, you should see that the status of the Master node is “Ready”, indicating that the node’s network is functioning properly.

Installing Worker Nodes #

If you have successfully installed the Master node, then the installation of Worker nodes is much simpler. You just need to use the kubeadm join command that was copied earlier, and remember to execute it with sudo:

sudo \
kubeadm join 192.168.10.210:6443 --token tv9mkx.tw7it9vphe158e74 \
    --discovery-token-ca-cert-hash sha256:e8721b8630d5b562e23c010c70559a6d3084f629abad6a2920e87855f8fb96f3

It will connect to the Master node, pull the images, install the network plugin, and finally join the node to the cluster.

During this process, you may encounter image pulling issues. You can follow the same steps as before and pre-download the images to the local Worker node. This will ensure a smooth installation process.

Once the installation of the Worker node is complete, execute kubectl get node, and you will see that both nodes are in the “Ready” state:

Now let’s use kubectl run to run Nginx and test it:

kubectl run ngx --image=nginx:alpine
kubectl get pod -o wide

You will see that the Pod is running on the Worker node with the IP address “10.10.1.2”, indicating that our Kubernetes cluster has been successfully deployed.

Summary #

Okay, now that we have installed both the Master node and Worker nodes, our task for today is considered mostly completed.

The deployment of the Console node is even simpler. It only requires the installation of kubectl and copying the “config” file. You can directly use “scp” to remotely copy it from the Master node, for example:

scp `which kubectl` [email protected]:~/
scp ~/.kube/config [email protected]:~/.kube

Today’s process was a bit more involved, and I have listed the key points below:

kubeadm is a convenient and easy-to-use Kubernetes tool that can deploy production-level Kubernetes clusters.
Before installing Kubernetes, you need to modify the host configurations, including the hostname, Docker configuration, network settings, and swap partition.
Kubernetes component images are stored in gcr.io, which can be a bit tricky to download in China. You can consider getting them from minikube or Chinese mirror websites.
Installing the Master node requires the use of the command kubeadm init, and installing the Worker nodes requires the use of the command kubeadm join. Additionally, you need to deploy network plugins like Flannel for the cluster to work properly.

Since these operations involve various Linux commands, manually typing them all can indeed be cumbersome. Therefore, I have turned these steps into Shell scripts and placed them on GitHub (https://github.com/chronolaw/k8s_study/tree/master/admin). You can download them and run them directly.

Homework #

The final homework is a hands-on exercise. Please spend more time creating cluster nodes using virtual machines and deploying a multi-node Kubernetes environment using kubeadm. In the upcoming “Intermediate” and “Advanced” sections, we will conduct experiments in this Kubernetes cluster.

If you have any questions during the installation and deployment process, feel free to leave a message in the comment section, and I will respond as soon as possible. If you find this helpful, you are welcome to share it with your friends for learning together. See you in the next class.