26 Stateful Set How to Manage Stateful Applications

26 StatefulSet How to Manage Stateful Applications #

Hello, I am Chrono.

In the intermediate section, we learned about two API objects: Deployment and DaemonSet. They are important tools for deploying applications in a Kubernetes cluster. However, they have a drawback - they can only manage “stateless applications” and cannot handle “stateful applications”.

Managing stateful applications is more complex and involves considering many factors. However, we can still solve these problems by combining the objects we have learned so far, such as Deployment, Service, PersistentVolume, etc.

Today, we will study what “stateful applications” are and why Kubernetes has designed a new object, StatefulSet, specifically for managing them.

What is a Stateful Application #

Let’s start with PersistentVolumes. It brings persistent storage capabilities to Kubernetes, allowing applications to store data on local or remote disks.

So have you ever wondered what persistent storage means for applications?

With persistent storage, applications can persist runtime data to disk, which is like having a “safety net”. If a Pod crashes unexpectedly, it’s just like pressing the pause button. After restarting and mounting the Volume, loading the original data can bring the application back to life, continuing its previous “state”.

Notice anything here? There is a keyword - “state”. The data that the application saves is actually its state at a certain moment.

So from this perspective, theoretically any application is stateful.

Only some application’s state information is not very important. Even if the state is not restored, it can still run normally. This is what we often call “stateless applications”. A typical example of a stateless application is a web server like Nginx. It only handles HTTP requests and does not generate data (excluding logs). It does not need to specifically save state, and it can provide services well regardless of the state it restarts in.

There are also some applications whose runtime state information is very important. Losing the state due to a restart is absolutely unacceptable. These applications are known as “stateful applications”.

There are also many examples of “stateful applications”, such as databases like Redis and MySQL. The “state” of these databases is the data generated in memory or on disk, which is the core value of the application. If this data is not saved and restored in time, it would be catastrophic.

Understanding this, let’s combine the knowledge we have learned so far and think about it: Does using Deployment and PersistentVolume in Kubernetes make it easy to manage stateful applications?

Indeed, using Deployment to ensure high availability and PersistentVolume to store data can partially achieve the goal of managing “stateful applications” (you can try to write such YAML yourself).

But Kubernetes has a more comprehensive and long-term perspective. It believes that “state” is not just data persistence. In a clustered and distributed environment, there are also dependencies between multiple instances, startup order, and network identification issues that need to be resolved. These are precisely the shortcomings of Deployment.

Because using only Deployment makes the multiple instances unrelated, the startup order is not fixed, and the Pod name, IP address, and domain name are all completely random. This is the characteristic of “stateless applications”.

However, for “stateful applications”, there may be dependencies between multiple instances, such as master/slave or active/passive, which need to be started in a particular order to ensure the application runs normally. External clients may also need to use fixed network identifiers to access the instances, and these identifiers must remain unchanged after Pod restarts.

Therefore, Kubernetes defines a new API object based on Deployment, which is well understood and called StatefulSet. It is specifically used to manage stateful applications.

How to Use YAML to Describe a StatefulSet #

First, let’s use the command kubectl api-resources to view basic information about StatefulSet. We can see that its abbreviation is sts, and the YAML file header information is:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: xxx-sts

Similar to DaemonSet, StatefulSet can also be regarded as a special case of Deployment. It cannot be directly created using kubectl create with a template file. However, it has similar object descriptions to Deployment. You can modify the Deployment description accordingly to create a StatefulSet object.

Here is an example of a StatefulSet using Redis. Take a look at its differences from Deployment:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: redis-sts

spec:
  serviceName: redis-svc
  replicas: 2
  selector:
    matchLabels:
      app: redis-sts

  template:
    metadata:
      labels:
        app: redis-sts
    spec:
      containers:
      - image: redis:5-alpine
        name: redis
        ports:
        - containerPort: 6379

We can see that in the YAML file, besides the kind which must be “StatefulSet”, there is an additional field “serviceName” in the spec. The rest of the file is exactly the same as a Deployment, such as replicas, selector, template, etc.

These two differences are actually the key distinctions between StatefulSet and Deployment. To truly understand this, we need to analyze the usage of StatefulSet in Kubernetes.

How to use StatefulSet in Kubernetes #

Let’s use kubectl apply to create a StatefulSet object and use kubectl get to see what it looks like:

kubectl apply -f redis-sts.yml
kubectl get sts
kubectl get pod

From the screenshot, you should be able to see that the Pods managed by the StatefulSet are no longer randomly named, but have sequential numbers starting from 0. They are respectively named redis-sts-0 and redis-sts-1. Kubernetes will create them in this order (Pod 0 has a higher AGE than Pod 1), which solves the first problem of a “stateful application”: startup order.

With the startup order determined, how does the application know its own identity to determine the dependencies between them?

The method provided by Kubernetes is to use the hostname, which is the hostname inside each Pod. Let’s login to the Pod’s interior using kubectl exec to take a look:

kubectl exec -it redis-sts-0 -- sh

In the Pod, you can check the environment variable $HOSTNAME or execute the command hostname to get the name of this Pod, which is redis-sts-0.

With this unique name, the application can decide on its dependencies. For example, in this Redis example, the initially started Pod (0) can be the master instance while the subsequently started Pod (1) can be the slave.

After solving the startup order and dependencies, the third problem that needs to be addressed is network identification, which requires the use of a Service object.

However, there is something strange here. We cannot use the kubectl expose command to generate a Service directly for the StatefulSet. We can only manually write the YAML. But this shouldn’t be a problem for you, as after all the exercises, you should be able to easily write a Service object.

Since it cannot be automatically generated, you need to be careful when writing the Service object. metadata.name must be the same as the serviceName in the StatefulSet, and the labels in the selector must also be consistent with those in the StatefulSet:

apiVersion: v1
kind: Service
metadata:
  name: redis-svc

spec:
  selector:
    app: redis-sts

  ports:
  - port: 6379
    protocol: TCP
    targetPort: 6379

After writing the Service, use kubectl apply to create the object:

You can see that this Service is nothing special. It also selects the two Pods managed by the StatefulSet using the label selector and finds their IP addresses.

However, the secret of the StatefulSet lies in its domain name.

Do you remember how the domain names of Services are used as mentioned in Lecture 20? A Service itself has a domain name format of “object_name.namespace”, and each Pod also has a domain name of the form “IP_address.namespace”. However, because IP addresses are not stable, the Pod’s domain name is not practical. Usually, we use the stable Service domain name.

When we apply a Service object to a StatefulSet, the situation is different.

The Service realizes that these Pods are not ordinary applications but stateful applications that require stable network identification. As a result, it creates another new domain name for each Pod, with the format “pod_name.service_name.namespace.svc.cluster.local”. Of course, this domain name can also be abbreviated as “pod_name.service_name”.

Let’s use kubectl exec to enter the Pod and use the ping command to verify:

kubectl exec -it redis-sts-0 -- sh

Obviously, both Pods in the StatefulSet have their own domain names, which provide stable network identification. So, going forward, as long as the external client knows the StatefulSet object, it can access a specific instance using a fixed number. Although the Pod’s IP address may change, this numbered domain name is maintained by the Service object and remains stable.

With the combination of StatefulSet and Service, Kubernetes solves the three problems of dependencies, startup order, and network identification for “stateful applications”. For handling the communication and coordination between multiple instances, the application itself needs to take care of it.

Regarding the Service, there is one more thing worth mentioning.

The original purpose of the Service is load balancing, and it should forward traffic in front of the Pods. However, for StatefulSets, this functionality is unnecessary because the Pods already have stable domain names. External access to the service should not go through the Service layer. Therefore, from a security and system resource-saving perspective, we can add a field clusterIP: None to the Service to tell Kubernetes not to allocate an IP address for this object.

I have created an illustration showing the relationship between StatefulSet and Service objects. You can refer to it for the interdependence between the fields:

How to achieve data persistence with StatefulSets #

Now that StatefulSets have fixed names, startup orders, and network identifiers, we can implement management of stateful applications by adding data persistence functionality to them.

We can use the knowledge of PersistentVolume and NFS learned in the previous lesson to define a StorageClass, write a PVC, and mount a Volume to the Pod.

However, to emphasize the one-to-one binding relationship between persistent storage and StatefulSets, Kubernetes specifically defines a field called “volumeClaimTemplates” for StatefulSets. This field directly embeds the PVC definition into the StatefulSet’s YAML file. This ensures that when a StatefulSet is created, a PVC is automatically created for each Pod, increasing the availability of the StatefulSet.

The “volumeClaimTemplates” field may be a bit difficult to understand. You can compare it with the template field of a Pod and the jobTemplate field of a Job to learn. It is actually a nested object composition structure, with a regular PVC applied with a StorageClass inside.

Let’s modify the Redis StatefulSet object we just created to include the persistence storage functionality:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: redis-pv-sts

spec:
  serviceName: redis-pv-svc

  volumeClaimTemplates:
  - metadata:
      name: redis-100m-pvc
    spec:
      storageClassName: nfs-client
      accessModes:
        - ReadWriteMany
      resources:
        requests:
          storage: 100Mi

  replicas: 2
  selector:
    matchLabels:
      app: redis-pv-sts

  template:
    metadata:
      labels:
        app: redis-pv-sts
    spec:
      containers:
      - image: redis:5-alpine
        name: redis
        ports:
        - containerPort: 6379

        volumeMounts:
        - name: redis-100m-pvc
          mountPath: /data

This YAML file is relatively long and has a lot of content, but as long as you are a little patient and analyze it module by module, you will quickly understand it.

First, the name of the StatefulSet object is “redis-pv-sts”, indicating that it uses PV storage. Then, we define a PVC named “redis-100m-pvc” in the “volumeClaimTemplates” field, which requests 100MB of NFS storage. In the Pod template, we reference this PVC with volumeMounts and mount the shared disk to the “/data” directory, which is the data directory of Redis.

The following figure shows the complete relationship diagram of this StatefulSet object:

Finally, use kubectl apply to create these objects, and a stateful application with persistence functionality will be up and running:

kubectl apply -f redis-pv-sts.yml

You can use the command kubectl get pvc to check the status of the storage volumes associated with the StatefulSet:

Looking at the naming of these two PVCs, they are not random but follow a pattern. They are combined with the PVC name and the StatefulSet name. Therefore, even if the Pod is destroyed, because its name remains the same, the PVC can still be found and reused to access the previously stored data.

Now let’s verify this. Use kubectl exec to run the Redis client and add some key-value data to it:

kubectl exec -it redis-pv-sts-0 -- redis-cli

Here, I set two values, “a=111” and “b=222”.

Now let’s simulate an unexpected accident and delete this Pod:

kubectl delete pod redis-pv-sts-0

Since StatefulSets monitor the instances of Pods just like Deployments do, when the number of Pods decreases, new Pods will be created quickly, and their names and network identifiers will be exactly the same as before:

What about the data stored in Redis? Has persistence storage been used and has the data been fully restored?

You can log in to the Redis client again to check:

kubectl exec -it redis-pv-sts-0 -- redis-cli

Since we mounted the NFS network storage to the /data directory of the Pod, Redis periodically saves the data to disk. Therefore, when the newly created Pod mounts the directory again, it retrieves the data from the backup file, and the data in the memory is fully restored.

Summary #

Alright, today we learned about the API object StatefulSet, which is specifically designed for deploying “stateful applications”. It is very similar to Deployment, but the difference is that the Pods managed by StatefulSet have fixed names, startup sequences, and network identifiers. These features are important for applications that have relationships such as master-slave or primary-secondary in a cluster.

Let me briefly summarize today’s content:

The YAML description of StatefulSet is almost identical to Deployment, with the addition of a key field serviceName.
To generate stable domain names for Pods in StatefulSet, a Service object needs to be defined. The name of the Service must be consistent with serviceName in StatefulSet.
Accessing StatefulSet should be done using the individual domain names of each Pod, in the format of “PodName.ServiceName”, and the load balancing feature of the Service should not be used.
In StatefulSet, PVC (PersistentVolumeClaim) can be defined directly using the “volumeClaimTemplates” field, allowing Pods to achieve data persistence.

Homework #

Now it’s time for homework. I have two questions for you to ponder:

With the fixed name and startup order provided by StatefulSet, what else does an application need to do to implement a master-slave or other dependency relationship?
Is it possible to define PVC without embedding it in “volumeClaimTemplates”? What consequences could arise from this?

Please feel free to participate in the discussion in the comment section and share your thoughts. See you in the next class.