02 Basic Concepts of Containers

02 Basic Concepts of Containers #

Containers and Images #

What is a container? #

Before introducing the specific concept of containers, let’s briefly review how an operating system manages processes.

Firstly, after logging into the operating system, we can see various processes through commands like ps. These processes include system services and user applications. What are the characteristics of these processes?

  • Firstly, these processes can see and communicate with each other;
  • Secondly, they use the same file system and can read and write to the same files;
  • Thirdly, these processes use the same system resources.

What problems do these three characteristics bring?

  • Because these processes can see and communicate with each other, processes with higher privileges can attack other processes;
  • Because they use the same file system, it brings two problems: these processes can manipulate existing data, processes with higher privileges may delete other processes’ data, disrupting their normal operation; in addition, there may be conflicts in dependencies between processes, which can create great pressure on operations and maintenance;
  • Because these processes use the resources of the same host machine, there may be resource contention between applications. When an application consumes a large amount of CPU and memory resources, it may disrupt the operation of other applications, causing them to fail to provide services normally.

How can we provide a separate runtime environment for processes to address these three problems?

  • To address the problems caused by different processes using the same file system, Linux and Unix operating systems can use the chroot system call to change the root directory to a subdirectory, achieving view-level isolation. With the help of chroot, processes can have an independent file system, and operations like create, read, update, and delete on this file system will not affect other processes;
  • Since processes can see and communicate with each other, namespace technology is used to isolate processes at the resource view level. With the help of chroot and namespaces, processes can run in an independent environment;
  • However, in an isolated environment, processes still use the resources of the same operating system, and some processes may consume system resources excessively. To reduce the impact between processes, cgroups can be used to limit their resource usage rate, such as CPU usage and memory allocation.

So, how should we define such a collection of processes?

In fact, a container is a collection of processes that are isolated in terms of view, have restricted resources, and use independent file systems. “Isolated in terms of view” means being able to see a subset of processes and having an independent hostname, etc.; controlling resource usage means being able to limit memory size and CPU usage. A container is a collection of processes that isolates other system resources and has its own independent resource view.

A container has an independent file system. Since it uses system resources, there is no need for kernel-related code or tools in the independent file system. We only need to provide the necessary binary files, configuration files, and dependencies for the container. As long as the necessary files for the container runtime are available, the container can run.

What is an image? #

Based on the above, we call the collection of all the files required for the runtime of these containers a container image. So, what are the usual ways to build images? Normally, we use Dockerfile to build images because it provides convenient syntax to describe each step of the build process. Each build step modifies the existing file system, resulting in changes to the file system content. These changes are referred to as a changeset. By applying the changes produced by the build steps to an empty folder, we can obtain a complete image. The layering and reusability features of changesets provide several advantages:

  • Firstly, it improves distribution efficiency. For large images, splitting them into smaller chunks improves the distribution efficiency as these chunks of data can be downloaded in parallel.
  • Secondly, since these data are shared, it means that when we already have some data on local storage, we only need to download the data that is not present locally. For example, if we already have the alpine image on our local storage, when downloading the golang image that is based on the alpine image, we only need to download the parts that the local alpine image does not have.
  • Thirdly, because the image data is shared, it can save a lot of disk space. For instance, if we have both the alpine and golang images on our local storage without the ability to reuse, the alpine image might be 5MB in size and the golang image might be 300MB in size. This would result in a total of 305MB of space being occupied. However, with the ability to reuse, only 300MB of space is needed.

How to build images? #

The following Dockerfile is used to describe how to build a golang application.

avatar

As shown in the image:

  1. The FROM line indicates what image the build steps are based on, as mentioned earlier, images can be reused.
  2. The WORKDIR line indicates the specific directory where the following build steps will take place. Its function is similar to the cd command in Shell.
  3. The COPY line indicates that files on the host machine can be copied into the container image.
  4. The RUN line represents actions that will be executed within the specific file system. After running these actions, an application can be obtained.
  5. The CMD line represents the default program name when using the image.

After having the Dockerfile, we can build the desired application using the docker build command. The resulting image is stored locally, and typically, image building is done in a build server or some other isolated environment.

So, how do we run these images in production or testing environments? This requires a hub or central storage, which we call a Docker registry, responsible for storing all the generated image data. We can use the docker push command to push the local image to the registry, allowing us to download the relevant data and run it in production or testing environments.

How to run containers? #

Running a container generally involves three steps:

  • The first step is to download the relevant image from the Docker registry.
  • After the image is downloaded, we can use the docker images command to view the local images, which provides a complete list from which we can select the desired image.
  • Once the image is selected, we can use the docker run command to run this image and obtain the desired container. We can run multiple containers by repeating this process. An image is like a template, and a container is like a specific running instance, which is why images can be built once and run anywhere.

Summary #

To recap, a container is a collection of processes isolated from the rest of the system, including processes, network resources, and file systems. An image is a collection of all the files required by the container and has the characteristic of being built once and run anywhere.

Lifecycle of Containers #

Lifecycle of Containers at Runtime #

A container is a collection of isolated processes. When using docker run, a specific image is chosen to provide an independent file system and specify the corresponding runtime program. The specified runtime program is called the initial process. When the initial process starts, the container also starts. When the initial process exits, the container also exits.

Therefore, the lifecycle of a container is consistent with the lifecycle of the initial process. Of course, a container can have more than one initial process, and the initial process itself can create other child processes, or operations from docker exec. All of these are managed by the initial process. When the initial process exits, all child processes will exit as well, to prevent resource leakage. However, this approach can also cause some problems. Firstly, the programs inside the container often have states and may generate important data. When a container is deleted after exit, the data will be lost, which is not acceptable for applications. Therefore, it is necessary to persist the important data generated by the container. Containers can directly persist the data to a specified directory, which is called a data volume.

Data volumes have some characteristics, and one of the most obvious ones is that the lifecycle of a data volume is independent of the lifecycle of the container. In other words, the creation, runtime, stop, and deletion of the container have no relationship with the data volume because it is a special directory used for container persistence. In summary, we can mount the data volume into the container so that the container can write data to the corresponding directory, and the data will not be lost when the container exits.

In general, there are two main approaches to manage data volumes:

  • The first approach is to bind-mount the host directory directly into the container. This approach is simple but involves operational costs because it relies on the host directory and requires unified management of all host machines.
  • The second approach is to delegate directory management to the runtime engine.

Container Project Architecture #

Moby Container Engine Architecture #

Moby is currently the most popular container management engine. The Moby daemon provides management of containers, images, networks, and volumes. The most important component that the Moby daemon relies on is containerd. Containerd is a container runtime management engine that is independent of the Moby daemon. It provides management of containers and images.

Underneath containerd, there is a containerd shim module, which acts as a daemon. There are several reasons for this design:

  • Firstly, containerd needs to manage the lifecycle of containers, which may be created by different container runtimes. Therefore, it needs to provide flexible plugin-based management. The shim is specifically developed for different container runtimes, allowing them to be separated from containerd and managed through plugins.
  • Secondly, because of the plugin-based implementation of the shim, it can be dynamically taken over by containerd. Without this capability, when the Moby daemon or containerd daemon unexpectedly exits, there would be no one to manage the containers, leading to their disappearance and termination, which would affect the running of applications.
  • Finally, since Moby or containerd may need to be upgraded at any time, without the shim mechanism, it would not be possible to perform in-place upgrades or upgrades without impacting business operations. Therefore, containerd shim is very important as it enables dynamic takeover.

This section only provides a rough introduction to Moby, and more detailed information will be covered in subsequent courses.

Container vs. VM #

Differences between Containers and VMs #

VMs utilize hypervisor virtualization technology to emulate hardware resources such as CPU and memory, allowing for the creation of a guest OS on a host machine, commonly referred to as installing a virtual machine.

Each guest OS has an independent kernel, such as Ubuntu, CentOS, or even Windows. Under such guest OS, each application is isolated from others, providing better isolation in the VM. However, this kind of isolation comes at a cost, as some computing resources need to be allocated for virtualization, making it difficult to fully utilize existing resources. Additionally, each guest OS requires a significant amount of disk space, for example, installing the Windows operating system requires 10~30GB of disk space, and Ubuntu requires 5~6GB. Moreover, launching a VM in this way is slow. It is because of the drawbacks of virtual machine technology that container technology emerged. Containers are focused on the process level, so there is no need for a guest OS, only an independent file system that provides the required file collection. All file isolation is at the process level, resulting in faster startup times than VMs and requiring less disk space. However, the process-level isolation is not as good as one might imagine, and the isolation effect is much weaker compared to VMs.

In summary, containers and VMs have their own advantages and disadvantages, and container technology is also evolving towards stronger isolation.

Summary of this section #

  • A container is a collection of processes with its unique perspective.
  • An image is a collection of all the files a container needs, and it features building once and running anywhere.
  • The lifecycle of a container is the same as that of the initial process.
  • Containers and virtual machines (VMs) have their own advantages and disadvantages, and container technology is evolving towards stronger isolation.