01 Introduction to Containers Beginning Is the Hardest Part

01 Introduction to Containers Beginning is the Hardest Part #

Hello, I’m Chrono.

In the pre-class preparation, we set up a Linux virtual machine environment using VirtualBox/VMWare. With this foundation, we will start our formal learning today.

As the saying goes, “the beginning is always the hardest.” This is especially true for the vast and unfamiliar field of Kubernetes. Taking the first step in learning is crucial, so today we will start with the simplest and most basic knowledge and talk about the most popular container technology, Docker. We will first set up the experimental environment and then get hands-on experience to demystify it.

The Birth of Docker #

Nowadays, we are all familiar with terms like Container and Kubernetes, but do you know how it all began - Docker, the starting point of everything?

Nine years ago, on March 15, 2013, a Python developer community-themed conference called PyCon was held in Santa Clara, North America. The conference focused on studying and exploring various Python development technologies and applications, and had no connection to the commonly mentioned “cloud,” “PaaS,” or “SaaS.”

Towards the end of the conference schedule that day, there was a lightning talk segment. Among them, a developer took 5 minutes to deliver a talk titled “The future of Linux Containers.” However, due to exceeding the time limit, the presenter was cut off by the host, causing a slightly awkward moment (you can revisit this historically significant video here).

I believe you have already guessed that this short 5-minute technical demonstration is the beginning of the cloud-native revolution that we now see sweeping the entire industry. It was in this talk that Solomon Hykes (from dotCloud, the founder of Docker) first demonstrated the Docker technology to the world.

Although the time of 5 minutes was very short, the talk included several concepts that are now widely adopted but were very novel at the time, such as containers, images, and isolated process execution. There was a wealth of information packed into those 5 minutes.

After the PyCon 2013 conference, many people realized the value and importance of containers. They found that containers could solve the long-standing issues faced by cloud vendors in terms of packaging, deployment, management, and operations. Docker quickly became popular and became a star project on GitHub. Over the next few months, Docker attracted the attention of big companies like Amazon, Google, and Red Hat. With their own technical backgrounds, these companies made significant advancements in the concept of containers, ultimately leading to the emergence of the supreme ruler we see today - Kubernetes.

The Form of Docker #

Alright, let’s now have a “reconstruction of the scenario” to set up a container runtime environment on our Linux virtual machine and simulate the scene when Solomon Hykes first showcased Docker.

Of course, after nine years of development, Docker is far from what it used to be. However, the core concepts and operations have remained consistent without significant changes.

Firstly, we need to understand the form of Docker. Currently, there are basically two options for using Docker: Docker Desktop and Docker Engine.

Docker Desktop is specifically designed for personal use, offering quick installation for Mac and Windows. It has an intuitive graphical interface and integrates many peripheral tools for ease of use.

However, personally, I don’t highly recommend using Docker Desktop for two reasons. First, it is a commercial product and inevitably carries Docker’s own “personal touch”, including some non-general features that are not conducive to our subsequent learning of Kubernetes. Second, it is only free for personal use with limitations in the terms, which can cause potential issues in our daily work.

In contrast, Docker Engine is completely free, but it can only run on Linux and operates only through the command line. It lacks auxiliary tools and requires us to create the runtime environment on our own. However, when it comes to authenticity, Docker Engine is the true form of Docker from the beginning. It has the purest “lineage” and is the Docker product used by various companies in production environments. After all, 99% of servers in data centers run on Linux.

Therefore, in the following learning process, I recommend using Docker Engine. In this column, unless otherwise specified, the term “Docker” usually refers to Docker Engine.

Installation of Docker #

In the preparation before the class, we have already installed some commonly used software in the Linux virtual machine using the Ubuntu package management tool apt. Therefore, we can still use the same method to install Docker.

Let’s try entering the command docker. We will get a “command not found” prompt and a suggestion on how to install it:

Command 'docker' not found, but can be installed with:
sudo apt install docker.io

So, you just need to follow the system’s suggestion and install docker.io accordingly. To make it easier, you can also use the -y parameter to avoid confirmation and achieve automated operation:

sudo apt install -y docker.io # Install Docker Engine

As mentioned earlier, Docker Engine is not like Docker Desktop, which can be used directly after installation. It requires some manual adjustments to get it working. So after installation, you need to execute the following two commands:

sudo service docker start         # Start the Docker service
sudo usermod -aG docker ${USER}   # Add the current user to the Docker group

The first service docker start command is used to start the Docker background service, and the second usermod -aG command is used to add the current user to the Docker user group. This is because operating Docker requires root permissions, but using the root user directly is not secure enough. Adding the user to the Docker user group is a better choice and is also recommended by Docker official. Of course, if it’s just for convenience, you can also directly switch to the root user to operate Docker.

After executing the above three commands, we also need to log out of the system (exit command) and then log in again to make the modification of the user group with usermod command take effect.

Now we can verify if Docker has been successfully installed using the commands docker version and docker info.

docker version will output the version information of Docker client and server respectively:

![Image](../images/fa0088c858d63d6b423155f854a1ddf9.png)

Here are the key version numbers and system information that I extracted from it. You can see that I am using Docker Engine 20.10.12, the system is Linux, and the hardware architecture is arm64, which is Apple M1:

Client:
 Version:           20.10.12
 OS/Arch:           linux/arm64
Server:
 Engine:
  Version:          20.10.12
  OS/Arch:          linux/arm64

docker info will display the Docker system-related information, such as CPU, memory, container count, image count, container runtime, storage file system, etc. Here, I also extracted a part of it:

Server:
 Containers: 1
  Running: 0
  Paused: 0
  Stopped: 1
 Images: 8
 Server Version: 20.10.12
 Storage Driver: overlay2
  Backing Filesystem: extfs
 Cgroup Driver: systemd
 Default Runtime: runc
 Kernel Version: 5.13.0-19-generic
 Operating System: Ubuntu Jammy Jellyfish (development branch)
 OSType: linux
 Architecture: aarch64
 CPUs: 2
 Total Memory: 3.822GiB
 Docker Root Dir: /var/lib/docker

The information displayed by docker info is very useful for us to understand the internal running status of Docker. For example, here you can see that there is currently one container in a stopped state, there are 8 images, the storage file system used is overlay2, the Linux kernel is 5.13, the operating system is Ubuntu 22.04 Jammy Jellyfish, the hardware architecture is aarch64, there are two CPUs, and the memory is 4G.

Using Docker #

Now that we have a working Docker environment, let’s recreate Solomon Hykes’ brief technical demo from 9 years ago.

First, we use the docker ps command, which lists the containers currently running on the system, much like the ps command lists running processes in a Linux system.

Note that all Docker operations follow this format: starting with docker, then a specific subcommand. The previous docker version and docker info also followed this rule. You can use help or --help to get help information, check the command list, and get more detailed explanations.

Since we just installed the Docker environment, there are no running containers at this time, so the list is obviously empty.

Next, let’s try another important command, docker pull, to pull a busybox image from an external image repository (Registry). You can think of it as downloading a software package using “apt install” in Ubuntu:

docker pull busybox      # Pull the busybox image

docker pull will output some strange-looking information, but we don’t need to worry about it for now. The subsequent lessons will explain it in detail.

Let’s now execute the docker images command, which lists all the images stored by Docker:

As you can see, the command displays an image called busybox, with a hexadecimal ID and a size of 1.41MB.

Now, we are going to start a container from this image using the docker run command and execute the echo command to output a string. This is the most exciting part demonstrated by Solomon Hykes at the conference:

docker run busybox echo hello world

This command will output the most famous phrase in the computer world, “hello world,” in our terminal:

Then, if we use the docker ps command with the -a parameter, we can see the completed container:

The above basically covers the content of Solomon Hykes’ lightning talk.

If you are new to containers, you may feel confused about what these commands are doing. It doesn’t seem to demonstrate anything particularly magical. Maybe it’s easier to write a shell script directly.

You are not the only one with the same thought. Perhaps most of the audience at PyCon2013 at that time had the same question. Don’t worry, we will gradually explain the mysteries behind these commands in the subsequent lessons.

Architecture of Docker #

Here I will briefly explain the architecture of Docker Engine to give you a preliminary understanding and lay the groundwork for further learning.

The following diagram is from the Docker official website (https://docs.docker.com/get-started/overview/). It accurately describes the internal roles and workflow of Docker Engine, providing very insightful guidance for our study and research.

Earlier, when we entered the command docker in the command line, it actually acted as a client that communicates with the Docker daemon, which is the background service of Docker Engine. The images are stored in a remote repository called the Registry, and the client cannot directly access the image repository.

The Docker client can send requests to the Docker daemon by using commands like build, pull, and run. The Docker daemon acts as the “caretaker” of containers and images, responsible for pulling images from remote repositories, storing images locally, creating containers from images, managing containers, and many other functions.

Therefore, in Docker Engine, the real work is actually done by the Docker daemon silently running in the background, and the command-line tool “docker” that we actually operate is just a “loudspeaker”.

Docker also provides a “hello-world” example to demonstrate the detailed workflow from Docker client to Docker daemon and then to the Registry. You just need to execute the following command:

docker run hello-world

It will first check the local image, pull from the remote repository if not found, then run the container, and finally output the runtime information:

Summary #

Alright, today we have gained a preliminary understanding of container technology. Let’s briefly summarize the main points:

Container technology originated from Docker, which currently has two products: Docker Desktop and Docker Engine. In our course, we recommend using the free Docker Engine, which can be installed directly on Ubuntu using the apt command.
Docker Engine requires command line operations, with the main command being docker, followed by various subcommands.
The command for viewing basic information about Docker is docker version and docker info. Other commonly used commands include docker ps, docker pull, docker images, and docker run.
Docker Engine follows a typical client/server (C/S) architecture. The command line tool Docker interacts directly with users, while the Docker daemon and Registry work together to perform various functions.

Homework #

Finally, it’s time for homework. I have two questions for you to consider:

After completing this lesson, what is your understanding and impression of container technology and Docker?
Why did Docker Engine adopt a Client/Server (C/S) architecture? What are its benefits?

Feel free to leave your comments in the discussion section. If you find it helpful, please share it with your friends and learn together. See you in the next lesson.