46 How to Make Docker Images

46 How to Make Docker Images #

Hello, I am Kong Lingfei.

To implement cloud-native architecture, one key aspect is using containers to deploy our applications. If we want to deploy applications using containers, creating a Docker image for the application is an essential step. Today, I will provide a detailed guide on how to create a Docker image.

In this tutorial, I will first explain the principles and methods of Docker image construction. Then, I will introduce the instructions of Dockerfile and how to write a Dockerfile. Finally, I will discuss some best practices to follow when writing a Dockerfile.

Principles and Methods of Docker Image Building #

First, let’s take a look at the principles and methods of building Docker images.

There are two common ways to build a Docker image:

Use the docker commit command to build an image based on an existing container.
Write a Dockerfile and use the docker build command to build the image.

In both of these methods, the underlying principle of image building is the same. The image building process consists of the following three steps:

Start a Docker container based on the original image.
Perform operations in the container, such as executing commands and installing files. Any file changes resulting from these operations are recorded in the container’s storage layer.
Commit the changes in the container’s storage layer to a new image layer and add it to the original image.

Now, let’s explain these two methods of building Docker images in detail.

Building Images Using the `docker commit` Command #

We can build an image using the docker commit command, with the format of the command being docker commit [options] [<repository>[:<tag>]].

In the following example, we build the Docker image ccr.ccs.tencentyun.com/marmotedu/iam-apiserver-amd64:test using four steps:

The specific steps are as follows:

Execute docker ps to obtain the container ID 48d1dbb89a7f of the container that we want to build an image from.
Execute docker pause 48d1dbb89a7f to pause the running of the container 48d1dbb89a7f.
Execute docker commit 48d1dbb89a7f ccr.ccs.tencentyun.com/marmotedu/iam-apiserver-amd64:test to build a Docker image based on the container ID 48d1dbb89a7f.
Execute docker images ccr.ccs.tencentyun.com/marmotedu/iam-apiserver-amd64:test to check whether the image is successfully built.

This method of image building is usually used in the following two scenarios:

Building temporary test images.
After a container is compromised, using docker commit to build an image based on the compromised container in order to preserve the scene and facilitate future investigation.

Apart from these two scenarios, I do not recommend using docker commit to build images for a production environment. The main reasons for this are:

Images built using docker commit contain a large number of useless files generated during compilation, software installation, and program execution, resulting in large and bloated images.
Images built using docker commit lose all the operational history of the image, making it impossible to restore the image building process, which is not conducive to image maintenance.

Next, let’s take a look at how to build images using a Dockerfile.

Building Images Using a `Dockerfile` #

In practical development, building images using a Dockerfile is the most common and standard method. A Dockerfile is a text file used by Docker to build images. It contains a series of instructions for building the image.

The docker build command reads the content of the Dockerfile and sends it to the Docker engine. The Docker engine then interprets each instruction in the Dockerfile and builds the required image.

The command format of docker build is docker build [OPTIONS] PATH | URL | -. PATH, URL, and - indicate the context in which the image is built. The context contains the Dockerfile and other files required to build the image. By default, the Docker build engine looks for a file named Dockerfile in the context, but you can manually specify the Dockerfile file using the -f, --file option. For example:

 $ docker build -f Dockerfile -t ccr.ccs.tencentyun.com/marmotedu/iam-apiserver-amd64:test .

When building an image using a Dockerfile, it essentially creates a container based on the image, executes the corresponding instructions in the container, stops the container, and submits the file changes in the storage layer. Compared to the docker commit method of building images, there are three advantages to using a Dockerfile:

The Dockerfile contains the complete process of image creation, allowing other developers to understand and reproduce the creation process using the Dockerfile.
Each instruction in the Dockerfile creates a new image layer, and these images can be cached by the Docker Daemon. When building the image again, Docker tries to reuse the cached image layers (using cache) instead of rebuilding each layer from scratch, saving time and disk space.
The process of the Dockerfile can be queried using the docker image history [image name] command, making it easy for developers to view the change history.

Here, we will use an example to explain in detail the process of building an image using a Dockerfile.

First, we need to write a Dockerfile. Below is the content of the Dockerfile for iam-apiserver: Dockerfile:

FROM centos:centos8
LABEL maintainer="<[[email protected]](/cdn-cgi/l/email-protection)>"

RUN ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
RUN echo "Asia/Shanghai" > /etc/timezone

WORKDIR /opt/iam
COPY iam-apiserver /opt/iam/bin/

ENTRYPOINT ["/opt/iam/bin/iam-apiserver"]

Here, centos:centos8 is chosen as the base image because it contains basic troubleshooting tools like vi, cat, curl, mkdir, and cp.

Next, execute the docker build command to build the image:

$ docker build -f Dockerfile -t ccr.ccs.tencentyun.com/marmotedu/iam-apiserver-amd64:test .

The build process after executing docker build is as follows:

Step 1: docker build packages the files in the context and sends them to the Docker daemon. If there is a .dockerignore file in the context, it will exclude the files that meet the .dockerignore rules from the upload list.

There is an exception that if .dockerignore includes .dockerignore or Dockerfile, docker build will ignore these two files when excluding files. If the tag of the image is specified, the repository and tag will also be validated.

Step 2: docker build sends an HTTP request to the Docker server to build the image, including the necessary context information.

Step 3: After receiving the build request, the Docker server performs the following process to build the image:

Create a temporary directory and extract the files from the context into the directory.
Read and parse the Dockerfile, iterate through the instructions, and distribute them to different modules based on the command type.
The Docker build engine creates a temporary container for each instruction, executes the instruction in the temporary container, and then commits the container to generate a new image layer.
Finally, merge all the image layers generated by the instructions to form the final build result. The image ID generated by the last commit is the final image ID.

To improve build efficiency, docker build caches existing image layers by default. If a cached image layer is found during the image build, it will be used directly without rebuilding. If you do not want to use the cached image, you can specify the --no-cache=true parameter when executing the docker build command.

The cache matching rules for Docker are as follows: Iterate through the base images and their child images in the cache, and check if the build instructions of these images are exactly the same as the current instruction. If they are different, it means that the cache does not match. For ADD and COPY instructions, it also compares the checksum of the file added to the image to determine if the file added to the image is the same. If they are different, it means that the cache does not match.

Please note that cache matching does not check the files in the container. For example, when the RUN apt-get -y update command is used to update the files in the container, the cache policy does not check these files to determine whether the cache matches.

Finally, we can use the docker history command to view the build history of the image.

Introduction to Dockerfile Instructions #

Above, I introduced some basic knowledge related to Docker image building. In the actual production environment, the standard practice is to build images using a Dockerfile, which requires you to know how to write a Dockerfile. Next, I will explain in detail how to write a Dockerfile.

The basic format of a Dockerfile instruction is as follows:

# Comment
INSTRUCTION arguments

INSTRUCTION is the instruction, which is case-insensitive, but my recommendation is to use uppercase for instructions to differentiate them from arguments. In a Dockerfile, lines starting with # are comments, while # appearing elsewhere will be considered as arguments, for example:

# Comment
RUN echo 'hello world # dockerfile'

A Dockerfile can contain multiple instructions, which can be divided into 5 categories:

Instructions to define the base image: FROM
Instructions to define the maintainer of the image: MAINTAINER (optional)
Instructions for the image building process: COPY, ADD, RUN, USER, WORKDIR, ARG, ENV, VOLUME, ONBUILD
Instructions to define the command to run when the container starts: CMD, ENTRYPOINT
Other instructions: EXPOSE, HEALTHCHECK, STOPSIGNAL

Among them, the bold instructions are commonly used in Dockerfile writing and require your special attention. I have provided a detailed explanation of these commonly used Dockerfile instructions on GitHub. You can take a look at this Dockerfile Instructions Explained.

Here is an example Dockerfile:

# The first line must specify the base image that this image is built upon
FROM centos:centos8

# Maintainer information
MAINTAINER Lingfei Kong <lingfei@example.com>

# Image operation instructions
RUN ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
RUN echo "Asia/Shanghai" > /etc/timezone
WORKDIR /opt/iam
COPY iam-apiserver /opt/iam/bin/

# Instruction executed when the container starts
ENTRYPOINT ["/opt/iam/bin/iam-apiserver"]

Docker will interpret and execute the instructions in the Dockerfile sequentially, and the first instruction must be FROM which is used to specify the base image for building the image. Next, the maintainer information is usually specified. After that, there are instructions for image operations, and finally, the command and parameters for the container startup are specified using CMD or ENTRYPOINT.

Dockerfile Best Practices #

In the previous section, I introduced the instructions of Dockerfile. However, knowing only these instructions is not enough to write a qualified Dockerfile. We also need to follow some best practices for writing Dockerfile. Here, I have summarized a checklist of best practices for writing Dockerfile that you can refer to.

It is recommended to use uppercase letters for all Dockerfile instructions. This helps to distinguish them from the instructions executed inside the image.
When choosing a base image, try to choose official images and, if possible, choose images with smaller sizes, as long as they meet the requirements. Currently, Linux images have the following relationship in terms of size: busybox < debian < centos < ubuntu. It is best to use a unified base image for the same project. If there are no specific requirements, you can choose to use debian:jessie or alpine.
When building an image, delete unnecessary files and only install the files that are needed to keep the image clean and lightweight.
Use fewer layers by grouping related content into one layer and use line breaks to separate them. This further reduces the size of the image and makes it easier to view the image history.
Do not modify file permissions in the Dockerfile. Modifying file permissions will cause Docker to make a new copy during the build process, resulting in a larger image size.
Tag your images. Tags can help you understand the functionality of the image, for example: docker build -t="nginx:3.0-onbuild".
The FROM instruction should include a tag, for example, use FROM debian:jessie instead of FROM debian.
Make good use of caching. The Docker build engine executes the instructions in the Dockerfile sequentially, and once the cache becomes invalid, subsequent commands will not be able to use the cache. In order to make efficient use of caching, try to put all the common parts of the Dockerfile in the front, while putting the different parts at the end.
Prefer using the COPY instruction instead of the ADD instruction. Compared to ADD, the COPY command is simpler and sufficient. The variable behavior of ADD makes its behavior unclear, which is not conducive to future maintenance and understanding.
It is recommended to use the combination of CMD and ENTRYPOINT instructions. Use the exec format of the ENTRYPOINT command to set a fixed default command and parameters, and then use the CMD instruction to set variable parameters.
Use Dockerfile to share images. By sharing the Dockerfile, developers can have a clear understanding of the Docker image build process, and the Dockerfile can be added to version control for tracking.
Use .dockerignore to exclude unnecessary files when building images. Ignoring unused files can improve build speed.
Use multi-stage builds. Multi-stage builds can greatly reduce the size of the final image. For example, the COPY instruction may include some installation packages, which become obsolete after installation. Here is a simple example of multi-stage build:

FROM golang:1.11-alpine AS build

# Install dependencies
RUN go get github.com/golang/mock/mockgen

# Copy source code and execute build. A new image layer will be created when the files change.
COPY . /go/src/iam/
RUN go build -o /bin/iam

# Reduce to one layer image
FROM busybox
COPY --from=build /bin/iam /bin/iam
ENTRYPOINT ["/bin/iam"]
CMD ["--help"]

Summary #

If you want to deploy an application using Docker containers, you need to create a Docker image. Today, I introduced how to create a Docker image.

There are two ways to build a Docker image:

Using the docker commit command to create an image based on an existing container.
Writing a Dockerfile and using the docker build command to build the image.

Both methods have the same underlying principle for image construction:

Start a Docker container based on the original image.
Perform some operations within the container, such as executing commands or installing files. Any file changes resulting from these operations will be recorded in the container’s storage layer.
Commit the changes in the container’s storage layer to a new image layer and add it to the original image.

In addition, we can also use docker save / docker load and docker export / docker import to replicate Docker images.

In actual production environments, the standard practice is to use a Dockerfile to build images. When using a Dockerfile, you need to write the Dockerfile file. Dockerfile supports multiple instructions, which can be categorized into 5 types. For specific instructions, you can review them again.

Furthermore, when building Docker images, we should also follow some best practices. You can refer to the best practices checklist I summarized for you.

After-class exercises #

Think about why, when writing Dockerfiles, “putting related content into one layer and splitting it with newline character \” can reduce the size of the image.
Give it a try, write a Dockerfile for the application you are developing, and successfully build a Docker image.

Feel free to leave a message in the comment section to discuss with me. See you in the next class.