05 Image Repositories How to Use Docker Hub Well

05 Image Repositories How to Use Docker Hub Well #

Hello, I’m Chrono.

In our last class, we learned about the usage of “Dockerfile” and “docker build”, and we learned how to create our own images. But how should we manage these image files? Specifically, how should we store, retrieve, distribute, and share images? Without addressing these questions, our containerized applications cannot be implemented smoothly.

Today, I will discuss this topic and talk about what an image repository is and how to make the most of it.

What is a Registry? #

Previously, we have used the docker pull command to pull images, and mentioned the concept of a “Registry”. So what exactly is a Registry?

Let’s refer to the official architecture diagram of Docker (it is really important):

Image

The area on the right side of the diagram is the Registry, which is also referred to as the “Registry Center”. This means that all the repositories of the images are registered and stored here, just like a huge archive.

Now let’s take a look at the “docker pull” on the left side. The dotted line shows its workflow: it first goes to the “Docker daemon”, and then to the Registry. Only when the Registry has the image can it be truly downloaded to the local machine.

Of course, pulling images is just the basic function of a Registry. It also provides more features, such as uploading, querying, deleting, and so on. It is a comprehensive image management service site.

You can also compare a Registry to an app store on a mobile phone, where various containerized applications are classified and stored. If you need something, just search for it. With a Registry, we can use images without worrying about the details.

What is Docker Hub #

However, have you noticed that when using docker pull to fetch images, we don’t explicitly specify the image repository? In this case, Docker uses a default image repository, which is the famous “Docker Hub” (https://hub.docker.com/).

Docker Hub is the official registry service built by Docker Inc., established in June 2014, released along with Docker 1.0. It claims to be the largest image repository in the world and has become almost as essential to the container world as GitHub is to code sharing.

Docker Hub not only contains images packaged by Docker itself but is also open to the public for free, allowing anyone to upload their own creations. Over the past 8 years, Docker Hub has evolved from a simple image repository to a rich and thriving container community.

In the screenshot below, you can see a list of the most popular applications, with more than 1 billion downloads, such as Nginx, MongoDB, Node.js, Redis, OpenJDK, and so on. Clearly, by introducing these containerized applications into our own systems, we stand on the shoulders of giants and start at a high level.

Image

However, like GitHub and the App Store, Docker Hub, which is open to everyone, also has an inevitable drawback: the quality of images varies.

If you enter keywords like Nginx or MySQL in the search box on Docker Hub, it will immediately give you several hundred or thousand search results, which can be a bit overwhelming. With so many images, how can you pick the one that suits you best? Let me share some of my experiences in this regard.

How to Choose Images on Docker Hub #

First, you should know the differences between official images, verified publisher images, and unofficial images on Docker Hub.

Official images refer to high-quality images provided by Docker (https://github.com/docker-library/official-images). They have undergone strict vulnerability scanning and security checks, support various hardware architectures such as x86_64 and arm64, and have clear and readable documentation. Generally, official images are the preferred choice for building images and the best examples for writing Dockerfiles.

There are currently over 100 official images, which basically cover various popular technologies. Below is a screenshot of the official Nginx image webpage:

Image

You will see that official images are marked with “Official image”, which means that these images have been certified by Docker and have a dedicated team responsible for auditing, publishing, and updating them. They are absolutely reliable in terms of quality.

The second category is verified publisher images, marked with “Verified publisher”. These are images published by verified publishers such as Bitnami, Rancher, Ubuntu, etc. These companies are large and have capabilities comparable to Docker, so they have created verified accounts on Docker Hub to release their own packaged images. It’s a bit like “verified accounts” on Weibo (a Chinese microblogging website).

Image

These images have endorsements from companies and are also trustworthy. However, they may come with some “imprints” of their respective companies. For example, Bitnami’s images are all based on “minideb” and are slightly less flexible than Docker official images, which may not always meet our specific needs.

Apart from official images and verified publisher images, the rest are unofficial images. However, within this category, we can further categorize them into two types.

The first type is semi-official images. Because becoming a “Verified publisher” requires paying Docker, many companies do not want to spend this “unnecessary money”. So, they create company accounts on Docker Hub without going through the verification process.

Let’s take OpenResty as an example and look at its Docker Hub page. You will see that it is shown as officially published by OpenResty, but it has not been officially certified by Docker. Therefore, there may be some risks such as the possibility of being impersonated, and we need to be careful when using them. However, generally speaking, these “semi-official” images are also quite reliable.

Image

The second type is purely community images, usually uploaded by individuals to Docker Hub. Due to limitations in conditions, these images may not have complete or even any testing, making it difficult to guarantee their quality. So, caution is needed when downloading them.

In addition to checking whether an image is officially certified, we should also consider other factors to judge the quality of the image. The approach is similar to GitHub, that is, to look at its download count, stars, and update history, in other words, the number of “positive reviews”.

Generally, download count is the most important reference. Well-rated images usually have download counts in the millions (exceeding 1M). On the other hand, some images may be officially certified but lack maintenance, have infrequent updates, and are rarely used by others, resulting in very few stars and downloads. In such cases, it is better to choose the image with the highest download count, which is basically “going with the flow”.

The screenshot below shows the search results for OpenResty on Docker Hub. You can see that there are two images from verified publishers (Bitnami, IBM), but their download counts are very low. There is also one “community” image with a download count exceeding 1M, but the last update was 3 years ago. So, without a doubt, we should choose the third image with a download count exceeding 10M, and has more than 360 stars - a “semi-official” image.

Image

After looking at so many images on Docker Hub, you must have noticed that the applications have the same names, such as Nginx, Redis, OpenResty. How do we differentiate images created by different authors?

If you are familiar with GitHub, you will find that Docker Hub also uses the same rule, which is the “username/application name” format, like bitnami/nginx, ubuntu/nginx, rancher/nginx, and so on.

Therefore, when using docker pull to download these unofficial images, we must include the username, otherwise it will default to using the official image:

docker pull bitnami/nginx
docker pull ubuntu/nginx

What are the naming rules for images on Docker Hub? #

Just determining which image to use is not enough, as there can be many different versions of an image, known as “tags”.

While using the default “latest” tag may be simple and convenient, it is an irresponsible approach in a production environment, as it leads to uncontrollable versions. Therefore, we need to understand the meaning of tags on Docker Hub in order to select the image version that is most suitable for us.

Let’s take the official Redis image as an example to explain what these tags mean.

Image

Generally, the format of an image tag is the application version number followed by the operating system.

You should be familiar with version numbers, which are typically in the form of major version number + minor version number + patch number. Some versions may have a release candidate (rc) version before the official release. On the other hand, the naming convention for operating systems is slightly more complex, as different Linux distributions have different naming styles.

Alpine and CentOS have straightforward version number naming, like alpine3.15 in this example. Ubuntu and Debian use code names. For example, Ubuntu 18.04 is bionic, Ubuntu 20.04 is focal, Debian 9 is stretch, Debian 10 is buster, and Debian 11 is bullseye.

Additionally, some tags may include slim or fat to further indicate whether the image content has been slimmed down or includes more auxiliary tools. Typically, slim images are smaller and more efficient, while fat images are larger and suitable for development and debugging.

Now, let me list a few examples of tags to illustrate this:

  • nginx:1.21.6-alpine represents version 1.21.6 with the latest Alpine as the base image.
  • redis:7.0-rc-bullseye represents the release candidate version 7.0 with Debian 11 as the base image.
  • node:17-buster-slim represents version 17 with a slimmed-down Debian 10 as the base image.

How to Upload Your Own Image #

Now that you should have a better understanding of how to select images on Docker Hub, the next question is how to upload your own image that you created using a Dockerfile.

It’s actually not difficult at all and can be done in just 4 steps.

First, you need to register a user on Docker Hub. I don’t need to explain this any further.

Second, you need to use the docker login command on your local machine to authenticate with your registered username and password. Here, I used my username “chronolaw”: Image

The third step is crucial. You need to use the docker tag command to change the image name to include your username, indicating that the image belongs to you. Alternatively, you can simply use docker build -t to name the image when creating it.

Here, I’ll use the example of the image “ngx-app” from the previous lesson and rename it to chronolaw/ngx-app:1.0:

docker tag ngx-app chronolaw/ngx-app:1.0

Image

The fourth step is to use docker push to push the image to Docker Hub. Once we do this, our image is published:

docker push chronolaw/ngx-app:1.0

Image

You can also log in to the Docker Hub website to verify the results of the image publication. It automatically generates a page template for us, which can be further enhanced by adding description information, usage instructions, and more: Image

Now you can share the name of this image (username/appname:tag) with your colleagues so they can download and deploy it using docker pull.

What to do in an offline environment #

Using Docker Hub to manage images is indeed very convenient, but there is one scenario where it cannot be used, which is an offline environment in an enterprise intranet. If you cannot connect to the internet, naturally you cannot use docker push or docker pull to push or pull images.

Is there a solution to this situation?

Of course, there are many methods. The best method is to create your own private Registry service in the intranet environment, simulating Docker Hub. It will manage our images, just like we use GitLab for version control.

There are already many mature solutions for building a self-hosted Registry, such as Docker Registry and CNCF Harbor. However, using them requires some knowledge that has not been covered yet, and the steps are a bit complex. I will introduce them in later courses.

Now, let me explain a “crude” method for storing and distributing images. Although it is relatively “primitive,” it is simple and feasible, and can be used as a temporary emergency measure.

Docker provides two image archiving commands: save and load. These commands allow you to export images as compressed files or import them into Docker from compressed files. Compressed files are very easy to store and transfer. You can copy them online, share them via FTP, or even carry them on a USB drive.

It is worth noting that these two commands use the standard stream as the default input and output (for easy Linux pipeline operations). Therefore, you generally use the -o and -i parameters to work with files. For example:

docker save ngx-app:latest -o ngx.tar
docker load -i ngx.tar

Summary #

Alright, today we learned about image repositories and how to use Docker Hub. Let’s summarize the key points to help you deepen your understanding:

  1. An image repository (Registry) is a website that provides comprehensive image services, with the basic functionalities of uploading and downloading.
  2. Docker Hub is currently the largest image repository, with many high-quality images. There are numerous criteria for selecting images, such as official certification, download count, star ratings, etc., and a comprehensive evaluation is required.
  3. Images also come in many versions, so it is important to carefully confirm the appropriate tags based on version numbers and operating systems.
  4. After registering on Docker Hub, you can upload your own images, with tags applied using docker tag, followed by pushing them using docker push.
  5. In offline environments, you can set up a private image repository or use docker save to save the image as a compressed package, and then restore the image from the compressed package using docker load.

Homework #

Finally, it’s time for homework. I have two questions for you to think about:

  1. Many applications (such as Nginx, Redis, and Go) already have official Docker images. Why do other companies (such as Bitnami and Rancher) still repeat the work and release their own packaged images?

  2. Can you compare GitHub and Docker Hub and discuss their similarities and differences in terms of features, target audience, and impact?

Remember to leave a comment in the discussion area. If I see it, I will reply to you as soon as possible. See you in the next class.