21 Ingress Managing Ingress and Egress Traffic of the Cluster

21 Ingress Managing Ingress and Egress Traffic of the Cluster #

Hello, I’m Chrono.

In the previous lesson, we learned about the Service object, which is Kubernetes’ built-in load balancing mechanism. It uses static IP addresses to proxy dynamic Pods, supports domain name access and service discovery, and is an essential infrastructure for microservices architecture.

Service is very useful, but it can only be considered as “infrastructure.” Its management solution for network traffic is still too simple and has a long way to go to meet the complex requirements of modern application architecture. That’s why Kubernetes introduces a new concept on top of Service: Ingress.

Compared to Service, Ingress is more closely related to actual business needs, and its development, application, and discussion are also the hottest topics in the community. Today, we will take a look at Ingress and its associated objects such as Ingress Controller and Ingress Class.

Why do we need Ingress #

Through the explanation in the previous class, we learned about the functions and operation mechanism of Service. Essentially, it is a layer 4 load balancer controlled by kube-proxy, which forwards traffic on the TCP/IP protocol stack (Service working principle diagram):

However, the load balancing feature at layer 4 is still too limited. It can only make simple judgments and combinations based on IP address and port number. Most of our applications now run on layer 7 of the HTTP/HTTPS protocol, which has more advanced routing conditions such as hostname, URI, request headers, certificates, etc. These conditions are not visible in the TCP/IP network stack.

Service also has a drawback. It is more suitable for proxying services within the cluster. If you want to expose the service to the outside of the cluster, you can only use NodePort or LoadBalancer, both of which lack flexibility and are difficult to control. This leads to a situation where our service has the ability, but without the right opportunity to showcase its strength.

How do we solve this problem?

Kubernetes still follows the concept of Service. Since Service is a layer 4 load balancer, can’t I introduce a new API object to perform load balancing at layer 7?

However, besides layer 7 load balancing, this object should also take on more responsibilities, acting as the overall traffic entry point and managing the cluster’s inbound and outbound data flow, “fan-in” and “fan-out” traffic (also known as “north-south” traffic), allowing external users to access internal services securely, smoothly, and conveniently (Image source):

Therefore, this API object is appropriately named Ingress, meaning the entry point on the cluster’s inner and outer boundaries.

Why do we need Ingress Controller #

Let’s compare Ingress with Service to better understand what Ingress is.

Ingress can be seen as another form of Service at the application layer, it also proxies backend Pods and has routing rules to define how traffic should be distributed and forwarded, but these rules are all based on the HTTP/HTTPS protocols.

You should know that Service itself does not have the ability to serve requests, it is just a set of iptables rules, and the actual configuration and enforcement of these rules is done by the kube-proxy component in the nodes. Without kube-proxy, no matter how perfect the Service definition is, it won’t work.

Similarly, Ingress is also just a collection of HTTP routing rules, like a static configuration file. To actually implement and run these rules in the cluster, we need another component called Ingress Controller, which plays a role similar to kube-proxy for Services. It can read and apply Ingress rules, handle and route traffic.

In theory, Kubernetes should have an built-in implementation of Ingress Controller as part of its infrastructure, just like kube-proxy.

However, Ingress Controller has to do a lot of things and is closely tied to upper-layer applications, so Kubernetes handed over the implementation of Ingress Controller to the community. Anyone can develop an Ingress Controller as long as they adhere to the Ingress rules.

This has led to a situation where there are many options for Ingress Controller.

Since Ingress Controller guards the critical entry point of cluster traffic, possessing it gives you the “right to speak” in controlling cluster applications. Therefore, many companies have entered the scene and carefully crafted their own Ingress Controller, intending to gain a foothold in the field of managing traffic in Kubernetes.

Among these implementations, the most famous one is the well-established reverse proxy and load balancing software, Nginx. From the description of Ingress Controller, we can also see that the functions such as traffic management and security control at the HTTP level are actually classic reverse proxies, and Nginx is the most stable and high-performance product among them. Therefore, it has naturally become the most widely used Ingress Controller in Kubernetes.

However, because Nginx is open source, anyone can develop it based on the source code, so it has many variations, such as the community-driven Kubernetes Ingress Controller (https://github.com/kubernetes/ingress-nginx), Nginx’s own Nginx Ingress Controller (https://github.com/nginxinc/kubernetes-ingress), and Kong Ingress Controller based on OpenResty (https://github.com/Kong/kubernetes-ingress-controller), and so on.

According to statistics on Docker Hub, Nginx’s official implementation has the highest download volume among Ingress Controllers. Therefore, in this article, I will use it as an example to explain how to use Ingress and Ingress Controller.

The image below is from the official Nginx website and clearly shows the position of Ingress Controller in a Kubernetes cluster:

Why do we need IngressClass #

So now that we have Ingress and Ingress Controller, can we perfectly manage the inbound and outbound traffic of the cluster?

In the beginning, Kubernetes also thought so. The idea was to have one Ingress Controller in a cluster and configure it with many different Ingress rules to handle request routing and distribution.

However, as Ingress was widely used in practice, many users found that this approach brought some problems, such as:

For various reasons, project teams need to introduce different Ingress Controllers, but Kubernetes does not allow this.
There are too many Ingress rules, and giving all of them to one Ingress Controller would make it overwhelmed.
Multiple Ingress objects have no good logical grouping method, resulting in high management and maintenance costs.
There are different tenants in the cluster, and their Ingress requirements vary greatly, even conflicting. They cannot be deployed on the same Ingress Controller.

Therefore, Kubernetes introduced the concept of “Ingress Class” to act as a coordinator between Ingress and Ingress Controller, relieving the strong binding between them.

Now, Kubernetes users can manage Ingress Class to define different business logic groups and simplify the complexity of Ingress rules. For example, we can use Class A to handle blog traffic, Class B to handle short video traffic, and Class C to handle shopping traffic.

These Ingress and Ingress Controllers are independent of each other and do not conflict, so the aforementioned problems can be easily solved with the introduction of Ingress Class.

How to Use YAML to Describe Ingress/Ingress Class #

We have spent a lot of time studying Ingress, Ingress Controller, and Ingress Class. It might feel a bit exhausting to learn all the theory. But we have no choice because the reality of business is complex, and this architectural design is the result of extensive discussion in the community and is currently the best solution available to us.

Now that we understand these three concepts, let’s take a look at how to write YAML description files for them.

Just like we learned about the Deployment and Service objects before, we should first use the command kubectl api-resources to check their basic information. The output is as follows:

kubectl api-resources

NAME      		SHORTNAMES   APIVERSION       		NAMESPACED   KIND
ingresses       ing          networking.k8s.io/v1   true         Ingress
ingressclasses               networking.k8s.io/v1   false        IngressClass

You can see that the apiVersion for both Ingress and Ingress Class is “networking.k8s.io/v1”, and Ingress has a shorthand “ing”. But why can’t we find Ingress Controller?

This is because Ingress Controller is different from the other two objects. It is not just a description file but an application that actually does the work and handles the traffic. Kubernetes already has objects to manage applications, such as Deployment and DaemonSet. So we only need to learn how to use Ingress and Ingress Class.

Let’s start with Ingress.

Like the Service, you can also use the kubectl create command to create a template file for Ingress. It also requires two additional parameters:

--class: Specifies the Ingress Class object that the Ingress belongs to.
--rule: Specifies the routing rule. The basic format is “URI=Service”, which means that when accessing an HTTP path, it will be forwarded to the corresponding Service object, and then the Service object will forward it to the backend Pod.

Now let’s execute the command to see what the Ingress looks like:

export out="--dry-run=client -o yaml"
kubectl create ing ngx-ing --rule="ngx.test/=ngx-svc:80" --class=ngx-ink $out

The resulting Ingress YAML looks like this:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ngx-ing

spec:

  ingressClassName: ngx-ink

  rules:
  - host: ngx.test
    http:
      paths:
      - path: /
        pathType: Exact
        backend:
          service:
            name: ngx-svc
            port:
              number: 80

In this Ingress YAML, there are two key fields: “ingressClassName” and “rules”, which correspond to the command line parameters, and their meanings are relatively easy to understand.

The “rules” field has a complex nested structure. But if you look closely, you will find that it breaks down the routing rules into host and http path. Inside the path, it specifies the matching mode, which can be exact or prefix matching. It then uses the backend field to specify the target Service object for forwarding.

However, personally, I think the description in the Ingress YAML is not as intuitive and understandable as the --rule parameter in the kubectl create command line. Moreover, there are too many fields in YAML, making it easy to make mistakes. I recommend that you let kubectl generate the rules automatically and then make slight modifications as needed.

Now that we have the Ingress object, what does the associated Ingress Class look like?

In fact, Ingress Class itself does not have any practical functionality. It only serves as a connection between Ingress and Ingress Controller. So its definition is very simple, and it only has one required field “controller” in the “spec” section, indicating which Ingress Controller to use. The specific name depends on the implementation documentation.

For example, if I want to use an Ingress Controller developed by Nginx, then the name should be “nginx.org/ingress-controller”:

apiVersion: networking.k8s.io/v1
kind: IngressClass
metadata:
  name: ngx-ink

spec:
  controller: nginx.org/ingress-controller

I have also created a diagram to show the relationship between Ingress, Service, and Ingress Class for your reference:

How to use Ingress/Ingress Class in Kubernetes #

Since Ingress Class is very small, I combined it with Ingress into one YAML file. Let’s use kubectl apply to create these two objects:

kubectl apply -f ingress.yml

Then we use kubectl get to check the status of the objects:

kubectl get ingressclass
kubectl get ing

The kubectl describe command can show more detailed information about the Ingress:

kubectl describe ing ngx-ing

As you can see, the Host/Path routing rules of the Ingress object are set to the domain “ngx.test/” in the YAML, and it is already associated with the Service object created in Lecture 20, as well as the two Pods behind the Service.

Also, don’t be surprised by the error message “Default backend” in the Ingress. It is designed to provide a default backend service when no routing is found, but it doesn’t cause any issues if it is not set, so we usually ignore it.

How to use an Ingress Controller in Kubernetes #

After preparing the Ingress and Ingress Class, we need to deploy the Ingress Controller, which actually handles the routing rules.

You can find the Nginx Ingress Controller project on GitHub (https://github.com/nginxinc/kubernetes-ingress). Since it runs in the form of Pods in Kubernetes, it supports both Deployment and DaemonSet deployment methods. Here, I chose Deployment, and the relevant YAML files are also copied in our course project (https://github.com/chronolaw/k8s_study/tree/master/ingress).

The installation of Nginx Ingress Controller is a bit tricky, as there are several YAML files that need to be executed. However, if you only want to do a simple test, you only need to use 4 YAML files:

kubectl apply -f common/ns-and-sa.yaml
kubectl apply -f rbac/rbac.yaml
kubectl apply -f common/nginx-config.yaml
kubectl apply -f common/default-server-secret.yaml

The first two commands create a separate namespace “nginx-ingress” for the Ingress Controller, along with the corresponding account and permissions, which are used to access the apiserver to obtain Service and Endpoint information. The last two commands create a ConfigMap and a Secret used to configure the HTTP/HTTPS services.

To deploy the Ingress Controller, we don’t need to write the Deployment from scratch. Nginx has provided example YAML files for us. However, before creating the Deployment, we need to make a few modifications to adapt it to our own application:

Change the name in the metadata section to your own name, for example, ngx-kic-dep.
Modify spec.selector and template.metadata.labels to use your own name, for example, ngx-kic-dep.
Change containers.image to use the Alpine version to speed up the download, for example, nginx/nginx-ingress:2.2-alpine.
Add -ingress-class=ngx-ink to the args at the bottom, which is the name of the Ingress Class created earlier. This is crucial for the Ingress Controller to manage the Ingress.

After making these changes, the YAML file for the Ingress Controller would look something like this:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ngx-kic-dep
  namespace: nginx-ingress

spec:
  replicas: 1
  selector:
    matchLabels:
      app: ngx-kic-dep

  template:
    metadata:
      labels:
        app: ngx-kic-dep
    ...
    spec:
      containers:
      - image: nginx/nginx-ingress:2.2-alpine
        ...
        args:
          - -ingress-class=ngx-ink

With the Ingress Controller, the association of these API objects becomes more complex. You can see how they are associated using the object names in the following diagram:

Diagram

After confirming that the modifications to the Ingress Controller YAML are complete, you can use kubectl apply to create the objects:

kubectl apply -f kic.yml

Note that the Ingress Controller is located in the “nginx-ingress” namespace, so you need to explicitly specify the “-n” parameter when checking the status; otherwise, you will only see Pods in the “default” namespace:

kubectl get deploy -n nginx-ingress
kubectl get pod -n nginx-ingress

Now the Ingress Controller is up and running.

However, there is one final step. Since the Ingress Controller itself is a Pod, it still relies on a Service object to provide external access. Therefore, you need to define a Service for it, using either NodePort or LoadBalancer to expose the ports, in order to connect the cluster’s internal and external traffic. You can complete this task on your own.

Here, I will use the kubectl port-forward command mentioned in Lecture 15. It can directly map the local port to a Pod in the Kubernetes cluster, which is very convenient for testing and verification.

The following command maps the local port 8080 to port 80 of the Ingress Controller Pod:

kubectl port-forward -n nginx-ingress ngx-kic-dep-8859b7b86-cplgp 8080:80 &

When making test requests with curl, it is important to note that Ingress uses HTTP protocol for routing rules, so you cannot use IP address to access it. You must use domain names and URIs.

You can manually add domain name resolution by modifying /etc/hosts, or you can use the --resolve parameter to specify the domain name resolution rule. For example, here I force the resolution of “ngx.test” to “127.0.0.1”, which is the local address forwarded by kubectl port-forward:

curl --resolve ngx.test:8080:127.0.0.1 http://ngx.test:8080

Compare this result with the Service mentioned in the previous lecture, and you will find that the final effect is the same: both forward the requests to Pods within the cluster. However, the routing rules of Ingress are no longer based on IP addresses, but on elements such as domain names and URIs in the HTTP protocol.

Summary #

Alright, that’s all for today. We have learned about the seven-layer reverse proxy and load balancing objects in Kubernetes, including Ingress, Ingress Controller, and Ingress Class. Together, they manage the incoming and outgoing traffic of the cluster, serving as the gateway to the cluster.

To summarize today’s main points:

Service is a layer-four load balancer with limited capabilities. Ingress, based on the HTTP/HTTPS protocol, defines routing rules.
Ingress is just a collection of rules and does not have traffic management capabilities by itself. It needs Ingress Controller to apply the Ingress rules and function properly.
Ingress Class decouples Ingress and Ingress Controller. We should use Ingress Class to manage Ingress resources.
The most popular Ingress Controller is the Nginx Ingress Controller, which is based on the classic reverse proxy software Nginx.

Furthermore, the current traffic management functionality in Kubernetes mainly focuses on the Ingress Controller. It goes beyond managing “ingress traffic” and can also handle “egress” or outbound traffic, as well as manage “east-west traffic” between services within the cluster.

In addition, Ingress Controllers usually have many other features, such as TLS termination, network application firewall, rate limiting, traffic splitting, authentication, access control, and more. They can be considered as full-featured reverse proxies or gateways. If you’re interested, you can look for more information on this topic.

Homework #

Finally, it’s time for homework. Here are two questions for you to ponder:

What are the similarities and differences between layer 4 load balancing (Service) and layer 7 load balancing (Ingress)?
What other tasks do you think an Ingress Controller should perform as the traffic entry point for the cluster?

Feel free to leave your thoughts in the comments. Closing the loop on these questions is the first step in reinforcing what you have learned. Progress begins with completion.

In the next class, we will have a practical exercise for this chapter. See you next time.