12 Observability Monitoring and Logs Moriyuan

12 Observability Monitoring and Logs Moriyuan #

This article is mainly divided into four parts:

Background information on monitoring and logging in Kubernetes;
The evolution of monitoring solutions in Kubernetes and the commonly available monitoring solutions;
Some details on log collection and common open-source logging systems;
Course summary, introducing the best practices for monitoring and logging on Alibaba Cloud Container Service.

Background #

Monitoring and logging are important infrastructures for large-scale distributed systems. Monitoring helps developers view the running status of the system, while logging assists in problem troubleshooting and diagnosis.

In Kubernetes, monitoring and logging are part of the ecosystem, rather than core components, so most of their capabilities rely on the adaptation of upper-level cloud vendors. Kubernetes defines interface standards and specifications for integration, allowing any component that meets these standards to be quickly integrated.

Monitoring #

Types of Monitoring #

Let’s take a look at monitoring. In terms of monitoring types, there are four different types in K8s:

Resource Monitoring

The most common type, such as CPU, memory, and network, is a metric for resource monitoring. These metrics are usually measured in numerical values or percentages, and this is the most common way of monitoring. This type of monitoring is achievable in common monitoring systems like Zabbix and Telegraph.

Performance Monitoring

Performance monitoring refers to APM monitoring, which checks metrics related to application performance. This is usually achieved through mechanisms such as hooks in the virtual machine layer or bytecode execution layer through implicit calls, or by explicit injection at the application layer to obtain deeper monitoring metrics. It is generally used for application optimization and diagnostics. Common examples include JVM or PHP’s Zend Engine, which use common hook mechanisms to obtain metrics such as GC counts, distribution of various memory pools, and network connection counts in JVM, thereby enabling application performance diagnostics and optimization.

Security Monitoring

Security monitoring mainly focuses on a series of monitoring strategies for security, such as authorization management and security vulnerability scanning.

Event Monitoring

Event monitoring is a unique monitoring method in K8s. In the previous lesson, we introduced a design concept in K8s, which is state transition based on a state machine. When transitioning from a normal state to another normal state, a normal event occurs, and when transitioning from a normal state to an abnormal state, a warning event occurs. Typically, we are more concerned about warning events, and event monitoring allows normal events or warning events to be logged to a data center and analyzed and alerted through the data center. The corresponding abnormalities can be exposed through platforms such as DingTalk, SMS, and email. This helps to compensate for the deficiencies and shortcomings of regular monitoring.

The Evolution of Kubernetes Monitoring #

In the early days, before version 1.10 of K8s, people would use components like Heapster for monitoring and data collection. The design principle of Heapster is relatively simple.

First, we have a packaged cadvisor on each Kubernetes node, which is responsible for data collection. After cadvisor completes data collection, Kubernetes encapsulates the data and exposes it as corresponding APIs. In the early days, there were actually three different APIs:

Summary API
Kubelet API
Prometheus API

These three APIs all correspond to cadvisor as the data source, but the data formats are different. In Heapster, support is provided for both the summary API and the kubelet API for data collection. Heapster periodically fetches data from each node, aggregates it in its own memory, and then exposes the corresponding service for consumption by higher-level consumers. Common consumers in K8s include the dashboard and HPA-Controller, which obtain monitoring data by calling the service to achieve scalability and visualization of monitoring data.

This used to be the data consumption chain, which seemed clear and without too many issues. So why did Kubernetes abandon Heapster and transition to metrics-server? The main driving force behind this transition is the standardization of monitoring data interfaces in Heapster. Why is standardization of monitoring data interfaces necessary?

Client requirements vary greatly. For example, if Heapster is used for basic resource collection today, and tomorrow I want to expose the number of online users in the application and display it in my own interface system, as well as consume data similar to HPA. Can Heapster support this scenario? No, it cannot, so this is a limitation of Heapster’s own scalability.
In order to ensure offline capabilities, Heapster provides many sinks, including influxdb, sls, DingTalk, and more. These sinks are mainly used for data collection and offline processing. Many customers use influxdb for offline processing and use visualization software like grafana to access the monitoring data on influxdb, achieving visualization of monitoring data.

However, the community later found that many of these sinks were not being maintained. This led to many bugs in the Heapster project that no one was fixing, and this posed many challenges to the community in terms of project activity and stability.

Based on these two reasons, K8s broke away from Heapster and created a streamlined monitoring collection component called metrics-server.

The figure above shows the internal architecture of Heapster. It can be seen that it is divided into several parts: the core part, the API that is exposed through standard HTTP or HTTPS, the source part which provides different interfaces for data collection, the processor part responsible for data transformation and aggregation, and the sink part responsible for offline data processing. This was the application architecture of early Heapster. Later, as K8s standardized the monitoring interface, Heapster was gradually trimmed down and transformed into metrics-server.

The current structure of metrics-server in version 0.3.1 is roughly as shown in the figure above. It is very simple: it has a core layer, an intermediate source layer, a simple API layer, and an additional API Registration layer. The purpose of this layer is to register the corresponding data interfaces on top of K8s’s API server. In the future, customers no longer need to access metrics-server through the API layer, instead they can access the API Registration layer via the API server and then access metrics-server. Therefore, the true data consumers might not perceive the metrics-server itself, but rather its implementation as an API. This is the biggest change made to metrics-server.

The Standard Monitoring Interfaces in Kubernetes #

There are three different interface standards for monitoring in K8s. It standardizes and decouples the data consumption capability of monitoring, achieving integration with the community, which can be mainly divided into three categories within the community.

1. Resource Metrics #

The corresponding interface is metrics.k8s.io, and the main implementation is metrics-server. It provides resource monitoring, including common metrics such as node-level, pod-level, namespace-level, and class-level monitoring. These monitoring metrics can be obtained through the metrics.k8s.io interface.

2. Custom Metrics #

The corresponding API is custom.metrics.k8s.io, and the main implementation is Prometheus. It provides resource monitoring and custom monitoring. Resource monitoring overlaps with the above resource monitoring. Custom monitoring refers to defining custom metrics in applications, such as online user numbers or slow queries in a database such as MySQL. These custom metrics can be exposed in the application layer using the standard Prometheus client, collected by Prometheus, and consumed through the standard custom.metrics.k8s.io interface.

Once these metrics are collected, they can be consumed using interfaces like custom.metrics.k8s.io, which means that if Prometheus is accessed in this way, you can use the custom.metrics.k8s.io interface to perform Horizontal Pod Autoscaling (HPA) and consume data.

3. External Metrics #

External Metrics is a relatively special category because Kubernetes has become an implementation standard for cloud-native interfaces. Many times, when dealing with cloud services, such as using a message queue in an application and an RBS database behind it, consuming some of the cloud product monitoring metrics is also required, such as the number of messages in the message queue or the number of connections on the SLB (Server Load Balancer) or the number of requests on the SLB, and so on.

How to consume these metrics? Kubernetes has implemented a standard for this, which is external.metrics.k8s.io. The main implementation providers are various cloud providers, and these providers can retrieve cloud resource monitoring metrics. Alibaba Cloud has also implemented the Alibaba Cloud metrics adapter on its platform to provide an implementation of this external.metrics.k8s.io standard.

Prometheus - The “Standard” Monitoring Solution in the Open Source Community #

Next, let’s take a look at a popular monitoring solution in the open-source community, Prometheus. Why is Prometheus considered the standard monitoring solution in the open-source community?

First, Prometheus is a graduated project of the CNCF (Cloud Native Computing Foundation), which is a cloud-native community initiative.
Second, an increasing number of open-source projects use Prometheus as the standard monitoring solution. For example, well-known projects such as Spark, TensorFlow, and Flink all have standard Prometheus collection interfaces.
Third, for commonly used databases and middleware projects, there are corresponding Prometheus client libraries available. For example, ETCD, ZooKeeper, MySQL, and PostgreSQL all have Prometheus interfaces. If the interfaces do not exist, the community will usually have corresponding exporters to implement the interfaces.

Now, let’s take a look at the overall structure of Prometheus. avatar

The diagram above shows the data collection links of Prometheus, which can be divided into three different data collection links.

The first one is the push mode, which collects data through the pushgateway. The data is then sent to the pushgateway, and Prometheus pulls the data from the pushgateway through the pull mode. This collection method is mainly used for scenarios where the task is relatively short-lived. The most common data collection mode in Prometheus is the pull mode, which has a potential issue. If the lifecycle of the data is shorter than the collection cycle, such as when the collection cycle is 30s and the task only runs for 15s, it may cause some data to be missed. The simplest solution for this scenario is to first push the metrics to the pushgateway and then pull the data from the pushgateway. This can ensure that short-lived tasks do not lose the collected data.
The second one is the standard pull mode, where data is directly pulled from the corresponding data task through the pull mode.
The third one is Prometheus on Prometheus, which allows data to be synchronized from another Prometheus to the current Prometheus.

These are the three data collection methods in Prometheus. In addition to the standard static configuration, Prometheus also supports service discovery. This means that it can dynamically discover collection objects through some service discovery mechanisms. In Kubernetes, a common practice is to use Kubernetes’ dynamic discovery mechanism by configuring annotations, which can automatically configure collection tasks for data collection, making it very convenient.

For alerting, Prometheus provides an external component called Alertmanager, which can send alert messages through email or SMS. For data consumption, data can be consumed through upper-layer API clients, web UI, and Grafana for data visualization and consumption.

In summary, Prometheus has the following five features:

The first feature is its simple and powerful integration standard. Developers only need to implement the Prometheus Client interface standard to implement data collection.
The second feature is the various data collection and offline modes. Data can be collected and processed through push mode, pull mode, and Prometheus on Prometheus mode.
The third feature is its compatibility with Kubernetes.
The fourth feature is its rich plugin mechanism and ecosystem.
The fifth feature is the assistance provided by Prometheus Operator. Prometheus Operator is probably the most complex operator we have seen so far, but it fully demonstrates the dynamic capabilities of Prometheus. If you are using Prometheus in Kubernetes, it is recommended to use Prometheus Operator for deployment and operation.

kube-eventer - Kubernetes Event Offline Tool #

Finally, we would like to introduce a Kubernetes event offline tool called kube-eventer. kube-eventer is an open-source component released by Alibaba Cloud Container Service. It can offline various eventers in Kubernetes, such as pod eventer, node eventer, core component eventer, CRD eventer, etc., through the API server’s watch mechanism, to platforms such as SLS, Dingtalk, Kafka, InfluxDB. This offline mechanism can be used for auditing, monitoring, and alerting events. We have now open-sourced this project on GitHub, and if you are interested, you can take a look at this project.

avatar

The above image is actually an alerting graph in Dingtalk. It can be seen that there is a warning event under the kube-system namespace. The event is related to a pod that failed to restart, and the reason is roughly a backoff. The time when the event occurred is also provided. This information can be used for checkups.

Logs #

Log Scenarios #

Next, let’s introduce a part of logs in K8s. First, let’s take a look at the log scenarios. Logs in K8s can mainly be divided into four major scenarios:

Host Kernel Logs #

The first is host kernel logs, which can help developers diagnose common issues, such as network stack anomalies, like iptables mark, which can show messages like “controller table”;
The second is driver anomalies. In some networking scenarios or GPU-related situations, driver anomalies may occur more frequently and are common errors;
The third is file system anomalies. In the early days when Docker was not yet mature, problems often occurred in overlayfs or AUFS. After encountering such issues, developers had no good way to monitor and diagnose them. In this case, you can check for anomalies in the host kernel logs;
Lastly, there are anomalies that affect the node, such as kernel panic or OOM, which are also reflected in the host logs.

Runtime Logs #

The second is runtime logs. Docker logs are more common in this scenario, and we can use Docker logs to troubleshoot problems like removing some Pod Hang.

Core Component Logs #

The third is core component logs. In K8s, core components include external middleware, such as etcd, or built-in components like API server, kube-scheduler, controller-manager, kubelet, and so on. The logs of these components can help us understand the usage of resources in the entire K8s cluster and whether the current running state has any anomalies.

There are also some core middleware, such as Ingress, a networking middleware, which can help us view the traffic in the entire access layer. Through Ingress logs, we can conduct a good analysis of the access layer.

Application Deployment Logs #

Finally, there are logs for deploying applications, which can be used to view the status of the business layer. For example, we can see if there are any 500 requests in the business layer, any panics, or any abnormal error accesses. All of these can be viewed through application logs.

Log Collection #

First, let’s take a look at log collection. There are three types of log collection based on the collection location:

avatar

The first is host machine log files. In this scenario, my container writes log files to the host machine using something like volumes. Log rotation is performed on the host machine using a log rotation strategy, and then the logs are collected by an agent on the host machine;
The second type is log files within the container. In this case, a common approach is to use a sidecar streaming container to transcribe the logs to stdout, write them to the corresponding log-file through stdout, and then collect them through local log rotation and an external agent;
The third type is direct output to stdout. This is a more common strategy. The first way is to directly collect it using an agent to a remote location, and the second way is to directly collect it to a remote location using a standard API like sls (Serverless Service).

The community generally recommends using the Fluentd log collection solution. Fluentd is an agent that runs on each node, and this agent aggregates data to a Fluentd server. This server can then store the data offline to something like Elasticsearch, and it can be presented through Kibana. Alternatively, the data can be stored offline to InfluxDB and presented through Grafana. This is currently the recommended approach in the community.

Summary #

Finally, let’s summarize today’s lesson and introduce the best practices for monitoring and logging on Alibaba Cloud. At the beginning of the course, we introduced that monitoring and logging are not core components of K8s, but mostly define a standard interface and then adapt to the upper cloud providers.

Alibaba Cloud Container Service Monitoring System #

Introduction to Monitoring System Components #

First, let me introduce the monitoring system in Alibaba Cloud Container Service. This chart is actually an overview of the monitoring system.

avatar

The four products on the right are closely related to monitoring and logging:

SLS #

The first one is SLS, which stands for Log Service. As we mentioned earlier, logs in K8s can be divided into different categories, such as core component logs, access layer logs, and application logs. In Alibaba Cloud Container Service, you can collect audit logs through the API server, access layer logs through service mesh or ingress controller, and application logs at the application layer.

With this data pipeline, it is still not enough. Because the data pipeline only achieves offline data processing, we still need to visualize and analyze the data. For example, with audit logs, you can see how many operations, changes, attacks, and system anomalies occurred today through the audit dashboard.

ARMS #

The second one is performance monitoring for applications. For performance monitoring, you can use products like ARMS to view performance and diagnose and optimize issues. Currently, ARMS supports Java and PHP, and it can be used for performance diagnosis and optimization of applications.

AHAS #

The third one is called AHAS, which is an architecture-aware monitoring tool. In Kubernetes, many deployments are based on microservice architecture, which leads to a complex topology management due to the large number of components and their replicas. It is difficult to visualize the flow of an application in Kubernetes or troubleshoot traffic anomalies. AHAS addresses this issue by monitoring the network stack and generating a topology diagram of the entire Kubernetes environment. It also provides resource monitoring, network bandwidth monitoring, traffic monitoring, and diagnosis of abnormal events. AHAS provides an alternative monitoring solution by offering architecture-aware monitoring capabilities.

Cloud Monitor #

Lastly, there is Cloud Monitor, the basic cloud monitoring tool. It collects standard resource metrics for monitoring and provides visualizations and alerts for metrics related to nodes, pods, and other resources.

Enhanced Features by Alibaba Cloud #

This section covers the enhancements made by Alibaba Cloud in the open source tools. Firstly, there is metrics-server, which has been significantly simplified according to the article. However, from the perspective of customers, this simplification actually removes some functionalities that may be inconvenient. For example, many customers want to export monitoring data to services like SLS or InfluxDB, but this is not possible with the community version. Alibaba Cloud retains commonly used sinks with higher maintainability, which is the first enhancement.

The second enhancement is related to the pace of integration of the Kubernetes ecosystem. For example, the release of the Dashboard is not aligned with the major Kubernetes versions. This means that the Dashboard may not have the same version number as Kubernetes. This misalignment leads to compatibility issues when upgrading from Heapster to metrics-server. Alibaba Cloud has made metrics-server fully compatible with Heapster. This means that Alibaba Cloud’s metrics-server can be used from Kubernetes version 1.7 to 1.14, ensuring complete compatibility for consuming monitoring components.

There are also enhancements in eventer and npd. The article mentioned the kube-eventer component. On the npd side, Alibaba Cloud has made numerous additional enhancements, including adding monitoring and detection items such as kernel Hang, npd detection, inbound and outbound network monitoring, and snat detection. There are also checks for file descriptors (fd). These monitoring features are included in npd, and Alibaba Cloud has made significant improvements. Developers can deploy npd checks to enable node diagnostics and alerts, and they can also use eventer to send the alerts to Kafka or Dingtalk.

Moving up to the Prometheus ecosystem, developers can integrate with Alibaba Cloud’s HiTSDB and InfluxDB for storage. In terms of data collection, Alibaba Cloud provides an optimized node-exporter and exporters for specific scenarios such as Spark, TensorFlow, and Argo. There are also additional enhancements for GPUs, including support for monitoring individual GPU cards and GPU sharing.

On the topic of Prometheus, Alibaba Cloud has launched a managed version of Prometheus in collaboration with the ARMS team. Developers can use pre-packaged Helm charts to experience Prometheus’ monitoring and data collection capabilities without having to deploy Prometheus servers.

Alibaba Cloud Container Service Logging System #

What enhancements has Alibaba Cloud made in terms of logging? Firstly, Alibaba Cloud ensures full compatibility in terms of log collection. It can collect pod logs, core component logs, Docker engine logs, kernel logs, and logs from middleware, and store them in SLS. Once the logs are collected in SLS, they can be sent offline to OSS or Max Compute for archiving and offline analysis.

Real-time consumption of logs is also supported. Logs can be consumed by Opensearch, E-Map, Flink, and other services for searching and further analysis. When it comes to log visualization, Alibaba Cloud supports both open-source solutions like Grafana and proprietary solutions like DataV, providing a complete data collection and consumption pipeline.

Summary #

Today’s course is coming to an end. Here is a recap of the key points:

First, we introduced monitoring, including: common monitoring methods for four different container scenarios; the evolution and interface standards of Kubernetes monitoring; and two commonly used monitoring solutions from different sources.
For logs, we mainly discussed four different scenarios and introduced a collection solution using Fluentd.
Finally, we introduced a best practice for logging and monitoring using Alibaba Cloud.