44 Technical Progress Software Architecture and Application Lifecycle Technical Progress

44 Technical Progress Software Architecture and Application Lifecycle Technical Progress #

Hello, I am Kong Lingfei.

The dimensions of applications, system resources, and application lifecycle management constitute all our demands for the cloud. In the previous lecture, I introduced the evolutionary path of virtualization technology from the perspective of system resources. In this lecture, I will discuss the technological evolution of the application dimension and application lifecycle management dimension.

Application software architecture is used to build applications. Different software architectures have different ways of building applications, efficiency, maintainability, and performance of the applications built. As technology continues to evolve, application software architecture is also constantly evolving. In this lecture, we will explore the various types of application software architectures, their characteristics, and how they have evolved over time.

As for the application lifecycle management dimension, I have already introduced the evolution of application lifecycle management technologies in Lecture 09. In this lecture, I will also supplement some core technologies, such as logging, monitoring, alerting, and distributed tracing.

Next, let’s take a look at the evolutionary path of software architecture.

Evolution of Software Architecture #

The evolution of software architecture technology is shown in the following figure:

Initially, we used the monolithic architecture to build applications, which gradually evolved into the SOA (Service-Oriented Architecture). However, neither the monolithic architecture nor the SOA architecture can meet the demands of rapid iteration in the era of the Internet. Therefore, in the internet age, application software architecture has evolved into the microservices architecture. Currently, we are in the stage of microservices architecture, and many teams are exploring the use of Service Mesh to replace some functions in the microservices architecture.

With the birth of Serverless cloud functions, a new software architecture called FaaS (Function as a Service) has emerged. Here, I will give a brief introduction to it and elaborate on it later. FaaS architecture is currently only applicable to system resource forms such as cloud functions due to its limitations and limited usage scenarios. Personally, I believe it will not become the mainstream software architecture in the future. Also, it should be noted that there is currently no widely recognized software architecture called FaaS in the industry. When people mention FaaS, they generally refer to technologies such as cloud functions. For convenience, we will refer to it this way temporarily.

Next, I will continue to introduce these software architectures from the perspective of technical evolution. Let’s start with the earliest monolithic architecture.

Monolithic Architecture #

In the early days, we used the monolithic architecture for software development. In the monolithic architecture, all the functions of the application are stored in a single code repository, and when released, the entire codebase is deployed.

In a monolithic architecture, the application software generally consists of four layers: presentation layer, business logic layer, data access layer, and database, as shown in the following figure:

Here is a brief introduction to each layer’s functionality:

  • Presentation Layer: It is used for direct user interaction, usually web pages or UI interfaces.
  • Business Logic Layer: It is used for processing business logic. It takes the parameters passed from the presentation layer, processes the business logic, and returns the results to the presentation layer.
  • Data Access Layer: It is used for accessing the database and typically includes CRUD operations on data. For example, querying user information from the database or adding a user record to the database.
  • Database: It is the physical medium for storing data. We access the data in the database through the data access layer.

The advantage of monolithic architecture is that application development is simple, the technology stack is unified, and testing and deployment are relatively straightforward. Therefore, it is more suitable for applications with low user traffic. However, it has obvious disadvantages. As the project grows, the monolithic architecture brings problems such as low development efficiency, long release cycles, difficult maintenance, poor stability, and lack of scalability. Additionally, the technology stack of the monolithic architecture is not easily expandable and can only be optimized locally based on the existing foundation.

Service-Oriented Architecture (SOA) #

To overcome the various problems caused by the growth of code in monolithic architecture, SOA (Service-Oriented Architecture) emerged.

SOA is a service-oriented software architecture. Its core idea is to extract repeated and shared functions as components and provide services to various systems in a service-oriented manner through an ESB (Enterprise Service Bus), as shown in the following figure:

In the SOA architecture, there are two main roles: service providers and service consumers. Service consumers can call services such as purchasing products and applying for after-sales services by sending messages. These messages are transformed by the ESB and sent to the corresponding services to achieve communication between SOA services.

SOA architecture is mainly suitable for scenarios where large software service enterprises provide services externally. It is not suitable for general business scenarios because defining, registering, and invoking SOA services require tedious coding or configuration. The ESB also tends to cause single point risks in the system and drag down overall performance.

Microservices Architecture #

In the Internet age, more and more companies have launched websites and applications for the general public. These companies do not have the ability or need to build and maintain an ESB. Therefore, based on the SOA architecture, the microservices architecture emerged.

The microservices architecture was introduced by Martin Fowler in 2014. Its idea is to thoroughly modularize and serviceize the business system, forming a collection of services or applications that can be developed, deployed, and maintained independently. Microservices use lightweight transmission protocols such as RESTful to deal with faster demand changes and shorter development and iteration cycles, as shown in the following figure:

The microservices architecture was proposed relatively early but only gained popularity over the past few years. What are the reasons? On the one hand, based on its own characteristics, the microservices architecture can indeed solve some problems in other software architectures. On the other hand, cloud-native technologies such as Docker and Kubernetes have also developed in recent years, which can support the deployment and lifecycle management of microservices effectively.

In general, the microservices architecture has the following characteristics:

  • Microservices follow the single responsibility principle, with each microservice responsible for an independent context boundary.
  • The services provided by the microservices architecture use lightweight protocols such as RESTful, making them lighter-weight compared to ESB.
  • Each service has its own independent business development activities and lifecycle.
  • Microservices are generally deployed using container technology, running in their own separate processes, and reasonably allocating the system resources they require. This allows developers to easily formulate optimization plans for each service and improve system maintainability. Microservice architecture has many advantages, but it also comes with its own set of problems. By breaking down an application into multiple microservices, the increase in the number of microservices can introduce complications, such as complex service deployment. The distributed nature of microservices also brings complexity, such as the need for service discovery capabilities, difficulty in tracing request chains, and testing challenges, among others. The interdependencies between services can form complex dependency chains, and a failure in a single service can cause a cascading effect known as a service outage.

Currently, there are some standard solutions in the industry to address these problems. For example, the complexity of microservice deployment can be resolved with technologies like Kubernetes, Helm, and CI/CD. As for the complexities introduced by the distributed nature of microservices, they can be tackled with the help of microservice development frameworks. Some well-known frameworks in the industry, such as Spring Cloud and Dubbo, have effectively addressed the aforementioned issues. Additionally, cloud-native technologies can also address challenges like complex tracing of request chains and troubleshooting of faults.

Furthermore, in my day-to-day development experience, I often come across developers who confuse SOA architecture with microservice architecture. So, let me explain the similarities and differences between the two.

Microservice architecture is another implementation approach of the design philosophy behind SOA architecture, and this is where the two share similarities. As for the differences, there are mainly three points. Once you understand these three points, you will find it easier to distinguish between them in development:

  • In SOA, services can only belong to one application, whereas in microservices, services are independent and can be shared by multiple applications.
  • SOA emphasizes maximum sharing, while microservices emphasize minimum sharing.
  • In the SOA architecture, services communicate through ESB (Enterprise Service Bus), whereas in microservices, the communication between services is achieved through lightweight mechanisms like RESTful APIs.

Service Mesh #

When talking about microservices, I mentioned that some of the issues in microservice architecture can be resolved using microservice development frameworks such as Spring Cloud and Dubbo. However, there is another issue: these frameworks are usually invasive, meaning they impose certain language restrictions, such as being limited to Java, and require developers to follow specific development methodologies set by the framework. This contradicts the idea of microservices with independent technology stacks.

In late 2017, the advent of Service Mesh addressed this issue. Service Mesh is a non-invasive technology that provides functionalities such as network communication, traffic control, circuit-breaking, and service monitoring between services. Similar to the TCP/IP protocol, Service Mesh does not require application-level awareness, allowing developers to focus on developing the application itself. Therefore, Service Mesh aims to be an infrastructure layer that solves inter-service communication. It has the following characteristics:

  • Middleware layer for inter-application communication.
  • Lightweight network proxies.
  • Non-invasive, with no impact on application code.
  • Decouples service governance functionalities, such as retry, timeout, monitoring, tracing, service discovery, and the services themselves.

Service Mesh has gained popularity, and there are many excellent open-source Service Mesh projects in the community, such as Istio and Linkerd. Currently, the most popular open-source project in this space is Istio.

Istio is a fully open-source service mesh that serves as a transparent layer within existing distributed applications, providing service governance and other functionalities. It is also a platform with APIs that can integrate with any logging, telemetry, and policy systems. The basic implementation principle of Istio is that each service is injected with a Sidecar component. Communication between services is first routed through the Sidecar, which then forwards the traffic to another service. Because all traffic passes through a Sidecar, many functions can be implemented through it, such as authentication, rate limiting, and service mesh. There is also a control plane, which configures the Sidecar to implement various service governance functions.

The latest version of Istio is 1.8. The architecture diagram of Istio 1.8 is as follows:

From the diagram, you can see that Istio mainly consists of two planes. One is the data plane, composed of Sidecars with the help of Envoy Proxy. The other is the control plane, mainly composed of three core components: Pilot, Citadel, and Galley. I will now introduce the functions of these three core components separately.

  • Pilot: Mainly used to manage the Envoy proxy instances deployed in the Istio service mesh, providing them with service discovery, traffic management, and resilience features such as A/B testing, canary releases, timeouts, retries, and circuit breaking.
  • Citadel: The core security component of Istio, responsible for managing service keys and digital certificates, providing functions such as automatic generation, distribution, rotation, and revocation of keys and certificates.
  • Galley: Responsible for providing support functions to other components of Istio. It can be understood as the configuration center of Istio. It is used to verify the format and content correctness of incoming network configuration information and provide these configuration information to Pilot.

FaaS Architecture #

In recent years, Serverless technology, represented by cloud functions, has gained popularity. With the development of Serverless technology, a new software development pattern called FaaS architecture has emerged.

The FaaS architecture provides a software architecture pattern that is more fragmented than microservices. Simply put, the FaaS architecture breaks down a complete business into individual functions for deployment, and the underlying functions are executed by triggering events.

Functions may call third-party components, such as databases and message queue services. In Serverless architecture, these third-party components are collectively referred to as BaaS (Backend as a Service). BaaS abstracts the backend service capabilities into APIs for users to call, so users do not need to be concerned about the high availability, scaling, and other operational aspects of these backend components, they only need to use them.

The following is an illustration of the FaaS architecture:

From this diagram, you can see that users trigger services such as API gateways, COS (Cloud Object Storage), and CLS (Cloud Log Service) through browsers, mobile devices, mini-programs, and other clients. These trigger services, after receiving requests from users, will trigger the cloud functions they are bound to. The cloud functions will start multiple concurrent instances in real-time based on request volume and other data. When triggering a cloud function, parameters are also passed to the function, which can be used in the function to perform some business logic processing. For example, calling third-party services and saving the processing results in a backend database.

In my opinion, the FaaS architecture will not become mainstream in the future, but it will be more prevalent in the context of cloud functions. I say this because if the application is broken down into individual functions, the deployment, maintenance, and communication between these functions will pose a significant challenge. From the current perspective, there are no technologies or conditions to solve this challenge yet. In addition, the FaaS architecture is not suitable for carrying heavy business logic, and it is not yet ready for large-scale migration of enterprise application systems.

Application Lifecycle Management Technologies: Monitoring and Alerting, Logging, Tracing #

In Lesson 09 of this course, I have detailed the evolution of application lifecycle management technologies. Here’s a quick summary: Initially, application lifecycle was primarily managed through development models, which included waterfall, iterative, and agile models. Then, to address some pain points in the development models, another management technology, CI/CD, emerged. As CI/CD technology matured, it gave rise to another advanced management technology called DevOps.

If you have forgotten any of the details, you can review Lesson 09 for a refresher. I will now supplement some important technologies related to application lifecycle management that were not covered before, including the following three:

  • Monitoring and alerting component: Prometheus;
  • Unified logging management framework: EFK (Elasticsearch, Fluentd, Kibana);
  • Tracing component: Jaeger.

It is important to note that these technologies do not have an evolutionary relationship with each other. Instead, they are on the same level and complement each other as application lifecycle management technologies.

Monitoring and Alerting Component: Prometheus #

Monitoring and alerting functionality is essential for applications as it allows developers or operations personnel to promptly detect program abnormalities and initiate timely repairs. In addition, monitoring can collect useful data for subsequent operational analysis. In the cloud-native technology stack, there are many excellent open-source monitoring and alerting projects, such as Zabbix and Prometheus. Among them, Prometheus is the most popular.

Prometheus is an open-source monitoring and alerting system with a built-in time-series database. Currently, Prometheus has become the standard monitoring and alerting system in Kubernetes clusters. It has the following characteristics:

  • Powerful multidimensional data model;
  • Flexible query language across multiple dimensions;
  • Independent of distributed storage, works as a single primary node;
  • Collects time-series data through HTTP-based pull mechanism;
  • Supports pushing time-series data via the Push Gateway;
  • Obtains the target servers to be collected through service discovery or static configuration;
  • Supports various visualization charts and dashboards (Grafana).

The architecture of Prometheus is shown in the following diagram:

From the diagram, we can see that the main modules of Prometheus include Prometheus Server, Exporters, Pushgateway, Alertmanager, and the Grafana graphical interface. Some of these modules are optional, while others are mandatory. Most of the components are implemented in Golang. Let me introduce each of them separately.

  • Prometheus Server (mandatory): The core service of Prometheus, which regularly pulls monitoring data from jobs/exporters or the Pushgateway and saves time-series data in a TSDB (time-series database).
  • Client Library (mandatory): The client library for Prometheus allows application developers to easily generate metrics and expose an API interface for the Prometheus server to pull metrics data.
  • Pushgateway (optional): Accepts short-lived jobs’ pushed metrics data and caches it for the Prometheus server to pull periodically.
  • Exporters (optional): Runs as agents on application servers that need to collect monitoring data, collects application monitoring data, and provides an API interface for the Prometheus server to pull metrics data.
  • Alertmanager (optional): The alert component of Prometheus, it receives alerts from the Prometheus server, deduplicates and groups them, and sends alerts to the configured destination.
  • Grafana (optional): Grafana is a cross-platform, open-source visualization tool used for statistical analysis and display of Prometheus monitoring data. It also has built-in alerting capability and is developed using the Go language.

The general workflow of Prometheus is as follows:

  1. The Prometheus server periodically pulls metrics from configured jobs or exporters, or receives metrics from the Pushgateway, or pulls metrics from other Prometheus servers.
  2. The Prometheus server stores the collected metrics locally and runs predefined alert.rules to record new time series or send alerts to the Alertmanager.
  3. The Alertmanager processes the received alerts according to the configuration file and sends the alerts to the configured recipients.
  4. Grafana visualizes the collected data in the graphical user interface.

Prometheus stores all the collected sample data in-memory as time-series data and periodically saves it to the disk. Time-series data is stored in chronological order based on timestamps and values. Each time-series is named based on the metric name and a set of labels, as shown below:

<--------------- metric ---------------------><-timestamp -><-value->
http_request_total{status="200", method="GET"}@1434417560938 => 94355
http_request_total{status="200", method="GET"}@1434417561287 => 94334

http_request_total{status="404", method="GET"}@1434417560938 => 38473
http_request_total{status="404", method="GET"}@1434417561287 => 38544

http_request_total{status="200", method="POST"}@1434417560938 => 4748
http_request_total{status="200", method="POST"}@1434417561287 => 4785

Each point in a time-series is called a sample. A sample consists of the following three parts:

  • Metric: The metric name and a labelset that describes the current sample’s characteristics.
  • Timestamp: A timestamp accurate to milliseconds.
  • Sample value: A float64 floating-point data representing the value of the current sample.

Unified Logging Management Framework: EFK (Elasticsearch, Fluentd, Kibana) #

We have detected program abnormalities through the monitoring and alarm service, at this time developers or operations personnel need to intervene for troubleshooting. The most effective way to troubleshoot is to check the logs. Therefore, for an application, an excellent logging system is also an essential feature.

In a large distributed system, there are many components deployed on different servers. If the system fails, you need to check the logs for troubleshooting. At this time, you may need to log in to different servers and check the logs of different components. This process is very tedious and inefficient, and it also leads to longer troubleshooting time. The longer the downtime, the greater the losses to customers.

Therefore, in a large system, traditional log viewing methods are no longer able to meet our needs. At this time, we need a log solution specifically designed for distributed systems. Currently, there are many mature distributed log solutions in the industry, among which the most commonly used is the EFK log solution. It can even be said that EFK has become the de facto standard for distributed log solutions.

EFK includes three open-source software: Elasticsearch, Filebeat, and Kibana. Below, I will introduce these three open-source software:

  • Elasticsearch: Short for ES, it is a real-time, distributed search engine commonly used for indexing and searching large-scale log data, and it supports full-text and structured searches.
  • Filebeat: A lightweight data collection component that runs on servers that need to collect logs as an agent. Filebeat collects specified files and reports them to Elasticsearch. If the log volume is large, it can also be reported to Kafka, and other components can consume logs from Kafka and dump them to Elasticsearch.
  • Kibana: Used to visualize the log data stored in Elasticsearch, it supports advanced data analysis and display through charts.

The architecture diagram of EFK is as follows:

Filebeat collects logs of various service components on the servers and uploads them to Kafka. Logstash consumes logs from Kafka, filters them, and reports them to Elasticsearch for storage. Finally, Kibana is used to visualize these logs. Kibana retrieves log data by calling the API provided by Elasticsearch.

When the log production speed of Filebeat does not match the log consumption speed of Logstash, the Kafka service in between plays the role of smoothing out the peak and valley.

Distributed Tracing Component: Jaeger #

In cloud-native architecture, applications commonly adopt microservices. An application contains multiple microservices that may call each other, which poses great challenges for troubleshooting. For example, when we encounter errors when accessing an application through the frontend, we have no idea which specific service or step has a problem. At this time, the application needs to have distributed tracing capability. Currently, there are also various distributed tracing systems in the industry, with Jaeger being the most widely used.

Jaeger is an open-source distributed trace system launched by Uber and compatible with OpenTracing API. Let’s first introduce two concepts:

  • OpenTracing: It is an open-source tracing standard that provides a vendor-neutral, platform-neutral API to support developers in easily adding/changing implementations of tracing systems.
  • Distributed tracing system: Used to record information within the scope of a request, it is a powerful tool for troubleshooting system issues and performance. There are many types of distributed tracing systems, but they all have three core steps: code instrumentation, data storage, and query/display.

The architecture diagram of Jaeger is as follows:

Jaeger has seven key components, and I will introduce them in detail below:

  • instrument: It loads the application with jaeger-client so that the application can report trace data to Jaeger.
  • jaeger-client: The client SDK of Jaeger, responsible for collecting and sending the trace data of the application to the jaeger-agent.
  • jaeger-agent: It receives and aggregates Span data and reports these data to jaeger-collector.
  • jaeger-collector: It collects trace information from jaeger-agent and processes these information through the processing pipeline before writing them to the backend storage. The jaeger-collector is a stateless component that can be horizontally scaled as needed.
  • Data Store: The backend storage component of Jaeger. Currently, it supports Cassandra and Elasticsearch.
  • jaeger-ui: The frontend interface of Jaeger, used to display the trace information and other details.
  • jaeger-query: Used to retrieve traces from the storage and provide them to jaeger-ui.

Next, I will use a tutorial provided by the official Jaeger documentation, the All in One tutorial, to help you better understand Jaeger. It can be divided into two steps.

Step 1: Install the Jaeger service using jaeger-all-in-one:

$ wget https://github.com/jaegertracing/jaeger/releases/download/v1.25.0/jaeger-1.25.0-linux-amd64.tar.gz
$ tar -xvzf jaeger-1.25.0-linux-amd64.tar.gz
$ mv jaeger-1.25.0-linux-amd64/* $HOME/bin
$ jaeger-all-in-one --collector.zipkin.host-port=:9411

Step 2: Start an example application called HotROD to generate traces:

$ example-hotrod all # Step 1) We have already installed the example-hotrod command.

You can access http://$IP:16686/search to search for traces (where IP is the IP address of the server where Jaeger is deployed), as shown in the following screenshot:

After querying the list of traces, you can click on any trace to view its detailed call process, as shown in the following screenshot:

For specific instructions on how to use Jaeger to record traces, you can refer to the hotrod example provided by the Jaeger official documentation.

Summary #

Finally, let’s take an overall look at the evolution of cloud technology through the following diagram:

From this diagram, you can see that each technology does not exist in isolation, but rather they mutually promote each other. In the physical machine stage, we used waterfall development and monolithic architecture. In the virtual machine stage, agile development and SOA architecture were more commonly used. In the container stage, CI/CD development model and microservices architecture were used.

In the Serverless stage, the software architecture still adopts microservices, but in some trigger scenarios, functions of FaaS architecture may be written and deployed on FaaS platforms such as Tencent Cloud Function. The underlying system resources mainly use Serverless containers, combined with Kubernetes resource orchestration technology. Cloud functions may also be used in some trigger scenarios. Third-party services (BaaS) in the application also become more Serverless. Application lifecycle management technologies will evolve into the CI/CD/CO model, with CI/CD becoming more intelligent and automated.

In this diagram, the shaded area represents the stage we are currently in: container technology has been widely adopted, and the industry is actively exploring Serverless technology with fruitful results.

Homework Exercises #

  1. Get familiar with the declarative API mechanism of Kubernetes and think about what the software architecture might look like after implementing a microservices architecture.
  2. Set up a Prometheus service, generate some data, and configure Grafana to visualize the monitoring data.

Feel free to leave a message in the comment section to discuss and exchange ideas. See you in the next class.