03 Advanced Properties How to Handle Production Level Applications Skills You Need to Master

03 Advanced Properties How to handle production level applications- skills you need to master #

Hello, I’m Jingyuan.

Through previous learning, we have mastered the basic features and operations of Function Compute, and can develop and maintain some simple Serverless business scenarios by configuring triggers.

However, these operations are definitely not enough for complex scenarios or applications that need to be deployed in production environments. Take the microservices that we are familiar with as an example. From development to deployment and operation, we need to pay attention to code reuse, the flexibility of development frameworks, gradual release during deployment, and measures for disaster recovery in production environment. We also need to consider how to handle traffic fluctuations for upstream and downstream services.

So, do we not need to pay attention to these aspects in Serverless Function Compute? Although Serverless is about serverless architecture, if we want to utilize Function Compute platform effectively or develop our own FaaS-based Serverless platform, we should have a clear understanding of these factors in order to manage Serverless better.

In today’s lesson, we will follow the order of “development-deployment-operation” and refer to the familiar microservices process to see what advanced skills you need to know in Function Compute. Let’s start with the development phase of an application.

How to extract common service capabilities? #

When developing microservices, if the functionality is more complex, it requires collaboration among multiple individuals or the reuse of existing libraries. Does Function Compute have similar features?

The introduction of layers is to solve this problem. You can extract the common libraries that the function depends on into layers to reduce the size of the code package during deployment and updates.

Currently, most cloud providers support layers in various languages, such as Java, Python, Node.js, PHP, etc. For runtimes that support layer functionality, Function Compute adds a specific directory (e.g. /opt) to the search path for runtime language dependencies. You can customize layers or use pre-built common layers provided by cloud providers, which include some commonly used dependencies.

For custom layers, you usually need to package all the content into a ZIP file and upload it to the Function Compute platform. The Function Compute runtime will then unzip the contents of the layer and deploy them to a specific directory, such as the /opt directory.

After using layer functionality, there are three benefits:

The function package is smaller, focusing only on the core code, making development easier and more convenient.
Using layers can also avoid unknown errors in the process of creating function ZIP packages and dependencies.
Since layers can be imported and used in multiple functions, unnecessary waste of storage resources can be reduced.

Of course, different cloud providers have different ways of constructing layers. Let’s take Baidu Cloud as an example and try to build a Node.js runtime layer together.

First, let’s create a directory, e.g. mkdir my_nodejs_layer, and create a directory named “nodejs” inside that directory. The directory structure looks like this:

➜ my_nodejs_layer tree
.
└── nodejs
    1 directory, 0 files

Next, assuming that the lodash JavaScript utility library is needed in our function code, we will package lodash into our custom layer. Go into the nodejs directory and execute npm install lodash. Let’s take a look at the directory structure again:

➜ my_nodejs_layer tree -d
.
└── nodejs
    └── node_modules
        └── lodash
            └── fp

4 directories

Finally, package the nodejs directory in the my_nodejs_layer directory:

zip -r nodejs_layer_lodash.zip nodejs

Now, our custom layer dependency is ready. Let’s take a look at how the layer works after being uploaded:

After uploading the layer, Function Compute will upload the ZIP package of the layer to an object storage. When the function is invoked, the layer’s ZIP package will be downloaded from the object storage and extracted to the /opt directory. You only need to access the /opt directory in the code to read the layer’s dependencies and common code.

When uploading multiple layers, pay attention to the order, as later layers will overwrite files in the same directory.

So far, we have built and uploaded our own layers. From now on, when writing functions, we can use them by following the agreed-upon path. Isn’t it convenient? It is somewhat similar to the dependency packages of our microservices. Usually, we also upload custom libraries to a repository for subsequent developers to use more conveniently.

How to develop and deliver quickly? #

Layers can indeed solve the problem of using common code logic and tool libraries. However, you may also be curious, “In the development of microservices, can I use a language and framework that I am good at to develop code flexibly, as long as I follow the interface protocol specification between services, so as to develop a module to provide service capabilities? So, does Serverless function computing also have similar capabilities that allow me to quickly and flexibly handle a service and execute it?”

The answer is yes. In order to allow more developers to develop their own services quickly and flexibly, in addition to developing function bodies using the standard runtimes provided by the function platform that you are familiar with, the function computing platform usually also supports users to upload applications and run them using custom images.

Let’s take Alibaba Cloud custom images as an example to explain how it works. Before initializing the execution environment instance, the function computing system plays the role of the service for this function, obtains temporary usernames and passwords, and pulls the image. After the pull is successful, it will start the HTTP Server defined by you according to the specified startup command, parameters, and port. This HTTP server will then take over all the requests invoked by the function computing system.

Before developing the specific logic of functions, we generally need to determine whether we are developing event functions or HTTP functions, and usually, the implementation logic of these two types of invocation methods is different for the platform service backend.

Now let’s take a closer look at the process of building and using custom images, including five steps.

First, before creating the function, we need to implement an HTTP Server based on our own business logic and package it as an image to upload to the image repository for use with function computing.

Different cloud vendors have different implementation methods, mainly reflected in the protocol adaptation of request and response interfaces and the way services are run. Regarding the implementation of the HTTP Server, there are four points to note:

It is recommended to implement the corresponding interface functionality based on the trigger method of HTTP and events;
The values in the request header and body need to be parsed and processed according to the interface protocol specification;
When uploading the image, you need to specify the container’s startup command (Command), parameters (Args), and listening port (Port);
It is recommended to implement a Healthz function for health monitoring.

Second, we create a function. At this time, you only need to set some basic properties related to the function runtime, such as the timeout period, memory size, concurrency, etc., and then associate it with the image built in the first step.

Third, based on our own business scenario, select a suitable trigger method to request the function computing platform. For example, we can link an event source by creating and setting an HTTP trigger.

Fourth, make the request through the HTTP trigger. Since the custom image exists in the form of an HTTP Server, we usually set the listening IP, port (such as 0.0.0.0:9000), timeout period, HTTP parameters (such as Keep-alive), etc. It should be noted that if this is the first time starting, the function computing platform will pull the custom image from the image repository and start the container.

Finally, the function computing service controller will schedule the received request to the corresponding HTTP Server, which will take over the subsequent service processing.

In this way, we can achieve the ability to develop and deliver quickly through the custom image functionality, and the cost is also very low for transforming and migrating old services.

How to gracefully switch traffic? #

I am often asked by clients how to perform a gray release and ensure smooth traffic switching for our function feature development. They also ask how to conduct small-scale experiments before fully deploying the feature. This is a common question for those who are new to FaaS. Next, let’s take a look at how function compute achieves graceful traffic switching.

Here are two concepts: version and alias. Function compute supports creating aliases for published function versions. An alias can be understood as a pointer to a specific version. You can use aliases to easily achieve traffic switching, version rollback, and gray release. The following diagram illustrates this mechanism well.

Taking HTTP triggers as an example, without aliases, you would need to manually modify the version number associated with the HTTP trigger every time a new version is released, as shown in the above diagram.

In actual business development, you usually develop based on the code of the latest version, $LATEST. For a developed function, you can publish a version, such as Public Version 1. The function compute platform will record the published version information, and by configuring an alias to point to this version, it can be used to handle external requests.

Moreover, you can continue to develop on the $LATEST version. Similarly, after development is complete, simply publish a new version, Public Version 2.

If, at this point, we need to switch traffic to the newly released version, we only need to change the pointer of the alias, and users will be unaware of this process and there will be no loss of traffic.

If we are not confident in the new functional version, we can also set the traffic ratio between the main version and the new version. For example, we can set 95% of the traffic to the main version and 5% to the latest version. After confirming that everything is working fine, we can switch all traffic to the new version. If any issues arise, we can simply revert by changing the pointer back to the original version or by disabling the test traffic for the new version.

This way, version iteration and upgrade can be gracefully completed. Aliases cannot exist independently from services or versions. When accessing a service or function using an alias, function compute resolves the alias to the version it points to, and callers do not need to know the specific version the alias points to. With version control and aliases, we can perform graceful traffic changes without affecting the online service.

How to Smooth and Implement Disaster Recovery? #

Now that we have addressed the issue of smoothness during deployment, what about when functions are running? Have we also considered governance for services, similar to microservices?

Here, let’s focus on the issue of sudden traffic increases and abnormal service handling, which is a common topic in the context of microservices. This problem also needs attention in the Serverless field.

In microservices, the main approach to deal with traffic peaks is through asynchronous message queues. Of course, this only applies to scenarios that are not time-sensitive, such as log ETL, batch processing, and event handling. In the Serverless platform of function computing, for such scenarios, they often introduce message queues to first handle the surge in traffic and then schedule it to cloud functions through the backend scheduling system.

This approach not only avoids the situation where the backend resource pool is unable to scale in time due to a surge in traffic, leading to failures, but also allows the function computing platform to quickly respond to the requesting party, avoiding waiting. The function computing platform guarantees the reliable execution of requests.

Since we mentioned the situation where the resource pool fails to scale up in time or when scheduling issues occur in a resource pool like a public cloud, let’s consider whether the function computing platform has the ability for disaster recovery.

From the platform’s perspective, cloud service providers have implemented logic for synchronous retries and asynchronous delayed retries. For asynchronous information, they have even introduced incremental retry mechanisms and asynchronous strategies to handle them in the user’s own disaster recovery queue and disaster recovery processing functions in order to further ensure reliable message transmission and processing.

So, how does the platform achieve this? Generally speaking, cloud service providers choose to add a traffic forwarding and scheduling layer on top of the function computing engine. If an error message is received from a function instance, it can be selectively distributed based on the user-configured strategies at the scheduling layer.

From the user’s perspective, for scenarios that are not time-sensitive, it is recommended to choose an asynchronous approach to handle function computing, and at the same time, configure a disaster recovery queue or use the cloud function’s chain calling method to handle exceptional situations.

With this, we are almost at the end of what we want to discuss today. However, to achieve a mature governance model for microservices, both the Serverless function computing platform and the business still have a long way to go.

As FaaS acts as the “connector” and “glue language” of the cloud-native era, it is essential to interact with other services within the VPC when handling complex scenario logic. In the process of development and operations, it also requires a well-implemented development toolchain and support for observability. In the future, I will gradually explain the implementation mechanisms and practical strategies one by one.

Conclusion #

Finally, let me summarize today’s content. Although through the previous sections, you have already been able to use Function Compute in some simple scenarios, more advanced properties of Function Compute need to be further mastered for complex scenarios. These properties include layers, custom images, aliases and versions, and the use of asynchronous policies.

Function Compute platform allows you to easily encapsulate common code libraries and tool dependencies by providing the ability to use layers, solving the problems of code reuse and dependencies. Considering the needs of traditional developers, using custom images can not only improve development and delivery efficiency but also to a certain extent reuse the original CI/CD systems and old service modules, further enhancing productivity.

Considering that our services need to be deployed in a production environment, aliases and versions, this pair of twin brothers, have their uses. In fact, similar usage can be found in other cloud products as well, such as the role of aliases in Elasticsearch.

When running services online, it is essential to consider disaster recovery and handling peak pressure. Although Function Compute has the ability to scale elastically, taking into account the scheduling speed of resource pools and unknown error recovery, as a “FaaSer”, you should also learn to incorporate mature experience from microservices and apply it in the field of Function Compute and Serverless.

If your job is limited to developing business code using Function Compute platform, then through this section, you should be very familiar with the platform’s use and precautions, and you should have a deep understanding of developing a highly available and scalable function service.

If your role is a cloud platform developer, then this section is a good reference point for stepping into the field of Function Compute or further streamlining and supplementing the capabilities of Serverless platforms in the FaaS form.

Reflection Question #

Alright, this lesson comes to an end here, and I have left you with a reflection question.

In the serverless computing paradigm, how can advanced capabilities be developed to deal with long-running process scenarios?

Please feel free to write your thoughts and answers in the comments section. Let’s exchange and discuss together. Thank you for reading, and I welcome you to share this lesson with more friends for further discussion and progress.