20 Ui Optimization Part1 Several Key Concepts of Ui Rendering

20 UI Optimization Part1 Several Key Concepts of UI Rendering #

Before we begin today’s lesson, I wish all the students a happy Chinese New Year, successful work, good health, and happiness for your family. Shao Wen is here to extend New Year greetings to you!

Every Android developer who works on UI must have been a fallen angel in their previous life.

Over the years, there has been a group of miserable Android developers who have suffered from fragmentation. They have faced various screen sizes and resolutions of mobile phones, and have also had to deal with “fierce” products and UI designers. Day after day, year after year, they have been doing UI adaptation and optimization work, wasting their youth. Unfortunately, in the past two years, this trend seems to have become more intense: notch screens, full-screen displays, and the upcoming flexible foldable screens will make UI adaptation more complex.

So what does UI optimization actually refer to? In my opinion, UI optimization should include two aspects: one is efficiency improvement, which means we can efficiently convert UI design into application interfaces and ensure that the UI interfaces are consistent on different sizes and resolutions of phones; the other is performance improvement, which means we need to ensure a smooth user experience while correctly implementing complex and fancy UI designs.

So how can we rescue ourselves from the endless UI adaptation?

Background Knowledge of UI Rendering #

What exactly is UI rendering? Android’s graphics rendering framework is very complex, and there are significant differences between different versions. But no matter what, they are all used to display the views or elements in our code on the screen.

As the hardware directly facing users, the screen is something manufacturers pay great attention to, including factors like thickness, color, power consumption, etc. From the small black and white screens of feature phones to the current large full-screen displays, let’s first take a look at the development history of mobile screens.

1. Screens and Adaptation

As consumers, we usually pay attention to the dimensions, resolution, and thickness of screens. The fragmentation problem of Android is distressing, and the difference in screens is the “center” of this fragmentation problem. Screens range in size from 3 inches to 10 inches, with resolutions ranging from 320 to 1920, causing great difficulties for UI adaptation.

In addition, the material of the screen is also a crucial factor. Currently, the mainstream screens of smartphones can be divided into two categories: LCD (Liquid Crystal Display) and OLED (Organic Light-Emitting Diode).

The latest flagship phones, such as the iPhone XS Max and Huawei Mate 20 Pro, use OLED screens. Compared to LCD screens, OLED screens have advantages in color, flexibility, thickness, and power consumption. It is because of these advantages that OLED material is used for full screens, curved screens, and future flexible foldable screens. For more details on the differences between OLED and LCD, you can refer to OLED和LCD区别 and 手机屏幕的前世今生，可能比你想的还精彩. Foldable screens are definitely the biggest highlight this year, but the unit cost of OLED is much higher than that of LCD.

For the problem of screen fragmentation, Android recommends using dp as a unit for sizing in order to adapt UI. Therefore, every Android developer should be very clear about concepts like px, dp, dpi, ppi, and density.

Screen density

By using dp and adaptive layouts, the problem of screen fragmentation can be basically solved, and it is the recommended screen compatibility adaptation solution for Android. However, it has two major problems:

Inconsistency. Due to the difference between the dpi and the actual ppi, the actual sizes of controls may vary on the same resolution smartphones.
Efficiency. Designer’s design drafts are in px units, so developers need to manually estimate the dp values for UI adaptation.

In addition to direct dp adaptation, there are several commonly used UI adaptation methods in the industry:

Constraint-based adaptation solution. It mainly includes using width and height qualifiers and smallestWidth qualifiers for adaptation. For more details, you can refer to the stable and efficient UI adaptation solution in Android and smallestWidth qualifier adaptation solution.
Today’s headline adaptation solution. By reflecting and correcting the density value of the system, you can refer to “A Low-Cost Android Screen Adaptation Method” and “Today’s headline adaptation solution”.

2. CPU and GPU

In addition to the screen, UI rendering relies on two core hardware components: CPU and GPU. Before UI components are drawn on the screen, they need to go through the Rasterization process, which is a time-consuming operation. The GPU (Graphic Processing Unit) is used for graphics processing and can help accelerate the Rasterization process.

As you can see from the diagram, the software rendering uses the Skia library, which is a 2D cross-platform graphics framework that can render high-quality graphics on low-end devices such as smartphones. Skia library is also used internally by Chrome and Flutter.

3. OpenGL and Vulkan

For hardware rendering, we use the OpenGL ES interface to utilize the GPU for drawing. OpenGL is a cross-platform graphics API that specifies a standard software interface for 2D/3D graphics processing hardware. OpenGL ES is a subset of OpenGL designed specifically for embedded devices.

In the official hardware acceleration documentation, you can see that many APIs have corresponding Android API level limitations.

Why is that? This is mainly due to the limitations imposed by the OpenGL ES version and system support. Until the latest Android P, there are still three APIs that have not been supported. For unsupported APIs, we need to use software rendering mode, which significantly reduces rendering performance.

Android 7.0 upgraded OpenGL ES to the latest 3.2 version and also added support for Vulkan. Vulkan is a low-overhead, cross-platform API for high-performance 3D graphics. Compared to OpenGL ES, Vulkan has significant advantages in improving power consumption, multi-core optimization, and rendering performance enhancement.

In China, the game “King of Glory” was one of the early adopters of Vulkan. Although there are still some compatibility issues, the Vulkan version of “King of Glory” has significantly improved smoothness and frame stability. Even in the most intense team battles, it can maintain a stable frame rate of 55-60.

Evolution of Android Rendering #

Just like power consumption, the UI rendering performance of Android is also something that Google has long been focused on. It is a topic that is often covered in Google I/O. Every developer hopes that their applications or games can achieve a silky smooth 60fps rendering, but compared to the iOS system, Android’s rendering performance has always been criticized.

In order to catch up with iOS, the Android system has made a lot of optimizations in every version. Before understanding Android’s rendering, it is necessary to first understand the overall architecture of the Android graphics system and its main components.

I once read a vivid analogy in an article. If we consider the process of application graphics rendering as a painting process, then the role of each graphic component in Android during this painting process is:

Brush: Skia or OpenGL. We can use Skia brush to draw 2D graphics, and use OpenGL to draw 2D/3D graphics. As mentioned earlier, the former uses CPU rendering, while the latter uses GPU rendering.
Canvas: Surface. All elements are drawn and rendered on this Surface canvas. In Android, a Window is a container for Views, and each window is associated with a Surface. The WindowManager is responsible for managing these windows and passing their data to SurfaceFlinger.
Easel: Graphic Buffer. The Graphic Buffer is used for drawing the application’s graphics. Prior to Android 4.1, double buffering was used, while after Android 4.1, triple buffering was used.
Display: SurfaceFlinger. It takes all the Surfaces provided by the WindowManager, synthesizes them through the Hardware Composer, and outputs them to the display.

Next, I will use the method of analyzing the evolution of Android rendering to help you further understand Android rendering.

1. Android 4.0: Enabling Hardware Acceleration

Before Android 3.0, or when hardware acceleration is not enabled, the system would use software rendering for UI.

The entire process is shown in the diagram above:

Surface: Each View is managed by a window, and each window is associated with a Surface.
Canvas: Obtains a Canvas through the lock function of the Surface. The Canvas can be understood as an encapsulation of the Skia underlying interface.
Graphic Buffer: SurfaceFlinger helps us manage a BufferQueue. We obtain the Graphic Buffer from the BufferQueue, and then use the Canvas and Skia to rasterize the drawing content onto it.
SurfaceFlinger: Swap Buffer hands the content of the front Graphic Buffer to SurfaceFlinger. Finally, the hardware composer Hardware Composer synthesizes and outputs to the display.

Isn’t the entire rendering process very simple? But as I mentioned earlier, CPU is not very efficient in graphics processing, and this process does not fully utilize the high performance of the GPU.

Hardware Acceleration Rendering

So starting from Android 3.0, Android began to support hardware acceleration, and it was enabled by default in Android 4.0.

The difference between hardware acceleration rendering and software rendering is very significant. The core is that we use the GPU to complete the rendering of the Graphic Buffer’s content. In addition, hardware rendering introduces the concept of a DisplayList. Each View has its own DisplayList, which is marked as dirty when a View needs to be redrawn.

When a redraw is needed, only the DisplayList of that View needs to be redrawn, instead of recursively upward like in software rendering. This greatly reduces the number of drawing operations and improves rendering efficiency.

2. Android 4.1: Project Butter

Optimization has no end. In the 2012 Google I/O conference, Google announced Project Butter, and officially launched this mechanism in Android 4.1.

Project Butter mainly consists of two components, VSYNC and Triple Buffering.

VSYNC Signal

When I talked about file I/O and network I/O, I mentioned the concept of interrupts. For Android 4.0, the CPU may not have enough time to process UI rendering due to being busy with other tasks.

To solve this problem, Project Butter introduced VSYNC, which is similar to a clock interrupt. When a VSYNC interrupt is received, the CPU immediately prepares the Buffer data. Since most display devices have a refresh rate of 60Hz (refresh 60 times per second), it means that the preparation work for one frame of data needs to be completed within 16ms.

This kind of application always starts drawing on the VSYNC boundary, and SurfaceFlinger always performs composition on the VSYNC boundary. This eliminates stuttering and improves the visual performance of graphics.

Triple Buffering Mechanism

Before Android 4.1, Android used a double buffering mechanism. How to understand this? Generally speaking, different views or activities share the same window, which means they share the same surface.

Each surface has a buffer queue managed by SurfaceFlinger through anonymous shared memory mechanism to interact with the application layer of the app.

The whole process is as follows:

Each surface has two graphic buffers internally, one for drawing and one for displaying. We draw the content to an off-screen buffer first, and then when it needs to be displayed, we copy the content of the off-screen buffer to the front graphic buffer through swap buffer.
In this way, SurfaceFlinger obtains the content that a certain surface will eventually display. However, at the same time, we may have multiple surfaces. These surfaces may be from different apps or they may belong to the same app, like SurfaceView and TextureView, which all have their own separate surfaces.
At this point, SurfaceFlinger hands over all the content that needs to be displayed to Hareware Composer, which will composite the content into the final display content according to position, Z-Order, and other information. This content will be displayed by the system’s frame buffer (frame buffer is very low-level, it can be understood as the abstraction of screen display).

If you understand the principle of double buffering, it is easy to understand what triple buffering is. If there are only two graphic buffers A and B, if the CPU/GPU drawing process takes longer than one VSYNC signal period, because the data in buffer B is not yet ready, only the content of buffer A can be displayed. As a result, both buffer A and B are occupied by the display device and GPU, and the CPU cannot prepare the data for the next frame.

If an additional buffer is provided, CPU, GPU, and the display device can work with their own buffers without interfering with each other. In simple terms, the triple buffering mechanism adds another graphic buffer to the double buffering mechanism, which maximizes the utilization of idle time. The downside is that it uses an additional graphic buffer, which occupies memory.

For a more detailed introduction to VSYNC signals and Triple Buffering, you can refer to the article “Android Project Butter Analysis”.

Data Measurement “To do a good job, one must first sharpen one’s tools.” Project Butter not only optimizes UI rendering performance but also helps us better identify UI-related issues.

In Android 4.1, Systrace was introduced as a performance data sampling and analysis tool. We have used Systrace many times in lag and startup optimization, and it can also be used to detect the rendering situation of each frame.

Tracer for OpenGL ES is also a tool introduced in Android 4.1. It can record the drawing process of App using OpenGL ES frame by frame and function by function. It provides the time consumed by each OpenGL function call, making it useful for performance analysis. When Traceview and Systrace are not helpful in analyzing rendering problems, this tool comes in handy due to its powerful recording capabilities.

In Android 4.2, the system added a tool for detecting excessive rendering. For specific usage guidelines, please refer to “Inspect GPU rendering speed and overdraw”.

3. Android 5.0: RenderThread

After the Project Butter, Android’s rendering performance has greatly improved. However, have you noticed an issue? Although we are leveraging the high-performance graphics computation of the GPU, the entire calculation and rendering process from computing the DisplayList to drawing to the Frame Buffer are all done in the UI main thread.

The UI main thread is responsible for both parenting and caretaking, and the tasks are too heavy. If the entire rendering process is time-consuming, it may result in unresponsive user operations and cause lags. The GPU has better rendering capabilities for graphics. If we use the GPU to draw and render graphics on different threads, the entire process will be smoother.

For this reason, Android 5.0 introduced two significant changes. One is the introduction of the RenderNode concept, which further encapsulates the DisplayList and some display attributes of views. The other is the introduction of RenderThread, where all GL commands are executed. The rendering thread stores all the information of the rendered frames in the RenderNode and can perform property animations. This way, even if the main thread has time-consuming operations, the smoothness of animations can be ensured.

In the official documentation “Inspect GPU rendering speed and overdraw”, we can also enable Profile GPU Rendering inspection. From Android 6.0 onwards, the time consumed in each calculation and drawing phase is outputted below:

If we convert the above steps into a thread model, we can obtain the following pipeline model. After synchronizing data to the GPU, the CPU generally does not block and wait for the GPU to finish rendering but returns after notifying the end. RenderThread takes on more drawing work and relieves a lot of pressure on the main thread, improving the responsiveness of the UI thread.

4. The Future

In Android 6.0, Android added more detailed information to gxinfo. In Android 7.0, some refactoring was done on HWUI, and it also supported Vulkan. In Android P, Vulkan 1.1 is supported. I believe that better support for Vulkan will be an inevitable direction in the near future, maybe in Android Q.

Overall, the optimization of UI rendering will inevitably proceed in two directions. One is to further squeeze the performance of the hardware to make the UI smoother. The other is to improve or add more analysis tools to help us more easily identify and locate problems.

Conclusion #

Today, through the evolutionary process of Android rendering, we have deepened our understanding of the Android rendering mechanism, which will greatly help us in our UI rendering optimization work.

However, everything has two sides. Although hardware-accelerated rendering greatly improves the speed of Android system display and refresh, it also has some problems. On the one hand, there is memory consumption. OpenGL API calls and Graphic Buffer buffers will occupy at least a few MB of memory, and in reality, it will occupy even more. However, the most serious problem is compatibility. The lack of support for some drawing functions is one part of it, and even more frightening is the bugs in the hardware-accelerated rendering process itself. Due to the fact that every version of Android has done some refactoring of the rendering module, some inexplicable problems often occur in certain scenarios.

For example, every application has a certain number of crashes related to libhwui.so. At one point, this crash accounted for more than 20% of our total crashes. We spent a whole month internally, conducting dozens of grayscale tests and using various methods such as Inline Hook and GOT Hook. Finally, we identified the cause of the problem as a bug in the data synchronization between the internal RenderThread and the main thread of the system, and solved it by avoiding it.

Homework #

People often say that the iOS system is smoother. How much do you know about Android UI rendering? In your daily work, what method do you use for UI adaptation, and what do you think is the biggest pain point in rendering? Feel free to leave a comment and discuss with me and other students.

I’m not very experienced in UI rendering either, so if you have better ideas and thoughts regarding what was mentioned in the article, please leave a comment and share your thoughts.

The Android rendering architecture is quite complex and evolves rapidly. If you still have areas that you don’t understand, you can further read the following references:

2018 Google I/O: Drawn out: how Android renders
Official documentation: Android graphics architecture
Browser rendering: The birth of a pixel
Android screen rendering mechanism and hardware acceleration
Android performance optimization: rendering

Feel free to click “Share with friends” to share today’s content with your friends and invite them to study together. Also, do not forget to submit today’s homework in the comments section. I have prepared a generous “study encouragement package” for students who complete the homework seriously. Looking forward to learning and improving together with you!