37 How to Detect and Optimize the Overall Performance of Flutter Apps

37 How to Detect and Optimize the Overall Performance of Flutter Apps #

Hello, I am Chen Hang.

In the previous article, I shared with you three basic ways to debug Flutter code: logging, breakpoints, and layout debugging.

By using the debugPrint function, which allows for customizable printing behavior, we can achieve different log output behaviors in production and development environments, ensuring that debugging information printed during development will not be released to the live environment. With the breakpoint debugging options provided by the IDE (Android Studio), we can continuously adjust the code execution steps and pause conditions to narrow down the scope of the problem until we find the root cause. And if we want to find layout rendering bugs in the code, we can use the auxiliary lines and visual information provided by Debug Painting and Flutter Inspector to more accurately locate visual problems.

In addition to logical bugs and visual abnormalities, another common issue in mobile applications is performance problems, such as unsmooth scrolling, page stuttering, and dropped frames. Although these issues may not render the mobile app completely unusable, they can easily cause user dissatisfaction, raise questions about the quality of the app, and even lead to loss of patience.

So, if the app rendering is not smooth and there are performance issues, how can we detect and where should we start to address them?

In Flutter, performance problems can be categorized into GPU thread issues and UI thread (CPU) issues. To identify these issues, we need to perform preliminary analysis using performance layers, and once the problem is confirmed, we can use various analysis tools provided by Flutter to pinpoint the problem.

Therefore, in today’s article, I will guide you through the basic approach and tools for analyzing performance issues in Flutter applications, as well as common optimization methods.

How to use performance layers? #

To address the problem, we first need to understand how to measure the problem, and performance analysis is no exception. Flutter provides tools and means to measure performance issues, helping us quickly identify performance issues in the code. The performance layer is a powerful tool to help us determine the scope of the problem.

To use the performance layer, we need to start the app in profile mode. Unlike debugging code where logic bugs can be found in the debug mode using the simulator, performance issues need to be detected on a real device in release mode.

This is because debugging mode adds many additional checks (such as assertions) that can consume a lot of resources. More importantly, the debug mode uses JIT mode to run the app, resulting in lower code execution efficiency. This means that an app running in debug mode cannot truly reflect its performance issues.

On the other hand, simulators use the x86 instruction set, while real devices use the ARM instruction set. The binary code execution behavior of these two approaches is completely different. Therefore, there is a significant performance difference between simulators and real devices. Some operations that the x86 instruction set is good at may be faster on the simulator, while other operations may be slower. This also means that we cannot use the simulator to evaluate performance issues that can only appear on real devices.

To debug performance issues, we need to provide the analysis tool with a small amount of necessary application tracking information based on the release mode. This is the profile mode. In addition to some necessary tracking methods for debugging performance issues, the analysis mode of a Flutter app is similar to the compile and run process in release mode, except that the startup parameter becomes “profile”. We can start the app in Android Studio by clicking Run -> Profile ‘main.dart’ in the menu bar, or run the Flutter app with the command parameter flutter run --profile.

Analyzing Rendering Issues #

After the application has started, we can use the rendering analysis tools provided by Flutter, namely Performance Overlay, to analyze rendering issues.

The Performance Overlay displays the execution charts of the GPU and UI threads on top of the current application, using Flutter’s own drawing. Each chart represents the performance of the thread in the last 300 frames. If there is stuttering or frame skipping in the UI, these charts can help us analyze and find the cause.

The following image shows the appearance of the Performance Overlay. The performance of the GPU thread is shown on the top, and the performance of the UI thread is displayed below. The blue vertical line represents the executed normal frames, and the green line represents the current frame:

Performance Overlay

Figure 1: Performance Overlay

In order to maintain a refresh rate of 60Hz, the time consumed by each frame in the GPU and UI threads should be less than 16ms (1/60 second). If one frame takes too long to process, it will cause the interface to lag, and a red vertical bar will be displayed on the chart. The following image shows the appearance of the Performance Overlay when there are rendering and drawing delays:

Rendering and Drawing Delays

Figure 2: Rendering and Drawing Delays

If the red vertical bar appears on the GPU thread chart, it means that the rendered graphics are too complex to be rendered quickly. If it appears on the UI thread chart, it indicates that the Dart code consumes a lot of resources and needs to optimize code execution time.

Next, let’s take a look at GPU issue identification.

GPU Problem Localization #

GPU problems mainly involve the rendering time at the lower level. Sometimes, although the Widget tree is constructed easily, the rendering in the GPU thread is time-consuming. Operations such as Widget clipping and overlay rendering, or repeated drawing of static images due to lack of caching, will significantly slow down the GPU rendering speed.

We can use two parameters provided by the performance layers to check these two situations, namely checkerboardOffscreenLayers and checkerboardRasterCacheImages.

checkerboardOffscreenLayers #

Overlaying multiple views usually involves the saveLayer method in the Canvas, which is very useful for implementing specific effects (such as semi-transparency). However, because its underlying implementation involves repetitive drawing of multiple layers in GPU rendering, it can cause significant performance issues.

To check the usage of saveLayer, we only need to set the checkerboardOffscreenLayers switch to true in the initialization method of MaterialApp. The analysis tool will automatically detect the overlaying of multiple views: Widgets that use saveLayer will be displayed as a checkerboard pattern and flicker with page refresh.

However, saveLayer is a lower-level drawing method, so we generally do not use it directly, but indirectly through some functional Widgets, in scenarios involving clipping or semi-transparent overlays. So once we encounter such a situation, we need to think about whether it is necessary to do so and whether it can be accomplished through other means.

For example, in the following example, we use CupertinoPageScaffold and CupertinoNavigationBar to implement a dynamic blur effect.

CupertinoPageScaffold(
  navigationBar: CupertinoNavigationBar(), // Dynamic Blur Navigation Bar
  child: ListView.builder(
    itemCount: 100,
    itemBuilder: (context, index)=>TabRowItem(
      index: index,
      lastItem: index == 100 - 1,
      color: colorItems[index], // Set different colors
      colorName: colorNameItems[index],
    )
  ),
);

Figure 3 Dynamic Blur Effect

Due to the frequent updates of the view overlay effect during scrolling, the GPU rendering pressure can be felt by the checkerboardOffscreenLayers detection layer, which refreshes and flickers frequently.

Figure 4 Detection of saveLayer usage

If we do not have special requirements for the dynamic blur effect, we can use the Scaffold without the blur effect and a white AppBar to achieve the same functionality to solve this performance problem.

Scaffold(
  appBar: AppBar(
    title: Text(
      'Home',
      style: TextStyle(color: Colors.black),
    ),
    backgroundColor: Colors.white
  ),
  body: ListView.builder(
    itemCount: 100,
    itemBuilder: (context, index)=>TabRowItem(
      index: index,
      lastItem: index == 100 - 1,
      color: colorItems[index], // Set different colors
      colorName: colorNameItems[index],
    )
  ),
);

After running the code, we can see that after removing the dynamic blur effect, the GPU rendering pressure is relieved, and the checkerboardOffscreenLayers detection layer no longer flickers frequently.

Figure 5 Removing the Dynamic Blur Effect

checkerboardRasterCacheImages #

From a resource perspective, another type of operation that consumes a lot of performance is image rendering. This is because image rendering involves I/O, GPU storage, and conversion of data formats in different channels, so the construction of the rendering process consumes a lot of resources. In order to alleviate the pressure on the GPU, Flutter provides multi-level cached snapshots during the rendering process, so that static images do not need to be redrawn when the Widget is rebuilt.

Similar to the checkerboardOffscreenLayers parameter that checks for multiple view overlay rendering, Flutter also provides a switch called checkerboardRasterCacheImages to detect images that flicker frequently during interface redraw (i.e., no static cache) in the Widget.

We can place the images that need static caching in a RepaintBoundary. RepaintBoundary can determine the boundaries for redrawing the Widget tree. If the image is complex enough, the Flutter engine will automatically cache it to avoid repetitive refreshing. Of course, because the cache resources are limited, if the engine considers the image not complex enough, it may ignore the RepaintBoundary.

The following code demonstrates the specific usage of using RepaintBoundary to cache a static composite Widget. As you can see, there is no difference in the use of RepaintBoundary compared to ordinary Widgets:

RepaintBoundary(
  child: Center(
    child: Container(
      color: Colors.black,
      height: 10.0,
      width: 10.0,
    ),
  ),
);

Troubleshooting UI Thread Issues #

If the troubleshooting for GPU thread focuses on underlying rendering engine issues, then troubleshooting UI thread issues identifies performance bottlenecks in the application. For example, complex computations in the build method during view construction, or synchronous I/O operations in the main isolate. These issues noticeably increase the CPU processing time and slow down the application’s responsiveness.

In such cases, we can use the Performance tool provided by Flutter to record the application’s execution traces. Performance is a powerful performance analysis tool that displays CPU call stacks and execution time in a timeline format to inspect suspicious method calls in the code.

After clicking the “Open DevTools” button in the bottom toolbar of Android Studio, the Dart DevTools webpage will open automatically. Switch the top tab to Performance, and then we can start analyzing performance issues in the code.

Figure 6 Opening the Performance tool

Figure 7 Performance main interface

Next, we will demonstrate the analysis process using an example where MD5 is calculated within a ListView.

Considering that assembling rendering information in the build function is a common operation, we intentionally magnify the time-consuming MD5 calculation by iterating it 10,000 times:

class MyHomePage extends StatelessWidget {
  MyHomePage({Key key}) : super(key: key);

  String generateMd5(String data) {
    //MD5 fixed algorithm
    var content = new Utf8Encoder().convert(data);
    var digest = md5.convert(content);
    return hex.encode(digest.bytes);
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(title: Text('demo')),
      body: ListView.builder(
          itemCount: 30, // Number of list items
          itemBuilder: (context, index) {
            // Iteratively calculate MD5
            String str = '1234567890abcdefghijklmnopqrstuvwxyz';
            for (int i = 0; i < 10000; i++) {
              str = generateMd5(str);
            }
            return ListTile(title: Text("Index : $index"), subtitle: Text(str));
          } // List item creation method
          ),
    );
  }
}

Unlike automatically recording the application’s execution using layers for performance analysis, using Performance to analyze code execution traces requires manual triggering. After sampling and collecting information, click the “Stop” button to complete recording. At this point, we can obtain the application’s execution information.

The application’s execution recorded by Performance is called a CPU flame chart. It is generated based on the recorded code execution results and displays the CPU call stack, representing the CPU’s busy level.

The y-axis represents the call stack, where each layer corresponds to a function. The deeper the call stack, the higher the flame. The bottom of the chart represents the currently executing function, and the functions above it are its parent functions. The x-axis represents time units, and the wider a function extends along the x-axis, the more samples it has been taken, indicating a longer execution time.

Therefore, in order to detect CPU time-consuming issues, we can check which function occupies the widest width at the bottom of the flame chart. If there is a “flat top,” it indicates that the function may have performance issues. In our example, the flame chart looks like this:

Figure 8 CPU flame chart

As we can see, the execution time of the _MyHomePage.generateMd5 function is the longest, almost filling the entire width of the flame chart, which is consistent with the issue in the code.

After identifying the issue, we can use Isolate (or compute) to move these time-consuming operations outside the main isolate for concurrent execution.

Conclusion #

Well, that’s all for today’s sharing. Let’s summarize the main points of today’s content.

In Flutter, the performance analysis process can be divided into GPU thread problem locating and UI thread (CPU) problem locating. Both of them require launching the application in profiling mode on a real device and analyzing the performance layer to roughly identify rendering issues. Once the issues are confirmed, we need to use the analysis tools provided by Flutter to locate the problem.

For GPU thread rendering issues, we can focus on checking whether there are multiple views overlapping or static images repeatedly refreshing. For UI thread rendering issues, we analyze the code execution time and find the application’s execution bottleneck by using the flame graph (CPU frame graph) recorded by the Performance tool.

Generally speaking, because Flutter adopts a declarative UI design concept, rendering is driven by data and it uses a three-layer structure: Widget->Element->RenderObject. This design shields unnecessary interface refreshes and ensures that in most cases, the applications we build are highly performant. Therefore, after detecting performance issues using analysis tools, we usually don’t need to do too much detailed optimization work. We just need to avoid common pitfalls during the development process to achieve excellent performance. For example:

  • Control the build method runtime, break down Widgets into smaller pieces, and avoid directly returning a huge Widget. This way, the Widgets will have finer-grained reconstruction and reuse.
  • Try not to use semi-transparent effects for Widgets; consider using images instead. This way, the obscured areas of the Widgets don’t need to be drawn.
  • For lists, use lazy loading instead of creating all the child Widgets at once. This reduces the initialization time of the views.

Thought Question #

Finally, I have a thought question for you.

Please modify the example of calculating MD5 in ListView. Using concurrent Isolate (or compute), complete the MD5 calculation while ensuring the original functionality. Hint: You can use CircularProgressIndicator to display a loading animation during the calculation process.

Feel free to leave a comment in the comment section and share your thoughts. I’ll be waiting for you in the next article! Thank you for listening, and you’re welcome to share this article with more friends to read together.