03 Memory Optimization Part1 Discussing Memory Optimization in the Era of 4 Gb RAM

03 Memory Optimization Part1 Discussing Memory Optimization in the Era of 4GB RAM #

Before writing this article today, I browsed through the “Miscellaneous Thoughts on Android Memory Optimization” I wrote on the WeMobileDev public account three years ago. Looking at it again today, one sentence in particular resonated with me: “We cannot explain every technique used in memory optimization, and with the changing versions of Android, many methods may become outdated.”

Three years have passed, and phones with 4GB of RAM have become mainstream. Does memory optimization become less important now? Which techniques have become outdated, and what skills should we upgrade?

Today, in the era of 4GB RAM, I will discuss the topic of “memory optimization” once again.

Development of Mobile Devices #

Facebook has an open-source library called device-year-class, which differentiates devices’ performance based on the year. For example, in 2008, smartphones only had a measly 140MB of memory, while this year’s Huawei Mate 20 Pro has reached 8GB of memory.

Memory seems to be a concept that we are all familiar with, but how different is it between mobile devices and PCs? Does having 8GB of memory necessarily mean it is better than 4GB of memory? I think many people may not be able to answer that correctly.

The random access memory (RAM) of a mobile phone is equivalent to the memory in our PCs. It is a temporary storage medium for the data that is used during the running process of apps on the phone. Due to size and power consumption considerations, mobile phones do not use DDR memory like PCs; instead, they use LPDDR RAM, which stands for “Low Power Double Data Rate RAM.” The “LP” in LPDDR represents “Lower Power.”

Taking LPDDR4 as an example, the bandwidth is calculated as: bandwidth = clock frequency x memory bus width ÷ 8, which equals to 1600 x 64 ÷ 8 = 12.8GB/s. Because DDR memory is a double data rate, the final bandwidth is 12.8 x 2 = 25.6GB/s.

Currently, the mainstream mobile phones on the market mainly use LPDDR3, LPDDR4, and LPDDR4X as their RAM. It can be seen that LPDDR4 has double the performance of LPDDR3, while LPDDR4X has even lower operating voltage compared to LPDDR4, resulting in 20% to 40% lower power consumption. Of course, the data in the chart is standard data, and different manufacturers may have low-frequency or high-frequency versions, with higher frequency versions having better performance.

Is having more RAM in a mobile phone better?

If one phone uses 4GB of LPDDR4X memory and another uses 6GB of LPDDR3 memory, undoubtedly choosing the phone with 4GB of RAM would be more practical.

However, memory is not an isolated concept; it is related to factors such as the operating system and app ecosystem. Even with the same 1GB of memory, using the Android 9.0 system will provide smoother performance compared to the Android 4.0 system. Similarly, using the more closed and standardized iOS system will yield better performance compared to the more “wild” Android system. This year’s released iPhone XR and iPhone XS both use LPDDR4X memory, but they have only 3GB and 4GB of memory, respectively.

Memory Issues #

In the crash analysis mentioned earlier, I mentioned that “memory optimization” is a very important part of crash optimization work. Similar to OOM, many “abnormal exits” are actually caused by memory issues. So what kind of problems can memory cause?

  1. Two problems

The first problem caused by memory is abnormality. In the previous crash analysis, I mentioned the “abnormal rate”, which includes crashes such as OOM and memory allocation failure, as well as issues such as the application being killed or the device rebooted due to insufficient overall memory. I don’t know if you usually pay attention to this at work, but if we divide the memory of the user’s device into two parts, below 2GB and above 2GB, you can try to calculate their abnormal rates or crash rates separately and see how big the difference is.

The second problem caused by memory is jitter. Java memory shortage can lead to frequent GC, which is more obvious in the Dalvik virtual machine. The ART virtual machine has made a lot of optimizations in memory management and recycling strategies, and the efficiency of memory allocation and GC has increased by 5 to 10 times. If you want to test the performance of GC more specifically, such as pause suspension time, total time consumed, and GC throughput, we can obtain ANR logs by sending the SIGQUIT signal .

adb shell kill -S QUIT PID
adb pull /data/anr/traces.txt

It contains some ANR dump information and detailed performance information about GC.

sticky concurrent mark sweep paused:    Sum: 5.491ms 99% C.I. 1.464ms-2.133ms Avg: 1.830ms Max: 2.133ms     // GC pause time

Total time spent in GC: 502.251ms     // Total time consumed by GC
Mean GC size throughput: 92MB/s       // GC throughput
Mean GC object throughput: 1.54702e+06 objects/s

In addition, we can also use systrace to observe the performance time of GC. This part will be explained in detail in the column later.

In addition to frequent GC causing jitter, when physical memory is insufficient, the system will trigger the low memory killer mechanism, and high system load is another reason for jitter.

  1. Two misconceptions

In addition to the abnormalities and jitters caused by memory, when doing memory optimization and architecture design in daily work, many students are also prone to two misconceptions.

Misconception 1: The less memory used, the better

VSS, PSS, and Java heap memory shortage can all cause abnormalities and jitters. Some students believe that memory is a monster and the less memory used, the better the performance of the application. This understanding can easily be “too much”.

Whether the application uses too much memory depends on the device, system, and current situation, not an absolute value like 300MB or 400MB. When the system has sufficient memory, we can use more to achieve better performance. When the system is low on memory, we hope to achieve “time allocation, timely release”, just like the image below, when there is pressure on system memory, we can quickly release various caches to reduce system pressure.

Nowadays, phones with 6GB and 8GB of memory have appeared, and the Android system also hopes to improve the utilization of memory, so it is necessary to briefly review the changes in Android Bitmap memory allocation.

  • Before Android 3.0, the Bitmap object was placed on the Java heap, while the pixel data was placed in Native memory. If recycle is not called manually, the recycling of Bitmap Native memory relies entirely on the finalize function callback, as those familiar with Java should know, this timing is not very controllable.

  • Android 3.0 to Android 7.0 unified the Bitmap object and pixel data on the Java heap, so even if we don’t call recycle, the Bitmap memory will be freed along with the object. However, Bitmap is a memory-intensive resource, and placing its memory on the Java heap is not ideal. Even the latest Huawei Mate 20, with a maximum Java heap limit of only 512MB, can cause OutOfMemoryError if the Java heap memory is insufficient, despite having 5GB of physical memory. Placing Bitmap on the Java heap also causes a large number of garbage collections, which doesn’t fully utilize the system memory.

  • Is there a way to store Bitmap memory in Native and release it quickly along with the object, while also ensuring that this memory is taken into account during garbage collection to prevent misuse? NativeAllocationRegistry satisfies these three requirements in one go. Android 8.0 uses this mechanism to help reclaim Native memory and store pixel data in Native memory. Android 8.0 also introduced Hardware Bitmap, which reduces image memory consumption and improves drawing efficiency.

Misconception 2: Native memory doesn’t need to be managed

Although Android 8.0 moved Bitmap memory back to Native, does that mean we can freely use images?

The answer is definitely no. As mentioned earlier, when the system is low on physical memory, the Low Memory Killer (lmk) starts killing processes, starting from the background, desktop, services, and finally the foreground, until the phone is restarted. The system imagines a scene where everyone queues up according to priority, waiting to be killed.

The design of the Low Memory Killer assumes that we all follow the Android specifications, but it didn’t consider the Chinese context. Many applications in China are like indestructible cockroaches, kill one and five more will spring up. Frequent killing and launching of processes can also cause the system server to freeze. Of course, app performance has become more difficult to maintain after Android 8.0, but there are still some ways to overcome it.

Since we’re talking about storing image memory in Native, the familiar Fresco image library stores images in Native memory in Dalvik. In fact, the same effect can be achieved in Android 5.0 to Android 7.0, but the process is relatively more complex.

Step 1: By directly calling the Bitmap constructor in libandroid_runtime.so, an empty Bitmap object can be obtained, with its memory stored in the Native heap. However, the implementation differs slightly between different Android versions, so adaptation is required.

Step 2: Create a normal Java Bitmap using system methods.

Step 3: Draw the content of the Java Bitmap onto the previously allocated empty Native Bitmap.

Step 4: Release the allocated Java Bitmap to achieve the “stealing the dragon’s breath and turning it into a phoenix” effect.

// Step 1: Allocate an empty Native Bitmap
Bitmap nativeBitmap = nativeCreateBitmap(dstWidth, dstHeight, nativeConfig, 22);

// Step 2: Allocate a normal Java Bitmap
Bitmap srcBitmap = BitmapFactory.decodeResource(res, id);

// Step 3: Use the Java Bitmap to draw the content onto the Native Bitmap
mNativeCanvas.setBitmap(nativeBitmap);
mNativeCanvas.drawBitmap(srcBitmap, mSrcRect, mDstRect, mPaint);

// Step 4: Release the Java Bitmap memory
srcBitmap.recycle();
srcBitmap = null;

Although the final image memory is indeed stored in Native, this “black technology” has two main issues: compatibility problems and frequent allocation and release of Java Bitmap can easily cause memory fluctuations.

Measurement Method #

In daily development, sometimes we need to troubleshoot memory issues in applications. For the usage of system memory and application memory, you can refer to the “Investigate RAM” in Android Developer. (link: Investigate RAM)

adb shell dumpsys meminfo <package_name|pid> [-d]

1. Java Memory Allocation

Sometimes we want to track the usage of Java heap memory, and the most commonly used tools for this are Allocation Tracker and MAT.

In my previous article “Android Memory Allocation Analysis” (link: Android Memory Allocation Analysis), I mentioned three disadvantages of Allocation Tracker:

  • The obtained information is too scattered, with a lot of other information mixed in. Many information is not the application’s allocation, and it may require a lot of searching to locate the specific problem.

  • Similar to Traceview, it cannot achieve automated analysis and requires manual start/stop by developers each time. This may be inconvenient for analyzing certain problems and difficult for batch analysis.

  • Although Allocation Tracking does not have much performance impact on the phone’s execution, it often completely freezes the phone until the data is dumped when stopping. If it takes too long, it may even result in ANR (Application Not Responding).

Therefore, we hope to achieve a custom “Allocation Tracker” that does not rely on Android Studio, to automate the analysis of object memory. This tool can obtain application information such as size, type, and stack of all objects, and identify which objects occupy a large amount of memory during a certain period of time.

However, this method needs to consider compatibility issues. There are significant differences in the processing flow of Allocation Tracker between Dalvik and ART. The following are the ways to enable Allocation Tracker in Dalvik and ART:

// dalvik
bool dvmEnableAllocTracker()
// art
void setAllocTrackingEnabled()

We can use a custom “Allocation Tracker” to monitor Java memory, and it can also be extended to real-time monitoring of Java memory leaks. For students who have little experience in this area, don’t worry. I provide a custom “Allocation Tracker” as a “homework” today for your reference. However, if a tool only needs to achieve automated offline testing, it will be relatively simple to implement. But if you want to port it for online use, you need to pay more attention to compatibility, stability, and performance, and the effort required will be much higher than laboratory solutions.

In the “homework,” we will provide a simple example. Once you are familiar with the implementation principles of various tools in Android Studio Profiler, you can make various custom modifications. There will also be many examples for your reference and practice in future articles.

2. Native Memory Allocation

Android’s native memory analysis has always been done very poorly, but Google has made a lot of effort to make the whole process easier in recent versions.

First, Google deprecated Valgrind and recommended using Chromium’s AddressSanitize. Following the principle of “Who is in more pain, who needs it more, and who optimizes,” Chromium has produced a lot of native-related tools. Android’s previous support for AddressSanitize was not very good, requiring root access and a lot of operations. However, starting from Android 8.0, we can use AddressSanitize following this guide. Currently, AddressSanitize only supports memory leak detection on x86_64 Linux and OS X systems, but I believe Google will soon be able to support detection directly on Android.

So do we have a native memory allocation tool like Allocation Tracker? In this regard, Android’s current support is not very good, but Android Developer has recently added some related documents, which you can refer to in “Debugging Native Memory Use.” (link: Debugging Native Memory Use). There are two methods for native memory issues, namely “Malloc Debug” and “Malloc Hooks.”

“Malloc Debug” helps us debug certain native memory usage issues, such as heap corruption, memory leaks, and invalid addresses. Starting from Android 8.0, debugging native memory on non-root devices is supported, but just like AddressSanitize, it requires wrapping through wrap.sh.

adb shell setprop wrap.<APP> '"LIBC_DEBUG_MALLOC_OPTIONS=backtrace logwrapper"'

“Malloc Hooks” is the ability introduced in Android P for Android’s libc to intercept all allocation/free calls that occur during program execution, enabling us to build custom memory detection tools.

adb shell setprop wrap.<APP> '"LIBC_HOOKS_ENABLE=1"'

However, when using “Malloc Debug,” the entire app feels sluggish and sometimes even results in ANR. How to do automated analysis of native memory allocation and leaks in applications on Android is also something I have recently been working on. From what I understand, WeChat has also made some attempts with native memory leak monitoring in the past few months, and I will discuss it in detail in the next article.

Summary #

LPDDR5 will enter the mass production stage next year. Mobile memory has been evolving towards larger capacity, lower power consumption, and higher bandwidth. Along with the development of memory, the challenges and solutions for memory optimization are constantly changing. Memory optimization is an important part of performance optimization. Today I talked about many anomalies and stuttering caused by insufficient memory, and finally discussed how to analyze and measure memory usage in daily development.

A good developer is not satisfied with just completing the requirements. When designing a solution, we also need to consider how much memory to use and how to manage it. After completing the requirements, we should also check the memory usage to see if there are any improper usages or memory leaks.

Homework #

Memory optimization is a very “ancient” topic, and everyone may encounter various memory-related problems in their work. Today’s homework is to share the memory problems you have encountered in your work and summarize what you have learned through the exercise of sampling.

In today’s article, I mentioned that I hope to implement a custom Allocation Tracker independent of Android Studio, so that it can be used in automated analysis. This issue’s Sample provides an example of implementing a custom Allocation Tracker, which is currently compatible with Android 8.1. You can use it to practice automated memory analysis, identify objects that consume a large amount of memory, and understand how they affect garbage collection, etc.

Feel free to click on “Share with Friends” to share today’s content with your friends and invite them to learn together. Finally, don’t forget to submit today’s homework in the comments section. I have also prepared a generous “Study Booster Pack” for students who complete the homework seriously. Looking forward to learning and making progress together with you.