23 Package Size Optimization Part2 Progressive Practices in Resource Optimization

23 Package Size Optimization Part2 Progressive Practices in Resource Optimization #

In the previous article, we discussed the optimization of Dex and Native Libraries. But perhaps you still feel there is more to learn about optimizing the installation package. So, what other areas can be optimized?

Optimization of Resources

Please take a look at the image above. Assets, Resources, and signature metadata are all part of the “resources” section of the installation package. Today, let’s explore how to further optimize the volume of these resources.

AndResGuard Tool #

In an article titled “Optimization Practices for Shrinking Android App Packages” on Meituan’s website, many optimization methods related to resources were discussed, such as WebP and SVG, R files, unused resources, resource obfuscation, and language compression.

In our installation package, there are several resource-related files that need to be optimized. They are shown below.

To use the AndResGuard tool effectively, a deep understanding of the installation package format and the principle of Android resource compilation is required. The tool mainly has two functions: resource obfuscation and extreme compression of resources.

Let’s first review the core implementation of this tool and then consider further optimizations.

1. Resource Obfuscation

ProGuard has three main optimizations: Shrink, Optimize, and Obfuscate. When I was developing AndResGuard, my goal was to implement the obfuscation feature in ProGuard.

The idea behind resource obfuscation is actually quite simple, which is to obfuscate the names of resources and files into short paths:

Proguard          -> Resource Proguard
R.string.name     -> R.string.a   
res/drawable/icon -> res/s/a

Which resource files can benefit from this obfuscation implementation?

  • resources.arsc. Since the resource index file, resources.arsc, needs to record the names and paths of resource files, using obfuscated short paths like res/s/a can reduce the size of the entire file.

  • Metadata signature files. The signature files MF and SF need to record the paths and hash values of all files. Using short paths can reduce the size of these two files.

  • ZIP file index. The ZIP file format also needs to record the path, compression algorithm, CRC, and file size of each file entry. Using short paths can reduce the size of the strings that record file paths.

Resource files have a very characteristic feature, which is that there are a large number of them. For example, in WeChat 7.0, there are over 7,000 resource files in the installation package. Therefore, a resource obfuscation tool can achieve the goal of reducing the size of resources.arsc, signature files, and ZIP files through optimization with short paths.

Since mobile optimization has reached the “deep water” area, just like Dex and Library optimization, we need to have a very deep understanding of their formats and characteristics in order to find optimization ideas. The same goes for resource optimization. We need to have a deep understanding and thinking about resources.arsc, signature files, and the ZIP format.

2. Extreme Compression

Another optimization provided by AndResGuard is extreme compression, which has the following two aspects:

  • Higher compression ratio. Although we are still using the Zip algorithm, by leveraging the large dictionary optimization of 7-Zip, the overall compression ratio of the APK can be increased by about 3%.

  • Compress more files. During the Android compilation process, files in the following formats are designated as not to be compressed. In AndResGuard, we support forced compression for resources.arsc, PNG, JPG, and GIF files.

/* these formats are already compressed, or don't compress well */
static const char* kNoCompressExt[] = {
    ".jpg", ".jpeg", ".png", ".gif",
    ".wav", ".mp2", ".mp3", ".ogg", ".aac",
    ".mpg", ".mpeg", ".mid", ".midi", ".smf", ".jet",
    ".rtttl", ".imy", ".xmf", ".mp4", ".m4a",
    ".m4v", ".3gp", ".3gpp", ".3g2", ".3gpp2",
    ".amr", ".awb", ".wma", ".wmv", ".webm", ".mkv"
};

Here, you might wonder why the Android system specifically chooses not to compress these files.

  • Compression effect is not significant. Most of these file formats have already been compressed, so re-compressing them with the Zip algorithm does not yield a significant effect. For example, re-compressing PNG and JPG formats only yields a 3% to 5% improvement, which is not very significant.

  • Consideration of reading time and memory. If a file is not compressed, the system can directly read it using mmap, without the need to decompress it all at once and store it in memory.

Starting from Android 6.0, the AndroidManifest supports not compressing library files. This means that when installing an APK, the library files don’t need to be extracted separately, and the system can directly use mmap to access the library files in the installation package.

android:extractNativeLibs=“true”

In simple terms, we have made a trade-off between startup performance, memory, and installation package size. As I mentioned in the previous article, for Dex and Library, the most effective method is to use XZ or 7-Zip compression. The same goes for resources. For some larger resource files, we can also consider using XZ compression, but they need to be decompressed during the first launch.

Advanced Optimization Methods #

After learning about the implementation principles of obfuscation and compression in the AndResGuard tool, we can deepen our understanding of the package format and the principles of Android resource compilation.

However, AndResGuard is a product from a few years ago. So what are the new advanced optimization methods now?

1. Resource Merge

In the resource obfuscation scheme, we found that the path of the resource files has an impact on resources.arsc, signature information, and ZIP file information. Moreover, due to the large number of resource files, this part has a significant size.

So, can we merge all the resource files into a single large file? This approach will definitely be more effective than the resource obfuscation scheme.

In fact, most skin-changing schemes also use this approach. The large resource file is equivalent to a set of skins. Therefore, we can promote this scheme, but there are still many problems to solve in the implementation.

  • Resource Parsing. We need to simulate the system to implement the parsing of resource files, such as converting PNG, JPG, and XML files into Bitmap or Drawable. In this way, the method of obtaining resources needs to be changed to our custom method.

    // The system default way Drawable drawable = getResources().getDrawable(R.drawable.loading);

    // The new way Drawable drawable = CustomResManager.getDrawable(R.drawable.loading);

Why don’t we directly put all these parsed Drawables into the system cache like SVG does? In this way, the code does not need to be modified too much. The reason for not doing this is mainly considering the impact on memory. If we parse all the resource files and put them into the system cache at once, it will occupy a very large amount of memory.

  • Resource Management. Considering memory and startup time, all resources are loaded on demand. We only need to use mmap to load the “Big resource File”. At the same time, we need to implement our own resource cache pool ResourceCache to release unused resource files. You can refer to the implementation of similar libraries like Glide for this part.

When I was reverse-engineering Facebook’s app, I also found that their resource and multilingual management processes were completely based on their own workflow. In the section on “UI optimization”, I mentioned that we tried many optimizations under the framework of the system. However, we gradually realized that this approach still depends on various constraints of the system. At this time, we need to consider breaking through the limitations of the system and taking over all processes.

Of course, we also need to find a balance between performance and efficiency, depending on whether we currently focus more on performance improvement or development efficiency.

2. Unused Resources

The resource obfuscation implemented in AndResGuard is based on ProGuard’s Obfuscate. Can we also implement resource shrinking, which is similar to the cropping function? Through long-term iterations, applications will always have some unused resources. Although they are not used during the program’s runtime, they still occupy the size of the installation package.

In fact, the Android official has long considered this situation. Let’s take a look at the evolution process of the solution for optimizing unused resources.

Phase 1: Lint

Since the Eclipse era, we have been using the static code scanning tool Lint. It supports scanning for unused resources.

Then we can directly select “Remove All Unused Resources” to easily delete all unused resources. Since it is the solution of the first phase, what are the specific shortcomings of the Lint solution?

As a static scanning tool, Lint’s biggest problem is that it did not consider code shrinking by ProGuard. During the ProGuard process, a large number of unused codes are shrunk, but the Lint tool cannot detect the unused resources referenced by these unused codes.

Phase 2: shrinkResources

Therefore, in the second phase, Android added the “shrinkResources” resource shrinking feature, which needs to be used in conjunction with the “minifyEnabled” feature of ProGuard.

If ProGuard removes some unused codes, the resources referenced by these codes will be marked as unused resources. Then these resources can be removed through the resource shrinking feature.

android {
    ...
    buildTypes {
        release {
            shrinkResources true
            minifyEnabled true
        }
    }
}

Doesn’t it look perfect? But there are a few defects in the current implementation of shrinkResources.

  • resources.arsc file is not handled. This means that a large number of unused resources such as String, ID, Attr, Dimen, etc. are not being deleted.

  • Resource files are not really deleted. For unused resources like Drawable, Layout, shrinkResources does not actually delete them, but replaces them with an empty file. Why can’t they be deleted? The main reason is that the paths for these files are still in resources.arsc. You can check this issue for more details.

So even though our app has a lot of unused resources, the system’s current approach does not really reduce the file count. As a result, the “big heads” such as resources.arsc, signature information, and ZIP file information remain unchanged.

But why doesn’t Studio really delete these resources? In fact, Android is aware of this issue, as stated in the comments of its core implementation, ResourceUsageAnalyzer, and it tries to address this problem by providing two approaches.

To answer why the system cannot directly delete these resources, we need to revisit the Android build process.

  • Since Java code requires the R.java file for resources, we need to prepare R.java in advance.

  • During the compilation of Java code, the references to resources in the code are directly replaced with constants based on the R.java file. For example, R.String.sample is replaced with 0x7f0c0003.

  • Synchronized compilation of .ap_ resource files, such as processing resources.arsc, XML files, etc.

If we forcefully delete unused resource files during this process, the resource IDs in resources.arsc and R.java will change (because they are contiguous by default). At this point, the already replaced 0x7f0c0003 in the code will result in resource disorder or not found.

Therefore, to avoid this situation, the system uses a compromise and does not perform a secondary processing of resources.arsc file. It only replaces unused Drawable and Layout files with empty files.

Phase Three: realShrinkResources

How can we truly achieve the deletion of unused resources? The comments in ResourceUsageAnalyzer provide an idea. We can use the mechanism of Public IDs in resources.arsc to achieve non-contiguous resource IDs.

In simple terms, we keep the IDs of the retained resources to ensure that the compiled code can find the corresponding resources.

However, rewriting resources.arsc is more complex than resource obfuscation. We not only need to erase all information related to unused resources from this file, but also keep the IDs of all retained resources, which is equivalent to rewriting the entire file.

Because of its complexity, Android has not yet provided a complete implementation of this scheme. I am currently working on implementing this scheme based on this idea, and I hope to release it as open source as soon as it is completed.

Summary #

Today we reviewed the implementation principle of the AndResGuard tool and learned two advanced methods of resource optimization. Especially for the optimization of unused resources, you can see that even the all-powerful Google did not achieve the best solution and still have some compromises.

In fact, there are many imperfections like this, and it is precisely because of these imperfections that various excellent open source solutions emerge. It is also because of this that we constantly think about how to break through the limitations of the system and achieve more and lower-level optimizations.

Homework #

Do you still have any areas that you don’t understand about the Android compilation process? What other good optimization solutions do you have for resources in installation packages? Feel free to leave a message to discuss with me and other classmates.

I don’t know if you have thought about it, but the solution for removing useless resources in the “third phase” is not the ultimate solution, as it does not consider useless assets resources.

However, for assets resources, there are various ways of referencing them in code, and it is not easy to accurately identify unused assets. In Matrix, we attempted to provide a simple implementation, which you can refer to UnusedAssetsTask.

I hope that you can further ponder on how we can identify unused assets resources, and what problems we may encounter in this process.

Feel free to click “Invite a Friend to Read” to share today’s content with your friends and invite them to learn together. Lastly, don’t forget to submit today’s homework in the comments section. For students who complete the homework seriously, I have prepared a generous “Study and Encouragement Gift Pack” for you. Looking forward to discussing and improving together with you.