01 How to Learn Linux Performance Optimization

01 How to Learn Linux Performance Optimization #

Hello, I am Ni Pengfei.

Have you ever been like me, read many books, learned many Linux performance tools, but still feel helpless when faced with Linux performance issues? In fact, performance analysis and optimization have always been a pain point for most software engineers. However, does this mean we really have no solution to the problem?

Admittedly, the complexity of performance issues increases the difficulty of learning. But this should not become a “roadblock” on our path to advancement. In my opinion, most people “give up” on performance issues for only two reasons.

One reason is that you haven’t found an effective method to learn the principles. When you hear words like “system” and “underlying”,you get scared and feel that things are too difficult. You believe that you won’t be able to learn it, so you naturally cannot delve deeper and develop a comprehensive understanding of performance.

Another reason is that the root cause of the performance problem seems too complex. You don’t know how to analyze it and can’t identify the bottleneck.

You may think that since the program has a problem, you can just search online, use other people’s methods, try a few times, and maybe it will be solved. So, you are not interested in investigating why these methods are effective, and you don’t know why. Many methods work in other environments but not in yours.

As a result, the same mistakes are repeated, and the same situations continue to occur.

In fact, performance issues are not as difficult as you imagine. As long as you understand a few basic principles of application programs and systems, and practice a lot, establishing a comprehensive view of overall performance, most performance optimization problems will be easily solved.

I have seen many engineers who are not familiar with the programming languages used by the third-party components that the application program uses when analyzing the performance of these components. However, they are still able to analyze the root causes of online problems and optimize them using methods such as modifying the application program’s calling logic or adjusting the configuration options of the components.

Again, you do not need to understand every implementation detail of each component, as long as you can understand their basic working principles and their ways of collaboration, you can achieve it.

What are performance metrics? #

The first step in learning performance optimization is to understand the concept of “performance metrics”.

When you see performance metrics, what do you think of first? I believe “high concurrency” and “quick response” are the first two words that come to mind, and they correspond to the two core metrics of performance optimization - “throughput” and “latency”. These two metrics examine performance from the perspective of application load and directly affect the user experience of the product. Corresponding to them are metrics from the perspective of system resources, such as resource utilization and saturation.

We know that as application load increases, the use of system resources also increases, even reaching its limit. The essence of performance issues is that system resources have reached a bottleneck, but the processing of requests is not fast enough to support more requests.

Performance analysis is actually about identifying bottlenecks in applications or systems and finding ways to avoid or mitigate them, in order to efficiently utilize system resources to handle more requests. This includes a series of steps, such as the following six steps.

  • Select metrics to evaluate the performance of applications and systems.
  • Set performance goals for applications and systems.
  • Perform performance benchmarking.
  • Perform performance analysis to locate bottlenecks.
  • Optimize systems and applications.
  • Perform performance monitoring and alerting.

After understanding these basic performance-related metrics and core steps, how do you learn? Next, I will talk about several important questions for learning Linux performance optimization.

What prerequisites are required for learning this column? #

Firstly, it is important to understand that the core of this column is performance analysis and optimization, rather than the basic usage of the Linux operating system.

Therefore, it is preferable that you have experience with Ubuntu or other Linux operating systems, and possess some programming basics, such as:

  • Familiarity with commonly used Linux commands;

  • Knowledge of how to install and manage software packages;

  • Understanding how to develop applications using programming languages, etc.

This way, when I talk about performance, you will have a better understanding of the principles behind it, especially after combining it with the case studies in this column, allowing you to gain a more intuitive understanding of performance analysis.

This column will not be as detailed as a textbook, covering the complete details of operating systems, algorithm principles, network protocols, and various programming languages. However, some important system principles are still indispensable. I will also use practical case studies to guide you step by step, covering various components from application programs to the operating system.

What is the focus of learning? #

To study performance analysis and optimization effectively, the core topic is to establish a global view of the overall system performance. Therefore,

  • Understand the basic principles of several system knowledge;

  • Master the necessary performance tools;

  • Go through practical scenarios that involve different components.

These three points are the most important aspects of our learning. In each article of the column, I will explain these three aspects to you, focusing on different scenarios. You must also spend time and effort to digest them.

Speaking of performance tools, we cannot fail to mention the performance domain master Brendan Gregg. He is not only the author of the dynamic tracing tool DTrace, but also developed many performance tools. I believe you have seen his depiction of the Linux Performance Tools Diagram:

(Images from brendangregg.com)

This figure is one of the most important reference materials for Linux performance analysis. It tells you what tools to use to observe and analyze when performance issues occur in different subsystems of Linux.

For example, when encountering I/O performance issues, you can refer to the I/O subsystem at the bottom of the image, and use iostat, iotop, blktrace, and other tools to analyze disk I/O bottlenecks. You can save this figure and refer to it when needed.

In addition, I would like to emphasize the selection of performance tools. There is a saying that goes, “A correct choice is better than a thousand efforts.” Although it may be exaggerated, choosing the right performance tool can greatly simplify the whole performance optimization process. Understanding which tools to choose in which scenarios, and how to learn to choose appropriate tools, are things I want to teach you.

However, remember not to rely solely on performance tools. Tools are just a means to solve problems, and the key lies in how you use them. Only by truly understanding the principles behind them, and combining them with specific scenarios to understand the different components of the system, can you truly master them.

Finally, in order to give you a comprehensive understanding of performance, I have created a mind map, which covers most of the knowledge involved in performance analysis and optimization, and will be covered in the column. You can save or print it out, mark each part as you learn it, and record and grasp your learning progress.

How to study more effectively? #

Earlier, I told you about the key points for learning Linux performance optimization. Now, let me share with you a few study tips that can make your learning process easier.

Tip 1: While understanding the system’s principles is important, do not try to grasp all the implementation details right from the beginning.

Getting too deep into the internal implementation of the system may cause you to lose focus on the main concepts of learning. Moreover, the complexity of implementation logic may dampen your enthusiasm for learning. Therefore, my personal opinion is to approach it moderately.

You can start by learning the system principles I have explained to you, but do not go into the details of how the Linux kernel achieves them. Instead, focus on how to observe and apply these principles, such as:

  • What metrics can be used to measure performance?

  • What performance tools should be used to observe these metrics?

  • What factors lead to changes in these metrics, and so on.

Tip 2: Learn and practice simultaneously by mastering the analysis and optimization of Linux performance through numerous case studies.

Only by practicing on a machine and going through the knowledge and case studies that I have provided, can you truly make them your own. I have carefully designed these case studies to help you better understand and experience the concepts and operations.

Therefore, I strongly recommend that you run and analyze these case studies or analyze your own system using the knowledge you have acquired. This way, you will have a more intuitive understanding and achieve better learning outcomes.

Tip 3: Think diligently, reflect frequently, summarize effectively, and ask “why” more often.

The best way to truly understand a subject is to ask questions. When you can ask good questions, it indicates that you have gained a deep understanding of the topic.

Feel free to leave me a message in the comments section at any time, where you can write down your questions, thoughts, and summaries. We can discuss and learn together with other learners. You can also share your experiences with performance issues, document your analysis steps, and share your optimization ideas. Let’s interact and explore together.

Preparing for Learning #

As a course that includes a lot of case practices, I will use one or two Ubuntu 18.04 virtual machines in each article as the environment for running and analyzing the cases. If you only listen to the audio explanations without actually practicing, the effectiveness of your learning will be greatly reduced.

So, can you prepare a Linux machine for practicing the course cases? Any virtual machine or physical machine will do, and it is not limited to the Ubuntu system.

Reflection #

Today’s content is a warm-up preparation for our subsequent study. Starting from the next article, we will officially delve into Linux performance analysis and optimization. So, I’d like to chat with you about the difficulties or doubts you have encountered before when solving Linux performance problems, or any questions you have while studying Linux performance optimization. Based on the content I discussed today, how do you plan to approach learning this column?

Feel free to share in the comments.