19 Directional Metrics, How to Establish a Complete Dev Ops Metrics System

19 Directional Metrics, How to Establish a Complete DevOps Metrics System #

Hello, I am Shi Xuefeng. Up until now, I have used the lectures to give you a comprehensive understanding of the engineering practices of DevOps, covering various aspects of engineering practices. However, just like the classic saying goes, “Not only must you look down at the road, but also look up at the sky.” After putting so much effort into implementing engineering practices, does the result meet our expectations?

Therefore, in the last two lectures on engineering practices, I would like to discuss the topics of metrics and continuous improvement. Today, let’s take a look at the metrics system of DevOps.

I believe that for every company, metrics are essential practices and are the practices that management values the most. When implementing metrics, many people regard the saying “If you can’t measure it, you can’t manage it” by management guru Dr. Edward Deming as their guiding principle.

However, let’s step back and think, how many metrics are measured just for the sake of measurement? Will people actually look at the data that has been painstakingly measured? What exactly is the problem that metrics are trying to solve?

Therefore, metrics are not the goal, but the means. In other words, the goal of metrics is to “do the right thing,” and the means of metrics is to “do things right.”

So, what is the right thing to do in the field of metrics? To understand what metrics in DevOps should look like, it is crucial to go back to the core demands of software delivery in DevOps.

In short, for IT delivery, DevOps aims to achieve continuous, rapid, and high-quality value delivery. Value can be a functional feature, an improvement in user experience, or the resolution of a blocking defect for users.

With this in mind, the goals that DevOps metrics aim to achieve are clear: to prove that, as a result of a series of improvement efforts, the team’s delivery speed has increased and delivery quality has improved compared to the past. If the results of metrics cannot lead to these two core goals, then obviously, the direction is wrong, and the desired results cannot be achieved.

If we only have a general direction, we often don’t know specifically what to do. At this time, we need to break down the goals and directions into a series of metrics. So, how do we define good metrics and bad metrics?

How to Define Metrics? #

A few days ago, I was assigned to work as an assembly line worker in a warehouse, and this experience made me deeply understand the huge differences between industrial manufacturing and software industry.

If you ask me now, what determines the speed of industrial production assembly line? I can tell you that the answer is the assembly line itself. Because the speed of the conveyor belt on the assembly line is constant, the production line speed can be quantified intuitively.

However, software development is not like industrial manufacturing. The development process cannot be seen or touched. It includes not only the time engineers spend on writing code, but also the time for brainstorming, design, and testing, as well as the time to complete various processes. In this process, there may also be switches and interruptions of various parallel work. Therefore, it is impossible to measure the efficiency of developers using the methods of industrial assembly lines.

Therefore, in order to achieve quantification, many metrics are artificially designed.

For example, taking the metric of on-time test submission rate as an example, this metric adopts a hundred-point scale. Submitting the test on time gets 100 points, delaying it by one day gets 90 points, delaying it by two days gets 70 points, and so on. If it is delayed for five days or more, it will only get 0 points. This kind of metric seems to be objective and fair enough, but if you think about it carefully, delaying by 1 day and 1 hour and delaying by 1 day and 23 hours seem to have no significant difference. The high or low score does not reflect the real situation.

In the measurement systems of various companies, similar artificial metrics are everywhere. It can be seen that there are always various types of bad metrics, each with its own appearance. However, good metrics mostly possess some typical characteristics.

1. Clearly defined target audience.

Metrics cannot exist independently without the target audience. When defining a metric, it is necessary to define the associated objects, that is, who the metric is intended for.

Different people have different focuses, and even if the metric itself does not appear to have any problems, it is difficult to generate the expected value if it is used inappropriate. For example, showing unit test coverage to a non-technical boss is not of much significance.

2. Directly address the problem.

In the NBA, excellent players always come with a system. The so-called system is a set of tactics and strategies based on the core capabilities of this player, which can solve the actual problems of the team. Therefore, the performance of this player becomes the “barometer” of the whole team.

Good metrics should also directly address the problem. As soon as you see this metric, you should realize where the problem lies and naturally make improvements, rather than just looking at it as if you haven’t seen it, without knowing exactly what to do.

For example, if the build failure rate is high, the team will realize that there are issues with the quality of code submissions and the need to strengthen pre-verification work.

3. Quantify trends.

According to the SMART principle, good metrics should be measurable and verifiable through objective data.

For example, metrics like user satisfaction may seem good, but it is difficult to quantify them with data. Another example is the metric of project completion rate. If it is only filled in manually, it will not carry much persuasiveness.

At the same time, good metrics should be able to demonstrate trends. It means that after a period of accumulation, it should be clear whether the metrics have improved or worsened and whether the target is getting closer or further away.

4. Full of tension.

Metrics should not exist in isolation, but should be interconnected to form a whole. Good metrics should have a certain tension, converging upward to the business results while decomposing downwards to specific details. By extracting data from different dimensions, it can meet the needs of users from different perspectives.

For example, simply measuring the number of delivered requirements does not have much significance. Because the granularity of requirements directly affects the quantity. If a requirement is split into two just to achieve the effect of doubling the delivery speed, it loses the meaning of measurement.

What are the principles for defining metrics? #

Having understood the typical characteristics of good measurement metrics, let’s now look at the five principles for defining DevOps metrics:

Global metrics are superior to local metrics: Excessive local optimization may be meaningless for overall output, deviating from the core of measurement, which is to improve delivery speed and delivery quality.
Composite metrics are superior to single metrics: Starting from a single dimension can lead to the trap of only seeing the trees and not the forest. Composite metrics are more objective. Therefore, to address a problem, a set of metrics is needed to objectively guide.
Outcome metrics are superior to process metrics: Outcome metrics should come first, with results as the guide and processes as the means. All process metrics should be summarized into outcome metrics.
Team metrics are superior to individual metrics: Team metrics should be prioritized over individual metrics. Shared metrics within a team help to form internal cohesion and reduce internal division.
Flexible metrics are superior to rigid metrics: Metrics are established in order to implement targeted improvements. The differences and improvement directions of the business itself need to be considered, rather than a simple and rough “one-size-fits-all” approach. As team capabilities improve, metrics also need to be adjusted appropriately, constantly challenging the team’s abilities.

Which metrics are most important? #

Based on the above metric characteristics and guiding principles, combined with some practices from industry giants, I recommend a set of DevOps measurement systems.

Although the measurement systems of different companies may be different, I believe that this framework is sufficient for most scenarios, as shown in the following figure:

1. Delivery Efficiency

Lead Time : The time from the proposal of requirements to the completion of the entire development and delivery process, and ultimately to the release. For both business and users, this time is the most objective indicator of team delivery speed. This metric can be further divided into demand-side, which includes the time it takes from requirement proposal, analysis, design, review to readiness, and business-side, which includes the time it takes for development scheduling, development, testing, acceptance, and release. From the perspective of value stream analysis, this represents the complete value stream duration.
Development Lead Time : The time from the requirement being scheduled for development to the final release. It reflects the delivery capability of the development team, i.e., how long it takes to complete the entire development process after a requirement is received.

2. Delivery Capability

Release Frequency : The number of system releases within a unit of time. In principle, the higher the release frequency, the stronger the delivery capability. This depends on the architecture structure, team autonomy, and the ability to independently release. Each team can release at its own pace, independent of system dependencies and release windows.
Release Lead Time : The time from the submission of code by the development team to the final release. It is the most intuitive indicator of team’s continuous delivery engineering capability and relies on end-to-end automated pipeline and automated testing capability. It is also one of the core metrics in the DevOps status report.
Delivery Throughput : The number of demand points delivered within a unit of time. It represents the team’s delivery capability based on the number of demands multiplied by the demand granularity under standard demand granularity.

3. Delivery Quality

Defect Density : The proportion of defects for requirements within a unit of time, i.e., the average number of defects generated per requirement. The more defects there are, the poorer the quality of requirement delivery.
Distribution of Defects : The proportion of critical and fatal level defects among all defects. The higher the value of this proportion, the more serious the defects, reflecting the overall controllability of quality.
Fault Fixing Time : The time from the identification of a valid defect to the completion of fixing and release. On one hand, this metric examines the time for fault locating and fixing; on the other hand, it examines the release lead time. Only by completing the release and deployment process faster can problems be fixed more quickly.

These three groups and eight metrics reflect the team’s delivery efficiency, delivery capability, and delivery quality, and examine key result indicators from a comprehensive perspective. They can be used to demonstrate the effects and value of team DevOps improvements. However, defining metrics is just a small step in DevOps measurement. It is only meaningful when these metrics are put into value.

How to Start Measurement? #

The process of starting measurement within a company can be divided into four steps.

Step 1: Refine metrics.

To have a complete metric, apart from defining it, you also need to clarify the metric name, description, level (team/organization), type, applicable scenarios and target users, data collection method, and standard reference values.

Taking delivery metrics as an example, I have consolidated the details of a refined metric in the table below. In fact, not only core outcome metrics, but all metrics defined within the measurement system need to be refined.

Regarding reference values for metrics, they may vary for different business forms. For example, the reference values for unit test coverage may be significantly different between autonomous vehicles and regular internet businesses.

Therefore, the selection of reference values needs to be analyzed based on the actual business and reach a consensus. Additionally, the measurement metrics themselves also need to establish a mechanism for regular updates to adapt to the team’s capabilities.

Step 2: Collect measurement data

Measurement metrics require objective data to support them, and data often comes from different platforms. So when defining metrics, you need to assess whether there are enough objective data to support the measurement of these metrics.

During the initial stage of collecting measurement data, the biggest challenges we face are not only the numerous systems and inconsistent data calibers, but also the accuracy of the data.

For example, when it comes to the metric of development delivery cycle, it is usually calculated as the time length from the start of development to the online release of a requirement. However, if developers delay setting this requirement as “resolved” or “waiting for testing,” the calculated development cycle will be highly distorted and will not reflect the objective and real situation.

This needs to be addressed from both process and platform perspectives. For example, on one hand, at the process level, establish development operation standards so that each developer is clear about when to change the status of requirement cards; on the other hand, build platform capabilities to provide user-friendly ways to assist development, and even automatically circulate requirement statuses.

Step 3: Build a visualization platform.

After all, measurement metrics are meant to be seen, and measurement data needs to be collected and computed in one place, which relies on the construction of a visualization platform. As for how to build a measurement platform that supports multi-dimensional views, interfaces with multiple system data, and is flexible and configurable, I will share a case study in the tool section to help you solve the critical issues in constructing the measurement platform.

Step 4: Identify bottlenecks and continuously improve.

Once the data becomes trustworthy and visible, the problems and bottlenecks faced by the team will naturally emerge. How to use metrics to lead and drive the team to implement improvements will be the core content of the next lecture.

I provide you with some commonly used measurement metrics and their definitions, which you can access through this cloud drive link with the extraction code c7F3. What needs to be noted is that metrics should be few rather than many and precise rather than redundant. For enterprise DevOps measurement, this is also the most common issue—defining a bunch of metrics without knowing what to do with them.

Only by refining the definition of metrics, reaching a consensus within the team, carefully vetting the completeness and effectiveness of the data, and achieving visualization from different perspectives, can you have the basis to drive team improvement. Please remember this point.

Conclusion #

To summarize, the goal of DevOps metrics is to prove that the team has achieved faster delivery speed and higher delivery quality through a series of improvement work compared to the past. Therefore, delivery efficiency and delivery quality are the two core objectives. Only a measurement system built around these two objectives can avoid going in the wrong direction.

Good indicators generally have four characteristics: clear audience, direct problems, quantify trends, and tension-filled. Combining the characteristics of indicators, guiding principles, and some practices from industry giants, I introduced you to three sets of eight core result indicators, including efficiency indicators, capability indicators, and quality indicators. Finally, I introduced the four steps to establish a measurement system, hoping to help you step by step in building the foundation for continuous improvement.

Measurements are a double-edged sword. If done poorly, it can hurt team morale. If the focus is misplaced and measurement results are tied to a person’s performance, it can easily change the nature of measurement. Many large companies repeatedly establish measurement systems because the previous system was figured out and turned into a numbers game, losing its original purpose and needing a fresh start.

As I said before, measurement is just a means, not an end. Ultimately, the true purpose of measurement is to improve team efficiency and achieve business success. Only by using measurements to provoke the team’s spontaneous improvement initiative and enhance the team’s creativity and enthusiasm for improvement can we achieve the so-called “positive measurement.” This is the concept I most want to convey to you.

Thought-provoking Question #

Is the company you are in also building a DevOps measurement system? Do you think that these metrics have had a positive impact on improving current work?

Feel free to share your thoughts and answers in the comments section. Let’s discuss together and learn from each other. If you find this article helpful, you are also welcome to share it with your friends.