05 Continuous Integration Integration Itself Is a Part of Coding

05 Continuous Integration: Integration is a part of coding itself #

Hello, I’m Zheng Ye.

In the previous lesson, we discussed the “completion” of requirements. Now you know how to determine if a requirement is considered complete, which depends on whether it can meet the acceptance criteria. If there are no acceptance criteria, you need to establish them first. This is crucial for every programmer.

In today’s lesson, let’s assume that the acceptance criteria for the requirements have been clearly defined. As an excellent programmer, you are now ready to roll up your sleeves and start coding.

However, here I have a question for you: “Does completing the code mean the job is finished?” You may be puzzled. Isn’t it the case? Then let me ask you again: “Is the code the deliverable of the technical team?”

Have you noticed something is not quite right? Nobody needs this pile of text, what people really need is a runnable software. Writing code is the duty of programmers, but we have a greater obligation to deliver a runnable software solution.

Delivering a runnable software solution is usually not something that can be achieved by individual programmers alone; it is the result of teamwork. Most of us work in a team, so can our code naturally work together with others’ code? Obviously, it’s not that simple.

If we want to effectively combine the code written by each programmer, there is one thing we must do: integration.

But who should be responsible for integration and how should it be done? I don’t know if you have thought about this question. Before we dive into this topic, let me tell you a story.

“Disaster” of Integration #

In 2009, I worked as a consultant in a large company. There were many teams in the department I worked with, all jointly developing a project. Their workflow was to first develop individually for a month, and when the development phase was completed, the project manager would gather the best members from each team for integration. Integration was a big deal for them, with great difficulties, so they needed to bring together the top talents to handle it.

The project was written in C language, so the first step of integration was compiling and linking. Everyone compiled the program modules developed by each team together, and if there were any problems with a module, the elite member from that team would step in to solve it.

If all the modules could be compiled and linked together on the first day, they would be grateful. Only after that would they enter a formal process called “system integration testing.”

The goal of the system integration testing was to ensure that the basic flow of the project was running smoothly, and only then could the integration be considered complete. But for them, this stage was more like a “disaster.”

Why was it a disaster? Just think about it: a large department with several teams, each team developing code for the same project within a one-month cycle. During this month, all the teams’ program modules would be consolidated, resulting in a massive volume. With such a large volume, the likelihood of errors that needed to be fixed was also significant, and the amount of changes required was huge. Therefore, the time needed for integration and system integration testing was also very long.

Even if they mobilized the top talents from each team, it would still take at least 2-3 days to complete a project integration. If the amount of changes was substantial, it could take up to a week. Although I don’t know the current situation in the company you work for, it is highly likely that you have encountered similar scenarios in your professional career. So how do you solve this problem?

Towards Continuous Integration #

As an intelligent observer, you must be wondering, “Why did they wait until a month after development to do integration? Why couldn’t they integrate once a week, or even in a shorter period of time?”

This is a common pain point in the industry, so people have been constantly trying to improve. The breakthrough came with the concept of “Daily Build.”

In 1996, Steve McConnell published a book called “Rapid Development”. In this book, he first put forward an excellent practice to solve integration problems: Daily Build, which means integrating once a day.

At that time, people were amazed by this practice. Like the example mentioned above, people had a common misconception that integration was not easy. It required elite participation and took a long time. Daily integration seemed unimaginable.

In fact, the logic behind daily build is simple: since the accumulated amount of changes over a period of time is too large, the accumulated changes in one day will be smaller, and the difficulty of integration will also decrease.

You will see that both the final integration and daily build are addressing the relationship between the amount of changes and the integration time. It’s just that one is moving towards the “long” direction, and the other is aiming for the “short” direction. In the end, the “long” approach became a vicious cycle, while the “short” approach became the best practice.

Daily Build

Since we agree that increasing the frequency of integration can ensure fewer changes during each integration, thus reducing the difficulty of integration.

But here comes the question. How long after development should we do integration? Should it be half a day, two hours, or one hour? If this idea is taken to the extreme, does it mean that we should integrate whenever there is a code submission?

Yes, based on this idea, someone tried to integrate development and integration simultaneously, giving birth to a new practice: Continuous Integration.

The key breakthrough of continuous integration is to combine the previously separate stages of development and integration into one, allowing for simultaneous development and integration.

The idea of continuous integration is good, but does it require someone to keep an eye on everyone’s work, so that whenever someone submits code, this person has to do the integration? Obviously, this is not feasible in real work.

Since it is programmers who came up with this idea, the natural solution to the problem is to automate the process. So someone wrote a script to regularly pull code from the source code server, and when there were program updates, it would automatically build.

Later, people found that this script was not specific to any particular project. So they further refined and released it, gradually evolving into the well-known continuous integration server that we have today.

In 2000, “the most insightful person in the software industry,” Martin Fowler, published a heavyweight article called “Continuous Integration”.

The following year, the company ThoughtWorks, where Martin Fowler is located, released the first continuous integration server on the market, CruiseControl. CruiseControl can be considered the pioneer of continuous integration servers, and later servers on the market were mostly improvements based on it.

Martin Fowler’s influential article and the release of the first continuous integration server prompted the software industry to delve deeper into continuous integration. The understanding of continuous integration reached new heights, and the continuous integration server became the most handy tool for development teams during the integration stage. A series of guidelines for continuous integration gradually took shape.

By 2006, Martin Fowler had to rewrite the article “Continuous Integration”. And from there, people expanded the concept of continuous integration to further develop the concept of Continuous Delivery.

Humans have preferences for tools. The release of continuous integration servers gradually transformed continuous integration from a niche practice to the “de facto” industry standard today.

Continuous Integration on the Ground #

However, even though continuous integration has been developed for many years, the entire industry has not reached a synchronized state in its application. Interestingly, some companies, although they cannot achieve continuous integration, can actually achieve daily builds due to the presence of a continuous integration server.

This is not difficult to understand. Although the concept of daily builds was proposed early on, there were not many companies in the industry at that time that truly practiced daily builds. The fundamental reason for this was the lack of tool support for daily builds. The fundamental difference between daily builds and continuous integration lies in the timing of the build, which is just a configuration option of the continuous integration server.

Of course, there are some companies in the industry that are already proficient in using continuous integration, while there are also a considerable number of people who are still struggling with integration, as I mentioned earlier in the consulting project.

I participated in this project in 2009. That is to say, it has been 9 years since Martin Fowler first wrote “Continuous Integration,” and even 3 years have passed since the updated version of this article was published, let alone 13 years since McConnell proposed “daily builds.”

Even from the perspective of that time period, the integration practices of this project were at least 10 years behind the industry. Yes, they were still far from even achieving daily builds.

Today, continuous integration is already a mature practice that cannot be more mature. However, to my knowledge, many companies are still in the stage where integration depends on “heroes.”

Although we write code and develop in the same era, different teams seem to live in different eras when it comes to technical practices. This is also the reason why we need to learn.

Perhaps the current level of continuous integration practice in China is still in a relatively primitive state, which is bad news. But the good news is, by learning more and gaining a sufficient understanding of integration, we can enter the most advanced state in one go.

We don’t need to stay in the era of integration centered around elites, nor do we need to completely disregard daily builds. I hope you have the integration perspective of this era and start continuous integration directly.

With the integration perspective of continuous integration, how should we view development? Development and integration are no longer two separate processes, but are combined into one.

Based on this understanding, we can no longer say that the code is complete and only integration is left, because that is not the completion of development. A good practice is to integrate the code with existing code as early as possible, rather than waiting for all the code to be developed before making the submission.

How to do it as soon as possible? You need to understand task decomposition, which we will discuss in the “Task Decomposition” topic later.

Summary #

In software development, writing code is an important part, but the deliverable of a programmer should not be just code, but a working software. When we work in a team, the process of putting different people’s code together to make it a working software is called integration.

For a long time, integration has been a challenge in the software industry, as the amount of changes and the time for integration affect each other. Fortunately, different people have tried to change things in different directions, and as a result, those who increased both the amount of changes and the time for integration got stuck, while those who decreased these two parameters saw the light.

Daily builds were proposed as an early “best practice,” but they were not widely adopted because it was mostly a principle. As people further “tuned” these parameters, a more extreme practice emerged: continuous integration, which means integrating the code every time it is submitted.

What truly made continuous integration an industry best practice was Martin Fowler’s article and the continuous integration server. The mindset of continuous integration made us realize that development and integration can be combined. We should consider development as completed when the code has been integrated, and from an individual’s perspective, we should submit our code as early as possible and start integration.

If there is one thing you can remember from today’s content, please remember: submit your code for integration as early as possible.

Finally, I would like to ask you to share your thoughts on the difficulties you have encountered in your work due to integration. Please feel free to write your thoughts in the comments section.

Thank you for reading, and if you found this article helpful, please feel free to share it with your friends.