27 the Giant's Shoulders, Those Open Source Tools You Cannot Ignore

27 The Giant’s Shoulders, Those Open-Source Tools You Cannot Ignore #

Hello, I’m Shi Xuefeng.

For companies, developing a self-built tool platform is a high-cost and high-investment endeavor that also requires highly skilled technical personnel. Very few companies can afford to invest in a team of nearly a hundred people like the BAT companies to develop internal system tools. After all, without such a large-scale team, the platform’s profits would be relatively limited.

Furthermore, few companies, like some industry leaders, invest a significant amount of money directly to purchase mature commercial tools or collaborate with third parties to jointly build tools.

These methods involve long-term investments and are not suitable for small and medium-sized enterprises. So, are there any other low-cost and quick-to-implement solutions available?

In fact, open source tools are already very mature, and with a little familiarity, you can quickly build a complete set of development and delivery toolchain platforms based on open source tools.

A few years ago, some friends and I built such an end-to-end pipeline solution in our spare time. I vaguely remember that we completed the architecture diagram for this solution on the high-speed train from Beijing to Shanghai. Currently, this solution is widely circulated in the industry and has become a reference material for many companies to build their own internal toolchain platforms. The architecture diagram for this system is as follows:

Today, I will use this solution as a basis to introduce you to the usage techniques for tools in the code submission stage, integration testing stage, and deployment/release stage. The tool selection will primarily focus on mainstream open source solutions with commercial tools as secondary options. This will cover Jira, GitLab, Jenkins, SonarQube, Kubernetes, and more. I hope this will help you quickly build a complete continuous delivery platform, guiding you step by step.

For the continuous delivery toolchain system, the connectivity of the tools is a core element. Therefore, I will not spend too much time explaining how to build the tools, as there are already many resources available. Alternatively, you can refer to the official documentation for tool setup. Especially now, many tools provide containerized deployment options, further simplifying the construction cost of the tools themselves.

Requirements Management - Jira #

In a prominent position on the Jira official website, there is a sentence that says: The first choice for Agile development tools. In my opinion, Atlassian indeed has the confidence to make such a statement because Jira is really excellent and has almost become a standard combination for many enterprises along with Confluence. That’s why I didn’t choose open-source tool Redmine or SaaS services like Teambition.

Of course, in recent years, major vendors have also actively output their R&D tool capabilities. Agile collaborative development tools represented by Tencent’s TAPD are widely used. However, in fact, the ideas behind these products are similar. Once you have mastered Jira, other tools are more or less similar.

As an agile collaboration tool, Jira allows you to choose between the Scrum and Kanban methodologies when setting up a new project according to your team’s development mode. In the 8th and 9th articles of this column, I introduced you to Lean Kanban, and you can customize your team’s visual board in Jira.

The configuration process for the visual board is not complicated. I have summarized it into a document that you can obtain by clicking the link on the cloud drive, with the extraction code “mrtd”. One reminder for you is: Don’t forget to add WIP (work-in-progress) constraints to avoid turning your Lean Kanban into just a visual board.

Requirements are the starting point for all development work and are an important lever throughout the development process. For Jira, the focus is on integrating with version control systems and developer tools. Let’s take a look at how to achieve this separately.

If you are also using the feature branch development mode, you should know that each feature corresponds to a task in Jira. You can create feature branches through tasks and bind all commits on the branches to specific tasks to establish a clear association between features and code. I recommend two implementation methods.

The first method is to use native plugins provided by Jira, such as Git Integration for Jira. It is very simple to configure this plugin, you just need to add the address of the version control system and the authentication method. Then, you can view commit information, compare differences, create branches, and merge requests in Jira. However, this plugin is a paid version. You can use it for free for 30 days and update it when it expires.

The second method is to connect Jira with GitLab through Webhooks.

First, you need to find the Jira options in “Settings - Integrations” of the GitLab project and add the corresponding configurations as shown in the figure below. After the configuration is complete, you only need to add a Jira task ID in the commit comments to associate the Jira task with code commits. These associations will be reflected in the “Issue links” section of Jira tasks.

In addition, you can also automatically transition the status of Jira tasks without manually moving task cards. I provide you with a configuration guide as reference.

However, if this is all you do, you still cannot automatically create branches based on Jira tasks. Therefore, you need to configure Jira Webhooks next. In the system administration interface of Jira, you can find the “Advanced Settings - Webhooks” option. After adding a Webhook, you can bind various events provided by the system, such as task creation and task update, which can meet the needs of the vast majority of scenarios.

Let’s assume that our system needs to automatically create a branch in GitLab based on the mainline when creating a Jira task. In this case, you can insert the GitLab branch creation API provided by GitLab into the Webhook URL triggered by Jira. The reference sample is as follows:

https : //replace-this-with-your-GitLab-service-address/repository/branches?branch=${issue.key}&ref=master&private_token=[replace-this-with-your-account-token]

With this, the integration between Jira and GitLab is completed. Let’s summarize the functionalities that have been achieved:

GitLab synchronizes every code change state to the Jira task and automatically associates (Issue links) Jira tasks with the code.
You can add keywords like Fixes/Resolves/Closes and the Jira task number in MR to automatically transition the status in Jira.
Every time you create a task in Jira, a feature branch will be automatically created.

I have also shared the steps for integrating Jira with developer tools. You can click the cloud drive link to obtain it, with the extraction code “kf3t”. Many tool platforms are now developer-oriented, so the IDE tools that are closest to the developers have become the new battlefield for enhancing efficiency, including cloud IDEs and IDE plugins, all aiming to allow developers to complete all their daily tasks within the IDE, including managing branches and Jira tasks.

Code Management - GitLab #

What is the development process in this example project? Let’s take a look together.

Step 1: Create tasks on the requirements management platform. These tasks are generally deliverable features. Do you remember? We have already automated the creation of feature branches through the previous steps.

Step 2: Developers develop and test locally on the feature branch. After completing the development, they push the code to the feature branch and trigger the commit stage pipeline. This pipeline is used to quickly verify the basic quality of the committed code.

Step 3: After the commit stage pipeline passes, developers create merge requests to merge the feature branch into the main codebase.

Step 4: Code reviewers review the merge requests. If there are any issues, they will point them out in the merge request. Finally, they accept the merge request and merge the feature code into the main branch.

Step 5: After the code is merged into the main branch, the integration stage pipeline is triggered immediately. The tasks in this stage are more comprehensive, and testing personnel can manually deploy the testing environment and verify the new features.

Step 6: The feature goes through the testing environment, pre-release environment, and is finally deployed to the production environment through the deployment pipeline.

In Lecture 12 of this column, I mentioned that the concept of continuous integration is to establish a rapid feedback loop for code quality by integrating code as early and as frequently as possible. Therefore, version control systems and continuous integration systems also need to be tightly integrated.

The bi-directional integration refers to the version control system triggering the continuous integration system, and the results of continuous integration need to be returned to the version control system.

Next, let’s see how it is implemented specifically.

Triggering Continuous Integration by Code Commit #

First, you need to install the GitLab plugin in Jenkins. This plugin provides many GitLab environment variables to retrieve GitLab information. For example, the gitlabSourceBranch parameter is very useful, as it can extract the branch information of the webhook triggered in this commit. After all, only GitLab knows this information. Only by synchronizing it to Jenkins can the correct branch code be pulled for continuous integration.

When GitLab detects a code change event, it will automatically call the webhook address provided by this plugin and implement the functionality of parsing webhook data and triggering Jenkins tasks.

In fact, when we develop our own pipeline platform, we can also refer to this approach: automate the registration of webhooks by calling GitLab’s API in the background, thus achieving the automatic execution of tasks based on code change events.

After the GitLab plugin is installed, you can find a new option in the Build Triggers of Jenkins tasks. Check this option to activate the GitLab automatic triggering configuration. In the image below, I marked the two important pieces of information in red boxes:

The link above is the webhook address, which is unique for each Jenkins task;
The one below is the authentication token for this webhook.

You need to add these two pieces of information to the GitLab integration configuration. Open the “Settings - Integrations” option of the GitLab repository to see the GitLab webhook configuration page. Copy the address and token information generated by the Jenkins plugin to the configuration options, and select the corresponding trigger options.

GitLab provides multiple trigger options by default. In the screenshot below, only the Push event is selected, which means the webhook will be triggered only when a Git push action is detected. Of course, you can configure to listen to specific branch information and only execute the associated Jenkins task for feature branches. After configuring in GitLab, you can see the newly added webhook information and click “Test” to verify if it can be executed properly. If everything is normal, it will prompt “200-OK”.

Updating the Code Status in Continuous Integration #

Open the system management page of Jenkins, find the GitLab configuration, and add the address and authentication method of the GitLab server. Note that you need to select the GitLab API Token type for the Credentials. The corresponding token can be generated in GitLab’s “User - Settings - Access Tokens”. Due to the uniqueness of the token, it is only visible during generation and can never be seen again. So, after generating the token, you need to securely save this information.

So, how do you update the commit status in GitLab? This requires the use of the Build status configuration command provided by the plugin.

For Jenkins tasks of the freestyle type, you can add the Post-build Actions step - Publish build status to GitLab. It will automatically update the queued task to “Pending”, the running task to “Running”, and the completed task to “Success” or “Failed” based on the result.

For tasks using pipelines, the official also provides the corresponding example code. You just need to write it inside the Jenkinsfile accordingly.

updateGitlabCommitStatus name: 'build', state: 'success'

With this setup, the results of each pipeline triggered by code commits will also be displayed in the commit status in GitLab, which can be used as a reference when reviewing merge requests. Some companies may be more direct: if the pipeline status is not successful, the merge request will be automatically closed. Regardless of the method used, the intention is to prompt developers to fix continuous integration issues as soon as possible.

Let’s summarize the functionalities that have been implemented:

Every code commit on GitLab can trigger the corresponding Jenkins task through a webhook. The specific task triggered depends on which Jenkins task address you add to the GitLab webhook configuration.
After each Jenkins task is executed, the execution result will be written in the GitLab commit log. You can view the execution status and decide whether to accept the merge request.

Code Quality - SonarQube #

SonarQube is a common open-source code quality platform that can be used for static code scanning, finding defects and vulnerabilities in the code, and also provides basic security checking capabilities. In addition, it can also collect unit test coverage, code duplication, and other metrics.

For companies that are just starting to pay attention to code quality and technical debt, SonarQube is a relatively easy-to-use choice. Regarding technical debt, there is a thorough explanation in the 15th lesson of the column. If you don’t remember, don’t forget to review it.

Therefore, for code quality checks and similar routine tasks, it is also suitable to automate the process. The best way is to integrate it into the pipeline, which requires integration with Jenkins. Let me briefly explain the logic of the execution, hoping to help you better understand the configuration process.

The SonarQube platform actually consists of two parts:

One is the server-side, used to collect and display code quality data, which is the most commonly used feature.
The other is the client-side, which is the SonarQube Scanner tool. This tool runs locally on the client-side, that is, it runs in the same environment as the code, and is used for analysis, data collection, and reporting. This tool may not be very noticeable because it is configured on the Jenkins backend. If the tool is not found on the node, Jenkins will automatically download it. You can find it in the global tool configuration in Jenkins.

SonarQube

Once you understand the execution logic of code quality scanning, you can understand that for the integration of SonarQube and Jenkins, it only needs to be one-way. This means that as long as Jenkins’ Scanner tool can collect the data correctly and report it to the SonarQube server-side.

This configuration is also very simple. You just need to add the SonarQube server address in the global configuration of Jenkins. Make sure to check the first option to ensure that the configuration information of the SonarQube server can be automatically injected into the environment variables of the pipeline.

Jenkins SonarQube configuration

When executing Jenkins tasks, you can also add different reporting methods for freestyle tasks and pipeline tasks. For specific details, you can refer to the official website of SonarQube here. I won’t go into detail here.

So far, we have successfully integrated GitLab, Jenkins, and SonarQube. I will share a system relationship diagram with you, hoping to help you better understand the meaning and process of system integration.

System Integration

Environment Management - Kubernetes #

Finally, let’s take a look at the environment management part. As the operating system of the cloud-native era, Kubernetes has become the de facto standard for container orchestration in the cloud age. For DevOps engineers, Kubernetes is a must-learn and must-have skill, and this trend is very evident.

In the sample project, we also use Kubernetes as the underlying environment, and all the environment for Jenkins tasks is dynamically initialized through Kubernetes.

There are many benefits to this approach. On the one hand, it can achieve standardization of the environment. All environment configurations are written in code in the Dockerfile, achieving uniform and controllable environments. On the other hand, it greatly enhances the utilization of resources. It no longer depends on the environment configuration and resource size of the host machine. You just need to tell Kubernetes how much resources you need, and it will help you find a suitable physical node to run the container. Resource scheduling and allocation are uniformly completed by Kubernetes, further improving the efficiency of resource utilization. For small and medium-sized systems, it only takes minutes to initialize a complete environment. I will discuss this in more detail when I talk about “the platform construction of cloud-native era applications”.

To achieve dynamic initialization of the environment, you need to integrate Jenkins and Kubernetes. Fortunately, Jenkins provides an official Kubernetes plugin to achieve this functionality. You can add a cloud - Kubernetes in the Jenkins system configuration and then configure it according to the attached image.

It is important to note that the Jenkins address must be correctly configured (System Configuration - Jenkins Location), otherwise new containers will not be able to connect to Jenkins.

When generating dynamic nodes, you need to use the JNLP protocol, and I recommend using the official Jenkins-provided image.

The full name of the JNLP protocol is Java Network Launch Protocol, which is a universal protocol for remotely connecting Java applications. The typical use case is to initiate a connection request from a build node (also known as a slave node) to a master node and actively mount the build node to the Jenkins master for scheduling by the master. Different from using SSH long connections, this dynamic connection protocol is particularly suitable for dynamic nodes like Kubernetes. The image configuration is as shown in the attached image.

When configuring dynamic nodes, there are a few key points that you need to pay special attention to.

Static directory mounting. Since a completely new container environment is generated each time, you need to mount static data such as code cache (e.g., .git directory), dependency cache (.m2, .gradle, .npm), and external tools to the container through volumes to avoid affecting execution time when re-downloading.
If your Jenkins also runs in Kubernetes, pay attention to configuring the JNLP port number of Jenkins (using environment variable: JENKINS_SLAVE_AGENT_PORT). Otherwise, the port number configured in the system will not take effect.
Since there is a certain time cost for initializing containers each time, you can configure a waiting time. In this way, the environment will be saved for a period of time after the task is completed. If there are new tasks running during this time, you can directly reuse the existing container environment without generating it again.
If the network conditions are poor, you can increase the timeout time for creating containers appropriately, with a default of 100 seconds. If the container creation cannot be completed within this time, Jenkins will automatically kill the creation process and try again.

If everything goes well, the dynamic Kubernetes environment can also be used. At this point, we can run a complete pipeline. When designing the pipeline, you need to pay attention to the layering of the pipeline. The specific steps of the pipeline are already described in the system architecture diagram. For example, the commit stage pipeline needs to complete four steps: code fetching, building and packaging, unit testing, and code quality analysis, as shown below:

// pipeline 2.0 - Commit stage - front-end
pipeline {
    agent {
        // Kubernetes node label
        label 'pipeline-slave'
    }
    environment {
        // Image repository address
        HARBOR_HOST= '123.207.154.16'
        IMAGE_NAME = "front-end"
        REPO = 'front-end'
        HOST_CODE_DIR = "/home/jenkins-slave/workspace/${JOB_NAME}"
        GROUP = 'weaveworksdemos'
        COMMIT = "${currentBuild.id}"
        TAG = "${currentBuild.id}"
        TEST_ENV_NAME = 'test'
        STAGE_ENV_NAME = 'staging'
        PROD_ENV_NAME = 'prod'
        BUILD_USER = "${BUILD_USER_ID}"
        // Static data to be mounted in the container
        COMMON_VOLUME = ' -v /nfs/.m2:/root/.m2  -v /nfs/.sonar:/root/.sonar -v /nfs/.npm:/root/.npm '
    }
    stages {
        stage('Checkout') {
            steps {
                git branch: 'xxx', credentialsId: '707ff66e-1bac-4918-9cb7-fb9c0c3a0946', url: 'http://1.1.1.1/shixuefeng/front-end.git'
            }
        }
        stage('Prepare Test') {
            steps {
                sh '''
                docker build -t ${IMAGE_NAME} -f test/Dockerfile .
                docker run --rm -v ${HOST_CODE_DIR}:/usr/src/app ${IMAGE_NAME} /usr/local/bin/cnpm install
                '''
            }
        }
        stage('Code Quality') {
            parallel {
                stage('Unit Test') {
                    steps {
                        sh '''
                        docker run --rm -v ${HOST_CODE_DIR}:/usr/src/app ${IMAGE_NAME} /usr/local/bin/cnpm test
                        '''
                    }
                }
                stage('Static Scan') {
                    steps {
                        sh 'echo "sonar.exclusions=node_modules/**" >> sonar-project.properties'
                        script {
                            def scannerHome = tool 'SonarQubeScanner';
                            withSonarQubeEnv('DevOpsSonar') {
                                sh "${scannerHome}/bin/sonar-scanner"
                                updateGitlabCommitStatus name: 'build', state: 'success'
                            }
                        }
                    }
                }
            }
        }       
    }
}

If you follow the steps I just described, you will get a complete pipeline demonstration effect like this:

Combined with Jenkins’ manual approval process, it can achieve automatic and manual deployment of multiple environments and build a true end-to-end continuous delivery pipeline.

Summary #

In today’s lesson, I introduced you to how to build an open-source tool-based continuous delivery pipeline platform through an open-source pipeline solution. You may have also realized that for DevOps, the real challenge lies not in the tools themselves, but in how to connect and combine them based on the entire development process to maximize their advantages. These concepts also apply to self-built platforms, and you need to try more in practice to be proficient in their application.

Discussion Questions #

Do you have any questions about the overall toolchain, configuration, and design approach of this open-source pipeline solution? What problems have you encountered during the implementation process that you couldn’t overcome?

Please feel free to share your thoughts and answers in the comments section. Let’s discuss and learn together. If you find this article helpful, you are also welcome to share it with your friends.