User Story How Operation and Maintenance and Software Engineers Communicate

User Story How Operation and Maintenance and Software Engineers Communicate #

The main content of this column has ended, and after nearly five months of learning, what kind of stories have you been left with? In this issue, we will present user stories. We have invited several active engineer students to share their learning stories and experiences.

Ninuxer #

Let me start with my background. I am based in Shanghai and currently work in a startup company doing operations-related work. I have been in this field for about 5 years. In many people’s eyes, operations is a support or service-oriented position, colloquially known as “firefighters,” responsible for ensuring the stability of online business.

In the past few years, my way of working was basically: if there are no problems, everyone is happy; if there are problems, I rely on my instincts (experience) to solve them; if I can’t solve them, I just Google it. Because I didn’t have a complete system for analyzing and solving problems, I have always used superficial methods that only treat the symptoms, without delving into the root cause, whether it is hardware, system, middleware, or my own program bugs.

By chance, I came across a technical public account that I follow, which recommended the GeekTime column “Linux Performance Optimization in Action.” When I saw the words “performance optimization in action,” I instinctively clicked on it to learn more and discovered that it was written by a familiar expert—I had previously studied some of the author’s blogs about k8s and knew how knowledgeable they were. So, without hesitation, I subscribed to the column and recommended it to some friends around me to join the learning brigade!

From what I can see, the author has put a lot of thought into the design of the column, with each section including cases, summaries, methodologies, and Q&A. For me, the biggest gains are the problem-solving thought process—analyzing problems from the surface to the root—and the use of analysis tools.

I have to give a shoutout to the tools section. Each knowledge module has two tables, one for problem metrics and one for tools. Even if I forget, I can quickly locate the tool by referring to these tables.

Now let me talk about my learning approach. Since I am quite busy with my work, I mainly read articles during my commute to and from work. On weekends, I find time to practice the case studies in the articles and then review the content based on the results of the practice. After each module is finished, I review it on the weekend to deepen my understanding.

After saying so much, I finally want to express my sincere gratitude to the author for their hard work, which has given me a comprehensive understanding of Linux system issues. At the same time, I am well aware that system optimization is a gradual process that cannot be completed with just one article, one column, or one book. It requires long-term learning and practice.

May we all not forget our original aspirations and forge ahead together!

Jia #

I am Yong Jia and I have been working since 2007. In the early stages, I mainly worked on C/C++ video monitoring and video processing development, and later worked on Linux underlying distributed file systems/fuse. I am more interested in application, containerization, and Kubernetes-related work. I currently work for Tuputech, an AI Internet company that provides image and video content auditing and business intelligence. I am mainly responsible for building a Kubernetes system for the company and working on application containerization.

Since last year, I have begun researching and building a Kubernetes system for the company. In the process of building it, I needed to choose the version of Kubernetes, as well as test and select Docker and the Linux kernel. These were not easy tasks, and I encountered quite a few challenges during the building process. It was a coincidence that I felt the pressure and decided to study systematically. I happened to see someone recommending this column in my friends’ circle, so I bought it out of curiosity and started learning.

At the beginning of my learning journey, I actually had a lot of questions in mind.

  • The training program of the company often has applications that occupy 10-40 core CPUs. Where exactly are these CPUs consumed?

  • When running Docker/k8s training, there is often a problem with containers not being able to be deleted. What should I do?

  • How to solve the problem of not being able to close zombie processes?

These questions have been answered in the column. For example, in the CPU content learning section of the first module, I understand the concept of average load, learn to find the application with the highest CPU usage, and use perf to locate the problem with the application.

The teacher has taught many methods, which I not only need to understand, but also need to learn to analyze. In addition, I think the learning effect of the column is relatively good. Because I can see other people’s comments below the article, I can know the different pitfalls that everyone has encountered and the different solutions they have encountered, which have also benefited me a lot.

Finally, I would like to thank the author for updating the column during the Spring Festival and patiently answering everyone’s questions. I hope the author can produce more courses, such as content about k8s. I also hope that other learners can learn from each other and progress together.

The Brightest Star in the Night Sky #

When “Linux Performance Optimization in Practice” was launched on Geek Time, I subscribed to it immediately. The reason is simple: I am an operations engineer, and I really need this course. In the process of operations work, I more or less need to use optimization content, but the knowledge I have mastered in this area is not deep enough and not systematic. Therefore, learning and improving are urgently needed!

“Linux Performance Optimization in Practice” has truly improved my work skills. The knowledge taught by the teacher can be applied directly to practical work immediately after learning, which really makes people happy and greatly improves my interest in learning. With the teacher’s course as a foundation, even if I continue to delve into the knowledge of optimization, I will have a clear plan and will not be afraid.

Every part of the course is fascinating. Among them, my favorite part is the strategy section. With a strategy, it’s like a soldier having a handy weapon and no longer fighting barehanded. My operations skills have also improved to a higher level. When encountering performance-related issues, I no longer feel helpless.

And, I have to say, the charts summarized by the teacher are really user-friendly. I saved them in my mobile phone album, so I can use them anytime and anywhere; or print them out and paste them on the company’s wall, so that I can improve my efficiency after working together day and night.

Every knowledge point of the course has been rewarding for me. The most impressive one is the section on Buffer and Cache. Through this article, I have a clear understanding of the differences and connections between the two, and I have a feeling of enlightenment and clarity. It’s incredible! It’s rare for learning to be so exciting and enjoyable.

I would like to express my gratitude for the teacher’s hard work, as well as the Geek Time editors and the entire Geek team. It is not an exaggeration to say that I am now a “addicted” user of Geek Time. I listen to it when I’m not working and Geek Time when I’m working. It has become a normal part of my life. I’m looking forward to the second season of the teacher’s content and also looking forward to Geek Time producing more and better knowledge products. Thank you, let’s embark on the road of Geek Time together.