How I Improve the development process

Written byKalanKalan
💡

If you have any questions or feedback, pleasefill out this form

This post is translated by ChatGPT and originally written in Mandarin, so there may be some inaccuracies or mistakes.

A few months ago marked the time when large features were launched. Before moving on to another project, there wasn't much development going on—mainly fixing some minor bugs and improving existing features. Since development was not as intense, this season provided more time to focus on process improvements.

Background

Let’s first discuss the challenges faced during development. Each team and organization encounters different situations, so understanding the context and the issues to improve is quite important. Several issues became apparent during the development phase:

Many People Involved in the Organization

Currently, I am involved in finance-related development, which includes products like FX (foreign exchange), stocks, investment trusts, etc. While larger features may be separated for different handling (for example, FX may involve developing a separate app), most features are implemented under the same website and managed by the respective development teams responsible for each project. This leads to:

  • Every time a shared module is touched, multiple teams must be involved, resulting in a high cognitive load. There’s a fear that modifying the code might break something, which leads to a preference for workarounds rather than addressing the problem directly.
  • When multiple projects are developed in parallel, conflicts can easily arise. For instance, if a desktop version is implemented in Project A, but Project B, which is ongoing at the same time, hasn’t done so yet, merging the two projects can lead to conflicts.

Teams Cannot Manage All Resources

For example, during QA, we must align with the schedules of all projects, and since QA resources are limited, there are times when a feature is ready but we have to wait one or two months before QA can be conducted. This incurs significant costs:

  • Developers often forget implementation details after one or two months and need time to recall them.
  • When QA raises issues that need fixing, the entire iteration process becomes prolonged.
  • Some APIs or implementations require other teams to make corrections, and without control over their schedules, this can stall the development progress of our team.

Complicated QA Environment

Ideally, there should only be one QA environment.

However, in the context of parallel project development, limited testing environments, and high costs for creating new testing environments (due to architectural constraints), it is likely that QA will occur in two environments at the same time. In situations where testing environments are insufficient, sometimes development and QA have to share an environment, leading to several issues:

  • QA may encounter issues but cannot determine whether they are genuine bugs or due to developers testing new features.
  • The communication costs resulting from insufficient QA environments are quite high. With a large team, there are many channels to coordinate.
  • The process of preparing test accounts is cumbersome, resulting in various environments having faulty or unusable accounts.

Incomplete Sprints

Although the team is currently using Scrum for development, I’ve noticed that since leaving my previous project, Scrum has become quite ineffective. Possible reasons include:

  • Some engineers within the team are responsible for different projects, rendering the Sprint Goal almost meaningless.
  • There are no QA resources available for scheduling, making testing impossible.
  • Progress tracking is not rigorous enough, often leading to situations where a priority A task is being worked on while a colleague is doing B instead.
  • There are no clear goals or iterations.

From the above situations, it seems the team has not met the ideal conditions needed for a successful Sprint. Given that many resources are tied up in other areas, it’s possible that Sprint may not even be the best choice.

Improvement Process

While most of the issues cannot be resolved solely on the development side, I believe there are some improvements we can make in the development environment.

When working on improvements, it’s essential to first think about the motivation and objectives we want to achieve. So, as the first step, I prepared a document summarizing the current problems and possible solutions.

The main objectives are as follows:

  • The development team can freely switch between different branches of code. This is important because the projects our development team handles are different, and often multiple projects are ongoing, leading to situations where someone might say, "Hey, can I borrow the environment just to test this? I'll return it afterwards."
  • QA and development teams do not share the same environment. This way, we can avoid situations where issues arise, and the cause cannot be determined.

With this document, I began discussions within the team, which had several purposes:

  • Confirm that everyone is facing similar issues, ensuring that our understanding of the situation is aligned.
  • For my colleagues, they may have different considerations, which are key to whether the subsequent improvements can be successfully implemented.
  • Brainstorming together might reveal flaws in my ideas, and gathering everyone’s thoughts could lead to better solutions.

While this time I wanted to emphasize the overall improvement process rather than just implementation, I’ll briefly describe my approach:

  • On the frontend, since we are running a SPA, we can simply upload the built JavaScript to switch between different branches of code. This allows development to avoid sharing environments with QA, enabling different development teams to switch environments freely.
  • Use query parameters to switch between different environments, but maintaining these query parameters can be cumbersome, especially since the app has some redirect logic (like redirecting after payment).
  • Implement a feature similar to deploy previews (like Netlify), where after submitting a PR, the development team can click a URL to view the demo directly.

After clarifying these points, I found that there were a few issues with these approaches:

  • During deployment, the frontend requires an additional node.js server to act as a simple validation server and provide some APIs for easy operations. So it’s not purely frontend; the approach of file uploads can lead to misunderstandings, and the codebase that needs to be modified is larger, making it more prone to backlash.
  • Deploy previews often require dynamic DNS support, but our company uses a private cloud, and it’s unclear if there are APIs available to achieve this. Additionally, features like login that redirect to other pages for SSO have domain restrictions, causing the domain names generated by deploy previews to be rejected, resulting in login failures. While this approach works for features that don’t require login, it proves to be inefficient and not cost-effective.

During this process, another engineer also felt similar issues and proactively joined the improvement efforts (I initially thought I would be going at it alone), and we began to sort through the problems together. I first explained the concept of Deploy Preview to him, and after he understood, he proposed a new idea for the existing solution.

“The cost of dynamically adjusting DNS is high; why don’t we use nginx for redirection?”

This means creating a new machine according to the current deployment method (which can be done with just a few clicks through our internal private cloud), and then adding a redirection logic in nginx on the existing machine.

If a certain cookie value is present, it redirects to the development machine; otherwise, it remains unchanged. This way, not only does the original codebase not require any modifications (just minor changes to the Jenkins script and adjustments to the Ansible script), but the existing deployment method can also be reused, significantly reducing the development costs.

Once the plan was finalized, I consulted with other teams for their opinions, and generally, everyone was positive. Finally, I checked with the SRE team, and there were no issues. The entire implementation was thus concluded.

After the Improvements

Honestly, it’s hard to feel the tangible results just from written documentation, even if a demo is presented. The true impact is felt only after actual use. Once everyone began utilizing the new environment-switching mechanism, the feedback was overwhelmingly positive, indicating a genuine improvement in the process.

Insights

The improvement process took a little over three months, as it involved several teams, hence the extended timeline. At the beginning, it’s crucial not to be impatient; otherwise, team members may lose understanding. I learned quite a few things throughout this process.

1. Write Documentation First

Creating documentation is the first step in building team consensus. When writing documentation, focus less on the implementation details and more on how to solve the problem and what possible solutions exist, allowing the team to brainstorm better ideas. This approach also has the added benefit of making the team feel that this is not just your idea but something collaboratively discussed.

2. Start from the Whole, Not the Individual

One reason this improvement was well-received is that it wasn’t just my personal issue; many people in the team shared similar pain points. When making improvements, one cannot simply justify actions based on "this is more comfortable to write" or "I am more familiar with this technology." Instead, the focus should be on whether these improvements genuinely bring benefits, as that’s what will garner support and allies.

3. Don’t Be Overly Attached to Technology

To be honest, this time we didn’t really use any technical innovations; it was just adding a conditional statement. However, the overall benefits were substantial. Many engineers become overly fixated on the technology itself, often neglecting the problems that need to be solved. Technology is meant to serve humanity; if we overlook this fact, then the technology itself loses its meaning.

4. Don’t Fear Other Teams

The need for assistance from other teams can often be daunting, as most people find it troublesome and prefer to handle only their own responsibilities. However, as long as you can accurately articulate the problems and solutions, addressing their pain points, many times people will be on your side. Sometimes, what might seem like hurdles from others are not worth overthinking.

5. Find Like-Minded Partners

I believe a significant factor in the success of this initiative was the interest shown by another engineer in the matter. Coincidentally, he was a backend engineer with more architectural insight, which led to the nginx solution. However, finding like-minded partners is often a matter of chance, so when you do find one, be sure to cherish that collaboration.

If you found this article helpful, please consider buying me a coffee ☕ It'll make my ordinary day shine ✨

Buy me a coffee