Kalan's Blog

Kalan 頭像照片,在淡水拍攝,淺藍背景

四零二曜日電子報上線啦!訂閱訂起來

Software Engineer / Taiwanese / Life in Fukuoka
This blog supports RSS feed (all content), you can click RSS icon or setup through third-party service. If there are special styles such as code syntax in the technical article, it is still recommended to browse to the original website for the best experience.

Current Theme light

我會把一些不成文的筆記或是最近的生活雜感放在短筆記,如果有興趣的話可以來看看唷!

Please notice that currenly most of posts are translated by AI automatically and might contain lots of confusion. I'll gradually translate the post ASAP

Setting up a server in a company is not as easy as I imagined.

Motivation

In this project development, the previous static Landing Page needed to be dynamically updated with data from API calls, and with increasing interactivity on the page, the existing pug+webpack+jQuery setup was no longer sufficient. Therefore, we decided to introduce next.js in the development of the new version.

Let's first talk about the original architecture of the Landing Page. During CI, the static HTML pages were packaged and uploaded to the CDN along with CSS, JavaScript, and images. The HTML files were directly hosted on the server, with an nginx reverse proxy in front.

Wait? If everything is already on the CDN, why are the HTML files still hosted on the server?

This is due to historical reasons. The company uses a self-built CDN, which has limited support for custom domains. If we were to upload the HTML files to the CDN, the domain name would also change, which would not be ideal for the planning team.

Another issue is that besides the Landing Page developed by frontend engineers, sometimes the planning team uses a template Landing Page generator to create pages, and the domain for this is fixed and cannot be changed. Therefore, in order to accommodate both scenarios (template Landing Page and Landing Page developed by the team), using nginx as a proxy was a compromise.

Now the problem arises, next.js is not compatible with the original development architecture. Rewriting the entire Landing Page in React seemed impractical, so we decided to add a new server dedicated to developing Landing Pages using next.js, and then redirect through nginx to ensure the same domain.

Issues

Before we continue with this article, let's talk about the problems with the original static server and the improvements that come with the new server.

1. SEO

The main consideration is SEO. Although the original static server could achieve SEO, as mentioned earlier, if we want to make API calls to fetch data or have requirements like blog articles, since the data is generated at build time, fetching data through ajax or fetch in frontend JavaScript would not be SEO-friendly. Another consideration is the CORS issue, as the API server and Landing Page have different domains.

2. Performance

Although the amount of data used in the Landing Page is not large, implementing SSR would improve performance.

3. Benefits of the Server

Having a node.js backend server allows for more flexibility in meeting the planning team's requirements. Next.js itself supports various build methods, such as pre-packaging into static files with getStaticProps, and using SSR with getServerSideProps. Having a node.js backend also facilitates the implementation of caching or database access requirements.

Based on the above reasons, we decided to set up a new server. However, little did we know that the troubles were just beginning.

Struggles

Coordinating with SRE

To set up the server, we had to coordinate with the SRE team and explain the issues. Perhaps due to the cautious nature of SRE, they had many concerns and responsibilities for other projects, so it took a lot of time to communicate with them. However, our motivation was clear, and I had already clarified the entire architecture in advance, so we quickly obtained permission.

Our cloud is a private cloud, and in the alpha environment, each developer can create their own machines. Therefore, the alpha setup was quickly completed. However, the beta and production environments required relatively complex procedures, such as applying for ACL and having SRE set up the infrastructure. Setting up the machines was easy, but installing various packages was troublesome. Monitoring tools, node.js, nginx, etc. Fortunately, with the assistance of SRE, the installation went smoothly.

Nginx Configuration with a Rich History

Setting up the server was straightforward, but what troubled me was the long-standing nginx configuration file. It contained various page redirects from different sources, how to handle maintenance mode, and special handling for specific paths.

Deployment through Ansible Playbook

SRE helped with the environment installation issues, but there was still deployment to be done. This part had to be clarified by ourselves. The company mostly uses ansible playbook + awx for deployment (AWX is a GUI that allows playbook execution through a web API). Luckily, I had collaborated with colleagues on other projects, so I had some experience with playbooks. Otherwise, the large chunk of YAML would have been overwhelming.

Docker

For convenience, the recent deployment practice is to set up a machine, install Docker, and run a Docker image as a systemctl service inside it.

I thought packaging would be easy, but after implementation, I encountered various issues. This also involved the history of our project. We initially used the lerna+monorepo architecture, where each subfolder had references to root files, making it difficult to package only a specific subfolder.

Lerna has two main features:

  • Package installations hoist dependencies used in other projects to the root node_modules directory.
  • In subprojects, you can use import a from 'sub-project' to reference functions from other subprojects (achieved by adding a symbolic link in node_modules).

The resulting packaged image size became over 1GB, making me realize the horror of node_modules. Additionally, I spent a long time debugging due to the symbolic link issue. If the symbolic link was not created, npm would attempt to find the sub-project package on the internet, which would likely not be found.

Jenkins Integration

The company uses Jenkins for integration, so the Docker image is also packaged and uploaded to the internal Docker hub through Jenkins. However, we discovered that Jenkins sometimes encountered unknown errors during packaging, but after rerunning several times, it would return to normal.

Risk Assessment

Since a new server was established, a risk assessment needed to be conducted in terms of procedures.

Security Check

Although there is always a security check for major projects, setting up a new server also requires the Security Team to inspect for security issues and make necessary fixes.

Underestimation of the Timeline

The introduction of next.js was not initiated by our team but by colleagues from other offices in another project. I initially thought the server setup was already completed, and when developing the Landing Page, we just needed to follow suit. However, in reality, we only used the SSG (Static Site Generation) feature of next.js and then deployed the generated static files to the original Landing Page server.

Although I was a bit surprised at the time, there was still over a month until the QA testing, and we were only applying the server functionality of next.js and deploying the generated static files, so it should have been more than enough. However, some features had already started QA testing at that time, and the development of the Landing Page itself had not yet begun. Despite completing the page in just one week, the process of handling QA, communicating with other departments, designing the architecture, and the aforementioned steps unknowingly took a month. Additionally, a new project was also under evaluation, and the launch date was approaching, leaving me busy and without anyone to help.

During this period, I experienced significant stress and even had to work overtime on holidays, barely able to cope.

Reflection

In unexpected situations like this, more time for preparation is often necessary. Actually, improving SEO for this purpose is not very appealing... and probably won't yield much benefit. Ah, as I grow older, I should also consider my work style.

1. Becoming the Bottleneck

In this server setup, there were too many non-frontend-related knowledge areas involved, and one had to be familiar with the company's internal tools to accomplish them. Many things were learned step by step through reading documentation, making it difficult for team members to assist as they were not highly relevant to frontend work. I'm also thinking if there is a better approach, but that's just how things are. Once you give up and stop doing it, the result is that the proposal remains untouched. However, this is not a good situation.

2. Underestimating the Complexity of Server Setup

Rather than saying that setting up a server is difficult, it is more accurate to say that the cross-department communication and the preparation work involved inevitably lengthened the entire development cycle. Honestly, you wouldn't know until you go through it. Next time, the best thing to do is to integrate this experience into documentation, making future journeys easier.

3. Mainly Japanese-speaking Departments

Apart from our team, most departments are primarily composed of Japanese-speaking members. Insufficient Japanese language ability easily leads to communication breakdowns, so sometimes I become the bridge between communication, and I have to be involved in many things. This is not a good situation either. I'm still thinking about how to improve it.

4. Insufficient Preparations

At the time, I didn't realize that the lack of a server was due to insufficient pre-research on my part, and this needs to be reconsidered. In the future, if there is a similar server setup requirement, it would be better to allocate more time to prevent unforeseen circumstances.

5. Multiple Timelines Colliding

In addition to the server setup, there were many timelines colliding at that time, such as fixing existing QA issues, developing the Landing Page, code review, communicating with SRE, discussing solutions for changing requirements, etc. Constant context switching compressed the time available for focusing on server setup.

Prev

Reading Temperature and Humidity with Arduino via Web Serial API

Next

Remote Work Environment Sharing (Hardware Edition)

If you found this article helpful, please consider buy me a drink ☕️ It'll make my ordinary day shine✨

Buy me a coffee