Motivation
In this project development, the previous static Landing Page needed to be dynamically updated with data from API calls, and with increasing interactivity on the page, the existing pug+webpack+jQuery setup was no longer sufficient. Therefore, we decided to introduce next.js in the development of the new version.
Let's first talk about the original architecture of the Landing Page. During CI, the static HTML pages were packaged and uploaded to the CDN along with CSS, JavaScript, and images. The HTML files were directly hosted on the server, with an nginx reverse proxy in front.
Wait? If everything is already on the CDN, why are the HTML files still hosted on the server?
This is due to historical reasons. The company uses a self-built CDN, which has limited support for custom domains. If we were to upload the HTML files to the CDN, the domain name would also change, which would not be ideal for the planning team.
Another issue is that besides the Landing Page developed by frontend engineers, sometimes the planning team uses a template Landing Page generator to create pages, and the domain for this is fixed and cannot be changed. Therefore, in order to accommodate both scenarios (template Landing Page and Landing Page developed by the team), using nginx as a proxy was a compromise.
Now the problem arises, next.js is not compatible with the original development architecture. Rewriting the entire Landing Page in React seemed impractical, so we decided to add a new server dedicated to developing Landing Pages using next.js, and then redirect through nginx to ensure the same domain.
Issues
Before we continue with this article, let's talk about the problems with the original static server and the improvements that come with the new server.
1. SEO
The main consideration is SEO. Although the original static server could achieve SEO, as mentioned earlier, if we want to make API calls to fetch data or have requirements like blog articles, since the data is generated at build time, fetching data through ajax or fetch in frontend JavaScript would not be SEO-friendly. Another consideration is the CORS issue, as the API server and Landing Page have different domains.
2. Performance
Although the amount of data used in the Landing Page is not large, implementing SSR would improve performance.
3. Benefits of the Server
Having a node.js backend server allows for more flexibility in meeting the planning team's requirements. Next.js itself supports various build methods, such as pre-packaging into static files with getStaticProps
, and using SSR with getServerSideProps
. Having a node.js backend also facilitates the implementation of caching or database access requirements.
Based on the above reasons, we decided to set up a new server. However, little did we know that the troubles were just beginning.
Struggles
Coordinating with SRE
To set up the server, we had to coordinate with the SRE team and explain the issues. Perhaps due to the cautious nature of SRE, they had many concerns and responsibilities for other projects, so it took a lot of time to communicate with them. However, our motivation was clear, and I had already clarified the entire architecture in advance, so we quickly obtained permission.
Our cloud is a private cloud, and in the alpha environment, each developer can create their own machines. Therefore, the alpha setup was quickly completed. However, the beta and production environments required relatively complex procedures, such as applying for ACL and having SRE set up the infrastructure. Setting up the machines was easy, but installing various packages was troublesome. Monitoring tools, node.js, nginx, etc. Fortunately, with the assistance of SRE, the installation went smoothly.
Nginx Configuration with a Rich History
Setting up the server was straightforward, but what troubled me was the long-standing nginx configuration file. It contained various page redirects from different sources, how to handle maintenance mode, and special handling for specific paths.
Deployment through Ansible Playbook
SRE helped with the environment installation issues, but there was still deployment to be done. This part had to be clarified by ourselves. The company mostly uses ansible playbook + awx for deployment (AWX is a GUI that allows playbook execution through a web API). Luckily, I had collaborated with colleagues on other projects, so I had some experience with playbooks. Otherwise, the large chunk of YAML would have been overwhelming.
Docker
For convenience, the recent deployment practice is to set up a machine, install Docker, and run a Docker image as a systemctl service inside it.
I thought packaging would be easy, but after implementation, I encountered various issues. This also involved the history of our project. We initially used the lerna+monorepo architecture, where each subfolder had references to root files, making it difficult to package only a specific subfolder.
Lerna has two main features:
- Package installations hoist dependencies used in other projects to the root
node_modules
directory. - In subprojects, you can use
import a from 'sub-project'
to reference functions from other subprojects (achieved by adding a symbolic link innode_modules
).
The resulting packaged image size became over 1GB, making me realize the horror of node_modules
. Additionally, I spent a long time debugging due to the symbolic link issue. If the symbolic link was not created, npm would attempt to find the sub-project
package on the internet, which would likely not be found.
Jenkins Integration
The company uses Jenkins for integration, so the Docker image is also packaged and uploaded to the internal Docker hub through Jenkins. However, we discovered that Jenkins sometimes encountered unknown errors during packaging, but after rerunning several times, it would return to normal.
Risk Assessment
Since a new server was established, a risk assessment needed to be conducted in terms of procedures.
Security Check
Although there is always a security check for major projects, setting up a new server also requires the Security Team to inspect for security issues and make necessary fixes.
Underestimation of the Timeline
The introduction of next.js was not initiated by our team but by colleagues from other offices in another project. I initially thought the server setup was already completed, and when developing the Landing Page, we just needed to follow suit. However, in reality, we only used the SSG (Static Site Generation) feature of next.js and then deployed the generated static files to the original Landing Page server.
Although I was a bit surprised at the time, there was still over a month until the QA testing, and we were only applying the server functionality of next.js and deploying the generated static files, so it should have been more than enough. However, some features had already started QA testing at that time, and the development of the Landing Page itself had not yet begun. Despite completing the page in just one week, the process of handling QA, communicating with other departments, designing the architecture, and the aforementioned steps unknowingly took a month. Additionally, a new project was also under evaluation, and the launch date was approaching, leaving me busy and without anyone to help.
During this period, I experienced significant stress and even had to work overtime on holidays, barely able to cope.
Reflection
In unexpected situations like this, more time for preparation is often necessary. Actually, improving SEO for this purpose is not very appealing... and probably won't yield much benefit. Ah, as I grow older, I should also consider my work style.
1. Becoming the Bottleneck
In this server setup, there were too many non-frontend-related knowledge areas involved, and one had to be familiar with the company's internal tools to accomplish them. Many things were learned step by step through reading documentation, making it difficult for team members to assist as they were not highly relevant to frontend work. I'm also thinking if there is a better approach, but that's just how things are. Once you give up and stop doing it, the result is that the proposal remains untouched. However, this is not a good situation.
2. Underestimating the Complexity of Server Setup
Rather than saying that setting up a server is difficult, it is more accurate to say that the cross-department communication and the preparation work involved inevitably lengthened the entire development cycle. Honestly, you wouldn't know until you go through it. Next time, the best thing to do is to integrate this experience into documentation, making future journeys easier.
3. Mainly Japanese-speaking Departments
Apart from our team, most departments are primarily composed of Japanese-speaking members. Insufficient Japanese language ability easily leads to communication breakdowns, so sometimes I become the bridge between communication, and I have to be involved in many things. This is not a good situation either. I'm still thinking about how to improve it.
4. Insufficient Preparations
At the time, I didn't realize that the lack of a server was due to insufficient pre-research on my part, and this needs to be reconsidered. In the future, if there is a similar server setup requirement, it would be better to allocate more time to prevent unforeseen circumstances.
5. Multiple Timelines Colliding
In addition to the server setup, there were many timelines colliding at that time, such as fixing existing QA issues, developing the Landing Page, code review, communicating with SRE, discussing solutions for changing requirements, etc. Constant context switching compressed the time available for focusing on server setup.