Introduction
I have been writing blog posts since around 2015. I started with platforms like Pixnet and Logdown, then customized the style using Hexo. Later, I switched to Medium, and in 2019, I joined the trend of using Gatsby. I have been using Gatsby for over three years now.
I want to emphasize that if you want to start writing blog articles, it's important to find a convenient platform to use. The most important thing is to focus on writing articles. I have seen too many blog articles by software engineers that start with "Building a blog using xxx," and then they never update it again. Or they spend a lot of time setting up the website, but the articles only contain "hello world" or "test." I am glad that I have maintained the habit of writing and sharing articles. Although the traffic is not high, I have accumulated a certain number of readers.
Why I stopped using Gatsby?
Let's get to the point. I found a few troublesome aspects of using Gatsby:
- Static Generation: Every time I finish writing an article, I have to rebuild the entire site. (Even though CI is running, it is still inconvenient.)
- Categories: Currently, categorization is done by matching metadata, but over time, it becomes messy and easy to forget or be too lazy to search for the correct category.
- Quantity: After accumulating articles for several years, I now have over 140 posts. It's time to find a different way to manage and modify them.
- Fun: Next.js has evolved a lot over the years and supports many features. It can be easily deployed to Vercel, which is convenient for someone like me who doesn't want to manage servers.
Although Gatsby offers rich functionality, the final output is still static files, and all the articles need to be managed locally. Although Gatsby provides content sources other than the filesystem, configuring them is a bit cumbersome, and they need to adhere to Gatsby's specific format to be usable. If customization is required, one needs to keep searching for packages or write their own. These accumulated costs become a significant overhead.
For these reasons, I eventually chose to customize with Next.js.
Requirements
I have organized the following requirements for my blog:
- Generate HTML from Markdown: This is a must for me since all my articles are in Markdown format, and I use mathematical syntax, footnotes, etc.
- Code syntax highlighting
- Database: Convenient storage and management of articles, with the ability to create tables and indexes.
- RSS support: I highly value RSS support. The blog itself can be simple, but RSS is a must. Many readers receive new article notifications through RSS.
- Image and video uploads: Currently, the images for articles are hosted on a CDN, and I hope to have an easy-to-use editor with upload functionality.
- Multi-language support (i18n): Some articles may require translation for communication with foreign communities.
- Customizable page layout
Technology Choices
- Frontend and Backend: Next.js, implementation details will be discussed later
- Multi-language support: Implemented using Next.js's i18n functionality, details will be covered in later sections
- Markdown and code syntax highlighting: Implemented using remark and shiki
- Database: PostgreSQL, built using Google Cloud SQL (databases are more expensive than expected 😱)
- RSS Feed: Generated periodically using Cloud Functions and Cloud Scheduler
- Static file storage: S3, with CDN provided by CloudFront
- Deployment: Integrated with Vercel and GitHub, automatically deployed after pushing new commits
At this point, some may wonder why static files are stored on Amazon while the rest of the cloud services are on Google Cloud.
I have used RDS and Lambda before and found that Google's interface and configuration are more user-friendly, so I switched to Google Cloud. However, since I have been using S3 to store static files since the beginning of my blog, I just stuck with it. If I have free time in the future, I might consider migrating the static files to another service.
Implementation
Moving articles to the database (Cloud SQL)
Since my past articles were written in individual Markdown files, the first step is to store the contents of these files in a database. I designed two tables: "posts" and "categories." To support i18n, I used JSON data types for fields like title and summary. Although searching might be a bit more complicated, it becomes more convenient when adding support for other languages. The next step is to write SQL statements to insert the articles into the corresponding fields. It's worth mentioning that all database table creation and operations are stored in separate .sql
files for easy rollback and re-creation.
For many experienced engineers, this is common knowledge, but I have found that many people rely heavily on the functionality provided by frameworks or use command-line interfaces. When it comes to making modifications, they often feel lost.
Next.js
Let me explain why I chose Next.js. Although major frontend frameworks have corresponding SSR frameworks, in my experience, Next.js offers the most feature-rich and well-integrated solution. It requires minimal configuration to start development. Here are some details to share:
SSR (Server Side Rendering) and ISR (Incremental Static Regeneration)
Next.js provides three rendering modes: static generation, incremental static regeneration, and server-side rendering. Static generation creates pure static pages during build time, while incremental static regeneration allows you to specify certain paths for static generation and falls back to SSR for non-existing paths. Server-side rendering renders pages on every request, executing getServerSideProps
and returning the rendered HTML, which then runs JavaScript logic on the client-side (e.g., adding event listeners).
For my blog requirements, I decided to use ISR for the homepage and the second and third pages, as they are frequently accessed and may require updates when I publish new articles. The content of the articles also uses ISR, with the first 50 articles being pre-generated during build time, and older articles being generated on request. Pages like /contact
are implemented as pure static pages.
In practice, static generation proves to be powerful. However, server-side rendering can be slower due to the server's need to connect to the database.
i18n
Next.js has built-in i18n support at the routing level. You can set up different languages to redirect to different domains or paths. For example:
/zh/posts
: Sets the locale tozh
./en/posts
: Sets the locale toen
.
Next.js also supports automatic language detection through headers or navigator.languages
. Both the server-side and client-side can easily access the current locale:
// Server-side
const getServerSideProps = ({ locale }) => {
// Current locale
};
// Client-side
import { useRouter } from 'next/router';
const Component = () => {
const { locale } = useRouter();
};
While many developers use libraries like react-intl
or react-i18next
to simplify i18n operations, I found them overwhelming for my needs. I don't require third-party services or assume failure scenarios, and the number of keys is not large enough to warrant additional loading. Therefore, I implemented a simple solution myself.
import React, { createContext, useCallback, useContext } from "react";
import { en } from "./en";
import { ja } from "./ja";
import { zh } from "./zh";
type I18nData = {
_i18n: {
locales: string[];
data: {
[key: string]: {
[key: string]: string;
};
};
};
};
export default function createI18n() {
const datas = {
en,
ja,
zh
};
return {
_i18n: {
locales: ["en", "zh", "ja"],
data: datas
}
};
}
const I18nContext = createContext<{
locale: string;
_i18n: I18nData["_i18n"];
}>(null);
export const I18nProvider: React.FC<{
_i18n: I18nData["_i18n"];
locale: string;
children: React.ReactElement;
}> = ({ _i18n, locale, children }) => {
return (
<I18nContext.Provider value={{ _i18n, locale }}>
{children}
</I18nContext.Provider>
);
};
export const useTrans = () => {
const { _i18n, locale } = useContext(I18nContext);
const t = useCallback(
(key: string) => {
const data = _i18n.data[locale] || _i18n.data[locale];
if (data) {
return data[key] || key;
}
return key;
},
[_i18n.data, locale]
);
return { t };
};
Although it's basic, it gets the job done. Since this blog is likely to be developed solely by me, I can fix any issues that arise. Although storing directly in JS may slightly increase the bundle size, considering that the static text files are not large and the homepage and frequently accessed pages are implemented using ISR, it's not a big problem. This implementation is sufficient for now.
Image Processing
Next.js provides next/image
specifically for handling image loading and optimization. By default, without specific handling, the same image may be rendered on all devices (mobile, desktop). However, this can lead to poor user experience. Large devices may require higher-resolution images, but providing high-resolution images to small devices not only has no visible effect but also wastes bandwidth.
Another issue is that if images are not defined with specific dimensions, they won't occupy any height until loaded. However, once loaded, they suddenly take up height, causing layout shifts. This can be annoying, especially with multiple images and network delays.
Therefore, I highly recommend using next/image
in Next.js to handle images. It is essential to define width and height or aspect ratios explicitly. Additionally, when using next/image
, the images are processed by the server before being returned. For example, if the image URL is https://cdn.example.com/images/avatar.jpeg
, the actual request will be /_next/image?url=${URL}&w=128&q=75
. This means that all requests through next/image
will be processed by the server, rather than directly requesting the original URL.
The benefit of this approach is that it can return appropriate formats and sizes based on device requirements. However, it also means an increased server load. Since I use Vercel for deployment, Vercel provides default image optimization quotas, so I don't need to worry too much. However, if self-hosting, special attention should be paid to this aspect.
If I don't want to handle image optimization on my own server, I can configure a loader to define how Next.js handles images (e.g., sending them to a CDN).
Vercel Edge Functions
Next.js 13 introduced an experimental feature called Edge Functions, which can be used in conjunction with Vercel's Edge Functions without additional configuration.
Unlike regular Serverless Functions, Vercel's Edge Functions run in a separate V8 Engine environment. Since the execution environment is the same, they have faster cold starts and higher scalability compared to regular Serverless Functions.
By taking advantage of this small runtime, Edge Functions can have faster cold boots and higher scalability than Serverless Functions.
This feature is incredibly cool. I followed the example provided by the official documentation and deployed an API that dynamically generates Open Graph (OG) images based on the title. The style can be defined using HTML syntax. If an article does not have an OG image, it falls back to this API. However, there are some limitations. Since it runs on a VM environment, it cannot directly execute DB operations, and it's uncertain whether it can make HTTP requests.
Serverless Functions
By default, when deploying Next.js to Vercel, if a page is non-static and uses SSR or ISR, Vercel automatically generates a Serverless Function.
This means that after deploying a Next.js application to Vercel, it is not running on a single machine but split into multiple Serverless Functions. There might be a delay due to cold starts.
Therefore, some considerations need to be taken into account during implementation:
- Avoid using global variables for in-memory caching unless you can accept inconsistency, as they might not be shared across different pages.
- Understand Vercel's limits on response and request, as outlined in their documentation.
- Since my application requires a database connection, I adjusted the idle timeout to ensure that the database connection does not occupy too much time when multiple Serverless Functions are executing concurrently. I also increased the number of connections to ensure stability.
Cloud Functions and Cloud Scheduler
The RSS feed is generated periodically, and I implemented it using Cloud Functions and Cloud Scheduler. It took me quite a while to understand the relationship between Cloud Functions and Cloud Run. Cloud Functions can only be written in languages supported by Google, while Cloud Run runs on Docker and can use any image. Cloud Functions now have a second generation and seem to be running on Cloud Run behind the scenes.
Cloud Functions and Scheduler are suitable for tasks that require some time but are relatively simple to implement. It's worth noting that Cloud Scheduler is free since there is only one job. Additionally, Cloud Functions do not incur additional costs for traffic if they are not externally accessible. This makes them convenient for personal projects.
AWS S3 and CDN Configuration
I have been using AWS S3, along with CloudFront, to store and distribute static files. Originally, I used the default domain name provided by S3. Later, I discovered that I could use a custom domain and obtain an SSL certificate through AWS Certificate Manager. By setting up the CNAME correctly, Amazon automatically handles the rest.
If static resources are accessed through a CDN, it is recommended to disable public access to S3 to ensure that all traffic goes through CloudFront. This enhances security, and if accidentally subjected to a DDoS attack, CloudFront provides an additional layer of protection, allowing you to block IP addresses using Web ACL.
The detailed configuration process is as follows:
- Create a distribution in the CloudFront interface.
- If the source is AWS S3, select "Origin Access Identity" to automatically add a bucket policy. In the S3 control interface, I enabled blocking public access to enforce the invalidation of ACLs.
- Request an SSL certificate from AWS Certificate Manager. If using DNS validation, after creating the certificate, AWS provides a CNAME name and value. These need to be added to the DNS record to prove ownership of the domain.
- Add a custom domain in the CloudFront settings, and that's it! The public certificate from AWS Certificate Manager is free. Now, static resources can be accessed through
https://cdn.example.com
, which looks much better.
Slate Editor
(Under development)
I wanted to create a Markdown editor that meets my specific needs. I needed a highly customizable package, so I chose Slate. Slate provides great flexibility, but almost all features need to be implemented manually, and it requires understanding the underlying node structure. Therefore, building it to meet my requirements takes more time.
Currently, I have implemented some features I have always wanted:
- Uploading images through drag and drop or copy and paste, which automatically uploads them to a CDN and returns the image URL.
- Calling a translation API directly when a specific shortcut key is pressed to translate text into other languages. This way, simple paragraphs can be translated automatically, while complex paragraphs require manual translation.
I will share more details once I have a complete result.
Features and Optimization
At this point, I have completed a usable blog system. However, there are still some areas I want to improve and optimize:
- Add tests and E2E testing. Currently, checking each article one by one is cumbersome due to the large number of articles.
- Reduce DB queries and cache article content and serve it from the CDN.
- Reduce JavaScript bundle size.
- Implement full-text search.
- Directly insert uploaded files into the database.
- Export and import articles.
- To be continued.