If you have any questions or feedback, pleasefill out this form
Table of Contents
This post is translated by ChatGPT and originally written in Mandarin, so there may be some inaccuracies or mistakes.
Introduction
I started blogging around 2015, moving from platforms like Pixnet and Logdown to customizing my own Hexo setup, and later trying out Medium. In 2019, I jumped on the Gatsby bandwagon when it was quite popular. Now, I've settled on Gatsby for over three years.
I want to reiterate that if you're looking to start writing blog posts, just find a user-friendly platform that works for you. The most important thing is to write. I've seen too many software engineers write blog articles about "self-hosting a blog with xxx" and then never update them again, or they spend a lot of time setting up a site only to have posts like "hello world" or "test." I'm grateful that I persisted in the habit of sharing articles, even though my traffic isn't huge; I've still accumulated a decent number of readers.
Why I'm No Longer Using Gatsby
Back to the point, there are a few reasons I found it troublesome:
- Static Generation: Every time I finish writing an article, I have to rebuild the entire site. (CI is running, but it's still a hassle.)
- Categorization: Currently, categories simply match metadata, and over time, it's easy to forget and become too lazy to search, leading to a messy categorization.
- Volume: After several years, I've accumulated over 140 articles, so it's time to figure out a better way to save and modify them.
- Fun:
Next.js
has evolved significantly over the years, supporting many features. Plus, it can be deployed to Vercel seamlessly, which is great for someone like me who is not keen on setting up servers.
While Gatsby offers a rich set of features, the end result is still static files, and all articles need to be managed locally. Although Gatsby provides support for content sources beyond the filesystem, setting it up is still a bit cumbersome and requires adhering to Gatsby's specified formats. For customized features, one constantly needs to search for packages or write one from scratch. These costs accumulate, leading to considerable overhead.
For these reasons, I ultimately chose the more customizable Next.js
.
Requirements
I summarized my blog requirements as follows:
- Generate HTML from Markdown: This is essential for me since all my articles are in Markdown format, and I also use mathematical syntax, footnotes, etc.
- Code syntax highlighting
- Database: Convenient for storing and managing articles, ideally with the ability to create tables and indexes.
- Support for RSS: I'm very particular about RSS support; the blog can be simple, but RSS is a must. Many readers rely on RSS to receive notifications for new articles.
- Upload images and videos: Currently, all article images are hosted on a CDN, so I hope for an editor that supports upload functionality.
- Multilingual support (i18n): Some articles may need translations for interaction with foreign communities.
- Ability to customize pages
Technology Choices
- Frontend and Backend:
Next.js
, implementation details will be discussed later. - Multilingual Support: Implemented via
Next.js
's i18n feature; data fetching details will be covered in later sections. - Markdown and Code Syntax Highlighting: Achieved using remark and shiki.
- Database: PostgreSQL, built using Google Cloud SQL. (Databases are more expensive than I expected 😱)
- RSS Feed: Regularly generated via Cloud Functions and Cloud Scheduler.
- Static File Storage: S3, with CloudFront used for CDN.
- Deployment: Integrated with Vercel and GitHub; new commits are deployed directly.
You might wonder why static storage is on Amazon while the rest of the cloud services are on Google Cloud.
From my experience with RDS and Lambda, I found Google's interface and settings to be more user-friendly, so I switched to Google Cloud. However, since static files have always been stored on S3 since I started the blog, I'm sticking with it for now. If I have some spare time later on, I might consider migrating the static files.
Implementation
Migrating Articles to the Database (Cloud SQL)
Since past articles were written as individual Markdown files, the first step was to move all file contents into the database. My design includes two tables: posts
and categories
. To support i18n, I used JSON types for fields like title and summary. Although searching can be a bit tricky this way, it makes adding other languages much easier. Next, I wrote SQL to insert articles into the corresponding fields. Notably, all database schema operations are stored in a separate .sql
file for easy rollback in case of issues, making it convenient to rebuild.
This might be common sense for many experienced engineers, but I've noticed that quite a few rely on the functionalities provided by frameworks or just run CLI commands, leaving them at a loss when modifications are needed.
Next.js
Let's talk about why I chose Next.js. Although many front-end frameworks have corresponding SSR frameworks, in my experience, the one with the richest features and the best integration is Next.js. You can start development with very little configuration. Here are some development details:
SSR (Server Side Rendering) and ISR (Incremental Static Regeneration)
Next.js offers three types of page rendering:
- Static: Generates purely static pages at build time, using
getStaticProps
to inject props and render HTML. - ISR: Generates static pages for defined paths at build time; if a specified path does not exist, SSR is used to render the page.
- SSR: Renders the page on each request, executing
getServerSideProps
and returning the rendered HTML before executing JavaScript logic (such as adding event listeners) on the frontend.
For my blog's needs, I expected the homepage and subsequent pages to be accessed frequently and possibly updated due to my posts, so I opted for ISR. The article content also utilizes ISR, where the first 50 articles are statically generated at build time, while older articles are generated upon request. Pages like /contact
are implemented purely statically.
In practice, static generation has proven to be unbeatable, as server-side rendering requires a connection to the database, making it inherently slower.
i18n
Next.js has built-in i18n functionality (at the routing level), allowing different languages to redirect to different domains or paths. For example:
/zh/posts
: Sets the locale tozh
/en/posts
: Sets the locale toen
It also supports automatic language detection (via headers or navigator.languages
). You can easily retrieve the current locale value on both the server and client sides:
// Server side
const getServerSideProps = ({ locale }) => {
// current locale
}
// Client side
import { useRouter } from 'next/router'
const Component = () => {
const { locale } = useRouter()
}
When implementing i18n, libraries like react-intl
or react-i18next
are typically used to simplify operations. However, I found the documentation quite overwhelming, and since my requirements don't entail using third-party services or assuming failure scenarios, the keys weren't numerous enough to justify additional loading.
Consequently, I wrote a simple implementation myself:
import React, { createContext, useCallback, useContext } from "react";
import { en } from "./en";
import { ja } from "./ja";
import { zh } from "./zh";
type I18nData = {
_i18n: {
locales: string[];
data: {
[key: string]: {
[key: string]: string;
};
};
};
};
export default function createI18n() {
const datas = {
en,
ja,
zh
};
return {
_i18n: {
locales: ["en", "zh", "ja"],
data: datas
}
};
}
const I18nContext = createContext<{
locale: string;
_i18n: I18nData["_i18n"];
}>(null);
export const I18nProvider: React.FC<{
_i18n: I18nData["_i18n"];
locale: string;
children: React.ReactElement;
}> = ({ _i18n, locale, children }) => {
return (
<I18nContext.Provider value={{ _i18n, locale }}>
{children}
</I18nContext.Provider>
);
};
export const useTrans = () => {
const { _i18n, locale } = useContext(I18nContext);
const t = useCallback(
(key: string) => {
const data = _i18n.data[locale] || _i18n.data[locale];
if (data) {
return data[key] || key;
}
return key;
},
[_i18n.data, locale]
);
return { t };
};
While it's basic, it gets the job done. Given that this blog will likely only be developed by me, I can address issues as they arise. Although storing directly in JS may slightly increase the bundle size, considering that the current static text files are small and the homepage and frequently accessed pages use ISR, it won't be a significant problem to migrate to CDN later if needed. For now, this setup works well.
Image Handling
Next.js provides next/image
, which is specifically designed to handle image loading and optimization. Without special handling, the same image might be rendered across all devices (mobile, desktop). This leads to a poor experience because larger devices may want higher resolution images, while smaller devices waste bandwidth on high-res images that don't show their benefits.
Additionally, if images aren't defined with sizes, they won't occupy height until loaded, causing sudden layout shifts after loading. If multiple images are involved, combined with network delays, this can be quite annoying.
Therefore, it's highly recommended to use next/image
for image processing in Next.js, where width and height need to be explicitly defined or proportions can be set. Also, when using next/image
, images are processed through the server before being returned.
For example, if there's a URL on the CDN: https://cdn.kalan.dev/images/avatar.jpeg
, the actual request becomes: /_next/image?url=${URL}&w=128&q=75
. This means requests through next/image
are first processed by the server before being returned rather than making a direct request to the original URL.
The advantage of this approach is that it can return suitable formats and sizes based on device needs. However, this also increases the server's workload. Since I'm deploying on Vercel, they provide default limits for image optimization, so I don't need to worry excessively. However, if you're self-hosting, you should pay special attention to this.
Given this, can others' images be sent to this API for optimization? In Next.js, you must configure the image's domain; requests with non-compliant domains will be rejected.
If you don't want image optimization to run on your own server, you can set a loader to define how Next.js handles images (for example, sending them to a CDN).
Remark and Shiki
To convert Markdown to HTML, I'm using the currently popular remark. First, it transforms Markdown into a syntax tree via unified
, then various packages convert it to HTML. For code syntax highlighting, I chose shiki mainly because I like its theme definitions and syntax definitions.
During development, I encountered an issue where shiki loads theme and syntax configuration files during execution. However, when deployed to Vercel, it becomes a serverless function, and the fs.readFile
method doesn't correctly read files from node_modules
. The solution was to manually move the theme and syntax configuration files into the project:
const languagesPath = path.join(
process.cwd(),
'src',
'utils',
'shiki',
'languages'
)
const langs = shiki.BUNDLED_LANGUAGES.map((lang) => ({
...lang,
path: path.join(languagesPath, lang.path || "")
}));
export default async function convertMdToHTML() {
const nordTheme = shiki.toShikiTheme(theme as any);
const highlighter = await shiki.getHighlighter({ theme: nordTheme, langs });
...
}
Vercel Edge Functions
Next.js 13 introduced experimental Edge Functions that can be used alongside Vercel's Edge Functions without additional setup.
Unlike typical Serverless Functions, Vercel's Edge Functions run in a separate V8 Engine environment. Since the execution environment is the same, startup speed is significantly increased, and they can be deployed to global nodes, determining the closest node based on the user's location, thereby reducing API response time.
By taking advantage of this small runtime, Edge Functions can have faster cold boots and higher scalability than Serverless Functions.
This feature is incredibly cool. Following an official example, I deployed an API capable of dynamically generating OG Images based on titles, with styles defined via HTML syntax. If an article lacks an OG Image, it will fallback to this API. However, there are some limitations; since it runs in a VM environment, direct database operations aren't possible, and it's unclear if HTTP requests can be executed.
Serverless Functions
By default, when deploying Next.js to Vercel, for non-static pages utilizing SSR or ISR, Vercel automatically generates a Serverless Function.
This means that a Next.js application deployed to Vercel does not run on a single machine, but is split into several Serverless Functions, which may have cold start delays.
Therefore, some implementation details should be noted:
- Unless you can accept inconsistency, avoid using global variables for in-memory caching, as they may not be shareable across different pages.
- Understand Vercel's limits regarding response and request handling.
- In my application, a database connection is established, so to avoid a surge in connection counts due to multiple Serverless Functions executing simultaneously, I lowered the idle timeout to ensure database connections aren't held for too long, while also slightly increasing the connection count to avoid errors.
Cloud Functions and Cloud Scheduler
RSS Feeds will be generated on a schedule, implemented using Cloud Functions and Cloud Scheduler. It took quite a while to clarify the relationship between Cloud Functions and Cloud Run. Cloud Functions can only be written in languages supported by Google, while Cloud Run runs on Docker, allowing any image to be used. Cloud Functions has a second generation that seems to be utilizing Cloud Run as well.
For tasks like running RSS Feed scripts that require some time but are relatively simple, Cloud Functions paired with Scheduler are very suitable. The only thing to note is to adjust the Cloud Function's memory size if it requires more memory.
Users can trigger Cloud Functions through various events, such as Pub/Sub or direct HTTP calls. Here, I implemented it using Pub/Sub, triggering the function when it receives the generate-rss
event.
Since Cloud Scheduler only has one job, it is free, and Cloud Functions are not accessible externally, so the traffic remains within the free tier, making it very convenient for personal projects.
AWS S3 and CDN Configuration
I've been using AWS S3 since the blog's inception, along with CloudFront. Initially, I used the default domain provided, but later I discovered that I could actually attach a custom domain, and SSL certificates can be obtained through AWS Certificate Manager. Once the CNAME is set up correctly, Amazon will handle the rest automatically.
If using a CDN to serve static resources externally, it's advisable to disable S3 access, ensuring that all traffic comes solely from CloudFront. This is not only more secure, but also provides a layer of protection against DDoS attacks, allowing IP addresses to be blocked via Web ACL.
The detailed setup process is as follows:
- Create a distribution in the CloudFront interface.
- If the origin is AWS S3, selecting "Origin Access Control Settings" will automatically apply the storage policy. In the S3 control interface, I enabled the block public access feature to enforce ACL rules.
- Request an SSL certificate from AWS Certificate Manager. If using DNS validation, after creating the certificate, you'll receive CNAME name and value, which you'll input into the DNS Record to prove domain ownership.
- Add the custom domain in CloudFront settings, and you're all set! Public SSL certificates from AWS are free. From now on, static resources can be accessed via
https://cdn.kalan.dev
, which looks much nicer.
Slate Editor
(In Development)
I want to create a Markdown editor that meets my specific needs, requiring a higher degree of customization. Thus, I chose Slate. Slate is designed to be highly flexible, but almost every feature needs to be implemented from scratch, and understanding the underlying Node structure takes time to shape it as I envision.
Currently, I've implemented some features I've long wanted:
- Uploading images to the CDN by triggering uploads via drag-and-drop or copy-paste, returning links.
- After pressing specific shortcut keys, the Translation API is called to translate text into other languages, allowing for quick translations of simple paragraphs, while more complex ones can be handled manually.
I'll share more once I have complete results.
Features and Optimization
At this point, a usable blog system is finally complete, but there are still some areas I want to improve and optimize:
- Add tests and E2E testing; currently, with so many articles, checking them one by one is cumbersome.
- Reduce DB queries by caching article content and placing it on the CDN.
- Minimize JavaScript bundle size.
- Implement full-text search.
- Directly insert uploaded files into the database.
- Export and import articles.
- To be continued...
If you found this article helpful, please consider buying me a coffee ☕ It'll make my ordinary day shine ✨
☕Buy me a coffee