If you have any questions or feedback, pleasefill out this form
This post is translated by ChatGPT and originally written in Mandarin, so there may be some inaccuracies or mistakes.
Introduction
The form
element is quite common in web applications, serving not only to transmit plain text but also to facilitate file uploads. However, due to the different behavior of forms compared to other transmission methods, confusion and misunderstandings can sometimes arise.
This article aims to provide a deeper understanding of what happens behind the scenes with forms after grasping the relevant specifications, as well as the differences between Form Data and other transmission methods. Finally, we will discuss the functionality provided by the HTML <form/>
tag.
The main points covered are:
- What is
multipart/form-data
and why do we need it? - How to understand request formats
- What problems are solved by form-data
Why Do We Need Form Data?
For data transmission, both parties need to have a certain understanding of the data format. In the world of the internet, we use protocols to standardize how data is transmitted. Through the HTTP Content-Type
header, we can identify the content of a request and interpret the data accordingly.
MIME Type defines the types of transmission formats:
Content-Type: application/json
indicates that the request content is JSONContent-Type: image/png
indicates that the request content is an image file
Among these, multipart/form-data
is one of the Content-Type
options.
Generally, Content-Type
can only transmit one type of data at a time. However, in web applications, we may also want to upload files, images, or videos through forms, which led to the emergence of the multipart/form-data
specification (RFC7578).
Parsing Form Data Requests
The primary utility of multipart/form-data
is that it allows users to send multiple data formats in a single request, mainly used in HTML forms or for implementing file upload functionalities.
Next, let's take a look at what a multipart/form-data
format looks like. To send a request with a Content Type of multipart/form-data
, we can use the HTML form tag (or JavaScript's FormData):
<form enctype="multipart/form-data" action="/upload" method="POST">
<input type="text" name="name" />
<input type="file" name="file" />
<button>Submit</button>
</form>
When the Submit button is clicked, the browser sends a POST request:
POST /upload HTTP/1.1
Host: localhost:3000
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryFYGn56LlBDLnAkfd
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36
WebKitFormBoundaryFYGn56LlBDLnAkfd
Content-Disposition: form-data; name="name"
Test
WebKitFormBoundaryFYGn56LlBDLnAkfd
Content-Disposition: form-data; name="file"; filename="text.txt"
Content-Type: text/plain
Hello World
WebKitFormBoundaryFYGn56LlBDLnAkfd--
Since web requests are based on HTTP, multipart/form-data
will also be an HTTP request, with its format specified in the RFC.
To understand a multipart/form-data
request, two key points need to be noted:
- The role of the boundary
- The meaning of each format
The Role of Boundary
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryFYGn56LlBDLnAkfd
In the Content-Type, we can see that the boundary is followed by a strange string. What is the purpose of this boundary?
As mentioned earlier, the purpose of multipart/form-data
is to allow different formats of data to be sent through a single request. Therefore, there needs to be a way to determine where each piece of data begins and ends. For example, in query parameters: a=b&c=d
, the &
serves as a delimiter, allowing the computer to know when to split the data. Each time the computer encounters this boundary, it knows that the data for the current attribute has been fully read and it can start reading the next piece of data.
The specifications do not impose strict limitations on the format of the boundary, but they do define its length and allowable characters:
- Begins with two hyphens
- Total length not exceeding 70 (excluding the hyphens themselves)
- Only accepts ASCII 7bit
Thus, a string like helloworldboundary
is also a completely valid boundary.
Content-Disposition
In multipart/form-data
, the Content-Disposition serves to describe the format of the data.
Content-Disposition: form-data; name="name"
This indicates that this is a field in the Form Data with the name name
.
If it is a file, the filename
will also be appended, and the next line will include Content-Type
to describe the file type:
Content-Disposition: form-data; name="file"; filename="text.txt"
Content-Type: text/plain
A blank line follows before the data content:
WebKitFormBoundaryFYGn56LlBDLnAkfd
Content-Disposition: form-data; name="name"
Test
WebKitFormBoundaryFYGn56LlBDLnAkfd
Content-Disposition: form-data; name="file"; filename="text.txt"
Content-Type: text/plain
Hello World
WebKitFormBoundaryFYGn56LlBDLnAkfd--
In this example, I uploaded a plain text file. If it were an image or another file type, it would be displayed in binary format.
Content-Disposition: form-data; name="file"; filename="image.png"
Content-Type: image/png
PNG
IHDR¤@¬
ÃiCCPICC ProfileHTSÙϽétBoô*%ôÐ{³@B!!ØPGp,¨2 cd,(¶A±a :l¨¼<ÂÌ{ë½·Þ¿ÖY÷»;ûì½ÏYçܵÏ
(omitted)
Implementing a multipart/form-data
Request
Now that we understand the request format for multipart/form-data
, we can try to implement one ourselves. Here, we'll use node.js
as an example:
const http = require('http');
const fs = require('fs');
const content = fs.readFileSync('./text.txt');
const formData = {
name: 'Kalan',
file: content,
};
let payload = '';
const boundary = 'helloworld';
Object.keys(formData).forEach((k) => {
let content;
if (k === 'file') {
content = [
`\r\n--${boundary}`,
`\r\nContent-Disposition: multipart/form-data; name=${k}; filename="text.txt"`,
`\r\nContent-Type: text/plain`,
`\r\n`,
`\r\n${formData[k]}`,
].join('');
} else {
content = [
`\r\n--${boundary}`,
`\r\nContent-Disposition: multipart/form-data; name=${k}`,
`\r\n`,
`\r\n${formData[k]}`,
].join('');
}
payload += content;
});
payload += `\r\n--${boundary}--`;
const options = {
host: 'localhost',
port: '3000',
path: '/upload',
protocol: 'http:',
method: 'POST',
headers: {
'Content-Type': 'multipart/form-data; boundary=helloworld',
'Content-Length': Buffer.byteLength(payload),
},
};
const req = http.request(options, (res) => {});
req.write(payload);
req.end();
Implementing it is straightforward; it merely involves filling in the request body with the defined format. The key point to note is that each boundary begins with two hyphens, and the last boundary ends with two hyphens as well.
Next, we can use Wireshark to observe whether the packet content is parsed correctly:
We can see that the encapsulated multipart part, which includes name=Kalan
and the file content, has been parsed correctly. This indicates several things:
multipart/form-data
is also a type of HTTP request- Requests can be sent without a browser as long as they conform to the format
- File content must be parsed on the server-side (the request only transmits a chunk of binary data)
The last point is often overlooked by beginners; sending a request with multipart/form-data
doesn't mean that the backend can directly access the file. It requires parsing to retrieve the file content, which is a more manageable format for us. For instance, in node.js
, a popular package for handling file uploads is multer, which helps parse file contents.
application/x-www-form-urlencoded
If you use the GET method to submit a form, all form contents will be transmitted in URL-encoded format. For example, the following HTML will transform into /upload?name=Kalan&file=filename
when the Submit button is clicked, even if the enctype
specifies multipart/form-data
.
<form enctype="multipart/form-data" action="/upload" method="GET">
<input type="text" name="name" />
<input type="file" name="file" />
<button>Submit</button>
</form>
Conclusion
This article aimed to understand multipart/form-data
through specifications, explore the problems Form Data solves, and attempt to create a compliant multipart/form-data
request for a deeper understanding of this uniquely structured HTTP request.
multipart/form-data
offers several benefits for web applications:
- Different formats of data can be sent in a single request
- It meets users' needs for file transmission
- Browsers have a standardized specification to implement
For developers, understanding multipart/form-data
serves several purposes:
- Knowing the principles for achieving file uploads on the web
- Understanding how HTTP requests standardize the transmission of different format data
- Mastery of these principles can accelerate development speed
The next article will focus on the <form>
tag, exploring how browsers handle this HTML element and what developers should pay attention to.
If you found this article helpful, please consider buying me a coffee ☕ It'll make my ordinary day shine ✨
☕Buy me a coffee