Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement 0 dependency streaming multipart/form-data parser #1851

Closed
wants to merge 36 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
c3c9cf3
feat: incomplete parser
KhafraDev Jan 2, 2023
438b7a3
feat: add in basic body parsing algorithm
KhafraDev Jan 4, 2023
3c6f6f8
fix: remove unused body var
KhafraDev Jan 4, 2023
4866b9a
fix: cleanup
KhafraDev Jan 4, 2023
acb3fef
feat: parse headers and emit file event
KhafraDev Jan 6, 2023
2d4decd
fix: differentiate between file and field and parse header attributes…
KhafraDev Jan 7, 2023
ae7c177
run busboy test & support charset options
KhafraDev Jan 7, 2023
1d665b9
fix: byte-by-byte receiving
KhafraDev Jan 7, 2023
6ec1fd3
fix: byte-by-byte receiving
KhafraDev Jan 7, 2023
119e172
fix: error on _final if parsing isn't complete
KhafraDev Jan 7, 2023
b902ef1
feat: implement fileSize and fieldSize limits
KhafraDev Jan 7, 2023
5629f71
feat: implement files limit
KhafraDev Jan 7, 2023
084a5e3
fix: parse filename base and fix filename w/ backslashes
KhafraDev Jan 7, 2023
771fa9d
fix: only parse path if preservePath is false
KhafraDev Jan 7, 2023
8301ce8
fix: filename parsing and filename* parsing
KhafraDev Jan 7, 2023
4c40486
fix: actively parse headers, rather than lazily
KhafraDev Jan 8, 2023
6c1b276
fix(finally): validate header name and fix parser issues
KhafraDev Jan 8, 2023
601a4ea
fix: send leftover buffers to filestream on destroy, cleanup
KhafraDev Jan 9, 2023
cd6dd4b
fix: content-type headers cannot contain ;
KhafraDev Jan 9, 2023
52b4b0a
enable test
KhafraDev Jan 9, 2023
d33ac7e
fix: content-type header with leading whitespace
KhafraDev Jan 9, 2023
a2bcfd6
fix: don't emit field event w/o listeners
KhafraDev Jan 9, 2023
82ea7a3
fix: don't emit field event w/o listeners
KhafraDev Jan 9, 2023
0d99054
feat: implement parts limit
KhafraDev Jan 9, 2023
50669a6
feat: implement fields limit
KhafraDev Jan 9, 2023
5dd8b45
fix: limit header length to 16 KiB
KhafraDev Jan 9, 2023
13f3cb5
omg every test passes
KhafraDev Jan 9, 2023
ed92a82
fix: use path.win32.basename
KhafraDev Jan 10, 2023
dd01bd3
fix: add types & highwatermark options
KhafraDev Jan 10, 2023
787b679
fix: speedup header parsing & bug fixes & nyc coverage
KhafraDev Jan 11, 2023
7b15177
fix: add main test file
KhafraDev Jan 11, 2023
ac81520
whoops, skip tests on < v18
KhafraDev Jan 11, 2023
e9c8f93
apply suggestions
KhafraDev Jan 11, 2023
7642b60
cleanup FileStream
KhafraDev Jan 11, 2023
588d62d
fix: replace busboy in fetch
KhafraDev Jan 11, 2023
e21de08
perf: use collectASequenceOfCodePointsFast
KhafraDev Feb 26, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions index.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ export * from './types/fetch'
export * from './types/file'
export * from './types/filereader'
export * from './types/formdata'
export * from './types/formdataparser'
export * from './types/diagnostics-channel'
export * from './types/websocket'
export * from './types/content-type'
Expand Down
4 changes: 4 additions & 0 deletions index.js
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,10 @@ if (util.nodeMajor >= 18 && hasCrypto) {
const { WebSocket } = require('./lib/websocket/websocket')

module.exports.WebSocket = WebSocket

const { FormDataParser } = require('./lib/formdata/parser')

module.exports.FormDataParser = FormDataParser
}

module.exports.request = makeDispatcher(api.request)
Expand Down
22 changes: 11 additions & 11 deletions lib/fetch/body.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
'use strict'

const Busboy = require('busboy')
const util = require('../core/util')
const {
ReadableStreamFrom,
Expand All @@ -21,6 +20,7 @@ const { isErrored } = require('../core/util')
const { isUint8Array, isArrayBuffer } = require('util/types')
const { File: UndiciFile } = require('./file')
const { parseMIMEType, serializeAMimeType } = require('./dataURL')
const { FormDataParser } = require('../formdata/parser')

let ReadableStream = globalThis.ReadableStream

Expand Down Expand Up @@ -374,21 +374,21 @@ function bodyMixinMethods (instance) {

const responseFormData = new FormData()

let busboy
let parser

try {
busboy = Busboy({
parser = new FormDataParser({
headers,
defParamCharset: 'utf8'
})
} catch (err) {
throw new DOMException(`${err}`, 'AbortError')
}

busboy.on('field', (name, value) => {
parser.on('field', (name, value) => {
responseFormData.append(name, value)
})
busboy.on('file', (name, value, info) => {
parser.on('file', (name, value, info) => {
const { filename, encoding, mimeType } = info
const chunks = []

Expand Down Expand Up @@ -417,14 +417,14 @@ function bodyMixinMethods (instance) {
}
})

const busboyResolve = new Promise((resolve, reject) => {
busboy.on('finish', resolve)
busboy.on('error', (err) => reject(new TypeError(err)))
const promise = new Promise((resolve, reject) => {
parser.on('close', () => resolve())
parser.on('error', (err) => reject(new TypeError(err)))
})

if (this.body !== null) for await (const chunk of consumeBody(this[kState].body)) busboy.write(chunk)
busboy.end()
await busboyResolve
if (this.body !== null) for await (const chunk of consumeBody(this[kState].body)) parser.write(chunk)
parser.end()
await promise

return responseFormData
} else if (/application\/x-www-form-urlencoded/.test(contentType)) {
Expand Down
22 changes: 18 additions & 4 deletions lib/fetch/dataURL.js
Original file line number Diff line number Diff line change
Expand Up @@ -132,33 +132,47 @@ function URLSerializer (url, excludeFragment = false) {

// https://infra.spec.whatwg.org/#collect-a-sequence-of-code-points
/**
* @param {(char: string) => boolean} condition
* @param {string} input
* @template {string|Buffer} T
* @param {(char: T[number]) => boolean} condition
* @param {T} input
* @param {{ position: number }} position
* @returns {T}
*/
function collectASequenceOfCodePoints (condition, input, position) {
const inputIsString = typeof input === 'string'
const start = position.position

// 1. Let result be the empty string.
let result = ''

// 2. While position doesn’t point past the end of input and the
// code point at position within input meets the condition condition:
while (position.position < input.length && condition(input[position.position])) {
// 1. Append that code point to the end of result.
result += input[position.position]
if (inputIsString) {
result += input[position.position]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If inputIsString you might skip the while loop and provide a fast path for this function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't skip the loop entirely, we need to increase the index and check if the character matches the condition

}

// 2. Advance position by 1.
position.position++
}

// 3. Return result.

if (!inputIsString) {
return input.subarray(start, position.position)
}

return result
}

/**
* A faster collectASequenceOfCodePoints that only works when comparing a single character.
* @template {string|Buffer} T
* @param {string} char
* @param {string} input
* @param {T} input
* @param {{ position: number }} position
* @returns {T}
*/
function collectASequenceOfCodePointsFast (char, input, position) {
const idx = input.indexOf(char, position.position)
Expand Down
41 changes: 41 additions & 0 deletions lib/formdata/constants.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
'use strict'

const states = {
INITIAL: 0,
BOUNDARY: 1,
READ_HEADERS: 2,
READ_BODY: 3
}

const headerStates = {
DEFAULT: -1, // no match
FIRST: 0,
SECOND: 1,
THIRD: 2
}

const chars = {
'-': 0x2D,
cr: 0x0D,
lf: 0x0A,
':': 0x3A,
' ': 0x20,
';': 0x3B,
'=': 0x3D,
'"': 0x22
}

const emptyBuffer = Buffer.alloc(0)

const crlfBuffer = Buffer.from([0x0D, 0x0A]) // \r\n

const maxHeaderLength = 16 * 1024

module.exports = {
states,
chars,
headerStates,
emptyBuffer,
maxHeaderLength,
crlfBuffer
}
11 changes: 11 additions & 0 deletions lib/formdata/filestream.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
'use strict'

const { Readable } = require('stream')

class FileStream extends Readable {
_read () {}
}

module.exports = {
FileStream
}
Loading