-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect validation of non-utf8 characters in setHeader function #50213
Comments
for reference nodejs/undici#2020 #47479 |
@marco-ippolito However, when we buffer a string with latin1 encoding but without explicitly stating the encoding in toString(). It loses the character information in the end. Minimum bug reproducible code below
Explanation of how it ended up this way:The original 2 bytes per character latin1 Ç (hex 00C7) was trimmed into 1 byte per character UTF-8 � (hex 00). When sending HTTP response headers, NodeJS converts it again into latin1 by taking the first byte of character � (hex FD) which becomes ý in the latin1 set. |
The current workaround I found is either convert the string into the binary string before calling
or Line 602 in badba8c
|
cc @ShogunPanda |
Version
>=18.16.0
Platform
All
Subsystem
_http_common.js
What steps will reproduce the bug?
NodeJS is actively preventing characters in the extended ascii table from passing in as a string.
Evidence: When using
writeHead()
with headers parameter. It throws anERR_INVALID_CHAR
error.However, when we use
setHeader()
. It does not validate the characters passed in.How often does it reproduce? Is there a required condition?
It always occurs when using the
setHeader()
function withlatin1
(extended ASCII table)What is the expected behavior? Why is that the expected behavior?
The expected behaviour would be
setHeader()
function rejects non-ascii string.Source: According to this PR http: unify header treatment by marco-ippolito · Pull Request #46528 · nodejs/node. Header value validation always rejects non-ascii characters. To support
latin1
in theContent-Disposition
HTTP header, the server has to parse the string as a binary string (within the ascii range). Just like what's written in the test https://github.com/nodejs/node/blob/main/test/parallel/test-http-server-non-utf8-header.jsWhat do you see instead?
setHeader()
does not reject non-ascii string. Content-Disposition header got parsed incorrectly.For example: An input of attachment; filename="ÇÕÑêÿ Island" becomes attachment; filename="ýýýýý Island"
Additional information
Through debugging, I found it really strange that
checkInvalidHeaderChar()
is behaving weirdly with the header value passed in from thewriteHead()
andsetHeader()
.writeHead()
passes a string buffer which gets regex'ed/rejected correctly butsetHeader()
passes a plain string and it does not trigger the regex.The text was updated successfully, but these errors were encountered: