-
Notifications
You must be signed in to change notification settings - Fork 569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Undici fails to read bodies from responses with implicit lengths #1414
Comments
My reading of the spec is different.
i.e. the transfer-encoding header still needs to be present. Hence your example actually goes under the 4. invalid response clause. |
I don't think that paragraph applies. It starts with "If a transfer-encoding header field is present" and in this case there is no transfer-encoding header. I'm referring to the final case, point 7:
Also, this seems very unlikely to be an invalid response as that would imply that the HTTP stacks of Chrome, Firefox and Node.js are all doing the wrong thing here. Point 4 regarding invalid response bodies says:
Nobody else does so - everybody except Undici happily accepts this response as legit. |
PR welcome |
I'd love to, but it looks like this is going to be deep in llhttp, which doesn't seem easy to dive into. I've skimmed the README from https://github.com/nodejs/llhttp and the llparse intro (https://llparse.org/) and I get the gist, but that's pretty spare and it's quite hard to immediately work out how the details work in practice. Is there a contributing guide for llhttp, or any more documentation for llparse anywhere? I'm not clear how to write test cases or debug it, and it's not clear exactly where Undici interacts with it to check inputs & outputs at the entrypoints. Looks like the trail starts here: Lines 80 to 83 in 360e5d1
That seems intended to start handling this exact situation to me, but something's going wrong in the parsing for that case at a later step. Or maybe parsing is working correctly, but then Undici is failing due to the connection close somewhere before it realises the response is complete & OK? |
Doing some digging, I've just run into #1412. Pretty sure that's caused by this underlying issue (and the HTTP/2 mention there is an unrelated red herring). I can reproduce the exact same fetch('https://www.dailymail.co.uk/').then(res => res.text()).then(console.log) Logging the headers too, they do include That suggests this failing edge case is a lot more common in reality than I'd expected, since this is the landing page of one of the top 100 most visited websites in the world (https://www.semrush.com/website/dailymail.co.uk/ says it's 73rd - they're terrible but sadly popular) and other headers suggest this response is being served by Akamai. |
No it's not in llhttp. Node.js uses llhttp as well and it implements this. |
yep getting the same error trying to fetch github graphql api https://api.github.com/graphql |
A PR to fix this would be nice. |
I have figured out the problem and I am currently figuring out how best to fix it.. Might not get the PR up before dinner, but hopefully in the next couple of hours. |
@evanderkoogh did we fix this? |
@ronag I'm going to close this because the original problem no longer exists. There are other issues being reported as dupes of this, but I think that's not correct and might be confusing things. Both the examples I provided above that originally failed do now pass successfully, and have for a while. The first standalone example passes in Undici since v5.12.0, and in Node v18.12.0+ and all v20 versions. The second example (with the daily mail) passes Undici since v5.13.0, and in Node v18.13.0+ and all v20 versions. Any other related issues are due to something else, or a much more specific version of this issue, and need investigating separately. |
Bug Description
Paraphrasing the HTTP spec (RFC 7230 3.3.3), the message length for a normal response body can be defined in a few ways:
content-length
header with a fixed length (point 3 in the RFC list)transfer-encoding
header, a transfer coding chunk that explicitly ends the body (point 5 in the list)Undici handles the last case incorrectly, making it impossible to read the HTTP response body for this simple "respond and then close the connection" case.
This doesn't come up much for big fancy servers, which tend to use keep-alive etc and explicit framing wherever they can, but it is common on quick simple HTTP server implementations, which avoid state and complexity by just streaming responses and closing the connection when they're done. It's a legitimate way to send responses according to the HTTP spec, and it is supported correctly by browsers and Node's
http
module, but not by Undici.Reproducible By
Running in Node 18.0.0:
The server returns a simple response, just containing the default
date
header and aconnection: close
header, then sends the body, then closes the connection.The
connection: close
header is for clarity - this should work equally well without that, just usingres.socket.end()
explicitly instead.Expected Behavior
The above should print the status, headers, and then the response body, which is everything after the headers until the connection is closed.
This does work using fetch in browsers. To test this, run the script above (which will print the fetch error, but then keep running) then load
localhost:8008
in your browser - the string loads successfully.With that page loaded (for CORS) you can also run the equivalent command in the browser dev console, which also works and prints the response body correctly:
This also works using Node's built-in HTTP module:
Logs & Screenshots
Undici's fetch does not print the response body, instead it throws an error:
Environment
Ubuntu
The text was updated successfully, but these errors were encountered: