You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was working on a bookmarklet that, among other things, form-posts the title of whatever page you're on to my server running Express, and I'm seeing Connect's body parser choke on some pages from Amazon.
Run that website locally, drag the bookmarklet to your toolbar, and click it on any of the provided Amazon links. You should see an error message like this one:
URIError: URI malformed
at decodeURIComponent (native)
at /usr/local/lib/node/.npm/qs/0.1.0/package/lib/querystring.js:28:18
at Array.reduce (native)
at /usr/local/lib/node/.npm/qs/0.1.0/package/lib/querystring.js:27:6
at IncomingMessage.<anonymous> (/usr/local/lib/node/.npm/connect/1.3.0/package/lib/middleware/bodyParser.js:74:15)
at IncomingMessage.emit (events.js:61:17)
at HTTPParser.onMessageComplete (http.js:132:23)
at Socket.ondata (http.js:1007:22)
at Socket._onReadable (net.js:677:27)
at IOWatcher.onReadable [as callback] (net.js:177:10)
This happens on Amazon pages where the title has special characters, like é or ü. You can change the title of an Amazon page (e.g. by setting document.title in the console) to just é, for example, and it will cause the bug.
I've done some investigating and can give you some more info, but at a high level, it seems that the browser in this case encodes the form differently than encodeURIComponent() does, which causes decodeURIComponent() — used by Connect's body parser — to choke.
For example, calling encodeURIComponent() on that é yields %C3%A9 everywhere, but what the server receives in the form body from these Amazon pages is %E9. Attempting to decodeURIComponent() on %E9 causes this error.
I tried making a sample page for this, but the form post matched encodeURIComponent(). I'm guessing the behavior on Amazon is related to encoding, but I haven't been able to confirm, maybe because Express sends a Content-Type header that specifies utf-8.
All said, it seems that Connect's body parser shouldn't break on these encodings. Hope this info helps. Thanks!
The text was updated successfully, but these errors were encountered:
I was working on a bookmarklet that, among other things, form-posts the title of whatever page you're on to my server running Express, and I'm seeing Connect's body parser choke on some pages from Amazon.
Here's a super simple test case:
https://gist.github.com/947895
Run that website locally, drag the bookmarklet to your toolbar, and click it on any of the provided Amazon links. You should see an error message like this one:
This happens on Amazon pages where the title has special characters, like
é
orü
. You can change the title of an Amazon page (e.g. by settingdocument.title
in the console) to justé
, for example, and it will cause the bug.I've done some investigating and can give you some more info, but at a high level, it seems that the browser in this case encodes the form differently than
encodeURIComponent()
does, which causesdecodeURIComponent()
— used by Connect's body parser — to choke.For example, calling
encodeURIComponent()
on thaté
yields%C3%A9
everywhere, but what the server receives in the form body from these Amazon pages is%E9
. Attempting todecodeURIComponent()
on%E9
causes this error.I tried making a sample page for this, but the form post matched
encodeURIComponent()
. I'm guessing the behavior on Amazon is related to encoding, but I haven't been able to confirm, maybe because Express sends a Content-Type header that specifies utf-8.All said, it seems that Connect's body parser shouldn't break on these encodings. Hope this info helps. Thanks!
The text was updated successfully, but these errors were encountered: