Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

actix-web returns 400 bad request for http requests emitted by many user agents #3102

Open
lovasoa opened this issue Aug 13, 2023 · 3 comments
Labels

Comments

@lovasoa
Copy link

lovasoa commented Aug 13, 2023

Hello, and first, thank you for this great library !

Recently, I published a blog post titled I’m sorry I forked you. In the title, the second character is a curly apostrophe ( U+2019 Right Single Quotation Mark).

I shared it online and started getting hits from a lot of different browsers. I significant portion of hits (I don't know which browsers exactly), did not encode the apostrophe (as %E2%80%99), but included the directly in the HTTP query.

There are two layers between the web and my actix service:

  • cloudflare, which parsed and understood the HTTP query perfectly well, and forwarded it with the curved apostrophe
  • nginx, which also parsed and forwarded the query without issue.

But when it got to actix-web, it failed to parse the query, and returned a 400 back without even invoking my code.
The very confusing error message I got was: [ERROR actix_http::h1::dispatcher] stream error: Request parse error: Invalid Header provided (confusing because the problem did not state what the problem was exactly, and said it came from headers instead of the query string).

See: https://en.wikipedia.org/wiki/Internationalized_Resource_Identifier

Expected Behavior

Since clients in the real world emit http requests with unicode characters, I think actix-web should accept them, and just invoke the user code with the unicode query string.

And when it encounters a real issue with the query string, it should say it comes from the query string, not from the headers, and give more details than just Request parse error.

Current Behavior

logs [ERROR actix_http::h1::dispatcher] stream error: Request parse error: Invalid Header provided

and returns an HTTP 400 bad request response to the client.

Steps to Reproduce (for bugs)

#[actix_web::main]
async fn main() -> std::io::Result<()> {
    actix_web::HttpServer::new(|| actix_web::App::new())
    .bind(("127.0.0.1", 8080))?
    .run()
    .await
}
❯ curl -v 'localhost:8080/’'
*   Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /’ HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.81.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 400 Bad Request
< content-length: 0
< connection: close
< date: Sun, 13 Aug 2023 20:01:26 GMT
< 
* Closing connection 0
@robjtede robjtede added needs-investigation A-http project: actix-http labels Aug 13, 2023
@lovasoa
Copy link
Author

lovasoa commented Aug 13, 2023

I dug in the logs, and here is a list of some user agents that sent the requests with raw unicode chars:

      1 DuckDuckGo/5 (com.duckduckgo.mobile.android; Android API 28)
      1 Embed PHP library
      1 Hatena::Fetcher/0.01 (master) Furl/3.13
      1 Mediatoolkitbot ([email protected])
      1 Mozilla/5.0 (Android 13; Mobile; rv:109.0) Gecko/116.0 Firefox/116.0
      1 Mozilla/5.0 (compatible; heritrix/3.3.0-SNAPSHOT-20150302-2206 +http://127.0.0.1)
      1 Mozilla/5.0 (compatible;PetalBot;+https://webmaster.petalsearch.com/site/petalbot)
      1 Mozilla/5.0 (compatible; Qwantify-dev/1.0; +https://help.qwant.com/bot/)
      1 Mozilla/5.0 (compatible; SemrushBot; +http://www.semrush.com/bot.html)
      1 Mozilla/5.0 (compatible; Yeti/1.1; +https://naver.me/spd)
      1 Mozilla/5.0 (iPhone; CPU iPhone OS 16_1_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.1 Mobile/15E148 DuckDuckGo/7 Safari/605.1.15
      1 Mozilla/5.0 (iPhone; CPU iPhone OS 16_5_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.5 Mobile/15E148 DuckDuckGo/7 Safari/605.1.15
      1 Mozilla/5.0 (iPhone; CPU iPhone OS 17_0 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.0 Mobile/15E148 DuckDuckGo/7 Safari/605.1.15
      1 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/600.1.25 (KHTML, like Gecko) Version/8.0 Safari/600.1.25
      1 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36
      1 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Safari/605.1.15 (Applebot/0.1; +http://www.apple.com/go/applebot)
      1 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36
      1 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36
      1 Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/116.0
      1 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36
      1 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36
      1 Mozilla/5.0 (Windows; U; Windows NT 5.1; ja; rv:1.8.0.9) Gecko/20061206 Firefox/53.0
      1 Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:94.0) Gecko/20100101 Firefox/95.0
      1 Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36
      1 Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9b3pre) Gecko/2008010415 Firefox/52.7.0
      1 Mozilla/5.0 (X11; U; Linux i686; fr; rv:1.9.0.1) Gecko/2008070206 Firefox/51.0
      1 okhttp/4.10.0
      1 python-requests/2.25.1
      1 SerendeputyBot/0.8.6 (http://serendeputy.com/about/serendeputy-bot)
      1 Twitterbot/1.0
      2 com.apple.WebKit.Networking/18615.3.12.11.2 CFNetwork/1410.0.3 Darwin/22.6.0
      2 com.apple.WebKit.Networking/8614.2.9.0.11 CFNetwork/1399 Darwin/22.1.0
      2 com.apple.WebKit.Networking/8614.3.7.0.6 CFNetwork/1402.0.8 Darwin/22.2.0
      2 com.apple.WebKit.Networking/8615.1.26.100.1 CFNetwork/1406.0.4 Darwin/22.4.0
      2 com.apple.WebKit.Networking/8616.1.14.10.12 CFNetwork/1458.2.2 Darwin/23.0.0
      2 magpie-crawler/1.1 (robots-txt-checker; +http://www.brandwatch.net)
      2 Mozilla/5.0 (compatible)
      2 Mozilla/5.0 (compatible; MJ12bot/v1.4.8; http://mj12bot.com/)
      2 Mozilla/5.0 (compatible) SemanticScholarBot (+https://www.semanticscholar.org/crawler)
      2 Mozilla/5.0 (compatible; SeznamBot/4.0; +http://napoveda.seznam.cz/seznambot-intro/)
      2 Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
      2 Mozilla/5.0 (iPhone; CPU iPhone OS 10_0 like Mac OS X) AppleWebKit/602.1.38 (KHTML, like Gecko) Version/10.0 Mobile/14A5297c Safari/602.1
      2 Mozilla/5.0 (iPhone; CPU iPhone OS 5_0 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9A334 Safari/7534.48.3
      2 Mozilla/5.0 (Linux; Android 5.0) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; Bytespider; [email protected])
      2 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36
      2 Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/114.0
      2 Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/116.0
      2 node-fetch
      2 omgili/0.5 +http://omgili.com
      2 Tiny Tiny RSS/21.05-326850845 (http://tt-rss.org/)
      3 DuckDuckGo/5 (com.duckduckgo.mobile.android; Android API 32)
      3 Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) Chrome/103.0.5060.134 Safari/537.36
      3 Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
      3 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36
      3 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/116.0
      4 com.apple.WebKit.Networking/8615.2.9.10.4 CFNetwork/1408.0.4 Darwin/22.5.0
      4 com.apple.WebKit.Networking/8616.1.24.10.2 CFNetwork/1469 Darwin/23.0.0
      4 curl/7.81.0
      4 facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
      4 MobileSafari/8615.2.9.10.3 CFNetwork/1408.0.4 Darwin/22.5.0
      4 Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)
      4 Mozilla/5.0 (Linux; arm_64; Android 12; Pixel 3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 YaBrowser/23.7.2.98.00 SA/3 Mobile Safari/537.36
      4 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36
      4 Safari/18615.2.9.11.4 CFNetwork/1408.0.4 Darwin/22.5.0
      4 Safari/19616.1.24.11.3 CFNetwork/1469 Darwin/23.0.0
      4 Twingly Recon
      5 DuckDuckGo/5 (com.duckduckgo.mobile.android; Android API 29)
      5 DuckDuckGo/5 (com.duckduckgo.mobile.android; Android API 30)
      5 LinkPreview/1.6 (https://www.linkpreview.net)
      5 Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36
      6 DuckDuckGo/5 (com.duckduckgo.mobile.android; Android API 31)
      6 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/601.2.4 (KHTML, like Gecko) Version/9.0.1 Safari/601.2.4 facebookexternalhit/1.1 Facebot Twitterbot/1.0
      6 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:103.0) Gecko/20100101 Firefox/103.0
      6 Safari/18615.1.26.110.1 CFNetwork/1406.0.4 Darwin/22.4.0
      8 com.apple.WebKit.Networking/8614.2.9.0.10 CFNetwork/1399 Darwin/22.1.0
      8 Mozilla/5.0 (compatible; Twingly Recon; twingly.com)
      8 Safari/19616.1.26.11.3 CFNetwork/1474 Darwin/23.0.0
     10 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36 Edg/115.0.1901.203
     12 com.apple.WebKit.Networking/8616.1.26.10.2 CFNetwork/1474 Darwin/23.0.0
     12 Mozilla/5.0 (iPhone; CPU iPhone OS 16_6 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.6 Mobile/15E148 DuckDuckGo/7 Safari/605.1.15
     12 Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko
     13 Mozilla/5.0 (compatible; Qwantify-prod/1.0; +https://help.qwant.com/bot/)
     14 Mozilla/5.0 (Windows; U; Windows NT 6.1; ru; rv:1.9.2b5) Gecko/20091204 Firefox/3.6b5
     16 MobileSafari/8614.2.9.0.10 CFNetwork/1399 Darwin/22.1.0
     16 MobileSafari/8616.1.26.10.2 CFNetwork/1474 Darwin/23.0.0
     18 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36
     21 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36
     22 DuckDuckGo/5 (com.duckduckgo.mobile.android; Android API 33)
     29 MobileSafari/8615.2.9.10.4 CFNetwork/1408.0.4 Darwin/22.5.0
     60 Safari/18615.2.9.11.10 CFNetwork/1408.0.4 Darwin/22.5.0
     62 Safari/18615.3.12.11.2 CFNetwork/1410.0.3 Darwin/22.6.0
     63 Go-http-client/2.0
     96 com.apple.WebKit.Networking/8615.3.12.10.2 CFNetwork/1410.0.3 Darwin/22.6.0
    112 MobileSafari/8615.3.12.10.2 CFNetwork/1410.0.3 Darwin/22.6.0
    158 com.apple.WebKit.Networking/8615.2.9.10.6 CFNetwork/1408.0.4 Darwin/22.5.0
  10972 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36 AppEngine-Google; (+http://code.google.com/appengine; appid: s~feedly-nikon3)

@lovasoa lovasoa changed the title actix-web returns 400 bad request for htp requests emitted by many user agents actix-web returns 400 bad request for http requests emitted by many user agents Aug 14, 2023
@rustrust
Copy link

does h2spec not test for this...?

@joelwurtz
Copy link
Contributor

FYI :

I have made 2 pull request in order to make it work in actix http

With both of this changes it works fine (so no change needed in actix http crate)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants