Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent HTTP/2 requests to different hosts sharing the same IP address return the same response for all requests #110

Closed
stefan-guggisberg opened this issue Jun 15, 2020 · 5 comments
Labels

Comments

@stefan-guggisberg
Copy link
Contributor

Concurrent requests to subdomains resolving to the same IP address, e.g.

return the same response for all requests.

The problem only occurs when using HTTP/2.

To reproduce:

const crypto = require('crypto');
const { fetch } = require('fetch-h2');

async function request(url) {
  const res = await fetch(url);
  const data = await res.text();
  const md5 = crypto.createHash('md5').update(data).digest().toString('hex');
  console.log(`HTTP Version: ${res.httpVersion}, hash of ${url} response: ${md5}`);
}

async function run() {
  await Promise.all([
    request('https://already--grantila.hlx.page/README.html'),
    request('https://fetch-h2--grantila.hlx.page/README.html'),
  ]);
}

run().catch(console.error);

Running the above code will output identical md5 hash values for both responses.

Workarounds:

  • force HTTP1(.1) protocol:
    const { fetch } = require('fetch-h2').context({ httpsProtocols: ['http1'] });
  • explicitly set Host header:
    const res = await fetch(url, {
      headers: {
        host: new URL(url).host,
      },
      allowForbiddenHeaders: true,
    });
@stefan-guggisberg stefan-guggisberg changed the title Concurrent HTTP/2 requests to subdomains return the same response for all requests Concurrent HTTP/2 requests to different hosts sharing the same IP address return the same response for all requests Jun 15, 2020
@grantila
Copy link
Owner

Thanks for a great bug report 👌

However, I don't think this is a bug in fetch-h2 but rather in the server.
In HTTP/2 you're supposed to re-use one socket/session if possible, and that's controlled e.g. by ssl certs' subjectAltName field. If I curl, I see:

*  subjectAltName: host "already--grantila.hlx.page" matched cert's "*.hlx.page"

so every concurrent call to *.hlx.page should re-use the same socket.

However, that's not the problem here, the problem is that the server looks at the Host header to know what to do, but that's not really how to do it in HTTP/2, then the :authority pseudo-header should be looked at. From the HTTP/2 spec under 8.1.2.3 Request Pseudo-Header Fields:

The :authority pseudo-header field includes the authority portion of the target URI ([RFC3986], Section 3.2). The authority MUST NOT include the deprecated userinfo subcomponent for http or https schemed URIs.

To ensure that the HTTP/1.1 request line can be reproduced accurately, this pseudo-header field MUST be omitted when translating from an HTTP/1.1 request that has a request target in origin or asterisk form (see [RFC7230], Section 5.3). Clients that generate HTTP/2 requests directly SHOULD use the :authority pseudo-header field instead of the Host header field. An intermediary that converts an HTTP/2 request to HTTP/1.1 MUST create a Host header field if one is not present in a request by copying the value of the :authority pseudo-header field.

This part; Clients that generate HTTP/2 requests directly SHOULD use the :authority pseudo-header field instead of the Host header field.

This means that a server which understands HTTP/2, and which uses the hostname to figure out what to do, must look at the :authority if it's HTTP/2, and Host if it's HTTP/1.1. Since your example works with the Host set, I assume the server doesn't look at :authority.

hlx.page doesn't allow fetch calls from the browser, complains on CORS. It would be interesting to see if that exact code you gave, worked in the browser (and in which browsers).

I'm closing this because I'm relatively sure fetch-h2 is doing the right thing here (since #9), but if this appears as a problem for many servers, perhaps also sending a Host field is an option, although I'd like to know how other clients handle this first.

@stefan-guggisberg
Copy link
Contributor Author

I am pretty sure it's not a server issue. The server is Fastly (CDN).

Since your example works with the Host set, I assume the server doesn't look at :authority.

No, the :authority is set (via the Host header, see #9).

I believe the problem is that fetch-h2 does not set the :authority pseudo header when using the HTTP/2 protocol.

There might also be a problem with subdomains sharing the same IP address. It's common that different subdomains share the same IP address (e.g. server1.acme.com and server2.acme.com). They are still separate hosts.

Maybe sessions are internally identified by IP address & port instead of :authority?

Please consider reopening this issue.

@grantila
Copy link
Owner

I just wrote a test against github.io (github pages) and you seem to be right. That's certainly not good!

@grantila grantila reopened this Aug 20, 2020
@grantila
Copy link
Owner

I think I've fixed this now, but testing against hlx.page is really slow (takes 3-4 seconds to read the text), and concurrent requests stall. I'm not sure why this happens, it works well for concurrent {lang}.wikipedia.org requests (which also uses a wildcard SAN of *.wikipedia.org), so session/socket re-use and :authority header should now be set correctly.

@github-actions
Copy link

🎉 This issue has been resolved in version 2.5.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants