-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
http2: improve perf of passing headers to C++ #14723
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonderful work!
src/node_http2.cc
Outdated
|
||
CHECK_EQ(StringBytes::Write(isolate, | ||
header_contents, header_string_len, | ||
header_string, ASCII), header_string_len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using V8's String API should suffice (and be preferred) here, and it doesn't have the overhead StringBytes has.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes – done!
src/node_http2.h
Outdated
} | ||
|
||
private: | ||
MaybeStackBuffer<nghttp2_nv> headers_; | ||
size_t count_; | ||
MaybeStackBuffer<char> buf_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be appropriate to enlarge the default allocated stack size, as the size is determined by the type and a constant. nghttp2_nv
is obviously bigger than a char
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this buffer was really oversized before. I don’t know what the typical number of headers and their sizes would be, but I’m pretty sure 1024 headers were excessive :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this buffer is expected to contain the full string of headers, it may need to be quite a bit larger. There are real world cases of applications sending headers with > 1 MB of data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but yes, 1024 header pairs is.... excessive ;-) (I'd just never got around to tuning the size of it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about alignment? A compound type like nghttp2_nv
has more strict alignment requirements than char
. It will probably work out alright in practice but by accident, not by design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this buffer is expected to contain the full string of headers, it may need to be quite a bit larger. There are real world cases of applications sending headers with > 1 MB of data.
It is expected to be able hold the full string of headers often enough. I’ve bumped this to 3000 bytes by default, that ought to be enough for everyone ;) Even if not, the worst things that happens is a single malloc()
+ free()
combination.
What about alignment? A compound type like
nghttp2_nv
has more strict alignment requirements thanchar
.
Yeah, I guess it would always work, but I’ve added code to account for that concern now anyway.
if (n >= count_) { | ||
// This can happen if a passed header contained a null byte. In that | ||
// case, just provide nghttp2 with an invalid header to make it reject | ||
// the headers list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to have a test case against this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There already is :) test/parallel/test-http2-client-unescaped-path.js
, if you want to have a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather inserting a fake header, can we simply abort and set an invalid
flag on the class that we can check for before using the Headers
instance? That way we can return a proper error code back to the js layer without having to attempt sending the headers at all. Should be faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jasnell Yes, we could do that, but is that really a case that we want to optimize for? Plus, it would create inconsistency with how we handle other invalid headers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nah, you're right, this works.
@@ -415,16 +417,18 @@ function mapToHeaders(map, | |||
if (isArray) { | |||
for (var k = 0; k < value.length; k++) { | |||
val = String(value[k]); | |||
ret.push([key, val]); | |||
ret += `${key}\0${val}\0`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This presupposes that neither the key nor the val contain null characters. That may not be a safe assumption to make
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, saw that as I read further :-)
There's one consideration we need to make on this before going this route. The HTTP/2 header compression mechanism allows the sender to decide, on a header-by-header basis, whether or not the header pair should be stored in the stateful header compression table (it's one of the flags that can be set on the |
} else { | ||
val = String(value); | ||
ret.push([key, val]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A big part of the reason this was an array is because there will be a third value that needs to be passed in at some point... specifically, a flag that indicates whether or not the header pair should be stored in the header compression table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose we can add another NUL-delimited column if we were to implement that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could even just add a single character/byte per header to the string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
single byte (bit-flag) per header should work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, why don't we go ahead and add that extra byte now as a reserved field in the structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One idea for the API... right now headers are set as an object.. e.g.
{
':status': 200,
'content-type': 'text/plain',
'foo': 'bar'
}
We could (not saying should) allow the value to be wrapped somehow... e.g.
{
':status': 200,
'content-type': http2.flaggedHeader('text/plain', { store: false }),
'foo': 'bar'
}
This feels a bit awkward but it optimizes for the common case and API familiarity and only adds the weirdness when someone really does want to use the more advanced feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or, alternatively... (just brainstorming here ;-) ...)
{
':status': 200,
'foo': 'bar',
[http2.noStore]: {
'content-type': 'text/plain'
}
}
Where 'http2.noStore
is a Symbol
. This still feels weird, but it eliminates the potential polymorphism in the default case.
The other option is to change the headers API entirely and use a Headers
object... e.g.
const headers = new http2.Headers();
headers.set(':status', 200);
headers.set('content-type', 'text/plain', { store: false });
headers.set('foo', 'bar');
This has it's whole own set of issues, including being slower than using an object literal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jasnell I do like the second approach, for the reason you mentioned. :) Do you think we might ever want to add more options than NGHTTP2_NV_FLAG_NO_INDEX
? I don’t the NO_COPY
flags would be set by userland, so we’re good on that, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The NO_INDEX
flag is likely the only one we're going to need. I would be quite surprised if any new flag was added here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jasnell Prototyped this out at addaleax/node@bb2fb6c, but I’d say it’s different enough to leave it for another PR.
mapToHeaders(headers), | ||
[ ':path_abc_:status_200_abc_1_xyz_1_xyz_2_xyz_3_xyz_4_bar_1_' | ||
.replace(/_/g, '\0'), | ||
8 ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using mapToHeaders(headers).split('\0')
and keeping the second argument as an array (albeit not an array of arrays) may be easier to read
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using Array#join was a bit easier, but yes, good idea – done!
By passing a single string rather than many small ones and a single block allocation for passing headers, save expensive interactions with JS values and memory allocations.
cbc6729
to
a9d9f6d
Compare
If you have suggestions for how that might be implemented in the API, I volunteer. :) |
lib/internal/http2/util.js
Outdated
} | ||
} | ||
} | ||
|
||
return ret; | ||
return [ ret, count ]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we should go for consistent styling here (no spaces). By the way, I would actually like to enable linting against these with array-bracket-spacing: [error, never]
(only 7 errors in lib/*
, but quite some churn in test
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, done. I’d like to point out that, while I know arrays and objects are different things, we just enabled the opposite lint rule for objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know, but as far as I know the house style is still [a, b, c]
. To be honest, I personally prefer [ a, b, c ]
over [a, b, c]
, but enforcing spacing within arrays would cause over 7400 errors within our code. But as the majority seems to prefer arrays without spaces anyway, I would go for consistency, even though @refack warned me that a PR might not be too successful, depending on the churn.
Another something to keep in mind, currently, we take the strings and copy them into the nghttp2_nv structure, pass those off to nghttp2, then free the structure allocations. Because nghttp2 may not immediately send those headers, by default it will copy the name and value into it's own internal buffer that is freed when it actually sends the header pairs. That means for every header pair, there are two copies made. We can eliminate the need for the extra copy by managing that memory ourselves and telling nghttp2 not to copy the name and value using a flag on the nghttp2_nv struct. We would need to add a callback to know when the headers are sent so we know when to free, but it could help shave off some time. It's worth looking into. |
Heh... that's more difficult. It would completely change the way headers are currently set and make it inconsistent with the way they are handled in the http/1 API. I'd played around with this a bit early on and decided to back off for the time being. Essentially, if we want to support this, then we'll need to go with a new API design. |
// Make sure the start address is aligned appropriately for an nghttp2_nv*. | ||
char* start = reinterpret_cast<char*>( | ||
ROUND_UP(reinterpret_cast<uintptr_t>(*buf_), alignof(nghttp2_nv))); | ||
char* header_contents = start + (count_ * sizeof(nghttp2_nv)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parens aren't necessary here. (They don't hurt either, of course.)
@jasnell Did some reading up & thinking about
I’m landing this PR soon, I’ll be working a bit on the above, and we’ll see where we get. |
Agreed on all points. The key issue will be the ergonomics of the api we expose. |
@addaleax can you link your research? I'm unconvinced that we should not store cookies in the compressed headers. I do not see an attack scenario where storying cookies would couse less security that not storying them (or using HTTP1). There are little best practices around HTTP2, so we can expect all of this to change. IMHO we should not make any decisions for our users on both layers, but provide an API (this is going to be hard, I agree), to help them help themselves. |
@mcollina That would be mostly going by https://http2.github.io/http2-spec/compression.html#compression.based.attacks, which does explicitly mention
I think this expectation makes support for the flag a requirement for a full-featured HTTP/2 API.
I am not convinced either but would prefer to err on the side of caution in security issues.
I think for the compat API that’s literally impossible because we’re always adding a bit of information that is not present in the HTTP/1 API, no matter how implicitly or explicitly we do that. ;)
I don’t think anybody is disagreeing with you on this :) |
I don’t mind continuing this discussion here, but to say something on-topic for a change: Landed in 348dd66 ;) |
By passing a single string rather than many small ones and a single block allocation for passing headers, save expensive interactions with JS values and memory allocations. PR-URL: #14723 Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: James M Snell <[email protected]>
By passing a single string rather than many small ones and a single block allocation for passing headers, save expensive interactions with JS values and memory allocations. PR-URL: nodejs#14723 Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: James M Snell <[email protected]>
By passing a single string rather than many small ones and a single block allocation for passing headers, save expensive interactions with JS values and memory allocations. PR-URL: #14723 Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: James M Snell <[email protected]>
By passing a single string rather than many small ones and a single block allocation for passing headers, save expensive interactions with JS values and memory allocations.
(Note: The benchmark has a lot of overhead since it does full request/response cycles. I don’t see any way around that right now, but the actual improvement of the changed code should be significantly higher than what’s displayed here.)
This is blocked by #14688 because the implementation for trailers isn’t changed here, but in that PR the two are unified so we won’t have to bother about that after that lands.
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes (after src,http2: DRY header/trailer handling code up #14688)Affected core subsystem(s)
http2
/cc @nodejs/http2