-
Notifications
You must be signed in to change notification settings - Fork 239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
G1: Add proposal for true binary metadata #19
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
True Binary Metadata | ||
---- | ||
* Author(s): ctiller | ||
* Approver: ejona | ||
* Status: Draft | ||
* Implemented in: n/a | ||
* Last updated: 3/29/17 | ||
* Discussion at: https://groups.google.com/forum/#!topic/grpc-io/Ww8waYz_Nes | ||
|
||
## Abstract | ||
|
||
Add a HTTP2 extension to allow binary metadata (those suffixed with -bin) to be | ||
sent without the base64 encode/decode step. | ||
|
||
## Background | ||
|
||
gRPC allows binary metadata to be sent by applications. These metadata elements | ||
are keyed with a -bin suffix. When transmitted on the wire, since HTTP2 does not | ||
allow binary headers, we base64 encode these elements and Huffman compress them. | ||
On receipt, we transparently reverse the transformation. | ||
|
||
This transformation is costly in terms of CPU, and there exist use cases where | ||
this transformation can become the CPU bottleneck for gRPC. | ||
|
||
### Related Proposals: | ||
|
||
n/a | ||
|
||
## Proposal | ||
|
||
### New setting | ||
|
||
Expose a custom setting in our HTTP2 settings exchange: | ||
GRPC_ALLOW_TRUE_BINARY_METADATA = 0xfe03. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This doesn't seem gRPC-specific enough to justify There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That it only applies to -bin headers (which are a gRPC thing) makes it gRPC-specific enough (IMO) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why only There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So gRPC uses -bin to specify that a header is binary, and says that other metadata must be ascii (per http2). That -bin has semantic meaning (foo-bin is a different header than foo). We use that as a signal to base64 encode things right now. Under this proposal we'd just switch to an alternate encoding. Since we can't universally make all of the HTTP2 ecosystem take binary metadata, a mechanism to allow transmitting anything and dropping the -bin isn't possible for us (we need to keep interoperating - so proxies that support this would potentially need to do the base64 conversion if one side didn't support this mechanism). If there were an extension that allowed sending binary values in metadata, we'd still end up needing a disambiguating byte for backwards compatibility (for the gRPC -bin suffixed headers), and we'd still maintain the requirement that non -bin headers must be ascii. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are you aware of any HTTP/2 proxies that would reject requests because of "true binary" encoding of the header field? Unless it's a well-known header that must be processed by the proxy, most (all?) will just pass-through headers without ever looking at the content of the header fields. Don't get me wrong, I'm all in for allowing binary header fields, but in order get this accepted into HTTP/2 ecosystem, this proposal should be a little more than gRPC implementation detail, that's why I believe that standardizing on " There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To answer this myself, RFC7540, sec. 10.3. Intermediary Encapsulation Attacks says:
I also did some digging on the http-wg mailing list and it seems that sending binary blobs in header fields was discussed during HTTP/2 standardization, but they eventually decided against it, because apparently base64 in HPACK is "good enough". |
||
|
||
This setting is randomly chosen (to avoid conflicts with other extensions), and | ||
within the experimental range of HTTP extensions (see https://tools.ietf.org/html/rfc7540#section-11.3). | ||
|
||
The setting can have the values 0 (default) or 1. If the setting is 1, then | ||
peers MAY use a 'true binary' encoding (described below), instead of the current | ||
base64 encoding for -bin metadata. | ||
|
||
Implementations SHOULD transmit this setting only once, and as part of the first | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's right. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ... or they can just not advertise support for this feature. |
||
settings frame. | ||
|
||
### 'True binary' encoding | ||
|
||
When transmitting metadata on a connection where the peer has specified | ||
GRPC_ALLOW_TRUE_BINARY_METADATA, instead of encoding using base64, an | ||
implementation MAY instead prefix a NUL byte to the metadata and transmit the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. s/metadata/metadata value/. Or maybe "header field value". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
metadata value in binary form. | ||
|
||
Since this is a HTTP2 extension and other extensions might alias this extension | ||
id, it's possible that this becomes misconfigured. In that case, peers are | ||
required to RST_STREAM with http error PROTOCOL_ERROR (as required by https://tools.ietf.org/html/rfc7540#section-10.3). | ||
If a binary encoding was attempted and such a RST_STREAM is received without any other headers, | ||
implementations SHOULD retry the request with base64 encoding, and disable | ||
binary encoding for future requests. Verbosely logging this condition is | ||
encouraged. | ||
|
||
### Examples | ||
|
||
Suppose we wanted to send metadata element 'foo-bin: 0x01' (ie a single byte | ||
containing '1'). | ||
|
||
Under base64, we'd send a http header 'foo-bin: AQ'. | ||
Under binary, we'd send 'foo-bin: 0x00 0x01' (ie prefixing a NUL byte and then | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why do we need I actually spent some time on Wednesday looking into this (for different reasons), and I couldn't find anything in the spec that forbids binary header values. This was a limitation in SPDY, which used There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. https://tools.ietf.org/html/rfc7540#section-8.1 says that https://tools.ietf.org/html/rfc7230#section-3.2 is normative, which does not allow binary headers. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ...which means that My point is that nothing in HPACK prevents it from transmitting binary data, so we could add generic But I guess There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. -bin is not going away so long as there are clients, servers, and intermediaries that do not support binary headers. |
||
sending the binary metadata value.) | ||
|
||
## Rationale | ||
|
||
Binary metadata transmission performance is critical for a number of | ||
applications. | ||
|
||
Various workarounds were considered: | ||
1. Switching to base16 - this would require a backwards incompatible protocol | ||
change for gRPC, and has the disadvantage of bloating wire size, which | ||
additionally interacts badly with hpack. | ||
2. Adding a new suffix (-raw): this leaks further implementation details to | ||
application developers, and likely would still need a base64 workaround | ||
|
||
## Implementation | ||
|
||
A trial implementation in gRPC C core is underway and expected to be ready in | ||
coming weeks. | ||
|
||
## Open issues (if applicable) | ||
|
||
n/a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some data here (i.e. metadata to payload ratio, metadata size, how much CPU does base64 encoding, base64 decoding, Huffman encoding and Huffman decoding each take)? It makes a lot of difference whether we're talking about 2% or 50% of total CPU processing time