Implement HTTP/1.1 support. #75

Lukasa · 2014-08-29T13:48:12Z

Over time it has become increasingly clear that httplib/http.client are a liability to requests/urllib3. Given that we'll need a HTTP/1.1 stack and that httplib is a liability, we should look into writing a new one.

This issue is a long-term goal, but is open to track the desired features from such a rewrite. This should be a list of mistakes that httplib has made that we should not make.

Initial list:

Configurable file-like-object streaming size. httplib streams file objects much slower than cURL does because it uses quite small chunks. We should make this faster.
Support for 1XX responses.
Better API.
Support for not ruining file-objects when connections fall apart.
No support for HTTP/1.0 or earlier.
A clean mechanism for Upgrade:, such that the socket is provided in a known-good state along with information about the connection.
Better separation of socket and parser logic.

/cc @sigmavirus24 @shazow for more ideas.

The text was updated successfully, but these errors were encountered:

dimaqq · 2014-08-29T14:22:45Z

IIRC requests uses urlilib3 for that very purpose -- for example httplib.HTTPConnection reads response header one byte at a time, and requests (via urllib3) 8K at a time.

In other words, why not use urllib3?

Lukasa · 2014-08-29T14:26:47Z

@dimaqq You have the abstraction layer backwards.

The hierarchy is supposed to be: requests -> urllib3 -> hyper. urllib3 should build on top of us, not the other way around.

Note that because of its reliance on httplib urllib3 is subject to all the limitations I mentioned above.

dstufft · 2014-08-29T15:47:23Z

Here's a thing, httplib is poorly factored. There is basically zero reason why a http library should have it's connection management/socket code entwined with it's HTTP parser.

piotr-dobrogost · 2014-09-01T08:37:09Z

Linking with urllib3/urllib3/issues/58 as closely related.

Lukasa · 2015-02-18T14:26:00Z

Ok, here's a proposed basic design principle.

HTTP/1.1 can be thought of as a special-case of HTTP/2, with the following limitations:

Max concurrent streams forced to 1.
Header frames 'compress' into linewise output.
No frame headers.

This means we can conceptually implement HTTP/1.1 by having a special-case frame renderer. That allows the middle and top layers to be protocol-version agnostic, thinking in terms of streams, while the bottom layer simply changes how the data is rendered out. There are some requirements at the higher level to enforce certain behaviours (max concurrent streams etc.), but this represents probably the cleanest way to support both versions in the codebase, while allowing for transparent protocol version change.

Lukasa · 2015-03-01T10:06:42Z

@shazow Do you have any thoughts about what an ideal httplib replacement API would be?

shazow · 2015-03-01T18:25:36Z

Aside from how you build/call requests, one big painpoint of httplib is the lack of clear granular state of a given connection/request at any given time, and poorly structured errors.

But yea, I agree with @dstufft, it would be best if there was a way to give some socket-like object and be like "ok treat this as HTTP v1.1, make request X to here" then have it let go of the object once the request is done.

Lukasa · 2015-03-01T18:27:40Z

What's the rationale for having the socket object not be owned by some kind of 'HTTP connection' object?

shazow · 2015-03-01T18:30:48Z

It can be owned by whatever if it makes sense for it to be in that specific context, but if you're managing your own sockets (e.g. urllib3) then all you want is something that knows the protocol you want to speak (http v1.1 or whatever).

The layers you have:

Socket setup and configuration (tls wrapping, etc)
Connection pooling
Retrying/timeouts/error handling
Request building and response reading
HTTP protocol

If in order to use 5 you need to relinquish 1 (and maybe others), then it makes the stuff in between very hard.

Also bonus points if it works with things other than sockets, for testing and other novel usecases.

Lukasa · 2015-03-01T18:44:32Z

Ok, so here's some notes.

Socket setup and configuration needs to be HTTP specific because of HTTP/2. Socket setup for TLS connections determines which type of HTTP can be spoken over that connection ahead of time. 1 and 5 are therefore tightly matched. All hyper's connection objects should be able to take socket objects provided from elsewhere, but they will always prefer to create them themselves.

I feel like the right abstraction here is for higher layers to manage connections, not sockets. A connection in this case is the local end of a HTTP state machine and its underlying transport (whatever that is). The state machine itself should know relatively little about the underlying transport: at most, it should believe it has send and recv methods (additional abstraction layers can be inserted if necessary).

What matters here, I think, is that hyper have a very clear semantic of what the transport layer should be for any connection (in terms of its API). Currently, that is fairly well defined: it needs a method called send, a method called recv, and a method called readline. This looks sockety, but is actually trivially defined in terms of other things (file wrappers, in-memory buffers, etc.), with the trickiest part being readline, which is really necessary so that the upper layers don't have to buffer data in order to work out where the hell a header line ends.

This would allow libraries like urllib3 to override the socket if necessary. However, once a socket has been handed to a hyper connection it really does need to own it from that point onward, because there is state inextricably tied up with it. What we should do is signal unambiguously whether a connection can safely be re-used or not, to make it easier to pool connections.

Lukasa · 2015-03-10T20:35:42Z

Alright, see #92.

Lukasa · 2015-04-03T18:20:12Z

Merged! \o/

Lukasa added the Long-Term Goal label Aug 29, 2014

Lukasa closed this as completed Apr 3, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement HTTP/1.1 support. #75

Implement HTTP/1.1 support. #75

Lukasa commented Aug 29, 2014

dimaqq commented Aug 29, 2014

Lukasa commented Aug 29, 2014

dstufft commented Aug 29, 2014

piotr-dobrogost commented Sep 1, 2014

Lukasa commented Feb 18, 2015

Lukasa commented Mar 1, 2015

shazow commented Mar 1, 2015

Lukasa commented Mar 1, 2015

shazow commented Mar 1, 2015

Lukasa commented Mar 1, 2015

Lukasa commented Mar 10, 2015

Lukasa commented Apr 3, 2015

Implement HTTP/1.1 support. #75

Implement HTTP/1.1 support. #75

Comments

Lukasa commented Aug 29, 2014

dimaqq commented Aug 29, 2014

Lukasa commented Aug 29, 2014

dstufft commented Aug 29, 2014

piotr-dobrogost commented Sep 1, 2014

Lukasa commented Feb 18, 2015

Lukasa commented Mar 1, 2015

shazow commented Mar 1, 2015

Lukasa commented Mar 1, 2015

shazow commented Mar 1, 2015

Lukasa commented Mar 1, 2015

Lukasa commented Mar 10, 2015

Lukasa commented Apr 3, 2015