-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quadratic performance of h2.connection.H2Connection._open_streams #1184
Comments
I've actually implemented a fix for this in #1185 |
I do plan to get to this, but I need to find a bit more time given I don't understand it yet. Thank you for looking into this and the fix though. |
For sure, let me know if there's anything I can help to clarify |
@pgjones anything that I can do to help push this along? Currently using my fork in production, which is making tens of millions of http2 requests/day, but would prefer switching to use upstream |
I'm seeing this too. |
First let me say this is an excellent library! Thanks for taking the time to maintain/update it.
While writing an http2 client library for tornado (I can hopefully share it soon ;) ), I ran into very poor performance while benchmarking the client with many concurrently open streams (5000+). Here's a cProfile output for starting/finishing 5000 concurrent requests:
As you can see, almost half of the run time is spent in
_open_streams
Here's the function code for reference:
The function loops through all streams, and returns the count of all open remote/local streams. As self.max_outbound_streams is called on each stream open in
send_headers
, the performance of this function is quadratic in the number of concurrently open streams over the lifetime of the program.And to check the numbers,
stream.py's
open()
is called per element inself.streams
. It was called11,113,314
times. The benchmark was run with 5000 concurrent streams, and sinceopen()
is called on opening a stream, the runtime isO(1 + 2 + ... + n) =~ (n * (n-1)) / 2
in the worst case if all streams remain open during this time. If we insert 5000 into that formula we get12,497,500
which is within 10% of the function calls we got from the cProfile output.Proposal: Update H2Connection to incrementally update the count of open local/remote streams, to make the function call O(1).
The logic here being that we are processing all events, so we know when we're processing the event if the stream should be closed/open, and can update counts incrementally when there is a state change. I'm not super familiar with the code base, but we should also be able to clean up
closed
streams on state change as well, or else make some other process for cleaning up close streams fromself.streams
.I'm happy to implement the change, just let me know if this is a path you'd be willing to go down, or if there is anything that makes this infeasible (totally possible! Started using this library two days ago :) )
The text was updated successfully, but these errors were encountered: