SOLR-17240: Fix Http2SolrClient maxConnectionsPerDestination usage #2410

HoustonPutman · 2024-04-18T17:05:27Z

https://issues.apache.org/jira/browse/SOLR-17240

markrmiller

Oh wow, I thought it was just defaulting to 4 and you couldn't necessarily change that everywhere in internal use because I saw the setMaxConnectionsPerHost in the builder. I didn't realize that that is a complete noop for http2 mode. That seems like a silly choice in any scenario.

magibney · 2024-04-19T13:46:13Z

javadoc should be updated on HttpSolrClientBuilderBase.withMaxConnectionsPerHost(int), explaining the new difference between how this is handled for http1 vs http2.

I wonder, to approximate similar high-level behavior for http1 vs http2, could the http2 client, under the hood, interpret this setting as ((maxConnectionsPerDest - 1) / 300) + 1, or something? I don't like having that 300 constant hardcoded there, as presumably the multiplexing factor in jetty could change ... but idk, something along those lines might address Sanjay's concern.

HoustonPutman · 2024-04-19T20:27:59Z

Ok I've updated the docs, and used the maxConcurrentPushedStreams to get the multiplexing factor.

This should probably be good to go?

markrmiller · 2024-04-21T06:42:52Z

under the hood, interpret this setting as ((maxConnectionsPerDest - 1) / 300) + 1, or something? I don't like having that 300 constant hardcoded there, as presumably the multiplexing factor in jetty could change ... but idk, something along those lines might address Sanjay's concern.

I don't think you want to do that, this should do what it says and set the max connections and not relate to multiplexing. Multiplexing should be it's own setter if you want it be configurable but trying to tie the two together really wouldn't do the user any favors.

HoustonPutman · 2024-05-10T15:12:01Z

under the hood, interpret this setting as ((maxConnectionsPerDest - 1) / 300) + 1, or something? I don't like having that 300 constant hardcoded there, as presumably the multiplexing factor in jetty could change ... but idk, something along those lines might address Sanjay's concern.

I don't think you want to do that, this should do what it says and set the max connections and not relate to multiplexing. Multiplexing should be it's own setter if you want it be configurable but trying to tie the two together really wouldn't do the user any favors.

Fair point mark. I'll make these things independent, add documentation, and make the defaults across solr make sense for both http1 and http2.

epugh · 2024-06-19T17:27:42Z

Anything I can do to help this move forward? (Testing wise!).

iamsanjay · 2024-06-20T03:47:43Z

I came across this article where they tested with various different combinations of concurrent Streams and number of http connections.

https://blog.vespa.ai/http2/

To compare the throughput of HTTP/1.1 vs HTTP/2 in the Vespa container, we measured the rate of HTTP requests (QPS) we could obtain on a single node with 48 CPU cores and 256GB RAM, using h2load as a benchmarking tool. As expected, single connection throughput increased significantly with HTTP/2. HTTP/2 with 256 concurrent streams gave a throughput of 115 000 requests per seconds compared to 6 500 for HTTP/1.1. Increasing the number of connections from 1 to 128 increased throughput to 115 000 for HTTP/1.1, while HTTP/2 gave 125 000 with the same setup (without request multiplexing). HTTP/2 was also more efficient, with lower CPU utilization—a consequence of its compact protocol representation and header compression. The highest throughput for HTTP/2 was observed with 4 clients and 256 concurrent streams. This configuration resulted in 225 000 requests per seconds—roughly double the best case for HTTP/1 (128 connections). The h2load tool was CPU constrained for the single connection benchmark as it could only utilize a single CPU core per connection. Having 4 connections removed the CPU bottleneck. Increasing beyond 4 connections resulted in gradually more overhead and degraded throughput.

For them 4 clients and 256 concurrent streams gives highest throughput.

We should try to test it with different configuration and try to find the right configuration that works for us. Also, I believe, most of the Recovery code in main branch is now using HTTP2 and we can start with testing that part.

NOTE :- maxConcurrentStreams is a server level configuration.

github-actions · 2024-08-20T00:00:37Z

This PR has had no activity for 60 days and is now labeled as stale. Any new activity or converting it to draft will remove the stale label. To attract more reviewers, please tag people who might be familiar with the code area and/or notify the [email protected] mailing list. Thank you for your contribution!

epugh · 2024-08-20T11:11:01Z

@iamsanjay did you get a chance to look at this one from a testing perspective? Seems like such a great ticket...

dsmiley · 2024-09-06T22:03:55Z

Curious if this change has been tested/deployed in any way?

github-actions · 2024-11-19T00:00:37Z

This PR has had no activity for 60 days and is now labeled as stale. Any new activity will remove the stale label. To attract more reviewers, please tag people who might be familiar with the code area and/or notify the [email protected] mailing list. To exempt this PR from being marked as stale, make it a draft PR or add the label "exempt-stale". If left unattended, this PR will be closed after another 60 days of inactivity. Thank you for your contribution!

SOLR-17240: Fix Http2SolrClient maxConnectionsPerDestination usage

9db532e

HoustonPutman requested review from vthacker, dsmiley and markrmiller April 18, 2024 17:05

github-actions bot added the client:solrj label Apr 18, 2024

jdyer1 approved these changes Apr 18, 2024

View reviewed changes

markrmiller approved these changes Apr 18, 2024

View reviewed changes

Divide by the multiplexing factor. Update docs

db83908

github-actions bot added the stale PR not updated in 60 days label Aug 20, 2024

github-actions bot removed the stale PR not updated in 60 days label Aug 23, 2024

janhoy approved these changes Sep 9, 2024

View reviewed changes

dsmiley mentioned this pull request Sep 19, 2024

Benchmark: HTTP2 fix cancel_stream_error #2612

Draft

risdenk approved these changes Sep 19, 2024

View reviewed changes

github-actions bot added the stale PR not updated in 60 days label Nov 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SOLR-17240: Fix Http2SolrClient maxConnectionsPerDestination usage #2410

SOLR-17240: Fix Http2SolrClient maxConnectionsPerDestination usage #2410

HoustonPutman commented Apr 18, 2024

markrmiller left a comment

magibney commented Apr 19, 2024

HoustonPutman commented Apr 19, 2024

markrmiller commented Apr 21, 2024

HoustonPutman commented May 10, 2024

epugh commented Jun 19, 2024

iamsanjay commented Jun 20, 2024

github-actions bot commented Aug 20, 2024

epugh commented Aug 20, 2024

dsmiley commented Sep 6, 2024

github-actions bot commented Nov 19, 2024

SOLR-17240: Fix Http2SolrClient maxConnectionsPerDestination usage #2410

Are you sure you want to change the base?

SOLR-17240: Fix Http2SolrClient maxConnectionsPerDestination usage #2410

Conversation

HoustonPutman commented Apr 18, 2024

markrmiller left a comment

Choose a reason for hiding this comment

magibney commented Apr 19, 2024

HoustonPutman commented Apr 19, 2024

markrmiller commented Apr 21, 2024

HoustonPutman commented May 10, 2024

epugh commented Jun 19, 2024

iamsanjay commented Jun 20, 2024

github-actions bot commented Aug 20, 2024

epugh commented Aug 20, 2024

dsmiley commented Sep 6, 2024

github-actions bot commented Nov 19, 2024