Reduce impact of backendRequests on latency #2530

joe-elliott · 2023-06-01T14:05:10Z

What this PR does:
Uses a channel to return jobs instead of a returning them as a slice from backendRequests. This nicely improves performance for queries that create a huge number of jobs.

Other Changes

Fixes a bug in searchProgress.internalShouldQuit() where we needed 1 more than the limit to quit
Switches totalBlockBytes to be a uint64 throughout

Benchmarks:

name                           old time/op    new time/op    delta
SearchSharderRoundTrip5-8         181ms ± 1%       1ms ± 9%  -99.54%  (p=0.008 n=5+5)
SearchSharderRoundTrip500-8       184ms ± 1%      12ms ± 1%  -93.27%  (p=0.008 n=5+5)
SearchSharderRoundTrip50000-8     318ms ± 4%     472ms ± 2%  +48.65%  (p=0.008 n=5+5)

name                           old alloc/op   new alloc/op   delta
SearchSharderRoundTrip5-8         118MB ± 0%       0MB ± 0%  -99.62%  (p=0.008 n=5+5)
SearchSharderRoundTrip500-8       120MB ± 0%       5MB ± 0%  -95.99%  (p=0.008 n=5+5)
SearchSharderRoundTrip50000-8     176MB ± 0%     176MB ± 0%   -0.10%  (p=0.008 n=5+5)

name                           old allocs/op  new allocs/op  delta
SearchSharderRoundTrip5-8         1.15M ± 0%     0.00M ± 2%  -99.88%  (p=0.008 n=5+5)
SearchSharderRoundTrip500-8       1.17M ± 0%     0.06M ± 0%  -95.12%  (p=0.008 n=5+5)
SearchSharderRoundTrip50000-8     2.24M ± 0%     2.26M ± 0%   +0.89%  (p=0.008 n=5+5)

Impact on exhaustive search with 100k jobs:

Which issue(s) this PR fixes:
Fixes #2469

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Joe Elliott <[email protected]>

mdisibio · 2023-06-01T16:06:18Z

modules/frontend/searchsharding.go

-			reqs = append(reqs, subR)
+
+			select {
+			case reqCh <- &backendReqMsg{req: subR}:


SearchSharderRoundTrip50000-8 318ms ± 4% 472ms ± 2% +48.65%

One thought on this is we could reduce the channel overhead by sending batched requests instead of individually. Looking at the code the easiest split is probably all jobs for a block in one channel send here.

Naive attempt was worse. before = this PR, after = this PR with batching as suggested:

name old time/op new time/op delta SearchSharderRoundTrip5-8 659µs ± 1% 1108µs ±83% +68.15% (p=0.016 n=4+5) SearchSharderRoundTrip500-8 12.2ms ± 4% 12.9ms ± 6% ~ (p=0.056 n=5+5) SearchSharderRoundTrip50000-8 474ms ±11% 540ms ± 8% +13.91% (p=0.032 n=5+5) name old alloc/op new alloc/op delta SearchSharderRoundTrip5-8 451kB ± 0% 588kB ±54% +30.19% (p=0.008 n=5+5) SearchSharderRoundTrip500-8 4.80MB ± 0% 4.96MB ± 0% +3.43% (p=0.008 n=5+5) SearchSharderRoundTrip50000-8 176MB ± 0% 181MB ± 0% +3.03% (p=0.016 n=5+4) name old allocs/op new allocs/op delta SearchSharderRoundTrip5-8 1.33k ± 2% 2.98k ±133% +123.52% (p=0.008 n=5+5) SearchSharderRoundTrip500-8 57.2k ± 0% 58.0k ± 0% +1.24% (p=0.008 n=5+5) SearchSharderRoundTrip50000-8 2.26M ± 0% 2.28M ± 0% +0.88% (p=0.008 n=5+5)

I think the additional memory management offsets it. Personally, i'm not concerned about that +40%. Even in that case the overall performance is going to be significantly better b/c we're getting jobs to queriers faster.

That first benchmark SearchSharderRoundTrip5 is the most interesting b/c it roughly represents "time to first job" which is the real improvement here.

zalegrala

This looks good to me. Nice changes. I think @mdisibio has an interesting idea.

joe-elliott added 4 commits June 1, 2023 08:43

cleanup

1e468d2

Signed-off-by: Joe Elliott <[email protected]>

first pass. metrics broken

a7a856f

Signed-off-by: Joe Elliott <[email protected]>

tests and benches

b167a7d

Signed-off-by: Joe Elliott <[email protected]>

restore stats

0e2fe7c

Signed-off-by: Joe Elliott <[email protected]>

joe-elliott requested review from annanay25, mdisibio, mapno, yvrhdn, zalegrala, electron0zero, ie-pham and stoewer as code owners June 1, 2023 14:05

joe-elliott added 2 commits June 1, 2023 10:09

changelog

f45315d

Signed-off-by: Joe Elliott <[email protected]>

tests + uint64 totalBlockBytes

23fbd89

Signed-off-by: Joe Elliott <[email protected]>

mdisibio reviewed Jun 1, 2023

View reviewed changes

zalegrala reviewed Jun 1, 2023

View reviewed changes

Merge remote-tracking branch 'upstream/main' into jackie-channels

224fe4b

mdisibio approved these changes Jun 2, 2023

View reviewed changes

joe-elliott merged commit 17c141f into grafana:main Jun 2, 2023

joe-elliott mentioned this pull request Jun 22, 2023

Fix range calculation on recent searches #2581

Merged

3 tasks

joe-elliott mentioned this pull request Jul 25, 2023

[Search Perf] Improve Query Frontend -> Querier Job Throughput #2464

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce impact of backendRequests on latency #2530

Reduce impact of backendRequests on latency #2530

joe-elliott commented Jun 1, 2023 •

edited

Loading

mdisibio Jun 1, 2023

joe-elliott Jun 1, 2023 •

edited

Loading

zalegrala left a comment

Reduce impact of backendRequests on latency #2530

Reduce impact of backendRequests on latency #2530

Conversation

joe-elliott commented Jun 1, 2023 • edited Loading

mdisibio Jun 1, 2023

Choose a reason for hiding this comment

joe-elliott Jun 1, 2023 • edited Loading

Choose a reason for hiding this comment

zalegrala left a comment

Choose a reason for hiding this comment

joe-elliott commented Jun 1, 2023 •

edited

Loading

joe-elliott Jun 1, 2023 •

edited

Loading