Replace CombineTraceProtos with new Combiner #1291

mdisibio · 2022-02-17T16:52:18Z

What this PR does:
This PR introduces a new Combiner which is similar to CombineTraceProtos but more efficient when combining more than two inputs. The previous pairwise usage of CombineTraceProtos had a couple inefficiencies when combining more than two inputs: (a) the intermediate result was sorted every time (b) the hash of span tokens was rebuilt every time. Combiner is stateful and improves this, which leads to significant reduction in cpu and memory. Performance is identical when combining just 2 inputs.

Additionally, this PR changes the span/token hashing to 64-bit to reduce the collision rate. Experimentally the collision rate of fnv32 approached 1 in 10,000 spans, which is significant because any collision results in a dropped span. 64-bit has no collisions up to the tested limit of a trace with 1M spans. Performance is still good.

Feedback
In order to maintain identical performance against 2 segments, Combiner must not save the span tokens for the second input, like how CombineTraceProtos did not. This can be generalized in that we never need to save the span tokens for the last input. These savings are significant enough to where it's worth accounting for, and there are many cases where we do know the length. I would like feedback on the chosen ergonomics/naming/style and see if there is a better pattern. For example:

// Unknown number of inputs
c := NewCombiner()
c.Consume(tr)
c.Consume(tr)
    ...
c.Result()

// Known number of inputs
c := NewCombiner()
for i, tr := range traces {
    c.ConsumeWithFinal(tr, i == len(traces)-1)
}
c.Result()

Benchmarks
This benchmark combines trace 2 to 8 segments of 100K spans each. Improvements are greater as more segments are combined.

BenchmarkCombine/2/CombineTraceProtos-12         	      27	  44842636 ns/op	 6316197 B/op	  115510 allocs/op
BenchmarkCombine/2/Combiner-12                   	      25	  44691228 ns/op	 6319784 B/op	  115615 allocs/op

BenchmarkCombine/3/CombineTraceProtos-12         	       9	 125415228 ns/op	17512674 B/op	  291023 allocs/op
BenchmarkCombine/3/Combiner-12                   	      15	  72986714 ns/op	11202546 B/op	  175560 allocs/op

BenchmarkCombine/4/CombineTraceProtos-12         	       5	 241490759 ns/op	35993108 B/op	  524033 allocs/op
BenchmarkCombine/4/Combiner-12                   	      10	 108902778 ns/op	18498868 B/op	  233082 allocs/op

BenchmarkCombine/8/CombineTraceProtos-12         	       1	1119084275 ns/op	164185168 B/op	 2061117 allocs/op
BenchmarkCombine/8/Combiner-12                   	       5	 242659052 ns/op	 37625163 B/op	  469882 allocs/op

Which issue(s) this PR fixes:
Should help #976

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

… trace deduping

modules/frontend/tracebyidsharding.go

mdisibio · 2022-02-17T16:58:19Z

@tanner-bruce Would appreciate your feedback too as you've also done of a lot of investigation in this area.

joe-elliott

Drooling over some of these benchmarks. We should definitely see some improvements in resource usage during both compaction and querying. A couple of baby nits and one real question.

Nice!

pkg/model/trace/combine_test.go

pkg/model/trace/combine.go

modules/frontend/tracebyidsharding.go

pkg/model/trace/combine.go

mdisibio · 2022-02-23T21:54:49Z

Pushed an optimization to alloc the span map using the first input size, which saves a few more MB per call. Thanks @tanner-bruce !

BenchmarkCombine/2/Combiner-12         	      27	  41954908 ns/op	 4650348 B/op	  113443 allocs/op
BenchmarkCombine/3/Combiner-12         	      15	  72924713 ns/op	 9526314 B/op	  173208 allocs/op
BenchmarkCombine/4/Combiner-12         	      10	 105436694 ns/op	16823280 B/op	  230750 allocs/op
BenchmarkCombine/8/Combiner-12         	       5	 237391203 ns/op	35946148 B/op	  467689 allocs/op

joe-elliott

Looks great. Thanks for the help on this one @tanner-bruce!

mdisibio added 2 commits February 8, 2022 16:27

Create trace.Combiner which has better performance on multiple/larger…

a4659c8

… trace deduping

Migrate all CombineTraceProtos to Combiner

2a2bda5

mdisibio requested review from joe-elliott, annanay25, dgzlopes, mapno, yvrhdn and zalegrala as code owners February 17, 2022 16:52

mdisibio commented Feb 17, 2022

View reviewed changes

modules/frontend/tracebyidsharding.go Show resolved Hide resolved

lint

6e9e085

joe-elliott reviewed Feb 17, 2022

View reviewed changes

pkg/model/trace/combine_test.go Outdated Show resolved Hide resolved

pkg/model/trace/combine.go Outdated Show resolved Hide resolved

modules/frontend/tracebyidsharding.go Show resolved Hide resolved

pkg/model/trace/combine.go Outdated Show resolved Hide resolved

mdisibio added 2 commits February 18, 2022 15:07

Review feedback, delete CombineTraceProtos

2458b91

Wait until first call to alloc span map, using actual input size

1695a50

mdisibio requested a review from joe-elliott February 24, 2022 12:45

joe-elliott approved these changes Feb 24, 2022

View reviewed changes

mdisibio added 2 commits February 24, 2022 10:28

changelog

e4cf0cf

Merge branch 'main' into combiner

b51cda5

mdisibio enabled auto-merge (squash) February 24, 2022 17:55

mdisibio merged commit 7d26ee2 into grafana:main Feb 24, 2022

mdisibio deleted the combiner branch April 25, 2023 18:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace CombineTraceProtos with new Combiner #1291

Replace CombineTraceProtos with new Combiner #1291

mdisibio commented Feb 17, 2022

mdisibio commented Feb 17, 2022

joe-elliott left a comment •

edited

Loading

mdisibio commented Feb 23, 2022

joe-elliott left a comment

Replace CombineTraceProtos with new Combiner #1291

Replace CombineTraceProtos with new Combiner #1291

Conversation

mdisibio commented Feb 17, 2022

mdisibio commented Feb 17, 2022

joe-elliott left a comment • edited Loading

Choose a reason for hiding this comment

mdisibio commented Feb 23, 2022

joe-elliott left a comment

Choose a reason for hiding this comment

joe-elliott left a comment •

edited

Loading