-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace CombineTraceProtos with new Combiner #1291
Conversation
@tanner-bruce Would appreciate your feedback too as you've also done of a lot of investigation in this area. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drooling over some of these benchmarks. We should definitely see some improvements in resource usage during both compaction and querying. A couple of baby nits and one real question.
Nice!
Pushed an optimization to alloc the span map using the first input size, which saves a few more MB per call. Thanks @tanner-bruce !
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Thanks for the help on this one @tanner-bruce!
What this PR does:
This PR introduces a new Combiner which is similar to CombineTraceProtos but more efficient when combining more than two inputs. The previous pairwise usage of CombineTraceProtos had a couple inefficiencies when combining more than two inputs: (a) the intermediate result was sorted every time (b) the hash of span tokens was rebuilt every time. Combiner is stateful and improves this, which leads to significant reduction in cpu and memory. Performance is identical when combining just 2 inputs.
Additionally, this PR changes the span/token hashing to 64-bit to reduce the collision rate. Experimentally the collision rate of fnv32 approached 1 in 10,000 spans, which is significant because any collision results in a dropped span. 64-bit has no collisions up to the tested limit of a trace with 1M spans. Performance is still good.
Feedback
In order to maintain identical performance against 2 segments, Combiner must not save the span tokens for the second input, like how CombineTraceProtos did not. This can be generalized in that we never need to save the span tokens for the last input. These savings are significant enough to where it's worth accounting for, and there are many cases where we do know the length. I would like feedback on the chosen ergonomics/naming/style and see if there is a better pattern. For example:
Benchmarks
This benchmark combines trace 2 to 8 segments of 100K spans each. Improvements are greater as more segments are combined.
Which issue(s) this PR fixes:
Should help #976
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]