-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup DistinctValue collector and exit early for ingesters #4104
Speedup DistinctValue collector and exit early for ingesters #4104
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some simple thoughts. this looks really good
d.values[v] = struct{}{} | ||
d.currLen += valueLen | ||
|
||
return false | ||
} | ||
|
||
// Values returns the final list of distinct values collected and sorted. | ||
func (d *DistinctValue[T]) Values() []T { | ||
ss := make([]T, 0, len(d.values)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joe-elliott I think we have the same subtle bug here as well, that we had in Diff.
we are checking the len of a member var without locking. I think we got lucky here because we usualy call Values after we are done collecting.
locking this as well.
…#4104) * make the collector go fast... * fixup usage and log lines * exit early when we hit the limits of collector * cleanup * CHANGELOG.md * fix lint * break with goto * locked and loaded
What this PR does:
now with early stop, we will bail out and return results instead of collecting the values after hitting the limit.
how fast? collector is 47% faster when used without any limits and 60% - 80% faster when used with limit and we hit the limits, and around 90-100% faster when we early stop when limit is hit
Benchmarks
main (no early stop) vs fast collector (with early stop)
main (with early stop) vs fast collector (with early stop)
main (no early stop) vs fast collector (no early stop)
main (with early stop) vs fast collector (no early stop)
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]