-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added new versions of methods to reduce allocations #14
Conversation
These errors were always nil - keep things clean
Added FindTagsAppend(), which reduces allocations by accepting a slice to use for the return value.
The `Find` methods now accept a slice to put results into. `Delete` now accepts a slice to use as a buffer.
Benchmark results: cpu: Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz OLD: BenchmarkFindTags-5 3721257 294.3 ns/op 160 B/op 6 allocs/op NEW: BenchmarkFindTags-5 41273214 28.12 ns/op 0 B/op 0 allocs/op
FYI - I'm going to rename these new methods, and put back the old names so the caller can choose between allocating and reusing, or just not worrying about it. |
renamed the following to better describe their function: - FindTags -> FindTagsAppend - FindDeepestTags -> FindDeepestTagsAppend - Delete -> DeleteWithBuffer Created new methods with the original names, that allocate slices and pass them into the *Append/*WithBuffer methods. This provides easier migration to new code, and simplifies use cases that can't reuse buffers.
Unit tests for FindDeepestTags, FindTags, FindTagsWithFilter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense and looks good to me but I had a question around the logic changes for FindTagsWithFilter
(now actually FindTagsWithFilterAppend
) and how that might negatively impact performance/if it matters before signing off.
template/tree_v4.go
Outdated
if address.Length == 0 { | ||
// caller just looking for root tags | ||
return ret, nil | ||
if len(ret) == retPos || filterFunc == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding this comment primarily for myself, but may just be useful for anyone else reviewing. This logic is essentially sayin:
- "if we didn't find any new tags, return what we have"
- "if we have no filtering function, return whatever we found (regardless of finding new tags)"
template/tree_v4.go
Outdated
nodeIndex = node.Left | ||
} else { | ||
nodeIndex = node.Right | ||
// filter in place |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this new filter in place approach (after finding all matching values) over the previous filter while searching for matches I think it definitely cleans up the code and makes it easier to follow/read. However, this comes at the cost of potentially needing to allocate more capacity for the slice up front (which may not be necessary if the values just get filtered out anyway). If we end up allocating extra capacity up front that isn't actually used, how do you think this would impact the overall performance since it is theoretically creating more garbage to cleanup for later? Will this really matter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still allocates less than before, where a new slice was allocated for the pre-filtered results, then thrown away after the filtered results were accepted. If you're reusing the same buffer over and over, the allocations will happen once, and then reused. But yeah, I think it's worth running the filter func earlier in the stack. I'll take a look, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented this in 94f2b4b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, my main concern was if by passing around the filter function and doing these extra checks it would slow things down. Based on offline discussion it doesn't seem like that is the case, and the new changes look good to me.
The methods that take a filter no longer allocate first, then filter out. We now only add them to the slice if they match. Did this by putting the main logic in the *Filter methods, then having the non-Filter methods call them, with a nil filter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice.
template/tree_v4.go
Outdated
func (t *TreeV4) Delete(address patricia.IPv4Address, matchFunc MatchesFunc, matchVal GeneratedType) (int, error) { | ||
// - use DeleteWithBuffer if you can reuse slices, to cut down on allocations | ||
func (t *TreeV4) Delete(address patricia.IPv4Address, matchFunc MatchesFunc, matchVal GeneratedType) int { | ||
return t.DeleteWithBuffer(make([]GeneratedType, 0), address, matchFunc, matchVal) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It really doesn't matter much, but I think it works to just pass nil
in these cases instead of making a zero-length slice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ya good call - updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also be done in the other non-buffering methods, but I worry the caller will be expecting non-nil, so I'll keep it as is for now�. If you're interested in reducing allocations, you can use the buffering versions
This incorporates the great work from https://github.com/pforemski (long enough ago that I'm embarrassed to say), here: #12, then takes it a bit further. The method signatures have changed, so this isn't backwards compatible - the always-nil errors were removed.
Changes:
FindTagsAppend
,FindTagsWithFilterAppend
,FindDeepestTagsAppend
DeleteWithBuffer
now accepts a slice to use as a bufferBenchmarks between the self-allocating
FindTags
and the slice-reusingFindTagsAppend
:Of course, we're just moving the slice allocation outside the method, and these gains are only realized if you can reuse it for successive calls. For applications that can, this should make quite an improvement in performance.
For those unfamiliar with this repo, look in the
template
directory. All of the*_tree
directories are generated from that code withmake generate
.