Added new versions of methods to reduce allocations #14

wblakecaldwell · 2021-08-31T04:07:11Z

This incorporates the great work from https://github.com/pforemski (long enough ago that I'm embarrassed to say), here: #12, then takes it a bit further. The method signatures have changed, so this isn't backwards compatible - the always-nil errors were removed.

Changes:

Removed always-nil error return values
New methods that take slices to append to: FindTagsAppend, FindTagsWithFilterAppend, FindDeepestTagsAppend
New DeleteWithBuffer now accepts a slice to use as a buffer

Benchmarks between the self-allocating FindTags and the slice-reusing FindTagsAppend:

cpu: Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
BenchmarkFindTags-5                     	 4138029	       246.7 ns/op	     112 B/op	       3 allocs/op
BenchmarkFindTagsAppend-5               	32779479	        32.25 ns/op	       0 B/op	       0 allocs/op

Of course, we're just moving the slice allocation outside the method, and these gains are only realized if you can reuse it for successive calls. For applications that can, this should make quite an improvement in performance.

For those unfamiliar with this repo, look in the template directory. All of the *_tree directories are generated from that code with make generate.

These errors were always nil - keep things clean

Added FindTagsAppend(), which reduces allocations by accepting a slice to use for the return value.

The `Find` methods now accept a slice to put results into. `Delete` now accepts a slice to use as a buffer.

Benchmark results: cpu: Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz OLD: BenchmarkFindTags-5 3721257 294.3 ns/op 160 B/op 6 allocs/op NEW: BenchmarkFindTags-5 41273214 28.12 ns/op 0 B/op 0 allocs/op

wblakecaldwell · 2021-08-31T15:39:12Z

FYI - I'm going to rename these new methods, and put back the old names so the caller can choose between allocating and reusing, or just not worrying about it.

renamed the following to better describe their function: - FindTags -> FindTagsAppend - FindDeepestTags -> FindDeepestTagsAppend - Delete -> DeleteWithBuffer Created new methods with the original names, that allocate slices and pass them into the *Append/*WithBuffer methods. This provides easier migration to new code, and simplifies use cases that can't reuse buffers.

Unit tests for FindDeepestTags, FindTags, FindTagsWithFilter

aaronfuj

It makes sense and looks good to me but I had a question around the logic changes for FindTagsWithFilter (now actually FindTagsWithFilterAppend) and how that might negatively impact performance/if it matters before signing off.

aaronfuj · 2021-08-31T23:28:07Z

template/tree_v4.go

-	if address.Length == 0 {
-		// caller just looking for root tags
-		return ret, nil
+	if len(ret) == retPos || filterFunc == nil {


Adding this comment primarily for myself, but may just be useful for anyone else reviewing. This logic is essentially sayin:

"if we didn't find any new tags, return what we have"

"if we have no filtering function, return whatever we found (regardless of finding new tags)"

aaronfuj · 2021-08-31T23:45:30Z

template/tree_v4.go

-			nodeIndex = node.Left
-		} else {
-			nodeIndex = node.Right
+	// filter in place


With this new filter in place approach (after finding all matching values) over the previous filter while searching for matches I think it definitely cleans up the code and makes it easier to follow/read. However, this comes at the cost of potentially needing to allocate more capacity for the slice up front (which may not be necessary if the values just get filtered out anyway). If we end up allocating extra capacity up front that isn't actually used, how do you think this would impact the overall performance since it is theoretically creating more garbage to cleanup for later? Will this really matter?

This still allocates less than before, where a new slice was allocated for the pre-filtered results, then thrown away after the filtered results were accepted. If you're reusing the same buffer over and over, the allocations will happen once, and then reused. But yeah, I think it's worth running the filter func earlier in the stack. I'll take a look, thanks.

Implemented this in 94f2b4b

Yeah, my main concern was if by passing around the filter function and doing these extra checks it would slow things down. Based on offline discussion it doesn't seem like that is the case, and the new changes look good to me.

The methods that take a filter no longer allocate first, then filter out. We now only add them to the slice if they match. Did this by putting the main logic in the *Filter methods, then having the non-Filter methods call them, with a nil filter.

aaronfuj

lgtm!

alistairking

Very nice.

alistairking · 2021-09-03T18:17:23Z

template/tree_v4.go

-func (t *TreeV4) Delete(address patricia.IPv4Address, matchFunc MatchesFunc, matchVal GeneratedType) (int, error) {
+// - use DeleteWithBuffer if you can reuse slices, to cut down on allocations
+func (t *TreeV4) Delete(address patricia.IPv4Address, matchFunc MatchesFunc, matchVal GeneratedType) int {
+	return t.DeleteWithBuffer(make([]GeneratedType, 0), address, matchFunc, matchVal)


It really doesn't matter much, but I think it works to just pass nil in these cases instead of making a zero-length slice?

ya good call - updated

Could also be done in the other non-buffering methods, but I worry the caller will be expecting non-nil, so I'll keep it as is for now�. If you're interested in reducing allocations, you can use the buffering versions

pforemski and others added 10 commits September 11, 2020 14:35

add FindTagsAppend()

f70628e

generate the type trees

2c31226

Removed unnecessary error return values

97e9497

These errors were always nil - keep things clean

Updated tests for no error returns

bef95ac

Code gen

9c3118a

Updated unit tests to handle no error returns

e2dbd6a

Merge branch https://github.com/pforemski/patricia

b0db03a

Added FindTagsAppend(), which reduces allocations by accepting a slice to use for the return value.

Changed method signatures to reduce allocations

0e456c6

The `Find` methods now accept a slice to put results into. `Delete` now accepts a slice to use as a buffer.

Code gen

fc6e472

Updated BenchmarkFindTags to use the buffer properly

057bd24

Benchmark results: cpu: Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz OLD: BenchmarkFindTags-5 3721257 294.3 ns/op 160 B/op 6 allocs/op NEW: BenchmarkFindTags-5 41273214 28.12 ns/op 0 B/op 0 allocs/op

wblakecaldwell mentioned this pull request Aug 31, 2021

add FindTagsAppend() #12

Merged

wblakecaldwell added 2 commits August 31, 2021 16:48

code gen

822d2f8

wblakecaldwell changed the title ~~Changed method signatures to reduce allocations~~ Added new versions of methods to reduce allocations Aug 31, 2021

wblakecaldwell added 3 commits August 31, 2021 17:35

Added tests to ensure that the *Append methods append, not just replace

84f9891

Added unit tests for self-allocating methods

2ab2476

Unit tests for FindDeepestTags, FindTags, FindTagsWithFilter

Added unit tests for self-allocating Delete method

7bf755c

aaronfuj reviewed Sep 1, 2021

View reviewed changes

wblakecaldwell added 3 commits September 1, 2021 20:48

removed unnecessary counter

b60eacf

code gen

5588ed1

aaronfuj approved these changes Sep 2, 2021

View reviewed changes

alistairking approved these changes Sep 3, 2021

View reviewed changes

wblakecaldwell added 2 commits September 9, 2021 16:44

removed unnecessary allocation in Delete()

60674e0

codegen

ec449a9

wblakecaldwell merged commit 2160333 into master Sep 9, 2021

ingwarsw deleted the wbc-fewer-allocations branch March 23, 2022 10:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added new versions of methods to reduce allocations #14

Added new versions of methods to reduce allocations #14

wblakecaldwell commented Aug 31, 2021 •

edited

Loading

wblakecaldwell commented Aug 31, 2021

aaronfuj left a comment

aaronfuj Aug 31, 2021

aaronfuj Aug 31, 2021

wblakecaldwell Sep 1, 2021

wblakecaldwell Sep 2, 2021 •

edited

Loading

aaronfuj Sep 2, 2021

aaronfuj left a comment

alistairking left a comment

alistairking Sep 3, 2021

wblakecaldwell Sep 9, 2021

wblakecaldwell Sep 9, 2021

Added new versions of methods to reduce allocations #14

Added new versions of methods to reduce allocations #14

Conversation

wblakecaldwell commented Aug 31, 2021 • edited Loading

wblakecaldwell commented Aug 31, 2021

aaronfuj left a comment

Choose a reason for hiding this comment

aaronfuj Aug 31, 2021

Choose a reason for hiding this comment

aaronfuj Aug 31, 2021

Choose a reason for hiding this comment

wblakecaldwell Sep 1, 2021

Choose a reason for hiding this comment

wblakecaldwell Sep 2, 2021 • edited Loading

Choose a reason for hiding this comment

aaronfuj Sep 2, 2021

Choose a reason for hiding this comment

aaronfuj left a comment

Choose a reason for hiding this comment

alistairking left a comment

Choose a reason for hiding this comment

alistairking Sep 3, 2021

Choose a reason for hiding this comment

wblakecaldwell Sep 9, 2021

Choose a reason for hiding this comment

wblakecaldwell Sep 9, 2021

Choose a reason for hiding this comment

wblakecaldwell commented Aug 31, 2021 •

edited

Loading

wblakecaldwell Sep 2, 2021 •

edited

Loading