-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce memory and CPU costs due to SegmentedList usage #75661
Reduce memory and CPU costs due to SegmentedList usage #75661
Conversation
Currently, the SegmentedList class suffers from two inefficiencies: 1) Upon growth, it doubles the SegmentedArray size. This is necessary for normal List like collections to get constant time amortized growth, but isn't necessary (or desirable) for segmented data structures. 2) Upon growth, it reallocates and copies over the existing pages. Instead, if we only allocate the modified/new pages and the array holding the pages, we can save significant CPU and allocation costs.
Doubling makes sense when you only have a single backing store, and exceeding its limits causes another allocation and full copy. Given that we're already segmented, and growth shouldn't cause copies of prior segments, this seems like a sensible change to me. Would like perf numbers if possible though. |
Oh. I didn't scroll down far enough. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm approving the concept. Haven't looked deeply at the code change yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks ok. I think dedicated tests would be good.
src/Tools/IdeCoreBenchmarks/SegmentedListBenchmarks_InsertRange.cs
Outdated
Show resolved
Hide resolved
src/Dependencies/Collections/SegmentedArray`1+PrivateMarshal.cs
Outdated
Show resolved
Hide resolved
@@ -502,7 +529,17 @@ internal void Grow(int capacity) | |||
// If the computed capacity is still less than specified, set to the original argument. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requested changes:
- Document the algorithm for initialization of
newCapacity
- For initial length greater than or equal to half the segment size but less than one full segment size, set
newCapacity
to one segment size. - For any initial length where the final segment is not a full segment, set
newCapacity
to the length it would be with a full-size final segment. This guarantees that the outer array will not be reallocated, and also guarantees that the single inner array allocation performed during the resize will not need to be performed a second time during the next resize. (Can be modified and treated as a generalization of the preceding point) - If the calculated
newCapacity
ends up being less thancapacity
andcapacity
is greater than the segment size, apply a final ceiling operation so the final segment is full size.
These rules are all relevant regardless of whether we increase by doubling or by a page at a time. I am still reviewing the choice of expansion size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
➡️ Changing the capacity selection algorithm for resize has been deferred to a later pull request.
src/Dependencies/Collections/SegmentedArray`1+PrivateMarshal.cs
Outdated
Show resolved
Hide resolved
…move need for a computation of the length
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed compiler test code
Mimics changes done in dotnet#75661 to enable segment reuse in SegmentedDictionary.
* Reuse segments during SegmentedDictionary growth Mimics changes done in #75661 to enable segment reuse in SegmentedDictionary.
Currently, the SegmentedList class suffers from two potential inefficiencies:
This PR addresses 2).
Instead, if we reuse existing segments when growing the list, we can save significant CPU and allocation costs.
*** Benchmark.NET data ***
OLD:
NEW: