-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ImmutableDictionary has slowly been regressing and in version 5.0 is about 10X slower than Dictionary #47812
Comments
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
Tagging subscribers to this area: @eiriktsarpalis Issue DetailsI am on the CPS team in Visual Studio and are benchmarking bottlenecks that affect solution load and a big contributor to the slowdown is Immutable collections. Instead of switching out the collection I started benchmarking across versions of the ImmutableDictionary for example and there is a steady regression going all the way back to version 1.5 of the dictionary Yes, I understand Immutable collections will be slower, but there are clear indications of a perf bug here, especially when you look at the GetValue numbers. Some benchmark numbers below: GetValue:
AddRange (SetRange is pretty similar)
Benchmark repo here - https://devdiv.visualstudio.com/DefaultCollection/Personal/_git/arkalyan
|
Dictionary uses a hash map and lookup is O(1). ImmutableDictionary uses a tree and lookup is O(log N).
You mean the same operation with the same inputs has been getting slower and slower with each version of the nuget package? |
cc: @adamsitnik |
I understand, but lets compare the Dictionary and the Tree(ImmutableDictionary) states with 1 value loaded in them, so the tree doesn't have to look beyond the root. It still compared as 6ns vs 48ns respectively.
Yes, I picked version 1.5 to run the same benchmarks and values were quite a bit lower. Below is a perf trace comparison of the 2 versions with similar data: Version 5.0:
Version 1.5
|
Admission time: Only caught the fact that 1.5 was faster accidentally because benchmarkdotnet defaulted to it and it did not match the CPS numbers. Then switched the version to 5 that matches internal usage and the above 10x differences are based off of that. :) |
It's not going to be possible to have a meaningful improvement to the read access performance of |
Thanks @sharwell It looks like there is some overlap with my FastImmutableDictionary implementation. My goal was to cover (a. Continue using ImmutableDictionary, but try to speed up with bug fixes like this one b. ImmutableDictionary that should not accept writes after it's been built (ImmutableSegmentedDictionary seems to achieve the same goal) c. Replace with a read-only collection instead since immutability is not really leveraged (ImmutableArrayDictionary I believe was your recommendation to target this.). Do you happen to have benchmarks for your collections that I can use to compare against my implementation and choose based on usage? In the above case, the reason I believe there is still some room for improvement inspite of the AVL tree backing is because the slowness of the immutable.Get seems to come from many layers of method delegations as compared to the Dictionary's FindEntry. In the Immutable version, get_Item calls into TryGetValue, which then delegate to a static version of TryGetValue and the top two layers before the static call have very little logic, but they took more than 25% of the CPU time. Then, the static TryGetValue alone took more than 25% of the CPU time, while the overall time inside the two TryGetValue it calls is less than 15%. The majority of the overhead is in function calls themselves, perhaps removing extra delagations or asking Compiler/JIT to do more method inlining is a viable alternative and would not involve any changes to the AVL tree itself. |
@arkalyanms, I took a look at your benchmark, and noted you're running against netcoreapp2.1... that's old news 😄 Can you try something newer? Here's what I see for GetValue / 1.
|
Ah nice! So there is a lot of goodness I am missing out on. I’ll update and circle back. Thanks @stephentoub |
Alright, here are the net5.0 numbers. The delta is not as wide anymore (about 3-4x for Get and AddRange varies drastically based on data). For the cases that has data subject to change, I am inclined to switch over to a more custom ImmutableDictionary implementation I have or the TreeDictionary and to Dictionary if immutability is not required. I am not sure if we still want to hold onto this bug to investigate some smaller oddities like the GetValue at size 1 (the increase in times with size shows there is some level of tree traversal cost, but there is also an additional overhead cost in ImmutableDictionary, much lower than in netcore2.1 but it's still there) GetValue:
CreateRange (Dictionary with constructor/DictionaryCreateRange and iterate over range and add/DictionaryWithCreateRange)
|
Honestly these numbers don't seem particularly surprising to me. There have been proposals like the one in #14477 for replacing the AVL tree implementation with something faster. |
I'm going to close this issue, since it is not very actionable. I would recommend continuing the conversation in #14477 on potential perf improvements to the internal implementation of immutable hashtables. |
I am on the CPS team in Visual Studio and are benchmarking bottlenecks that affect solution load and a big contributor to the slowdown is Immutable collections. Instead of switching out the collection I started benchmarking across versions of the ImmutableDictionary for example and there is a steady regression going all the way back to version 1.5 of the dictionary
Yes, I understand Immutable collections will be slower, but there are clear indications of a perf bug here, especially when you look at the GetValue numbers.
Some benchmark numbers below:
GetValue:
AddRange (SetRange is pretty similar)
Benchmark repo here - https://devdiv.visualstudio.com/DefaultCollection/Personal/_git/arkalyan
Adding more thoughts from folks who have observed a similar slowness:
The slowness of the immutable.Get actually comes from many layers of method delegations.
In the Dictionary<TKey, TValue>, get_Item is almost a single method (only calls FindEntry), TryGetValue has a similar but separated implementation.
On the other side, in the Immutable version, get_Item calls into TryGetValue, which then delegate to a static version of TryGetValue. The top two layers have very little logic, but they took more than 25% of the CPU time.
Then, the static TryGetValue alone took more than 25% of the CPU time, while the overall time inside two TryGetValue it calls is less than 15%. I think due to the core of this logic is so small, the majority overhead is function calls themselves. Maybe one thing to try is to push more functions inline, but either removing some extra delegating, or ask Compiler/JIT to do more method inlining?
http://index/?leftProject=System.Collections.Immutable&leftSymbol=hghocbfkyomt&file=System%5CCollections%5CImmutable%5CImmutableDictionary_2.cs
The text was updated successfully, but these errors were encountered: