-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Hash for HashSet #21182
Comments
I would think that an implementation should be on HashMap and let HashSet derive it. Does this seem reasonable? |
It's impossible to implement Hash for HashSet, as far as I know. You would need some way to deterministically order all the elements in the HashSet for hashing, in a hasher-state and insertion-order independent way. Since HashMap only requires Eq and Hash, and hashers have random state, there is no such ordering. |
I suppose you could have the impl be dependent on also happening to have Ord, and either take O(n^2) time and O(1) space, or O(nlogn) time and O(n) space, but both are pretty unsavoury options! |
I suggest using EnumSet or TrieSet in collect-rs if you need some order-dependent information like this. |
Note for nitpickers: You can synthesize a Hash using your favourite commutative function (+, *, XOR), but this would substantially undermine the security properties provided by SipHash. We obtain security through siphash by mixing values through the hasher, which is fundamentally order-dependent. This ensures that e.g. "tac" and "cat" don't hash to the same thing. However of course for a HashSet you would want If you do this through a commutative function Now of course this is a less fundamental flaw than this flaw existing for all keys, especially since a HashMap or HashSet as a key is a really niche thing. So perhaps this is an acceptable problem. Finally because we use robin hood hashing, we're extra vulnerable to bad hashing because even "clumpy" distributions are a performance hazard. |
While the elements don't necessarily have an ordering, their hashes do. This could be implemented by obtaining the hashes of all elements, sorting them, and then iterating through them to compute a hash for the set. |
#91837 implemented it for a rustc-internal hashet by using commutative operations. |
This doesn't work but should be possible somehow. @chris-morgan @huonw @Aatch and others found a method to implement one: IRC log. Exact solution link here
The text was updated successfully, but these errors were encountered: