Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

impls of PartialEq<str> etc. violate Hash consistency #230

Closed
mzabaluev opened this issue Oct 21, 2018 · 12 comments
Closed

impls of PartialEq<str> etc. violate Hash consistency #230

mzabaluev opened this issue Oct 21, 2018 · 12 comments
Milestone

Comments

@mzabaluev
Copy link
Contributor

This test currently fails:

    use bytes::Bytes;

    use std::{
        collections::hash_map::DefaultHasher,
        hash::{Hash, Hasher},
    };

    fn hash<T: Hash>(v: &T) -> u64 {
        let mut sh = DefaultHasher::new();
        v.hash(&mut sh);
        sh.finish()
    }

    #[test]
    fn hash_fail() {
        let s = "test string";
        let b = Bytes::from(s);
        assert_eq!(b, s);
        assert!(
            hash(&b) == hash(&s),
            "Hashes differ even though values compare as equal",
        );
    }

For things to be proper, this test should not even compile due to lack of an equality impl.
The PartialEq implementations that make the equality operators work between a Bytes value and a string, while convenient, allow two logically equal values to have different hashes. This violates the contract of Hash and therefore is a no-no. The standard library avoids pairing strings and byte slices in equality and comparison operators, and even burns extra CPU to make a string's Hash output pointedly different from that of an equivalent byte slice.

This issue does not affect HashTable because there is no Borrow impl to cross between the key types. But the inconsistency is lurking and can bite even in generic code using Hash by the book.

@mzabaluev mzabaluev changed the title impls of PartialEq<str> etc. violate Hash integrity impls of PartialEq<str> etc. violate Hash consistency Oct 21, 2018
@seanmonstar
Copy link
Member

I think you may be referring to the relationship of Eq and Hash, which is different than PartialEq. It's fine to implement that for different types, as a convenience, but if the Hash needed to match for all those, it wouldn't be nearly as useful.

@mzabaluev
Copy link
Contributor Author

Still, it's notable that the standard library does not mix str and [u8] in PartialEq and PartialOrd impls. Most (all?) cross-type comparison trait implementations there have the types share the same Borrow target, and besides conceptual separation, it may be also important for future reimplementation of the binary operator traits with blanket default impls abstracting over Borrow, to enable auto-implementation for any foreign types Borrowing to the same type.

@carllerche
Copy link
Member

@mzabaluev what do you propose should change?

@mzabaluev
Copy link
Contributor Author

@carllerche I think the PartialEq/PartialOrd impls with string operand types should be deprecated in bytes 0.5, to be removed in the future.

@carllerche carllerche added this to the v0.5 milestone Nov 26, 2018
@carllerche
Copy link
Member

This seems reasonable to me. I flagged this issue as part of the v0.5 milestone.

@mzabaluev
Copy link
Contributor Author

I think the PartialEq/PartialOrd impls with string operand types should be deprecated

Unfortunately, deprecation of an impl does not work in current Rust. So it should either be a hard break in 0.5, or some shim methods to show a deprecation warning meant for the impls.

@seanmonstar
Copy link
Member

I personally would prefer not to remove the impls, they are useful and I don't believe they cause an issue in practice.

@carllerche
Copy link
Member

@seanmonstar should we not stick w/ what std does?

@mzabaluev
Copy link
Contributor Author

mzabaluev commented Nov 26, 2018

For full disclosure, I have a pending RFC that, if accepted, will discourage binary operator impls across types not sharing the Borrow target.

@mzabaluev
Copy link
Contributor Author

The convenience provided by the impls in question is, basically, in not having to add an .as_bytes() call to the string argument.

@seanmonstar
Copy link
Member

I've found these impls provide a nicer experience than when I'm working with Vec<u8>. When just using vectors, I find I forgot to call other_str.as_bytes(), get the compiler error, and grumble while I add what I feel is busywork. When using Bytes, that annoyance never happens.

I'm not willing to die on this hill, I just feel like it improves my experience and would be sad to see it go.

@carllerche
Copy link
Member

After discussing this, the plan is to stick w/ std. PartialEq does not require hashes to match. This requirement comes with Eq.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants