Add fast, byte-oriented `Hash` derive #2075

jswrenn · 2024-11-18T19:36:18Z

For types that are IntoBytes, we could derive optimized implementations of Hash.

Adding derive(Hash) to a struct generates a recursive descent into the fields of the struct. Instead, for a type that is IntoBytes, we could relatively easily have derive(zerocopy::Hash) expand to something like this:

impl core::hash::Hash for UserTy
where
    UserTy: ::zerocopy::IntoBytes,
{
    fn hash<H: core::hash::Hasher>(&self, state: &mut H) {
        core::hash::Hasher::write(
            state,
            ::zerocopy::IntoBytes::as_bytes(self)
        )
    }

    fn hash_slice<H: core::hash::Hasher>(data: &[Self], state: &mut H)
    where
        Self: Sized,
    {
        core::hash::Hasher::write(
            state,
            ::zerocopy::IntoBytes::as_bytes(data)
        )
    }
}

Alternatively, if we run with the assumption that Hashers benefit from writing the greatest number of bytes at a time, we can leverage the known size and well-alignedness of the user type to emit calls to state.write_u128, state.write_u64, state.write_u32, and so on. A trailing DST suffix, if any, can be serialized with state.write(trailing.as_bytes()).

We should benchmark these approaches. I don't have a strong intuition for the performance characteristics of Rust's leading Hasher implementations. My guess is that we couldn't do better in the more complicated approach than a well optimized Hasher::write could do, so we're best off doing the simple thing.

The text was updated successfully, but these errors were encountered:

jswrenn · 2024-11-19T15:57:06Z

We should benchmark these approaches. I don't have a strong intuition for the performance characteristics of Rust's leading Hasher implementations. My guess is that we couldn't do better in the more complicated approach than a well optimized Hasher::write could do, so we're best off doing the simple thing.

On further investigation, I don't think there's any benefit to the more complex approach. We should assume that Hasher::write is well-optimized, as it is for:

Closes google#2075

The standard library's derive for `Hash` generates a recursive descent into the fields of the type it is applied to. This commit adds an alternative derive that generates an optimized, byte-oriented `Hash` implementation for types that implement `IntoBytes`. Instead of a recursive descent, the generated implementation makes a single call to `Hasher::write()` in both `Hash::hash()` and `Hash::hash_slice()`, feeding the hasher the bytes of the type or slice all at once. Resolves google#2075

The standard library's derive for `Hash` generates a recursive descent into the fields of the type it is applied to. This commit adds a `ByteHash` derive that generates an optimized, byte-oriented `Hash` implementation for types that implement `IntoBytes`. Instead of a recursive descent, the generated implementation makes a single call to `Hasher::write()` in both `Hash::hash()` and `Hash::hash_slice()`, feeding the hasher the bytes of the type or slice all at once. Resolves google#2075

The standard library's derive for `Hash` generates a recursive descent into the fields of the type it is applied to. This commit adds a `ByteHash` derive that generates an optimized, byte-oriented `Hash` implementation for types that implement `IntoBytes`. Instead of a recursive descent, the generated implementation makes a single call to `Hasher::write()` in both `Hash::hash()` and `Hash::hash_slice()`, feeding the hasher the bytes of the type or slice all at once. Resolves #2075

jswrenn · 2025-01-03T16:49:05Z

Partially completed in #2159; still needs a port to 0.9.

joshlf · 2025-01-30T20:11:41Z

Given that #2159 is labeled needs-backport and given that our 0.9 release roadmap lists needs-backport as a blocker, I think we can close this issue. Reopen if you disagree.

jswrenn added the enhancement New feature or request label Nov 18, 2024

joshlf assigned jswrenn Nov 18, 2024

jswrenn assigned GJason88 Nov 18, 2024

max-heller added a commit to max-heller/zerocopy that referenced this issue Dec 14, 2024

Optimized derive(Hash) for IntoBytes types

cf08fc0

Closes google#2075

max-heller mentioned this issue Dec 14, 2024

[derive] Implement a IntoBytes-based Hash derive #2159

Merged

joshlf closed this as completed Jan 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fast, byte-oriented `Hash` derive #2075

Add fast, byte-oriented `Hash` derive #2075

jswrenn commented Nov 18, 2024 •

edited

Loading

jswrenn commented Nov 19, 2024

jswrenn commented Jan 3, 2025

joshlf commented Jan 30, 2025

Add fast, byte-oriented Hash derive #2075

Add fast, byte-oriented Hash derive #2075

Comments

jswrenn commented Nov 18, 2024 • edited Loading

jswrenn commented Nov 19, 2024

jswrenn commented Jan 3, 2025

joshlf commented Jan 30, 2025

Add fast, byte-oriented `Hash` derive #2075

Add fast, byte-oriented `Hash` derive #2075

jswrenn commented Nov 18, 2024 •

edited

Loading