Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[derive] Implement a IntoBytes-based Hash derive #2159

Merged
merged 1 commit into from
Dec 24, 2024

Conversation

max-heller
Copy link

The standard library's derive for Hash generates a recursive descent into the fields of the type it is applied to. This commit adds an alternative derive that generates an optimized, byte-oriented Hash implementation for types that implement IntoBytes. Instead of a recursive descent, the generated implementation makes a single call to Hasher::write() in both Hash::hash() and Hash::hash_slice(), feeding the hasher the bytes of the type or slice all at once.

Resolves #2075

@codecov-commenter
Copy link

codecov-commenter commented Dec 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 87.42%. Comparing base (2c8ef74) to head (05f4718).
Report is 1 commits behind head on v0.8.x.

Additional details and impacted files
@@           Coverage Diff           @@
##           v0.8.x    #2159   +/-   ##
=======================================
  Coverage   87.42%   87.42%           
=======================================
  Files          16       16           
  Lines        6115     6115           
=======================================
  Hits         5346     5346           
  Misses        769      769           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jswrenn
Copy link
Collaborator

jswrenn commented Dec 15, 2024

Thank you for the contribution! I'm in awe you managed to pass our CI on your first commit! At a glance, this looks great; I'll give a more thorough review tomorrow.

I think the right choice is to advocate for the usage pattern of derive(zerocopy::Hash), and thus preserve the property that derive(Trait) derives Trait. However, we need to be mindful that this could conflict with other imports of core::hash::Hash if the user writes use zerocopy_derive::*.

The Hash macro (but not trait) is in the prelude. Would a subsequent glob import from zerocopy_derive overwrite it?

@jswrenn
Copy link
Collaborator

jswrenn commented Dec 18, 2024

Unfortunately, this can lead to errors. This code, for example:

use zerocopy::*;

#[derive(Hash)]
struct Foo;

...produces this error:

error[E0659]: `Hash` is ambiguous
   --> examples/foo.rs:3:10
    |
3   | #[derive(Hash)]
    |          ^^^^ ambiguous name
    |
    = note: ambiguous because of a conflict between a name from a glob import and an outer scope during import or macro resolution
note: `Hash` could refer to the derive macro imported here
   --> examples/foo.rs:1:5
    |
1   | use zerocopy::*;
    |     ^^^^^^^^^^^
    = help: consider adding an explicit import of `Hash` to disambiguate
    = help: or use `crate::Hash` to refer to this derive macro unambiguously
note: `Hash` could also refer to the derive macro defined here
   --> /home/ubuntu/.rustup/toolchains/nightly-2024-11-06-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/prelude/mod.rs:151:13
    |
151 |     pub use super::v1::*;

Unfortunately, we encourage users to glob-import from zerocopy_derive. Given this, we should probably follow bytemuck's lead and call it something like ByteHash.

@max-heller max-heller changed the title [derive] Implement a IntoBytes-based derive(Hash) [derive] Implement a IntoBytes-based Hash derive Dec 23, 2024
@max-heller
Copy link
Author

max-heller commented Dec 23, 2024

Unfortunately, this can lead to errors. This code, for example:

use zerocopy::*;

#[derive(Hash)]
struct Foo;

...produces this error:

error[E0659]: `Hash` is ambiguous
   --> examples/foo.rs:3:10
    |
3   | #[derive(Hash)]
    |          ^^^^ ambiguous name
    |
    = note: ambiguous because of a conflict between a name from a glob import and an outer scope during import or macro resolution
note: `Hash` could refer to the derive macro imported here
   --> examples/foo.rs:1:5
    |
1   | use zerocopy::*;
    |     ^^^^^^^^^^^
    = help: consider adding an explicit import of `Hash` to disambiguate
    = help: or use `crate::Hash` to refer to this derive macro unambiguously
note: `Hash` could also refer to the derive macro defined here
   --> /home/ubuntu/.rustup/toolchains/nightly-2024-11-06-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/prelude/mod.rs:151:13
    |
151 |     pub use super::v1::*;

Unfortunately, we encourage users to glob-import from zerocopy_derive. Given this, we should probably follow bytemuck's lead and call it something like ByteHash.

That's unfortunate, I do like to preserve derive(Trait) -> impl Trait as you mentioned. I renamed it to ByteHash to be consistent with bytemuck. I could also see a case for HashBytes, I'm not sure if there's much other precedent for naming these specialized derives.

let type_ident = &ast.ident;
let (impl_generics, ty_generics, where_clause) = ast.generics.split_for_impl();
let where_predicates = where_clause.map(|clause| &clause.predicates);
Ok(quote! {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last (I think) nit: Could you leave a comment why deferring to impl_block isn't appropriate here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

The standard library's derive for `Hash` generates a recursive descent
into the fields of the type it is applied to. This commit adds a
`ByteHash` derive that generates an optimized, byte-oriented `Hash`
implementation for types that implement `IntoBytes`. Instead of a
recursive descent, the generated implementation makes a single call
to `Hasher::write()` in both `Hash::hash()` and `Hash::hash_slice()`,
feeding the hasher the bytes of the type or slice all at once.

Resolves google#2075
Copy link
Collaborator

@jswrenn jswrenn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me! Thank you for the contribution!

@jswrenn jswrenn enabled auto-merge December 24, 2024 16:02
@jswrenn jswrenn added this pull request to the merge queue Dec 24, 2024
@jswrenn jswrenn removed this pull request from the merge queue due to a manual request Dec 24, 2024
@jswrenn jswrenn added this pull request to the merge queue Dec 24, 2024
Merged via the queue into google:v0.8.x with commit 681561c Dec 24, 2024
87 checks passed
@jswrenn jswrenn mentioned this pull request Jan 29, 2025
@max-heller max-heller deleted the derive-hash branch February 1, 2025 16:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants