Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(SourceId): use stable hash from rustc-stable-hash #14917

Merged
merged 2 commits into from
Dec 11, 2024

Conversation

weihanglo
Copy link
Member

@weihanglo weihanglo commented Dec 10, 2024

What does this PR try to resolve?

This helps -Ztrim-paths build a stable cross-platform path for the
registry and git sources. Sources files then can be found from the same
path when debugging.

It also helps cache registry index all at once for all platforms,
for example the use case in #14795
(despite they should use cargo vendor instead IMO).

Some caveats:

  • Newer cargo will need to re-download files for global caches
    (index files, git/registry sources).
    The old cache is still kept and used when running with older cargoes.
  • Absolute paths on windows iarenot really covered by the "cross-platform" hash,
    because path prefix components like C: are always there.
    That means hashes of some sources kind,
    like local registry and local path,
    are not going to be real cross-platform stable.

Security concern

There might be hash collisions if you have two registries under the same
domain. This won't happen to crates.io, as the infra would have to
intentionally put another registry on index.crates.io to collide.
We don't consider this is an actual threat model, so we are not going to
use any cryptographically secure hash algorithm like BLAKE3.

At least, the current unstable SipHash isn't in a better situation.
We might switch to a cryptographic secure one when needed.

See also #13171 (comment)

How should we test and review this PR?

We have an FCP in #14795 (comment).

This PR implements the proposal,
The path-length concern in #14795 (comment) is automatically addressed
because we don't need cryptographically secure hash for now.

Additional information

See more information and benchmark results in #14116.

@rustbot
Copy link
Collaborator

rustbot commented Dec 10, 2024

r? @ehuss

rustbot has assigned @ehuss.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-cache-messages Area: caching of compiler messages A-layout Area: target output directory layout, naming, and organization A-rebuild-detection Area: rebuild detection and fingerprinting S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 10, 2024
@weihanglo weihanglo changed the title refactor: make room for adapting rustc-stable-hasher feature(SourceId): use stable hash from rustc-stable-hash Dec 10, 2024
src/cargo/util/hasher.rs Outdated Show resolved Hide resolved
This helps `-Ztrim-paths` build a stable cross-platform path for the
registry and git sources. Sources files then can be found from the same
path when debugging.

It also helps cache registry index all at once for all platforms,
for example the use case in rust-lang#14795
(despite they should use `cargo vendor` instead IMO).

Some caveats:

* Newer cargo will need to re-download files for global caches
  (index files, git/registry sources).
  The old cache is still kept and used when running with older cargoes.
* Windows is not really covered by the "cross-platform" hash,
  because path prefix components like `C:` are always there.
  That means hashes of some sources kind,
  like local registry and local path,
  are not going to be real cross-platform stable.

There might be hash collisions if you have two registries under the same
domain. This won't happen to crates.io, as the infra would have to
intentionally put another registry on index.crates.io to collide.
We don't consider this is an actual threat model, so we are not going to
use any cryptographically secure hash algorithm like BLAKE3.

See also <rust-lang#13171 (comment)>
Copy link
Contributor

@epage epage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FCP in #14795. We likely don't need to wait the 10 day waiting period. If something comes up, we have time to revert before the next release.

@epage epage added this pull request to the merge queue Dec 11, 2024
@weihanglo weihanglo added the A-caching Area: caching of dependencies, repositories, and build artifacts label Dec 11, 2024
Merged via the queue into rust-lang:master with commit 94d274d Dec 11, 2024
22 checks passed
@ayanwaar

This comment was marked as spam.

1 similar comment
@ayanwaar

This comment was marked as spam.

@weihanglo weihanglo deleted the stable-hash branch December 11, 2024 20:07
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 14, 2024
Update cargo

18 commits in 20a443231846b81c7b909691ec3f15eb173f2b18..7847c03965260b5dcc8d93218d6af295a717abb6
2024-12-06 21:56:56 +0000 to 2024-12-13 18:06:39 +0000
- fix(base): Support bases in patches in virtual manifests  (rust-lang/cargo#14931)
- fix(resolver): Report invalid index entries  (rust-lang/cargo#14927)
- feat: Implement `--depth workspace` for `cargo tree` command (rust-lang/cargo#14928)
- fix(resolver): In errors, show rejected versions over alt versions (rust-lang/cargo#14923)
- fix: emit_serialized_unit_graph uses the configured shell (rust-lang/cargo#14926)
- fix(script): Don't override the release profile (rust-lang/cargo#14925)
- feature(SourceId): use stable hash from rustc-stable-hash (rust-lang/cargo#14917)
- fix(resolver): Don't report all versions as rejected  (rust-lang/cargo#14921)
- fix(resolver): Report unmatched versions, rather than saying no package (rust-lang/cargo#14897)
- fix(build-rs): Implicitly report rerun-if-env-changed for input (rust-lang/cargo#14911)
- a faster hash for ActivationsKey (rust-lang/cargo#14915)
- feat(build-script): Pass CARGO_CFG_FEATURE  (rust-lang/cargo#14902)
- fix(build-rs): Correctly refer to the item in assert (rust-lang/cargo#14913)
- chore: update auto-label to include build-rs crate (rust-lang/cargo#14912)
- refactor: use Path::push to construct remap-path-prefix (rust-lang/cargo#14908)
- feat(build-rs): Add the 'error' directive (rust-lang/cargo#14910)
- fix(build-std): determine root crates by target spec `std:bool` (rust-lang/cargo#14899)
- SemVer: Add section on RPIT capturing (rust-lang/cargo#14849)
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 14, 2024
Update cargo

19 commits in 20a443231846b81c7b909691ec3f15eb173f2b18..769f622e12db0001431d8ae36d1093fb8727c5d9
2024-12-06 21:56:56 +0000 to 2024-12-14 04:27:35 +0000
- test(build-std): dont require rustup (rust-lang/cargo#14933)
- fix(base): Support bases in patches in virtual manifests  (rust-lang/cargo#14931)
- fix(resolver): Report invalid index entries  (rust-lang/cargo#14927)
- feat: Implement `--depth workspace` for `cargo tree` command (rust-lang/cargo#14928)
- fix(resolver): In errors, show rejected versions over alt versions (rust-lang/cargo#14923)
- fix: emit_serialized_unit_graph uses the configured shell (rust-lang/cargo#14926)
- fix(script): Don't override the release profile (rust-lang/cargo#14925)
- feature(SourceId): use stable hash from rustc-stable-hash (rust-lang/cargo#14917)
- fix(resolver): Don't report all versions as rejected  (rust-lang/cargo#14921)
- fix(resolver): Report unmatched versions, rather than saying no package (rust-lang/cargo#14897)
- fix(build-rs): Implicitly report rerun-if-env-changed for input (rust-lang/cargo#14911)
- a faster hash for ActivationsKey (rust-lang/cargo#14915)
- feat(build-script): Pass CARGO_CFG_FEATURE  (rust-lang/cargo#14902)
- fix(build-rs): Correctly refer to the item in assert (rust-lang/cargo#14913)
- chore: update auto-label to include build-rs crate (rust-lang/cargo#14912)
- refactor: use Path::push to construct remap-path-prefix (rust-lang/cargo#14908)
- feat(build-rs): Add the 'error' directive (rust-lang/cargo#14910)
- fix(build-std): determine root crates by target spec `std:bool` (rust-lang/cargo#14899)
- SemVer: Add section on RPIT capturing (rust-lang/cargo#14849)
@rustbot rustbot added this to the 1.85.0 milestone Dec 14, 2024
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Dec 15, 2024
Update cargo

19 commits in 20a443231846b81c7b909691ec3f15eb173f2b18..769f622e12db0001431d8ae36d1093fb8727c5d9
2024-12-06 21:56:56 +0000 to 2024-12-14 04:27:35 +0000
- test(build-std): dont require rustup (rust-lang/cargo#14933)
- fix(base): Support bases in patches in virtual manifests  (rust-lang/cargo#14931)
- fix(resolver): Report invalid index entries  (rust-lang/cargo#14927)
- feat: Implement `--depth workspace` for `cargo tree` command (rust-lang/cargo#14928)
- fix(resolver): In errors, show rejected versions over alt versions (rust-lang/cargo#14923)
- fix: emit_serialized_unit_graph uses the configured shell (rust-lang/cargo#14926)
- fix(script): Don't override the release profile (rust-lang/cargo#14925)
- feature(SourceId): use stable hash from rustc-stable-hash (rust-lang/cargo#14917)
- fix(resolver): Don't report all versions as rejected  (rust-lang/cargo#14921)
- fix(resolver): Report unmatched versions, rather than saying no package (rust-lang/cargo#14897)
- fix(build-rs): Implicitly report rerun-if-env-changed for input (rust-lang/cargo#14911)
- a faster hash for ActivationsKey (rust-lang/cargo#14915)
- feat(build-script): Pass CARGO_CFG_FEATURE  (rust-lang/cargo#14902)
- fix(build-rs): Correctly refer to the item in assert (rust-lang/cargo#14913)
- chore: update auto-label to include build-rs crate (rust-lang/cargo#14912)
- refactor: use Path::push to construct remap-path-prefix (rust-lang/cargo#14908)
- feat(build-rs): Add the 'error' directive (rust-lang/cargo#14910)
- fix(build-std): determine root crates by target spec `std:bool` (rust-lang/cargo#14899)
- SemVer: Add section on RPIT capturing (rust-lang/cargo#14849)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cache-messages Area: caching of compiler messages A-caching Area: caching of dependencies, repositories, and build artifacts A-layout Area: target output directory layout, naming, and organization A-rebuild-detection Area: rebuild detection and fingerprinting S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants