-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stable metadata hashes across workspaces #3611
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
It seems the appveyor failure is intermittent. |
☔ The latest upstream changes (presumably #3609) made this pull request unmergeable. Please resolve the merge conflicts. |
Thanks for the PR! I think that this would also solve #3273 as well, right? My only worry here would be that we forget to use the stable hash and instead accidentally use the |
Patch looks great to me! Could you add new test cases though instead of extending existing ones? Just to be explicit about what's being tested. |
I had to modify As for #3273, I think this may be part of the solution but as of now it will still rebuild when moving a workspace. The generated files all have the same names now, but the .fingerprint directory still has absolute path references to the old directory. Thanks for the review! |
Oh right yeah that makes sense, we can just leave #3273 open. Changing that test case though does worry me. This breaks any cases like that which are sharing directories with Could we perhaps only apply this change for workspace crates? |
Yeah, it will recompile if you individually build a dependency and then build a dependent crate, sharing the same The main case I'm interested in is the top-level crate which may be at its own implicit workspace root. I don't think we should require setting a workspace to get stable output out of a single crate. I think this workspace-relative hashing will be required as part of #3273 anyway, since moving the project root moves both the crate source and the default target dir. This just takes care of the crate source component. |
Yeah this use case was definitely what workspaces were intended to solve, but I worry about projects and/or systems which may not have had a chance to update yet. I'm not really sure of the best way to gauge what sort of breakage this would cause though. We could try landing it for awhile and see if anyone complains, but that still carries some risk with it. We could also try posting on internals and ask around if anyone's still relying on this to gauge. |
cc rust-lang/rust#39786, I think this PR would solve that |
☔ The latest upstream changes (presumably #3818) made this pull request unmergeable. Please resolve the merge conflicts. |
@tylerwhall any update about this PR ? |
Switching to workspace-relative metadata hashing means that output can change based on what is considered to be the top-level crate. This test is affected by this change, but in general hashes should be more stable than before. This test verifies that a crate will not rebuild when built as a dependency of two different crates but with the same CARGO_TARGET_DIR. This currently relies on the absolute path to the crate being the same across runs. Set an explicit workspace so the hashes will be the same when using workspace-relative hashing. Setting a workspace would cause the target dir to be the same in both invocations, defeating the intent of this test, so move the target dir between runs to verify that the contents of the target dir are not dependent on its path.
Currently a crate from a path source will have its metadata hash incorporate its absolute path on the system where it is built. This always impacts top-level crates, which means that compiling the same source with the same dependencies and compiler version will generate libraries with symbol names that vary depending on where the workspace resides on the machine. For paths inside the Cargo workspace, hash their SourceId relative to the root of the workspace. Paths outside of a workspace are still hashed as absolute. This stability is important for reproducible builds as part of a larger build system that employs caching of artifacts, such as OpenEmbedded. OpenEmbedded tightly controls all inputs to a build and its caching assumes that an equivalent artifact will always result from the same set of inputs. The workspace path is not considered to be an influential input, however. For example, if Cargo is used to compile libstd shared objects which downstream crates link to dynamically, it must be possible to rebuild libstd given the same inputs and produce a library that is at least link-compatible with the original. If the build system happens to cache the downstream crates but needs to rebuild libstd and the user happens to be building in a different workspace path, currently Cargo will generate a library incompatible with the original and the downstream executables will fail at runtime on the target.
f22560b
to
eb0e4c0
Compare
I rebased the PR. @alexcrichton, I think we should try to move forward with this slight change in behavior. I don't know how to fix #3273, rust-lang/rust#39786 without removing the absolute path from the hash. |
☔ The latest upstream changes (presumably #4214) made this pull request unmergeable. Please resolve the merge conflicts. |
the travis fails was a timeout on Darwin, and now there is another rebase needed :-/ |
Egad I'm sorry I let this sit so long @tylerwhall! The recent changes look good to me, though, so let's land this and see how it goes. @bors: r+ |
📌 Commit eb0e4c0 has been approved by |
Stable metadata hashes across workspaces Currently a crate from a path source will have its metadata hash incorporate its absolute path on the system where it is built. This always impacts top-level crates, which means that compiling the same source with the same dependencies and compiler version will generate libraries with symbol names that vary depending on where the workspace resides on the machine. This is hopefully a general solution to the hack we've used in meta-rust to make dynamic linking reliable. meta-rust/meta-rust@0e6cf94 For paths inside the Cargo workspace, hash their SourceId relative to the root of the workspace. Paths outside of a workspace are still hashed as absolute. This stability is important for reproducible builds as part of a larger build system that employs caching of artifacts, such as OpenEmbedded. OpenEmbedded tightly controls all inputs to a build and its caching assumes that an equivalent artifact will always result from the same set of inputs. The workspace path is not considered to be an influential input, however. For example, if Cargo is used to compile libstd shared objects which downstream crates link to dynamically, it must be possible to rebuild libstd given the same inputs and produce a library that is at least link-compatible with the original. If the build system happens to cache the downstream crates but needs to rebuild libstd and the user happens to be building in a different workspace path, currently Cargo will generate a library incompatible with the original and the downstream executables will fail at runtime on the target.
☀️ Test successful - status-appveyor, status-travis |
Currently a crate from a path source will have its metadata hash incorporate its absolute path on the system where it is built. This always impacts top-level crates, which means that compiling the same source with the same dependencies and compiler version will generate libraries with symbol names that vary depending on where the workspace resides on the machine.
This is hopefully a general solution to the hack we've used in meta-rust to make dynamic linking reliable.
meta-rust/meta-rust@0e6cf94
For paths inside the Cargo workspace, hash their SourceId relative to the root of the workspace. Paths outside of a workspace are still hashed as absolute.
This stability is important for reproducible builds as part of a larger build system that employs caching of artifacts, such as OpenEmbedded.
OpenEmbedded tightly controls all inputs to a build and its caching assumes that an equivalent artifact will always result from the same set of inputs. The workspace path is not considered to be an influential input, however.
For example, if Cargo is used to compile libstd shared objects which downstream crates link to dynamically, it must be possible to rebuild libstd given the same inputs and produce a library that is at least link-compatible with the original. If the build system happens to cache the downstream crates but needs to rebuild libstd and the user happens to be building in a different workspace path, currently Cargo will generate a library incompatible with the original and the downstream executables will fail at runtime on the target.