-
Notifications
You must be signed in to change notification settings - Fork 442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support third party crates with bzlmod #1910
Conversation
7d87bcc
to
eecdbdd
Compare
Interesting! I attempted turning cc @fmeum who might be interested in this since |
Thanks for the mention. I will have to read up on cargo a bit to understand in which aspects it is similar to Go. There is an ongoing effort to simplify isolation of extension usages which you may be able to take advantage of: bazelbuild/bazel#18529 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really want this kind of feature to get into rules_rust, but I'm not convinced this is the way to do it. The proposed API that end users would use in their MODULE.bazel files seems unintuitive to me.
IMO this PR is too large. I'd like to suggest splitting everything that isn't directly related to bzlmod off into separate smaller PRs that are easier to review. There is too much room for bugs if this many things are changed at once.
bzlmod/private/BUILD.bazel
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dead file.
.bazelci/presubmit.yml
Outdated
platform: ubuntu2004 | ||
working_directory: examples/bzlmod/hello_world | ||
build_targets: | ||
- "//..." | ||
ubuntu2004_examples_bzlmod_crate_universe: | ||
name: Bzlmod crate universe examples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name: Bzlmod crate universe examples | |
name: Bzlmod Crate Universe examples |
def add_annotation(k, v): | ||
if k not in annotations: | ||
annotations[k] = [] | ||
annotations[k].append(v) | ||
|
||
for file in annotation_files: | ||
for name, annotations_for_crate in json.decode(module_ctx.read(file)).items(): | ||
for annotation_for_crate in annotations_for_crate: | ||
add_annotation(name, _crate_universe_crate.annotation(**annotation_for_crate)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really wrong. Starlark doesn't have the same linters available as Python, but it's still probably a good practice to attempt to reuse as much from those as possible. In this case:
- Don't use inline
def
s. Factor those out into standalone private functions. - Avoid deep nesting. It increases cognitive complexity by a lot.
- Almost always prefer list/dict comprehensions when working with data objects. Regular
for
loops should almost exclusively be used for control flow where it's unavoidable. - Keep in mind that
dict
s have many methods to simplify operation on them.
implementation = _crate_impl, | ||
tag_classes = dict( | ||
from_cargo = _from_cargo, | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc
field unspecified.
ctx, # @unused | ||
name, | ||
data): | ||
path = tag_path.get_child(name) # buildifier: disable=uninitialized |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This warning shouldn't be disabled. Instead the function should be moved out and path
or tag_path
should be an argument into it. That requires changes to write_config_file
though which should be handled in separate reviews.
There must be a better solution to this. It's very confusing that this write_data_file
has a similar intention but a different behavior than the one in crates_vendor.bzl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes to write_crates_config_file
should be factored out into a separate commit and have separate review.
version = crate["version"] | ||
|
||
# "+" isn't valid in a repo name. | ||
crate_repo_name = "%s__%s-%s" % (repo_name, name, version.replace("+", "-")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"{}-{}".format(a, b)
or "{a}-{b}".format(a, b)
. Formatting with %
reduces accessibility. It's not always possible (e.g. in some ctx.args methods) but in this case it is.
It also makes sense to follow https://bazel.build/external/extension#put_each_extension_in_a_separate_file and have a dedicated file per extension. |
@aaronmondal PTAL at #2021 I've followed your recommendations re naming, but I'm specifically not applying your suggestions about the toolchain, since although I agree with you, it's out of scope of this PR |
Hey all, Thanks for this @matts1 , I tried your PR and I have few things to share :
I perfectly understand that is a challenge to make it works with minimum impacts : for brand new project, it's very ok to use it like it is, but for existing projects, i prefer limiting the change for now, even if the implementation is not "optimized", with just your PR #2021 and without using the new Olivier |
Re your first comment, that's correct, I hadn't implemented package. I'm not entirely sure I understand what you're saying about breaking the "old" repo in BUILD file, but I suspect that this is what I alluded to earlier with Re the all_crate_deps, you're correct that I don't implement it, but there's nothing stopping it from being implemented in a future CL - it'd be relatively trivial. This PR is pretty big already, and I don't want to make it bigger by adding anything that isn't strictly required. |
Hey @matts1 Great :), so for now i can't migrate and I keep what I have, please let me know when Thanks a lot! Olivier |
Hey @matts1 I've pushed changes from alchemy_v0.20.1 to fetch back the deps.bzl from the generated files in order to use It's working by adding crates in a I unfortunately needed to declare those repos because I try to patch the |
If you can make it |
Unfortunatley, using the canonical repo as proposed don't make the difference.
IMHO:
|
Hey, |
I've pulled out some of the code from #1910 into a seperate PR, to simplify the review process. --------- Co-authored-by: scentini <[email protected]>
This commit creates the "generate" prefix for each of these functions, which generates the content of these files in a way that doesn't touch the ctx variable. This allows us to use these functions in module extensions, which is required for bazelbuild#1910.
This commit creates the "generate" prefix for each of these functions, which generates the content of these files in a way that doesn't touch the ctx variable. This allows us to use these functions in module extensions, which is required for bazelbuild#1910.
This commit creates the "generate" prefix for each of these functions, which generates the content of these files in a way that doesn't touch the ctx variable. This allows us to use these functions in module extensions, which is required for bazelbuild#1910.
This commit creates the "generate" prefix for each of these functions, which generates the content of these files in a way that doesn't touch the ctx variable. This allows us to use these functions in module extensions, which is required for #1910. --------- Co-authored-by: UebelAndre <[email protected]>
bazelbuild#1910 has a workaround for this because changing the timestamp may invalidate the repository rule. Instead, just don't do the write if the contents is identical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally looks good, thanks!
I've left comments for a handful of design discussions, let's work those out an rebase, and hopefully we can get this merged soon.
Thanks again for all the work you've been putting in here!
crate.from_cargo( | ||
# This is optional, but if you need crate.annotation, then this is the | ||
# bzlmod equivalent. | ||
annotation_files = ["//:annotations.json"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some interesting trade-offs here... Is it possible to make this be an inline function call rather than an external file? (I'm not sure which we'd want to support if so, but I want to understand whether we're making API choices here or being forced by bzlmod's limitations).
The external file is nice in large repos given the lack of load
support in a MODULE.bazel
because it allows folks to spread this definition out of the MODULE.bazel
file and allow e.g. separate code ownership of annotations from the root MODULE.bazel
file.
In smaller repos, having this stuff in-line (and not needing to write a whole JSON file) is kind of nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll just remove annotations from this PR, since it's not actually required.
crate_universe/extension.bzl
Outdated
return annotations | ||
|
||
def _generate_hub_and_spokes(module_ctx, cargo_bazel, mod_path, cfg): | ||
annotations = _generate_annotations(module_ctx, cfg.annotation_files) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it makes more sense to do this processing of annotations in starlark here, or to just pass the JSON files to cargo-bazel and let it deal with them?
It feels like the starlark approach basically gives us stardoc at the expense of needing to duplicate the definitions across two languages and keep them in sync? And starlark gives us potentially earlier errors, but also potentially less control over the exact contents of the errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing annotations from this PR, will do them in a follow-up CL to minimize size of this PR
cargo_lockfile = module_ctx.path(cfg.cargo_lockfile) | ||
|
||
rendering_config = json.decode(render_config( | ||
regen_command = "Run 'cargo update [--workspace]'", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the story with this regen_command? I don't thinkcargo update
is generally something we should be recommending people run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've simplified everything and removed all unnecessary complexity. In the old mechanism, you would have a Cargo.toml, a Cargo.lock, and a bazel index.
I've made it so that the Cargo.lock is the source of truth. The bazel index is based on the Cargo.lock, and is automatically kept up to date. There's no need to repin, because repinning can be done via "cargo update".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made it so that the Cargo.lock is the source of truth. The bazel index is based on the Cargo.lock, and is automatically kept up to date.
I'm definitely excited by the idea of being able to get rid of the cargo-bazel lockfile, but I'm pretty sure there were reasons we hadn't already done this... @UebelAndre do you remember?
Some possible things that come to mind, though I'm not 100% confident in any of them:
- The
crate.spec
(via extra crates a versions) andcrate.annotation
(via features) fields can change the resolution, which means if you're not going via our logic you'll miss those parts of the process when youcargo update
. - Splicing is expensive (it involves doing a whole resolve, even if it's offline, as well as potentially running
cargo tree
many times to resolve features for platforms) - the cargo-bazel lockfile includes the results of splicing (which is why we can skip it entirely if not repinning), whereas Cargo.lock doesn't so will necessitate re-doing this work. PossiblyModule.bazel.lock
mitigates this as long as it's checked in? But if so, can we get rid of the cargo-bazel lock completely in bzlmod mode?
There's no need to repin, because repinning can be done via "cargo update".
Even if it's literally just proxying to a call to cargo update
via the fetched local toolchain, it's important that we support repinning via bazel - we have users who don't themselves have cargo
installed at all but who may need to trigger a repin.
tag_path, | ||
"--metadata", | ||
splicing_output_dir.get_child("metadata.json"), | ||
"--repin", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it unconditionally repins? This is quite different from how the non-bzlmod logic works (
rules_rust/crate_universe/private/crates_repository.bzl
Lines 44 to 86 in 8637941
# Determine whether or not to repin depednencies | |
repin = determine_repin( | |
repository_ctx = repository_ctx, | |
generator = generator, | |
lockfile_path = lockfiles.bazel, | |
config = config_path, | |
splicing_manifest = splicing_manifest, | |
cargo = cargo_path, | |
rustc = rustc_path, | |
) | |
# If re-pinning is enabled, gather additional inputs for the generator | |
kwargs = dict() | |
if repin: | |
# Generate a top level Cargo workspace and manifest for use in generation | |
metadata_path = splice_workspace_manifest( | |
repository_ctx = repository_ctx, | |
generator = generator, | |
cargo_lockfile = lockfiles.cargo, | |
splicing_manifest = splicing_manifest, | |
config_path = config_path, | |
cargo = cargo_path, | |
rustc = rustc_path, | |
) | |
kwargs.update({ | |
"metadata": metadata_path, | |
}) | |
# Run the generator | |
execute_generator( | |
repository_ctx = repository_ctx, | |
generator = generator, | |
config = config_path, | |
splicing_manifest = splicing_manifest, | |
lockfile_path = lockfiles.bazel, | |
cargo_lockfile_path = lockfiles.cargo, | |
repository_dir = repository_ctx.path("."), | |
cargo = cargo_path, | |
rustc = rustc_path, | |
# sysroot = tools.sysroot, | |
**kwargs | |
) |
What's the story with the difference here? Is this an intermediate step to making pinning conditional, or is there some reason bzlmod needs to do pinning differently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the easiest way to explain this is the different workflow.
If I wanted to update anyhow
from 1.0.77
to 1.0.78
, the workflow here is:
- Update the entry in
Cargo.toml
from1.0.77
to1.0.78
- Run
cargo update --workspace
to updateCargo.lock
Since everything is transient and automatically generated, you can then just build and it works. I'm not personally a big fan of the whole CARGO_BAZEL_REPIN
method, but if people do care about it, I think it's fine to add it in a follow-up PR, but it definitely isn't needed for the MVP.
Note that if you try and build between step 1 and 2 you'll get an error saying that we need to update the lockfile, but that --locked
was passed, so it'll refuse to compile.
#1910 has a workaround for this because changing the timestamp may invalidate the repository rule. Instead, just don't do the write if the contents is identical.
Change-Id: I5dee8352f9a014b51cd36937475a5cc4765c4f66
Also separate out the sh_test into two independent tests rather than combining them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to merge this, it's a useful stand-alone step on our journey - we can discuss what the next steps are :) (and shout if you want to try to arrange a call at any point, I'm sure we can make time zones work if needed :))
This PR adds bzlmod support for third party crates via rules_rust.
It only adds some subset of crate_universe's features. Specifically, it supports:
Out of scope of this PR is: