Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support third party crates with bzlmod #1910

Merged
merged 6 commits into from
Jan 3, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
191 changes: 191 additions & 0 deletions crate_universe/extension.bzl
Original file line number Diff line number Diff line change
@@ -0,0 +1,191 @@
"""Module extension for generating third-party crates for use in bazel."""

load("@bazel_tools//tools/build_defs/repo:git.bzl", "new_git_repository")
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
load("//crate_universe/private:crates_vendor.bzl", "CRATES_VENDOR_ATTRS", "generate_config_file", "generate_splicing_manifest")
load("//crate_universe/private:generate_utils.bzl", "render_config")
load("//crate_universe/private/module_extensions:cargo_bazel_bootstrap.bzl", "get_cargo_bazel_runner")

def _generate_repo_impl(repo_ctx):
for path, contents in repo_ctx.attr.contents.items():
repo_ctx.file(path, contents)

_generate_repo = repository_rule(
implementation = _generate_repo_impl,
attrs = dict(
contents = attr.string_dict(mandatory = True),
),
)

def _generate_hub_and_spokes(module_ctx, cargo_bazel, cfg):
cargo_lockfile = module_ctx.path(cfg.cargo_lockfile)
tag_path = module_ctx.path(cfg.name)

rendering_config = json.decode(render_config(
regen_command = "Run 'cargo update [--workspace]'",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the story with this regen_command? I don't thinkcargo update is generally something we should be recommending people run?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've simplified everything and removed all unnecessary complexity. In the old mechanism, you would have a Cargo.toml, a Cargo.lock, and a bazel index.

I've made it so that the Cargo.lock is the source of truth. The bazel index is based on the Cargo.lock, and is automatically kept up to date. There's no need to repin, because repinning can be done via "cargo update".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made it so that the Cargo.lock is the source of truth. The bazel index is based on the Cargo.lock, and is automatically kept up to date.

I'm definitely excited by the idea of being able to get rid of the cargo-bazel lockfile, but I'm pretty sure there were reasons we hadn't already done this... @UebelAndre do you remember?

Some possible things that come to mind, though I'm not 100% confident in any of them:

  • The crate.spec (via extra crates a versions) and crate.annotation (via features) fields can change the resolution, which means if you're not going via our logic you'll miss those parts of the process when you cargo update.
  • Splicing is expensive (it involves doing a whole resolve, even if it's offline, as well as potentially running cargo tree many times to resolve features for platforms) - the cargo-bazel lockfile includes the results of splicing (which is why we can skip it entirely if not repinning), whereas Cargo.lock doesn't so will necessitate re-doing this work. Possibly Module.bazel.lock mitigates this as long as it's checked in? But if so, can we get rid of the cargo-bazel lock completely in bzlmod mode?

There's no need to repin, because repinning can be done via "cargo update".

Even if it's literally just proxying to a call to cargo update via the fetched local toolchain, it's important that we support repinning via bazel - we have users who don't themselves have cargo installed at all but who may need to trigger a repin.

))
config_file = tag_path.get_child("config.json")
module_ctx.file(
config_file,
executable = False,
content = generate_config_file(
module_ctx,
mode = "remote",
annotations = {},
generate_build_scripts = cfg.generate_build_scripts,
supported_platform_triples = cfg.supported_platform_triples,
generate_target_compatible_with = True,
repository_name = cfg.name,
output_pkg = cfg.name,
workspace_name = cfg.name,
generate_binaries = cfg.generate_binaries,
render_config = rendering_config,
),
)

manifests = {module_ctx.path(m): m for m in cfg.manifests}
splicing_manifest = tag_path.get_child("splicing_manifest.json")
module_ctx.file(
splicing_manifest,
executable = False,
content = generate_splicing_manifest(
packages = {},
splicing_config = "",
cargo_config = cfg.cargo_config,
manifests = {str(k): str(v) for k, v in manifests.items()},
manifest_to_path = module_ctx.path,
),
)

splicing_output_dir = tag_path.get_child("splicing-output")
cargo_bazel([
"splice",
"--output-dir",
splicing_output_dir,
"--config",
config_file,
"--splicing-manifest",
splicing_manifest,
"--cargo-lockfile",
cargo_lockfile,
])

# Create a lockfile, since we need to parse it to generate spoke
# repos.
lockfile_path = tag_path.get_child("lockfile.json")
module_ctx.file(lockfile_path, "")

cargo_bazel([
"generate",
"--cargo-lockfile",
cargo_lockfile,
"--config",
config_file,
"--splicing-manifest",
splicing_manifest,
"--repository-dir",
tag_path,
"--metadata",
splicing_output_dir.get_child("metadata.json"),
"--repin",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it unconditionally repins? This is quite different from how the non-bzlmod logic works (

# Determine whether or not to repin depednencies
repin = determine_repin(
repository_ctx = repository_ctx,
generator = generator,
lockfile_path = lockfiles.bazel,
config = config_path,
splicing_manifest = splicing_manifest,
cargo = cargo_path,
rustc = rustc_path,
)
# If re-pinning is enabled, gather additional inputs for the generator
kwargs = dict()
if repin:
# Generate a top level Cargo workspace and manifest for use in generation
metadata_path = splice_workspace_manifest(
repository_ctx = repository_ctx,
generator = generator,
cargo_lockfile = lockfiles.cargo,
splicing_manifest = splicing_manifest,
config_path = config_path,
cargo = cargo_path,
rustc = rustc_path,
)
kwargs.update({
"metadata": metadata_path,
})
# Run the generator
execute_generator(
repository_ctx = repository_ctx,
generator = generator,
config = config_path,
splicing_manifest = splicing_manifest,
lockfile_path = lockfiles.bazel,
cargo_lockfile_path = lockfiles.cargo,
repository_dir = repository_ctx.path("."),
cargo = cargo_path,
rustc = rustc_path,
# sysroot = tools.sysroot,
**kwargs
)
)

What's the story with the difference here? Is this an intermediate step to making pinning conditional, or is there some reason bzlmod needs to do pinning differently?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the easiest way to explain this is the different workflow.

If I wanted to update anyhow from 1.0.77 to 1.0.78, the workflow here is:

  1. Update the entry in Cargo.toml from 1.0.77 to 1.0.78
  2. Run cargo update --workspace to update Cargo.lock

Since everything is transient and automatically generated, you can then just build and it works. I'm not personally a big fan of the whole CARGO_BAZEL_REPIN method, but if people do care about it, I think it's fine to add it in a follow-up PR, but it definitely isn't needed for the MVP.

Note that if you try and build between step 1 and 2 you'll get an error saying that we need to update the lockfile, but that --locked was passed, so it'll refuse to compile.

"--lockfile",
lockfile_path,
])

crates_dir = tag_path.get_child(cfg.name)
_generate_repo(
name = cfg.name,
contents = {
"BUILD.bazel": module_ctx.read(crates_dir.get_child("BUILD.bazel")),
},
)

contents = json.decode(module_ctx.read(lockfile_path))

for crate in contents["crates"].values():
repo = crate["repository"]
if repo == None:
continue
name = crate["name"]
version = crate["version"]

# "+" isn't valid in a repo name.
crate_repo_name = "{repo_name}__{name}-{version}".format(
repo_name = cfg.name,
name = name,
version = version.replace("+", "-"),
)

build_file_content = module_ctx.read(crates_dir.get_child("BUILD.%s-%s.bazel" % (name, version)))
if "Http" in repo:
# Replicates functionality in repo_http.j2.
repo = repo["Http"]
http_archive(
name = crate_repo_name,
patch_args = repo.get("patch_args", None),
patch_tool = repo.get("patch_tool", None),
patches = repo.get("patches", None),
remote_patch_strip = 1,
sha256 = repo.get("sha256", None),
type = "tar.gz",
urls = [repo["url"]],
strip_prefix = "%s-%s" % (crate["name"], crate["version"]),
build_file_content = build_file_content,
)
elif "Git" in repo:
# Replicates functionality in repo_git.j2
repo = repo["Git"]
kwargs = {}
for k, v in repo["commitish"].items():
if k == "Rev":
kwargs["commit"] = v
else:
kwargs[k.lower()] = v
new_git_repository(
name = crate_repo_name,
init_submodules = True,
patch_args = repo.get("patch_args", None),
patch_tool = repo.get("patch_tool", None),
patches = repo.get("patches", None),
shallow_since = repo.get("shallow_since", None),
remote = repo["remote"],
build_file_content = build_file_content,
strip_prefix = repo.get("strip_prefix", None),
**kwargs
)
else:
fail("Invalid repo: expected Http or Git to exist for crate %s-%s, got %s" % (name, version, repo))

def _crate_impl(module_ctx):
cargo_bazel = get_cargo_bazel_runner(module_ctx)
all_repos = []
for mod in module_ctx.modules:
local_repos = []
for cfg in mod.tags.from_cargo:
if cfg.name in local_repos:
fail("Defined two crate universes with the same name in the same MODULE.bazel file. Use the name tag to give them different names.")
elif cfg.name in all_repos:
fail("Defined two crate universes with the same name in different MODULE.bazel files. Either give one a different name, or use use_extension(isolate=True)")
_generate_hub_and_spokes(module_ctx, cargo_bazel, cfg)
all_repos.append(cfg.name)
local_repos.append(cfg.name)

_from_cargo = tag_class(
doc = "Generates a repo @crates from a Cargo.toml / Cargo.lock pair",
attrs = dict(
name = attr.string(doc = "The name of the repo to generate", default = "crates"),
cargo_lockfile = CRATES_VENDOR_ATTRS["cargo_lockfile"],
manifests = CRATES_VENDOR_ATTRS["manifests"],
cargo_config = CRATES_VENDOR_ATTRS["cargo_config"],
generate_binaries = CRATES_VENDOR_ATTRS["generate_binaries"],
generate_build_scripts = CRATES_VENDOR_ATTRS["generate_build_scripts"],
supported_platform_triples = CRATES_VENDOR_ATTRS["supported_platform_triples"],
),
)

crate = module_extension(
implementation = _crate_impl,
tag_classes = dict(
from_cargo = _from_cargo,
),
)
48 changes: 31 additions & 17 deletions examples/bzlmod/hello_world/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -3,28 +3,42 @@ load("@rules_rust//rust:defs.bzl", "rust_binary", "rust_doc")
package(default_visibility = ["//visibility:public"])

rust_binary(
name = "hello_world",
name = "hello_world_transient",
srcs = ["src/main.rs"],
deps = [
"//third-party/crates:anyhow",
"@crates//:anyhow",
],
)

rust_doc(
name = "hello_world_doc",
crate = ":hello_world",
)

sh_test(
name = "hello_world_test",
srcs = ["hello_world_test.sh"],
args = [
"$(rlocationpath :hello_world)",
],
data = [
":hello_world",
],
rust_binary(
name = "hello_world_vendored",
srcs = ["src/main.rs"],
deps = [
"@bazel_tools//tools/bash/runfiles",
"//third-party/crates:anyhow",
],
)

[
rust_doc(
name = "hello_world_{}_doc".format(target),
crate = ":hello_world_{}".format(target),
)
for target in ["transient", "vendored"]
]

[
sh_test(
name = "hello_world_{}_test".format(target),
srcs = ["hello_world_test.sh"],
args = [
"$(rlocationpath :hello_world_{})".format(target),
],
data = [
":hello_world_{}".format(target),
],
deps = [
"@bazel_tools//tools/bash/runfiles",
],
)
for target in ["transient", "vendored"]
]
32 changes: 30 additions & 2 deletions examples/bzlmod/hello_world/MODULE.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -5,21 +5,49 @@ module(
version = "0.0.0",
)

bazel_dep(name = "bazel_skylib", version = "1.5.0")
bazel_dep(name = "rules_rust", version = "0.0.0")
bazel_dep(
name = "bazel_skylib",
version = "1.5.0",
)

bazel_dep(
name = "rules_rust",
version = "0.0.0",
)

local_path_override(
module_name = "rules_rust",
path = "../../..",
)

rust = use_extension("@rules_rust//rust:extensions.bzl", "rust")

rust.toolchain(edition = "2021")

use_repo(
rust,
"rust_toolchains",
)

register_toolchains("@rust_toolchains//:all")

# To do third party dependencies, you have multiple options:

# Option 1: Fully transient (Cargo.toml / Cargo.lock as source of truth).
crate = use_extension(
"@rules_rust//crate_universe:extension.bzl",
"crate",
)

crate.from_cargo(
name = "crates",
cargo_lockfile = "//third-party:Cargo.lock",
manifests = ["//third-party:Cargo.toml"],
)

use_repo(crate, "crates")
illicitonion marked this conversation as resolved.
Show resolved Hide resolved

# Option 2: Vendored crates
crate_repositories = use_extension("//third-party:extension.bzl", "crate_repositories")

use_repo(crate_repositories, "vendor__anyhow-1.0.77")
Loading