Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement RFC 3553 to add SBOM support #13709

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

justahero
Copy link
Contributor

@justahero justahero commented Apr 5, 2024

What does this PR try to resolve?

This PR is an implementation of RFC 3553 to add support to generate pre-cursor SBOM files for compiled artifacts in Cargo.

How should we test and review this PR?

The RFC 3553 adds a new option to Cargo to emit SBOM pre-cursor files. A project can be configured either by the new Cargo config field sbom.

# .cargo/config.toml
[build]
sbom = true

or using the environment variable CARGO_BUILD_SBOM=true. The sbom option is an unstable feature and requires the -Zsbom flag to enable it.

Check out this branch & compile Cargo. Pick a Cargo project to test it on, then run:

CARGO_BUILD_SBOM=true <path/to/compiled/cargo>/target/debug/cargo build -Zsbom

All generated *.cargo-sbom.json files are located in the target folder alongside their artifacts. To list all generated files use:

find ./target -name "*.cargo-sbom.json"

then check their content. To see the current output format, see these examples.

What does the PR not solve?

The PR leaves a task(s) open that are either out of scope or should be done in a follow-up PRs.

Additional information

There are a few things that I would like to get feedback on, in particular the generated JSON format is not final. Currently it holds the information listed in the RFC 3553, but it could be further enriched with information only available during builds.

During the implementation a number of questions arose:

  • Should the graph be packages or crates?
    • The unit graph that the SBOM is based on is units, but the current algorithm is combining units within the same package.
    • Artifact dependencies may impact this
  • Which outputs should get SBOMs files?
    • Currently: executables (including examples and tests), dylib, cdylib, staticlib

Thanks @arlosi, @RobJellinghaus and @lfrancke for initial guidance & feedback.

@rustbot
Copy link
Collaborator

rustbot commented Apr 5, 2024

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @ehuss (or someone else) some time within the next two weeks.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

@rustbot rustbot added A-build-execution Area: anything dealing with executing the compiler A-configuration Area: cargo config files and env vars A-unstable Area: nightly unstable support S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 5, 2024
@justahero justahero force-pushed the rfc3553/cargo-sbom-support branch from 74dafa0 to 190682e Compare April 6, 2024 14:31
@heisen-li
Copy link
Contributor

Much respect for your contribution.

From my kind reminders, it seems appropriate to modify the documentation of the corresponding sections, e.g. Configuration, Environment Variables.

@weihanglo
Copy link
Member

Thanks for the reminder, @heisen-li. Would love to see a doc update, though we should probably focus on the design discussion first, as the location of the configuration is not yet decided. (See rust-lang/rfcs#3553 (comment)).

@epage
Copy link
Contributor

epage commented Apr 9, 2024

One approach for the docs (if this is looking to be merged) is to put the env and config documentation fragments in the Unstable docs.

@justahero justahero force-pushed the rfc3553/cargo-sbom-support branch from 190682e to ae0881c Compare May 2, 2024 19:54
src/cargo/core/compiler/mod.rs Show resolved Hide resolved
src/cargo/core/compiler/build_runner/mod.rs Show resolved Hide resolved
src/cargo/core/compiler/build_runner/mod.rs Outdated Show resolved Hide resolved
src/cargo/core/compiler/output_sbom.rs Outdated Show resolved Hide resolved
tests/testsuite/sbom.rs Outdated Show resolved Hide resolved
tests/testsuite/sbom.rs Outdated Show resolved Hide resolved
tests/testsuite/sbom.rs Outdated Show resolved Hide resolved
Copy link
Member

@weihanglo weihanglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just note that I reviewed this as-is, didn't really think too much for the design itself. Thank you for working on this!

src/cargo/core/compiler/build_config.rs Outdated Show resolved Hide resolved
src/cargo/core/compiler/build_config.rs Outdated Show resolved Hide resolved
src/cargo/core/compiler/output_sbom.rs Outdated Show resolved Hide resolved
@bors
Copy link
Contributor

bors commented May 3, 2024

☔ The latest upstream changes (presumably #13571) made this pull request unmergeable. Please resolve the merge conflicts.

@justahero justahero force-pushed the rfc3553/cargo-sbom-support branch 4 times, most recently from 1cfd71a to 376fe1e Compare May 6, 2024 13:42
@rustbot rustbot added the A-documenting-cargo-itself Area: Cargo's documentation label May 6, 2024
@justahero justahero force-pushed the rfc3553/cargo-sbom-support branch 4 times, most recently from 67332d6 to 0aa10e9 Compare May 7, 2024 11:13
@justahero justahero marked this pull request as ready for review May 7, 2024 11:53
Copy link
Member

@weihanglo weihanglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I like the idea of having this PR to explore SBOM format. I'll post back issues we've found so far to the RFC. Thank you :)

src/cargo/core/compiler/build_runner/mod.rs Outdated Show resolved Hide resolved
src/doc/src/reference/unstable.md Show resolved Hide resolved
tests/testsuite/sbom.rs Outdated Show resolved Hide resolved
src/cargo/core/compiler/output_sbom.rs Show resolved Hide resolved
src/cargo/core/compiler/output_sbom.rs Outdated Show resolved Hide resolved
src/cargo/core/compiler/output_sbom.rs Outdated Show resolved Hide resolved
src/cargo/core/compiler/output_sbom.rs Outdated Show resolved Hide resolved
src/cargo/core/compiler/output_sbom.rs Outdated Show resolved Hide resolved
tests/testsuite/sbom.rs Outdated Show resolved Hide resolved
tests/testsuite/sbom.rs Outdated Show resolved Hide resolved
src/cargo/core/compiler/build_runner/mod.rs Outdated Show resolved Hide resolved
src/cargo/core/compiler/build_config.rs Outdated Show resolved Hide resolved
src/cargo/core/compiler/output_sbom.rs Show resolved Hide resolved
src/cargo/core/compiler/output_sbom.rs Outdated Show resolved Hide resolved
src/cargo/core/compiler/output_sbom.rs Show resolved Hide resolved
src/cargo/core/compiler/output_sbom.rs Outdated Show resolved Hide resolved
src/cargo/core/compiler/output_sbom.rs Outdated Show resolved Hide resolved
@justahero justahero force-pushed the rfc3553/cargo-sbom-support branch from c8e1bc8 to 8d5fa4d Compare May 13, 2024 12:33
@bors
Copy link
Contributor

bors commented Oct 4, 2024

☔ The latest upstream changes (presumably #14576) made this pull request unmergeable. Please resolve the merge conflicts.

Comment on lines 21 to 28
#[derive(Serialize, Clone, Debug, Copy)]
#[serde(rename_all = "kebab-case")]
enum SbomBuildType {
/// A package dependency
Normal,
/// A build script dependency
Build,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be consistent with cargo metadata wrt th schema for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've renamed this to SbomDependencyKind. It still has Normal and Build options, but they have somewhat different meaning to cargo metadata. It the current iteration I'm about to push, build means a build-time dependency (proc-macro or build script).

The metadata format has kind: ['custom-build'] and kind: null. It's possible we could use the same format here, but it feels a bit awkward.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the use of the name normal, we use it today in

seems like there is enough precedence for this.

We should probably make sure this information gets reflected in the RFC

Comment on lines 46 to 30
#[derive(Serialize, Clone, Debug)]
struct SbomProfile {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No case is put on this. Is snake_case intentional? Looks like thats what we use for cargo metadata

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll explicitly mark them all as snake_case

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I wrote up our schema docs, I said to be kebab-case. I don't remember what all led to that (focusing on Cargo.toml?)

However, looking at our code, we never explicitly state snake_case, only kebab-case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did snake_case for consistency with cargo_metadata, but if there's a reason for switching to snake-case for some of these, let me know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's called kebab-case btw 😛

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, typo!


/// Describes a package dependency
#[derive(Serialize, Clone, Debug)]
struct SbomPackage {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we decide whether the sbom will track packages instead of crates?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently packages, since the crates within the package should have the same dependency set & profile if compiled in the same invocation. The existing algorithm is merging units within the same package.

If there's a reason to move to crates instead and have more nodes in the graph, I'm open to that, but I currently don't see one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect artifact dependencies could make things interesting

Note that we do call the field in the format crates

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tried artifact deps. I can add a test and we can revisit.

tests/testsuite/sbom/mod.rs Outdated Show resolved Hide resolved
tests/testsuite/sbom/mod.rs Outdated Show resolved Hide resolved
tests/testsuite/sbom/mod.rs Outdated Show resolved Hide resolved
tests/testsuite/sbom/mod.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@arlosi arlosi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've the PR to address most of the feedback from @epage. Thanks!

I re-wrote the algorithm for building the sbom graph and made changes to the format. There's also now documentation for the format in unstable.md that should help reviewers understand what it's looking like.
 
Before merging, I still want to add more test coverage for additional cases.

Comment on lines 21 to 28
#[derive(Serialize, Clone, Debug, Copy)]
#[serde(rename_all = "kebab-case")]
enum SbomBuildType {
/// A package dependency
Normal,
/// A build script dependency
Build,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've renamed this to SbomDependencyKind. It still has Normal and Build options, but they have somewhat different meaning to cargo metadata. It the current iteration I'm about to push, build means a build-time dependency (proc-macro or build script).

The metadata format has kind: ['custom-build'] and kind: null. It's possible we could use the same format here, but it feels a bit awkward.

Comment on lines 46 to 30
#[derive(Serialize, Clone, Debug)]
struct SbomProfile {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll explicitly mark them all as snake_case

src/cargo/core/compiler/output_sbom.rs Outdated Show resolved Hide resolved
src/cargo/core/compiler/output_sbom.rs Outdated Show resolved Hide resolved

/// Describes a package dependency
#[derive(Serialize, Clone, Debug)]
struct SbomPackage {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently packages, since the crates within the package should have the same dependency set & profile if compiled in the same invocation. The existing algorithm is merging units within the same package.

If there's a reason to move to crates instead and have more nodes in the graph, I'm open to that, but I currently don't see one.

src/cargo/core/compiler/output_sbom.rs Outdated Show resolved Hide resolved
tests/testsuite/sbom/mod.rs Outdated Show resolved Hide resolved
tests/testsuite/sbom/mod.rs Outdated Show resolved Hide resolved
tests/testsuite/sbom/mod.rs Outdated Show resolved Hide resolved
src/doc/src/reference/unstable.md Show resolved Hide resolved
Comment on lines 1299 to 1309
"sbom" => self.sbom = parse_empty(k, v)?,
"script" => self.script = parse_empty(k, v)?,
"separate-nightlies" => self.separate_nightlies = parse_empty(k, v)?,
"checksum-freshness" => self.checksum_freshness = parse_empty(k, v)?,
"skip-rustdoc-fingerprint" => self.skip_rustdoc_fingerprint = parse_empty(k, v)?,
"script" => self.script = parse_empty(k, v)?,
"target-applies-to-host" => self.target_applies_to_host = parse_empty(k, v)?,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this moved? I believe we generally try to be sorted here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was inadvertently moved in the 1st commit when merging. The 2nd commit fixes it.

If you prefer, I can squash the two commits together. I was hoping it would be easier to review only the changes since last time, so I added a commit rather than rewriting the 1st one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done enough of a pass; let's squash

Comment on lines +298 to +302
let sbom = super::build_sbom(&mut self, unit)?;
for sbom_output_file in self.sbom_output_files(unit)? {
let outfile = BufWriter::new(paths::create(sbom_output_file)?);
serde_json::to_writer(outfile, &sbom)?;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are outputting the same sbom for each output.

If we're package focused, that makes sense. If we're create focused, then that doesn't quite work.

Is there anything specific about each root artifact that we'd want to call out?


if gctx.cli_unstable().sbom && build_runner.bcx.build_config.sbom {
let file_list = std::env::join_paths(build_runner.sbom_output_files(unit)?)?;
base.env("CARGO_SBOM_PATH", file_list);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you document this in the unstable docs like it will be when moved to the stable docs?

I want to make sure we call out that it can be multiple files

/// Returns the list of SBOM output file paths for a given [`Unit`].
///
/// Only call this function when `sbom` is active.
pub fn sbom_output_files(&self, unit: &Unit) -> CargoResult<Vec<PathBuf>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these need to be calculated in prepare_units and added to files?

Comment on lines +472 to +491
let sbom_enabled = match unit.target.kind() {
TargetKind::Lib(crate_types) | TargetKind::ExampleLib(crate_types) => {
crate_types.iter().any(|crate_type| {
matches!(
crate_type,
CrateType::Cdylib | CrateType::Dylib | CrateType::Staticlib
)
})
}
TargetKind::Bin | TargetKind::Test | TargetKind::Bench | TargetKind::ExampleBin => true,
TargetKind::CustomBuild => false,
};
if !sbom_enabled {
return Ok(Vec::new());
}

let files = self
.outputs(unit)?
.iter()
.filter(|o| matches!(o.flavor, FileFlavor::Normal | FileFlavor::Linkable))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a place to track specific points we want to have called out in the RFC? For me, when we do this is one of them

Similar to the generation of `depinfo` files, a function is called to
generated SBOM precursor file named `output_sbom`. It takes the
`BuildRunner` & the current `Unit`. The `sbom` flag can be specified as
a cargo build option, but it's currently not configured fully. To
test the generation the flag is set to `true`.

* use SBOM types to serialize data

Output source, profile & dependencies

Trying to fetch all dependencies

This ignores dependencies for custom build scripts. The output should be
similar to what `cargo tree` reports.

Output package dependencies

This is similar to what the `cargo metadata` command outputs.

Extract logic to fetch sbom output files

This extracts the logic to get the list of SBOM output file paths into
its own function in `BuildRunner` for a given Unit.

Add test file to check sbom output

* add test to check project with bin & lib
* extract sbom config into helper function

Add build type to dependency

Add test to read JSON

Still needs to check output.

Guard sbom logic behind unstable feature

Add test with custom build script

Integrate review feedback

* disable `sbom` config when `-Zsbom` is not passed as unstable option
* refactor tests
* add test

Expand end-to-end tests

This expands the tests to reflect end-to-end tests by comparing the
generated JSON output files with expected strings.

* add test helper to compare actual & expected JSON content
* refactor setup of packages in test

Add 'sbom' section to unstable features doc

Append SBOM file suffix instead of replacing

Instead of replacing the file extension, the `.cargo-sbom.json` suffix
is appended to the output file. This is to keep existing file extensions
in place.

* refactor logic to set `sbom` property from build config
* expand build script related test to check JSON output

Integrate review feedback

* use `PackageIdSpec` instead of only `PackageId` in SBOM output
* change `version` of a dependency to `Option<Version>`
* output `Vec<CrateType>` instead of only the first found crate type
* output rustc workspace wrapper
* update 'warning' string in test using `[WARNING]`
* use `serde_json::to_writer` to serialize SBOM
* set sbom suffix in tests explicitely, instead of using `with_extension`

Output additional fields to JSON

In case a unit's profile differs from the profile information on root
level, it's added to the package information to the JSON output.

The verbose output for `rustc -vV` is also written to the `rustc` field
in the SBOM.

* rename `fetch_packages` to `collect_packages`
* update JSON in tests to include profile information

Add test to check multiple crate types

Add test to check artifact name conflict

Use SbomProfile to wrap Profile type

This adds the `SbomProfile` to convert the existing `Profile` into, to
expose relevant fields. For now it removes the `strip` field, while
serializing all other fields. It should keep the output consistent, even
when fields in the `Profile` change, e.g. new field added.

Document package profile

* only export `profile` field in case it differs from root profile

Add test to check different features

The added test uses a crate with multiple features. The main crate uses
the dependency in the normal build & the custom build script with
different features.

Refactor storing of package dependencies

All dependencies for a package are indices into the `packages` list now.
This sets the correct association between a dependency & its associated
package.

* remove `SbomDependency` struct

Refactor tests to use snapbox
@arlosi arlosi force-pushed the rfc3553/cargo-sbom-support branch from e93e3b0 to 7266454 Compare February 8, 2025 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-build-execution Area: anything dealing with executing the compiler A-configuration Area: cargo config files and env vars A-documenting-cargo-itself Area: Cargo's documentation A-testing-cargo-itself Area: cargo's tests A-unstable Area: nightly unstable support S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants