Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[6.0 regression] Crash on macOS ARM64 when using target_compatible_with = ["@platforms//:incompatible"] #17561

Closed
EdSchouten opened this issue Feb 23, 2023 · 5 comments
Assignees
Labels
P1 I'll work on this now. (Assignee required) team-Configurability platforms, toolchains, cquery, select(), config transitions type: bug

Comments

@EdSchouten
Copy link
Contributor

EdSchouten commented Feb 23, 2023

Description of the bug:

We have observed Bazel 6.0 and later to crash on macOS using ARM64 hardware, while using --auto_cpu_environment. This is a regression caused by 72787a1.

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Create a repository with the following contents:

WORKSPACE:

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

http_archive(
    name = "platforms",
    urls = [
        "https://mirror.bazel.build/github.com/bazelbuild/platforms/releases/download/0.0.6/platforms-0.0.6.tar.gz",
        "https://github.com/bazelbuild/platforms/releases/download/0.0.6/platforms-0.0.6.tar.gz",
    ],
    sha256 = "5308fc1d8865406a49427ba24a9ab53087f17f5266a7aabbfc28823f3916e1ca",
)

BUILD.bazel:

environment(name = "darwin_arm64")

environment_group(
    name = "cpus",
    defaults = [
        ":darwin_arm64",
    ],
    environments = [
        ":darwin_arm64",
    ],
)

sh_binary(
    name = "testcrash",
    srcs = ["testcrash.sh"],
    target_compatible_with = ["@platforms//:incompatible"],
)

And an empty file file named testcrash.sh.

Now run the following command:

./bazel-6.0.0-darwin-arm64 build --auto_cpu_environment_group=//:cpus //...

Bazel will then crash as follows:

FATAL: bazel crashed due to an internal error. Printing stack trace:
com.google.common.base.VerifyException: expected a non-null reference
	at com.google.common.base.Verify.verifyNotNull(Verify.java:503)
	at com.google.common.base.Verify.verifyNotNull(Verify.java:479)
	at com.google.devtools.build.lib.analysis.constraints.TopLevelConstraintSemantics.getMissingEnvironments(TopLevelConstraintSemantics.java:493)
	at com.google.devtools.build.lib.analysis.constraints.TopLevelConstraintSemantics.compatibilityWithTargetEnvironment(TopLevelConstraintSemantics.java:213)
	at com.google.devtools.build.lib.analysis.constraints.TopLevelConstraintSemantics.checkTargetEnvironmentRestrictions(TopLevelConstraintSemantics.java:376)
	at com.google.devtools.build.lib.analysis.BuildView.update(BuildView.java:492)
	at com.google.devtools.build.lib.buildtool.AnalysisPhaseRunner.runAnalysisPhase(AnalysisPhaseRunner.java:233)
	at com.google.devtools.build.lib.buildtool.AnalysisPhaseRunner.execute(AnalysisPhaseRunner.java:139)
	at com.google.devtools.build.lib.buildtool.BuildTool.buildTargets(BuildTool.java:180)
	at com.google.devtools.build.lib.buildtool.BuildTool.processRequest(BuildTool.java:494)
	at com.google.devtools.build.lib.buildtool.BuildTool.processRequest(BuildTool.java:462)
	at com.google.devtools.build.lib.runtime.commands.BuildCommand.exec(BuildCommand.java:103)
	at com.google.devtools.build.lib.runtime.BlazeCommandDispatcher.execExclusively(BlazeCommandDispatcher.java:608)
	at com.google.devtools.build.lib.runtime.BlazeCommandDispatcher.exec(BlazeCommandDispatcher.java:233)
	at com.google.devtools.build.lib.server.GrpcServerImpl.executeCommand(GrpcServerImpl.java:550)
	at com.google.devtools.build.lib.server.GrpcServerImpl.lambda$run$1(GrpcServerImpl.java:614)
	at io.grpc.Context$1.run(Context.java:566)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)

Which operating system are you running Bazel on?

macOS Ventura on ARM64 hardware

What is the output of bazel info release?

6.0-release

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

@sgowroji sgowroji added type: bug team-Configurability platforms, toolchains, cquery, select(), config transitions untriaged labels Feb 23, 2023
@EdSchouten
Copy link
Contributor Author

Cc @philsc.
Would this be a blocker for Bazel 6.1 (#17212)?

@meisterT meisterT added the potential release blocker Flagged by community members using "@bazel-io flag". Should be added to a release blocker milestone label Feb 23, 2023
@EdSchouten
Copy link
Contributor Author

It looks like a change like this prevents the crash:
https://github.com/bazelbuild/bazel/compare/master...EdSchouten:bazel:eschouten/20230223-issue-17561?w=1
When applied, the output of Bazel master is identical to that of Bazel 5:

INFO: Analyzed target //:testcrash (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //:testcrash was skipped
INFO: Elapsed time: 0.101s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action

That said, I have no idea whether that change is actually correct. I'd rather leave that to someone who has more knowledge on this.

@keertk
Copy link
Member

keertk commented Feb 23, 2023

@bazel-io fork 6.1.0

@bazel-io bazel-io removed the potential release blocker Flagged by community members using "@bazel-io flag". Should be added to a release blocker milestone label Feb 23, 2023
@gregestren
Copy link
Contributor

gregestren commented Feb 23, 2023

What are you doing with --auto_cpu_environment? It doesn't specifically mix with target_compatible_with = ["@platforms//:incompatible"]. Ideally you could rely exclusively on the latter. We generally consider environment and environment_group outdated substitutes for what incompatible target skipping does.

I think your change would work - and automatically error any incompatible target combined with --autto_cpu_environment? Another approach is for the incompatible target generation logic to produce an instance of that provider:

// Create dummy instances of the necessary data for a configured target. None of this data will
// actually be used because actions associated with incompatible targets must not be evaluated.
NestedSet<Artifact> filesToBuild = NestedSetBuilder.emptySet(Order.STABLE_ORDER);
FileProvider fileProvider = new FileProvider(filesToBuild);
FilesToRunProvider filesToRunProvider = new FilesToRunProvider(filesToBuild, null, null);
TransitiveInfoProviderMapBuilder providerBuilder =
new TransitiveInfoProviderMapBuilder()
.put(incompatiblePlatformProvider)
.add(RunfilesProvider.simple(Runfiles.EMPTY))
.add(fileProvider)
.add(filesToRunProvider);

@ma-oli
Copy link
Contributor

ma-oli commented Feb 23, 2023

We've been using --auto_cpu_environment_group in conjunction with restricted_to for a while now to skip incompatible targets. We're also using target_compatible_with for finer grain target skipping in some cases.
You're right that at some point, the former will be deprecated and all cpu environment groups users will have to transition to target_compatible_with only.
Still; the feature is there and usable today, and we probably don't want bazel to crash either way.

@meteorcloudy meteorcloudy added the P1 I'll work on this now. (Assignee required) label Feb 24, 2023
gregestren added a commit to gregestren/bazel that referenced this issue Feb 24, 2023
Stop crashes when incompatible target skipping mixes with
---auto_cpu_environment_group.

Fixes bazelbuild#17561.

PiperOrigin-RevId: 512125121
Change-Id: If5960a6abb08f8fe4f2643af6249c8528b7a2c51
meteorcloudy pushed a commit that referenced this issue Feb 28, 2023
Stop crashes when incompatible target skipping mixes with ---auto_cpu_environment_group.

Fixes #17561.

PiperOrigin-RevId: 512125121
Change-Id: If5960a6abb08f8fe4f2643af6249c8528b7a2c51

Closes #17590.

Change-Id: If5960a6abb08f8fe4f2643af6249c8528b7a2c51
PiperOrigin-RevId: 512820070

Co-authored-by: Googler <[email protected]>
fweikert pushed a commit to fweikert/bazel that referenced this issue May 25, 2023
Stop crashes when incompatible target skipping mixes with ---auto_cpu_environment_group.

Fixes bazelbuild#17561.

PiperOrigin-RevId: 512125121
Change-Id: If5960a6abb08f8fe4f2643af6249c8528b7a2c51

Closes bazelbuild#17590.

Change-Id: If5960a6abb08f8fe4f2643af6249c8528b7a2c51
PiperOrigin-RevId: 512820070
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 I'll work on this now. (Assignee required) team-Configurability platforms, toolchains, cquery, select(), config transitions type: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants