Huge heap usage with large number of files and RE #16876
Labels
P2
We'll consider working on this in future. (Assignee optional)
team-Remote-Exec
Issues and PRs for the Execution (Remote) team
type: bug
Description of the bug:
Huge memory usage is observed when running a large number of tests in remote execution with a large number of input files with their individual runfiles trees created in starlark using ctx.runfiles. If you have the same number of tests with the same number of input files, but instead of create individual runfiles trees you create a single one, the memory usage stays low.
Looking at bazel code, I could not find a reason why the memory usage should be duplicated in one case but not duplicated in the other so this seems like a bug.
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
I have a repro for the issue in https://github.com/exoson/oom_repro. To repro the issue you will additionally need a .bazelrc file which includes configuration for remotely executing actions. I have used
--config=remote_exec
to configure remote execution in my setup.Inside this repository, when running
bazel test --jobs=1000 --nocache_test_results --config=remote_exec //no_oom/...
everything will be fine and no huge heap usage will happen. When trying to run
bazel test --jobs=1000 --nocache_test_results --config=remote_exec //oom/...
The heap size will increase to my max of 16GB almost instantly when it starts to prepare for executing the tests. Also if you do not use remote execution, either invocation will have low memory usage.
Which operating system are you running Bazel on?
linux
What is the output of
bazel info release
?release 5.3.2
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.N/A
What's the output of
git remote get-url origin; git rev-parse master; git rev-parse HEAD
?Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
No response
The text was updated successfully, but these errors were encountered: