-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Breaking] Remove ResumableFileSlot and rely on high ulimits #1582
Conversation
b74b3fc
to
43044cc
Compare
43044cc
to
d7a3655
Compare
d7a3655
to
d81c645
Compare
d81c645
to
f2a2385
Compare
f2a2385
to
c8c51ba
Compare
c8c51ba
to
a9302dc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 LGTMs obtained, and 0 of 21 files reviewed, and pending CI: Web Platform Deployment / ubuntu-24.04, pre-commit-checks, and 1 discussions need to be resolved (waiting on @aaronmondal)
a discussion (no related file):
+@aaronmondal
nativelink-worker/src/running_actions_manager.rs
line 995 at r1 (raw file):
"Could not kill process", ); } else {
fyi: This branch is the success case and the error was wrong, and it shouldn't be an ERROR.
nativelink-store/tests/filesystem_store_test.rs
line 1313 at r1 (raw file):
#[serial] #[nativelink_test] async fn update_with_whole_file_slow_path_when_low_file_descriptors() -> Result<(), Error> {
fyi: Removed this as it is an outlier and we mostly rely on ulimit now.
nativelink-worker/tests/running_actions_manager_test.rs
line 458 at r1 (raw file):
arguments: vec!["touch".to_string(), "./some/path/test.txt".to_string()], output_files: vec!["some/path/test.txt".to_string()], environment_variables: vec![EnvironmentVariable {
fyi: sneaking these in on this PR.
nativelink-worker/tests/running_actions_manager_test.rs
line 1478 at r1 (raw file):
assert_eq!(poll!(&mut run_action_fut), Poll::Pending); tokio::task::yield_now().await; match fs::metadata(&process_started_file).await {
fyi: This should fix this flaky test by waiting for the process to start. The test was fine before, but gave exit code 127 or 9 depending on the state of the process during the kill event.
nativelink-util/src/buf_channel.rs
line 375 at r1 (raw file):
.recv() .await .err_tip(|| "During next read of buf_channel::take()")?;
fyi: Fixed typo.
a9302dc
to
3c8d578
Compare
a73bcab
to
a47c20e
Compare
a47c20e
to
c2298d4
Compare
c2298d4
to
3e2df54
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 17 of 21 files at r1, 1 of 1 files at r2, 4 of 4 files at r3, all commit messages.
Reviewable status: 1 of 1 LGTMs obtained, and all files reviewed, and 2 discussions need to be resolved
nativelink-worker/tests/running_actions_manager_test.rs
line 3207 at r3 (raw file):
#[serial] #[nativelink_test] #[cfg(target_family = "unix")]
nit: Let's leave a todo to handle the windows case or document why we're not testing windows here.
ResumableFileSlot became to difficult to manage, instead of managing resources this way, we use much higher DEFAULT_OPEN_FILE_PERMITS, set ulimit (when unix) and warn user if the limits are likely too low. Breaking: Removed idle_file_descriptor_timeout_millis from config. closes: TraceMachina#1288, TraceMachina#513, TraceMachina#1298, TraceMachina#527
3e2df54
to
56f427d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 of 1 LGTMs obtained, and 20 of 22 files reviewed, and pending CI: Web Platform Deployment / ubuntu-24.04, pre-commit-checks, and 1 discussions need to be resolved
nativelink-worker/tests/running_actions_manager_test.rs
line 3207 at r3 (raw file):
Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…
nit: Let's leave a todo to handle the windows case or document why we're not testing windows here.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r4, all commit messages.
Reviewable status: 1 of 1 LGTMs obtained, and all files reviewed, and pending CI: Bazel Dev / macos-13, Bazel Dev / macos-14, Bazel Dev / ubuntu-24.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Coverage, Installation / macos-13, Installation / macos-14, Installation / ubuntu-22.04, Local / lre-cc / ubuntu-24.04, Local / lre-rs / macos-14, Local / lre-rs / ubuntu-24.04, NativeLink.com Cloud / Remote Cache / macos-14, NativeLink.com Cloud / Remote Cache / ubuntu-24.04, Publish image, Publish nativelink-worker-init, Remote / lre-cc / large-ubuntu-22.04, Remote / lre-rs / large-ubuntu-22.04, Web Platform Deployment / ubuntu-24.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (22.04), macos-13, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, windows-2022 / stable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 1 LGTMs obtained, and all files reviewed
ResumableFileSlot became to difficult to manage, instead of managing resources this way, we use much higher
DEFAULT_OPEN_FILE_PERMITS, set ulimit (when unix) and warn user if the limits are likely too low.
Breaking: Removed idle_file_descriptor_timeout_millis from config.
closes: #1288, #513, #1298, #527
This change is![Reviewable](https://camo.githubusercontent.com/1541c4039185914e83657d3683ec25920c672c6c5c7ab4240ee7bff601adec0b/68747470733a2f2f72657669657761626c652e696f2f7265766965775f627574746f6e2e737667)