Skip to content

Commit

Permalink
Optimization: Vendor jobserver impl and rm thread spawning in paralle…
Browse files Browse the repository at this point in the history
…l compile_objects (#889)

* Impl vendored jobserver implementation

It supports non-blocking `try_acquire` and is much simpler than the one
provided by `jobserver`

Signed-off-by: Jiahao XU <[email protected]>

* Convert parallel `compile_objects` to use future instead of threads

Also fixed compilation errors in mod `job_token`

Signed-off-by: Jiahao XU <[email protected]>

* Optimize parallel `compile_objects`

Remove use of mpsc since the future is executed on one single thread
only.

Signed-off-by: Jiahao XU <[email protected]>

* Fix `job_token`: Remove mpsc and make sure tokens are relased

The mpsc is stored in a global variable and Rust never calls
`Drop::drop` on global variables, so they are never released.

This commit removes the mpsc and replaces that with an `AtomicBool` for
the implicit token to fix this, also dramatically simplifies the code.

Signed-off-by: Jiahao XU <[email protected]>

* Optimize `job_token`: Make `JobToken` zero-sized

Signed-off-by: Jiahao XU <[email protected]>

* Fix `windows::JobServerClient::try_acquire` impl

Return `Ok(None)` instead of `Err()` if no token is ready.

Signed-off-by: Jiahao XU <[email protected]>

* Fix `unix::JobServerClient::from_pipe`: Accept more fd access modes

`O_RDWR` is a valid access mode for both read and write end of the pipe.

Signed-off-by: Jiahao XU <[email protected]>

* Rm unnecessary `'static` bound in parameter of `job_token`

Signed-off-by: Jiahao XU <[email protected]>

* Optimize parallel `compile_objects`: Sleep/yield if no progress is made

Signed-off-by: Jiahao XU <[email protected]>

* Fix windows implementation: Match all return value explicitly

Signed-off-by: Jiahao XU <[email protected]>

* Use Result::ok() in job_token.rs

Co-authored-by: Piotr Osiewicz <[email protected]>

* Fix grammer in comments

Co-authored-by: Piotr Osiewicz <[email protected]>

* simplify job_token impl

Co-authored-by: Piotr Osiewicz <[email protected]>

* Add more comment explaining the design choice

Signed-off-by: Jiahao XU <[email protected]>

* Refactor: Extract new mod `async_executor`

Signed-off-by: Jiahao XU <[email protected]>

* Update src/job_token/unix.rs

Co-authored-by: Thom Chiovoloni <[email protected]>

* Remove outdated comment

Signed-off-by: Jiahao XU <[email protected]>

* Do not check for `--jobserver-fds` on windows

Since the manual specifies that only `--jobsewrver-auth` will be used
and windows does not have the concept of fds anyway.

Signed-off-by: Jiahao XU <[email protected]>

* Accept ASCII only in windows `JobServerClient::open` impl

Signed-off-by: Jiahao XU <[email protected]>

* Use acquire and release ordering for atomic operation in `JobServer`

Signed-off-by: Jiahao XU <[email protected]>

* Add a TODO for use of `NUM_JOBS`

Signed-off-by: Jiahao XU <[email protected]>

* Simplify windows jobserver `WAIT_ABANDONED` errmsg

Signed-off-by: Jiahao XU <[email protected]>

---------

Signed-off-by: Jiahao XU <[email protected]>
Co-authored-by: Piotr Osiewicz <[email protected]>
Co-authored-by: Thom Chiovoloni <[email protected]>
  • Loading branch information
3 people authored Nov 11, 2023
1 parent bd25128 commit fcedb00
Show file tree
Hide file tree
Showing 8 changed files with 644 additions and 206 deletions.
5 changes: 1 addition & 4 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,16 +18,13 @@ exclude = ["/.github"]
edition = "2018"
rust-version = "1.53"

[dependencies]
jobserver = { version = "0.1.16", optional = true }

[target.'cfg(unix)'.dependencies]
# Don't turn on the feature "std" for this, see https://github.com/rust-lang/cargo/issues/4866
# which is still an issue with `resolver = "1"`.
libc = { version = "0.2.62", default-features = false }

[features]
parallel = ["jobserver"]
parallel = []

[dev-dependencies]
tempfile = "3"
13 changes: 13 additions & 0 deletions gen-windows-sys-binding/windows_sys.list
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,12 @@ Windows.Win32.Foundation.SysFreeString
Windows.Win32.Foundation.SysStringLen
Windows.Win32.Foundation.S_FALSE
Windows.Win32.Foundation.S_OK
Windows.Win32.Foundation.FALSE
Windows.Win32.Foundation.HANDLE
Windows.Win32.Foundation.WAIT_OBJECT_0
Windows.Win32.Foundation.WAIT_TIMEOUT
Windows.Win32.Foundation.WAIT_FAILED
Windows.Win32.Foundation.WAIT_ABANDONED

Windows.Win32.System.Com.SAFEARRAY
Windows.Win32.System.Com.SAFEARRAYBOUND
Expand All @@ -25,3 +31,10 @@ Windows.Win32.System.Registry.HKEY_LOCAL_MACHINE
Windows.Win32.System.Registry.KEY_READ
Windows.Win32.System.Registry.KEY_WOW64_32KEY
Windows.Win32.System.Registry.REG_SZ

Windows.Win32.System.Threading.ReleaseSemaphore
Windows.Win32.System.Threading.WaitForSingleObject
Windows.Win32.System.Threading.SEMAPHORE_MODIFY_STATE
Windows.Win32.System.Threading.THREAD_SYNCHRONIZE

Windows.Win32.System.WindowsProgramming.OpenSemaphoreA
118 changes: 118 additions & 0 deletions src/async_executor.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
use std::{
cell::Cell,
future::Future,
pin::Pin,
ptr,
task::{Context, Poll, RawWaker, RawWakerVTable, Waker},
thread,
time::Duration,
};

use crate::Error;

const NOOP_WAKER_VTABLE: RawWakerVTable = RawWakerVTable::new(
// Cloning just returns a new no-op raw waker
|_| NOOP_RAW_WAKER,
// `wake` does nothing
|_| {},
// `wake_by_ref` does nothing
|_| {},
// Dropping does nothing as we don't allocate anything
|_| {},
);
const NOOP_RAW_WAKER: RawWaker = RawWaker::new(ptr::null(), &NOOP_WAKER_VTABLE);

#[derive(Default)]
pub(super) struct YieldOnce(bool);

impl Future for YieldOnce {
type Output = ();

fn poll(self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<()> {
let flag = &mut std::pin::Pin::into_inner(self).0;
if !*flag {
*flag = true;
Poll::Pending
} else {
Poll::Ready(())
}
}
}

/// Execute the futures and return when they are all done.
///
/// Here we use our own homebrew async executor since cc is used in the build
/// script of many popular projects, pulling in additional dependencies would
/// significantly slow down its compilation.
pub(super) fn block_on<Fut1, Fut2>(
mut fut1: Fut1,
mut fut2: Fut2,
has_made_progress: &Cell<bool>,
) -> Result<(), Error>
where
Fut1: Future<Output = Result<(), Error>>,
Fut2: Future<Output = Result<(), Error>>,
{
// Shadows the future so that it can never be moved and is guaranteed
// to be pinned.
//
// The same trick used in `pin!` macro.
//
// TODO: Once MSRV is bumped to 1.68, replace this with `std::pin::pin!`
let mut fut1 = Some(unsafe { Pin::new_unchecked(&mut fut1) });
let mut fut2 = Some(unsafe { Pin::new_unchecked(&mut fut2) });

// TODO: Once `Waker::noop` stablised and our MSRV is bumped to the version
// which it is stablised, replace this wth `Waker::noop`.
let waker = unsafe { Waker::from_raw(NOOP_RAW_WAKER) };
let mut context = Context::from_waker(&waker);

let mut backoff_cnt = 0;

loop {
has_made_progress.set(false);

if let Some(fut) = fut2.as_mut() {
if let Poll::Ready(res) = fut.as_mut().poll(&mut context) {
fut2 = None;
res?;
}
}

if let Some(fut) = fut1.as_mut() {
if let Poll::Ready(res) = fut.as_mut().poll(&mut context) {
fut1 = None;
res?;
}
}

if fut1.is_none() && fut2.is_none() {
return Ok(());
}

if !has_made_progress.get() {
if backoff_cnt > 3 {
// We have yielded at least three times without making'
// any progress, so we will sleep for a while.
let duration = Duration::from_millis(100 * (backoff_cnt - 3).min(10));
thread::sleep(duration);
} else {
// Given that we spawned a lot of compilation tasks, it is unlikely
// that OS cannot find other ready task to execute.
//
// If all of them are done, then we will yield them and spawn more,
// or simply return.
//
// Thus this will not be turned into a busy-wait loop and it will not
// waste CPU resource.
thread::yield_now();
}
}

backoff_cnt = if has_made_progress.get() {
0
} else {
backoff_cnt + 1
};
}
}
Loading

0 comments on commit fcedb00

Please sign in to comment.