Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve rustc path early to bypass rustup wrapper #10998

Closed
wants to merge 3 commits into from

Conversation

CAD97
Copy link
Contributor

@CAD97 CAD97 commented Aug 17, 2022

This is the cargo side of rust#100681.

What does this PR try to resolve?

Fixes #10986, obsoletes(?) rustup#3035

The goal is to bypass the rustup wrapper when invoking rustc multiple times. See #10986 for more context.

How should we test and review this PR?

Relies on rust-lang/rust#100681.

A local rustup run stage1 ($env.CARGO_TARGET_DIR | path join release/cargo.exe) build12 succeeded in building, though I'm terribly unsure how to go about testing that this is because --print=rustc-path is being respected, especially since I allow --print=rustc-path to fail and stick to the old behavior.

Additional information

cc @davidlattimore, @bjorn3

Footnotes

  1. nushell shell syntax

  2. Actually, the test run was without the if output.status.success() check, which I added afterwards.

@rust-highfive
Copy link

r? @epage

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 17, 2022
@bjorn3
Copy link
Member

bjorn3 commented Aug 17, 2022

This increases the amount of rustc invocations before building anything from 3 to 4:

rustc --print rustc-path # <--- new
rustc -vV
$RUSTC_WRAPPER rustc - --crate-name ___ --print=file-names --crate-type bin --crate-type rlib --crate-type dylib --crate-type cdylib --crate-type staticlib --crate-type proc-macro -Csplit-debuginfo=packed
$RUSTC_WRAPPER rustc - --crate-name ___ --print=file-names --crate-type bin --crate-type rlib --crate-type dylib --crate-type cdylib --crate-type staticlib --crate-type proc-macro --print=sysroot --print=cfg

It would be nice if this could be reduced a bit. Maybe allowing --print rustc-path and -vV at the same time?

@CAD97
Copy link
Contributor Author

CAD97 commented Aug 17, 2022

I agree that this would be ideal1; however, rustc --print is currently handled after rustc --version, so e.g. rustc --print=sysroot --version just gives rustc --version, so allowing mixing "early --print" and --version would be a bigger change to rustc2.

Footnotes

  1. An enticing alternative to rustc --print=rustc-path would be to add the full path to rustc -vV. rustc -vV already includes a binary: rustc line; perhaps we could change this to be binary: /path/to/rustc?

  2. Although, the way of doing so is fairly straightforward in theory: just make --version an alias for e.g. --print version and handle the version print along with other early --prints. However, it should be noted that --version is handled even earlier (i.e. during argument parsing), before interface::run_compiler is even started, whereas --print happens inside the compiler session. Perhaps that's an argument for not using --print=rustc-path, and putting it in -vV instead?

@bjorn3
Copy link
Member

bjorn3 commented Aug 17, 2022

perhaps we could change this to be binary: /path/to/rustc?

I think that will hurt reproducible builds as I think the entire rustc -vV output gets hashed into the -Cmetadata argument.

@Muscraft
Copy link
Member

It would be nice to see perf results on these changes. Going from 3 -> 4 rustc invocations could be a larger hit than using rustup wrapper.

src/cargo/util/rustc.rs Outdated Show resolved Hide resolved
Comment on lines 49 to 53
// In order to avoid calling through rustup multiple times, we first ask
// rustc to give us the "resolved" rustc path, and use that instead. If
// this doesn't give us a path, then we just use the original path such
// that the following logic can handle any resulting errors normally.
let mut cmd = ProcessBuilder::new(&path);
Copy link
Contributor

@epage epage Aug 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also like sign off from rustup folks on this being a viable alternative to rust-lang/rustup#3035 (ie they like it enough not to continue down that path, making this obsolete)

@CAD97
Copy link
Contributor Author

CAD97 commented Aug 17, 2022

It would be nice to see perf results on these changes.

Reproducing the flamegraph from #10986 would be the ideal way to see the impact. Unfortunately I'm on Windows so I don't think I can do that.

A completely unscientific N=4 test that likely loses any signal in noise (e6bd7a6):
simple-raytracer〉for $i in 1..4 { cargo clean; benchmark { rustup run stage1 cargo check -q } }
╭───┬────────────────────────╮
 0  8sec 584ms 527µs 600ns 
 1  8sec 658ms 529µs 200ns 
 2  8sec 611ms 257µs 700ns 
 3  8sec 614ms 52µs 200ns  
╰───┴────────────────────────╯
simple-raytracer〉for $i in 1..4 { cargo clean; benchmark { RUSTC=D:\.rust\rustup\toolchains\stage1\bin\rustc.exe rustup run stage1 cargo check -q } }
╭───┬────────────────────────╮
 0  8sec 220ms 422µs 400ns 
 1  9sec 88ms 127µs 800ns  
 2  9sec 119ms 748µs 500ns 
 3  9sec 97ms 148µs 700ns  
╰───┴────────────────────────╯
simple-raytracer〉for $i in 1..4 { cargo clean; benchmark { rustup run stage1 ./cargo.exe check -q } }
╭───┬────────────────────────╮
 0  8sec 868ms 90µs 600ns  
 1  8sec 722ms 460µs 200ns 
 2  8sec 544ms 25µs 600ns  
 3  8sec 912ms 617µs 400ns 
╰───┴────────────────────────╯
strategy average time delta to as-is
as-is 8.617091675s -
$RUSTC 8.88136185s +0.264270175s
--print 8.76179845s +0.144706775s

Yeah, likely just noise. A second try:

simple-raytracer〉open .\bench.nu
rustup override set stage1
rustup show

let iter_count = 4

for _ in 1..$iter_count { cargo clean; benchmark { cargo check -q } } | math avg
with-env { RUSTC: $"(rustup which rustc | str trim)" } {
    for _ in 1..$iter_count { cargo clean; benchmark { cargo check -q } } | math avg
}
for _ in 1..$iter_count { cargo clean; benchmark { ./cargo.exe check -q } } | math avg

rustup override remove
simple-raytracer〉source .\bench.nu
info: override toolchain for 'D:\git\ebobby\simple-raytracer' set to 'stage1'
Default host: x86_64-pc-windows-msvc
rustup home:  D:\.rust\rustup

installed toolchains
--------------------

stable-x86_64-pc-windows-msvc (default)
nightly-2022-06-21-x86_64-pc-windows-msvc
nightly-x86_64-pc-windows-msvc
1.61-x86_64-pc-windows-msvc
stage0
stage1
1.46.0-x86_64-pc-windows-msvc
1.49.0-x86_64-pc-windows-msvc
1.53.0-x86_64-pc-windows-msvc
1.56.1-x86_64-pc-windows-msvc

active toolchain
----------------

stage1 (directory override for 'D:\git\ebobby\simple-raytracer')
rustc 1.65.0-dev

8sec 812ms 985µs 825ns
8sec 814ms 123µs 525ns
8sec 848ms 928µs 275ns
strategy average time delta to as-is
as-is 8.812985825s -
$RUSTC 8.814123525s +0.001137700s
--print 8.848928275s +0.035942450s

I'm clearly just measuring noise.


Slightly better, N=128

This is using locally compiled versions of 9809f8f (main) and e6bd7a6 (patched), so as fair a comparison as I can possibly get.

strategy average time delta
as-is 8.140742420s -
$RUSTC 8.040192590s -0.10054983s
--print 8.248244145s +0.107501725s
$RUSTC && --print 8.240557033s +0.099814613s

Something's wrong with the implementation to see opposite duration deltas from setting $RUSTC and from using --print.

I'm hoping @davidlattimore can reproduce their benchmark from #10986 for us.

(However, I've run enough tests to pretty clearly see that how this is currently implemented isn't accomplishing the stated goal. I spotted the antimalware service executable in Task Manager causing a significant portion of the CPU load since this is the Windows filesystem I'm working on...)

@CAD97
Copy link
Contributor Author

CAD97 commented Aug 17, 2022

With a further adjustment to run rustc --print=rustc-path through the cache as well, I got performance numbers more in line with my expectations:

N=10, check -p simple-raytracer:

strategy duration delta
current 7.969578393s -
$RUSTC 7.747807531s -2.8%
--print 7.858727781s -1.4%

N=64, check -p empty_bin

strategy duration delta
current 0.303947129s -
$RUSTC 0.185393026s -39%
--print 0.278397067s -8.4%

This seems to show that now this is a consistent win, and the overhead of the extra rustc call is about 100ms. A super simple ad-hoc test puts the overhead of the current rustup proxy at around 30ms, so the cost of the initial call roughly pays for itself by speeding up the next three in the startup. (Of course, this is not the case when rustup is not used.)

@CAD97
Copy link
Contributor Author

CAD97 commented Aug 17, 2022

Maybe allowing --print rustc-path and -vV at the same time?

I have just made a very interesting observation:

~〉rustup run stable rustc --version --print=not-a-valid-print -Cnot-a-codegen
rustc 1.63.0 (4b91a6ea7 2022-08-08)

i.e. --version is so processed before anything else that it overrides anything else, including nearly all argument validation beyond "is a recognized flag" (-Z is still gated). How much is this behavior guaranteed? 🤷

I did try a local patch to use --print=verbose-version --print=rustc-path, but this ended up with a runtime of ~370ms for the empty bin and ~7.90s for simple-raytracer, and I'm not super interested in investigating where the extra time is lost, since it's adding extra complexity over what I have here.

patches

cargo

diff --git a/src/cargo/util/rustc.rs b/src/cargo/util/rustc.rs
index 46c763108..3fc88d1dc 100644
--- a/src/cargo/util/rustc.rs
+++ b/src/cargo/util/rustc.rs
@@ -54,28 +54,27 @@ impl Rustc {
             cache_location,
         );
 
-        // In order to avoid calling through rustup multiple times, we first ask
-        // rustc to give us the "resolved" rustc path, and use that instead. If
-        // this doesn't give us a path, then we just use the original path such
-        // that the following logic can handle any resulting errors normally.
         let mut cmd = ProcessBuilder::new(&rustc);
-        cmd.arg("--print=rustc-path");
-        if let Ok((stdout, _)) = cache.cached_output(&cmd, 0) {
-            let resolved = PathBuf::from(stdout.trim());
-            if resolved.exists() {
-                rustc = resolved;
-                cache.reset(
-                    wrapper.as_deref(),
-                    workspace_wrapper.as_deref(),
-                    &rustc,
-                    rustup_rustc,
-                );
-            }
-        }
+        cmd.arg("-Zunstable-options")
+            .arg("--print=verbose-version")
+            .arg("--print=rustc-path");
+        let mut identifying = cache.cached_output(&cmd, 0)?.0;
 
-        let mut cmd = ProcessBuilder::new(&rustc);
-        cmd.arg("-vV");
-        let verbose_version = cache.cached_output(&cmd, 0)?.0;
+        // In order to avoid calling through rustup multiple times, we asked
+        // rustc to give us the "resolved" rustc path, and use that from now on.
+        let resolved_path = identifying.lines().next_back().unwrap();
+        rustc = PathBuf::from(resolved_path);
+        // We need to also update the cache with the resolved path.
+        cache.reset(
+            wrapper.as_deref(),
+            workspace_wrapper.as_deref(),
+            &rustc,
+            rustup_rustc,
+        );
+        // Truncate the identifying string to remove the `--print=rustc-path`,
+        // leaving just the verbose version as if we had called `rustc -vV`.
+        identifying.truncate(identifying.len() - (1 + resolved_path.len()));
+        let verbose_version = identifying;
 
         let extract = |field: &str| -> CargoResult<&str> {
             verbose_version

rustc

diff --git a/compiler/rustc_driver/src/lib.rs b/compiler/rustc_driver/src/lib.rs
index 2e678cd54fd..df4990d9390 100644
--- a/compiler/rustc_driver/src/lib.rs
+++ b/compiler/rustc_driver/src/lib.rs
@@ -729,6 +729,10 @@ fn print_crate_info(
                 Ok(exe) => println!("{}", exe.display()),
                 Err(_) => early_error(ErrorOutputType::default(), "failed to get rustc path"),
             },
+            RustcVersion(verbose) => {
+                let verbose_codegen_backend = if verbose { Some(codegen_backend) } else { None };
+                print_version_info("rustc", verbose_codegen_backend)
+            }
             // Any output here interferes with Cargo's parsing of other printed output
             NativeStaticLibs => {}
             LinkArgs => {}
@@ -740,10 +744,23 @@ fn print_crate_info(
 /// Prints version information
 pub fn version(binary: &str, matches: &getopts::Matches) {
     let verbose = matches.opt_present("verbose");
+    let codegen_backend;
+    let verbose_codegen_backend = if verbose {
+        let debug_flags = matches.opt_strs("Z");
+        let backend_name = debug_flags.iter().find_map(|x| x.strip_prefix("codegen-backend="));
+        codegen_backend = get_codegen_backend(&None, backend_name);
+        Some(&*codegen_backend)
+    } else {
+        None
+    };
 
+    print_version_info(binary, verbose_codegen_backend);
+}
+
+pub fn print_version_info(binary: &str, verbose_codegen_backend: Option<&dyn CodegenBackend>) {
     println!("{} {}", binary, util::version_str().unwrap_or("unknown version"));
 
-    if verbose {
+    if let Some(codegen_backend) = verbose_codegen_backend {
         fn unw(x: Option<&str>) -> &str {
             x.unwrap_or("unknown")
         }
@@ -753,9 +770,7 @@ fn unw(x: Option<&str>) -> &str {
         println!("host: {}", config::host_triple());
         println!("release: {}", unw(util::release_str()));
 
-        let debug_flags = matches.opt_strs("Z");
-        let backend_name = debug_flags.iter().find_map(|x| x.strip_prefix("codegen-backend="));
-        get_codegen_backend(&None, backend_name).print_version();
+        codegen_backend.print_version();
     }
 }
 
diff --git a/compiler/rustc_session/src/config.rs b/compiler/rustc_session/src/config.rs
index 8b77fb7be29..ee830063417 100644
--- a/compiler/rustc_session/src/config.rs
+++ b/compiler/rustc_session/src/config.rs
@@ -546,6 +546,7 @@ pub enum PrintRequest {
     StackProtectorStrategies,
     LinkArgs,
     RustcPath,
+    RustcVersion(bool),
 }
 
 pub enum Input {
@@ -1769,6 +1770,7 @@ fn collect_print_requests(
     error_format: ErrorOutputType,
 ) -> Vec<PrintRequest> {
     let mut prints = Vec::<PrintRequest>::new();
+
     if cg.target_cpu.as_ref().map_or(false, |s| s == "help") {
         prints.push(PrintRequest::TargetCPUs);
         cg.target_cpu = None;
@@ -1778,6 +1780,20 @@ fn collect_print_requests(
         cg.target_feature = String::new();
     }
 
+    let gated = |req, opt| {
+        if unstable_opts.unstable_options {
+            req
+        } else {
+            early_error(
+                error_format,
+                &format!(
+                    "the `-Z unstable-options` flag must also be passed to \
+                     enable the {opt} print option"
+                ),
+            )
+        }
+    };
+
     prints.extend(matches.opt_strs("print").into_iter().map(|s| match &*s {
         "crate-name" => PrintRequest::CrateName,
         "file-names" => PrintRequest::FileNames,
@@ -1792,19 +1808,11 @@ fn collect_print_requests(
         "tls-models" => PrintRequest::TlsModels,
         "native-static-libs" => PrintRequest::NativeStaticLibs,
         "stack-protector-strategies" => PrintRequest::StackProtectorStrategies,
-        "target-spec-json" => {
-            if unstable_opts.unstable_options {
-                PrintRequest::TargetSpec
-            } else {
-                early_error(
-                    error_format,
-                    "the `-Z unstable-options` flag must also be passed to \
-                     enable the target-spec-json print option",
-                );
-            }
-        }
+        "target-spec-json" => gated(PrintRequest::TargetSpec, "target-spec-json"),
         "link-args" => PrintRequest::LinkArgs,
-        "rustc-path" => PrintRequest::RustcPath,
+        "rustc-path" => gated(PrintRequest::RustcPath, "rustc-path"),
+        "version" => gated(PrintRequest::RustcVersion(false), "version"),
+        "verbose-version" => gated(PrintRequest::RustcVersion(true), "verbose-version"),
         req => early_error(error_format, &format!("unknown print request `{req}`")),
     }));
 

ChrisDenton added a commit to ChrisDenton/rust that referenced this pull request Aug 20, 2022
…enkov

Add `rustc --print rustc-path`

Related:

- rust-lang/cargo#10986
- rust-lang/rustup#3035

Goal:

Like the original rust-lang/rustup#2958, the goal is to enable `cargo` to call `rustc` directly, rather than through the `rustup` shim.

Solution:

`cargo` asks `rustc` to tell it what executable to run. Sometime early in compilation, `cargo` will run `$(RUSTC_WRAPPER) $(RUSTC ?: rustc) --print rustc-path`[^1]. Further calls to `rustc` to do execution will use the resolved printed executable path rather than continuing to call the "input `rustc` path," which will allow it to bypass the `rustup` shim.

The cargo side is rust-lang/cargo#10998.

[^1]: This can potentially be combined with other `--print`s, as well!

This is a tiny patch, so I've implemented it as insta-stable; this will need signoff probably from both the compiler and cargo teams. I'm working on the cargo side of the patch currently, but wanted to get this up ASAP.

cc `@davidlattimore` `@bjorn3` `@rust-lang/cargo` `@rust-lang/compiler` (sorry for the big ping list 😅)
@CAD97 CAD97 force-pushed the rustc-print-rustc branch from e76de76 to 30acc36 Compare August 21, 2022 02:00
@CAD97 CAD97 marked this pull request as draft August 21, 2022 02:00
@CAD97
Copy link
Contributor Author

CAD97 commented Aug 21, 2022

(Rebase was unnecessary, I just saw the GitHub button for it and pressed it while trying to remember how to access the convert to draft functionality.)

@ehuss
Copy link
Contributor

ehuss commented Aug 21, 2022

Per my comment at #10986 (comment), I would prefer to see this resolved on the rustup side, and pass the information to cargo. There were some concerns raised about running cargo directly without rustup, but I do not think that should be a common enough occurrence to worry about. Rustup already knows the answer for the toolchain directory, and it should be relatively easy to pass to any tool. Is that something that can be pursued instead?

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Aug 21, 2022
…enkov

Add `rustc --print rustc-path`

Related:

- rust-lang/cargo#10986
- rust-lang/rustup#3035

Goal:

Like the original rust-lang/rustup#2958, the goal is to enable `cargo` to call `rustc` directly, rather than through the `rustup` shim.

Solution:

`cargo` asks `rustc` to tell it what executable to run. Sometime early in compilation, `cargo` will run `$(RUSTC_WRAPPER) $(RUSTC ?: rustc) --print rustc-path`[^1]. Further calls to `rustc` to do execution will use the resolved printed executable path rather than continuing to call the "input `rustc` path," which will allow it to bypass the `rustup` shim.

The cargo side is rust-lang/cargo#10998.

[^1]: This can potentially be combined with other `--print`s, as well!

This is a tiny patch, so I've implemented it as insta-stable; this will need signoff probably from both the compiler and cargo teams. I'm working on the cargo side of the patch currently, but wanted to get this up ASAP.

cc `@davidlattimore` `@bjorn3` `@rust-lang/cargo` `@rust-lang/compiler` (sorry for the big ping list 😅)
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Aug 21, 2022
…enkov

Add `rustc --print rustc-path`

Related:

- rust-lang/cargo#10986
- rust-lang/rustup#3035

Goal:

Like the original rust-lang/rustup#2958, the goal is to enable `cargo` to call `rustc` directly, rather than through the `rustup` shim.

Solution:

`cargo` asks `rustc` to tell it what executable to run. Sometime early in compilation, `cargo` will run `$(RUSTC_WRAPPER) $(RUSTC ?: rustc) --print rustc-path`[^1]. Further calls to `rustc` to do execution will use the resolved printed executable path rather than continuing to call the "input `rustc` path," which will allow it to bypass the `rustup` shim.

The cargo side is rust-lang/cargo#10998.

[^1]: This can potentially be combined with other `--print`s, as well!

This is a tiny patch, so I've implemented it as insta-stable; this will need signoff probably from both the compiler and cargo teams. I'm working on the cargo side of the patch currently, but wanted to get this up ASAP.

cc ``@davidlattimore`` ``@bjorn3`` ``@rust-lang/cargo`` ``@rust-lang/compiler`` (sorry for the big ping list 😅)
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Aug 21, 2022
…enkov

Add `rustc --print rustc-path`

Related:

- rust-lang/cargo#10986
- rust-lang/rustup#3035

Goal:

Like the original rust-lang/rustup#2958, the goal is to enable `cargo` to call `rustc` directly, rather than through the `rustup` shim.

Solution:

`cargo` asks `rustc` to tell it what executable to run. Sometime early in compilation, `cargo` will run `$(RUSTC_WRAPPER) $(RUSTC ?: rustc) --print rustc-path`[^1]. Further calls to `rustc` to do execution will use the resolved printed executable path rather than continuing to call the "input `rustc` path," which will allow it to bypass the `rustup` shim.

The cargo side is rust-lang/cargo#10998.

[^1]: This can potentially be combined with other `--print`s, as well!

This is a tiny patch, so I've implemented it as insta-stable; this will need signoff probably from both the compiler and cargo teams. I'm working on the cargo side of the patch currently, but wanted to get this up ASAP.

cc ```@davidlattimore``` ```@bjorn3``` ```@rust-lang/cargo``` ```@rust-lang/compiler``` (sorry for the big ping list 😅)
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Aug 21, 2022
…enkov

Add `rustc --print rustc-path`

Related:

- rust-lang/cargo#10986
- rust-lang/rustup#3035

Goal:

Like the original rust-lang/rustup#2958, the goal is to enable `cargo` to call `rustc` directly, rather than through the `rustup` shim.

Solution:

`cargo` asks `rustc` to tell it what executable to run. Sometime early in compilation, `cargo` will run `$(RUSTC_WRAPPER) $(RUSTC ?: rustc) --print rustc-path`[^1]. Further calls to `rustc` to do execution will use the resolved printed executable path rather than continuing to call the "input `rustc` path," which will allow it to bypass the `rustup` shim.

The cargo side is rust-lang/cargo#10998.

[^1]: This can potentially be combined with other `--print`s, as well!

This is a tiny patch, so I've implemented it as insta-stable; this will need signoff probably from both the compiler and cargo teams. I'm working on the cargo side of the patch currently, but wanted to get this up ASAP.

cc ````@davidlattimore```` ````@bjorn3```` ````@rust-lang/cargo```` ````@rust-lang/compiler```` (sorry for the big ping list 😅)
@CAD97
Copy link
Contributor Author

CAD97 commented Nov 22, 2022

I'm closing this PR since I don't have the bandwidth to pursue furthering this change.

I personally still like this approach because it's completely agnostic to if/how rustc is being multiversioned and doesn't require setting up a hierarchy of multiple environment variables checked to determine what rustc to call.

fun factIf this saves 30ms per codegen unit, napkin math says it'd take 21 crate builds a day for this to save me an hour of time over a full year. The cost-benefit for pushing this PR doesn't pan out for me.

@CAD97 CAD97 closed this Nov 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bypass rustup wrapper when invoking rustc
6 participants