-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize usage under rustup. #11917
Optimize usage under rustup. #11917
Changes from all commits
52317aa
d4b0b49
cdf60e3
42ef94f
4f79ac3
b9993bd
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
|
@@ -388,7 +388,7 @@ impl Config { | |||
/// Gets the path to the `rustdoc` executable. | ||||
pub fn rustdoc(&self) -> CargoResult<&Path> { | ||||
self.rustdoc | ||||
.try_borrow_with(|| Ok(self.get_tool("rustdoc", &self.build_config()?.rustdoc))) | ||||
.try_borrow_with(|| Ok(self.get_tool(Tool::Rustdoc, &self.build_config()?.rustdoc))) | ||||
.map(AsRef::as_ref) | ||||
} | ||||
|
||||
|
@@ -406,7 +406,7 @@ impl Config { | |||
); | ||||
|
||||
Rustc::new( | ||||
self.get_tool("rustc", &self.build_config()?.rustc), | ||||
self.get_tool(Tool::Rustc, &self.build_config()?.rustc), | ||||
wrapper, | ||||
rustc_workspace_wrapper, | ||||
&self | ||||
|
@@ -1640,11 +1640,63 @@ impl Config { | |||
} | ||||
} | ||||
|
||||
/// Looks for a path for `tool` in an environment variable or config path, defaulting to `tool` | ||||
/// as a path. | ||||
fn get_tool(&self, tool: &str, from_config: &Option<ConfigRelativePath>) -> PathBuf { | ||||
self.maybe_get_tool(tool, from_config) | ||||
.unwrap_or_else(|| PathBuf::from(tool)) | ||||
/// Returns the path for the given tool. | ||||
/// | ||||
/// This will look for the tool in the following order: | ||||
/// | ||||
/// 1. From an environment variable matching the tool name (such as `RUSTC`). | ||||
/// 2. From the given config value (which is usually something like `build.rustc`). | ||||
/// 3. Finds the tool in the PATH environment variable. | ||||
/// | ||||
/// This is intended for tools that are rustup proxies. If you need to get | ||||
/// a tool that is not a rustup proxy, use `maybe_get_tool` instead. | ||||
fn get_tool(&self, tool: Tool, from_config: &Option<ConfigRelativePath>) -> PathBuf { | ||||
let tool_str = tool.as_str(); | ||||
self.maybe_get_tool(tool_str, from_config) | ||||
.or_else(|| { | ||||
// This is an optimization to circumvent the rustup proxies | ||||
// which can have a significant performance hit. The goal here | ||||
// is to determine if calling `rustc` from PATH would end up | ||||
// calling the proxies. | ||||
// | ||||
// This is somewhat cautious trying to determine if it is safe | ||||
// to circumvent rustup, because there are some situations | ||||
// where users may do things like modify PATH, call cargo | ||||
// directly, use a custom rustup toolchain link without a | ||||
// cargo executable, etc. However, there is still some risk | ||||
// this may make the wrong decision in unusual circumstances. | ||||
// | ||||
// First, we must be running under rustup in the first place. | ||||
let toolchain = self.get_env_os("RUSTUP_TOOLCHAIN")?; | ||||
// This currently does not support toolchain paths. | ||||
// This also enforces UTF-8. | ||||
if toolchain.to_str()?.contains(&['/', '\\']) { | ||||
return None; | ||||
} | ||||
// If the tool on PATH is the same as `rustup` on path, then | ||||
// there is pretty good evidence that it will be a proxy. | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We have an exact list of the proxies we offer btw. I think it is a good idea to only take the fastpath for things we are known to proxy. I'm happy to commit to keeping a copy of that list in Cargo up to date. New proxies are very rare. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cargo is currently hard-coded to only use this for |
||||
let tool_resolved = paths::resolve_executable(Path::new(tool_str)).ok()?; | ||||
let rustup_resolved = paths::resolve_executable(Path::new("rustup")).ok()?; | ||||
let tool_meta = tool_resolved.metadata().ok()?; | ||||
let rustup_meta = rustup_resolved.metadata().ok()?; | ||||
// This works on the assumption that rustup and its proxies | ||||
// use hard links to a single binary. If rustup ever changes | ||||
// that setup, then I think the worst consequence is that this | ||||
// optimization will not work, and it will take the slow path. | ||||
if tool_meta.len() != rustup_meta.len() { | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This seems hazardous. It's entirely possible for two binaries to have the same length without having the same contents. If you're checking for hardlinks, shouldn't this compare There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Following up: according to @ChrisDenton, you need to open both files, and then while both are open, call There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Essentially you'd need to compare There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's possible, but I'm concerned about the potential complexity or difficulty in getting that right. There are various situations where symlinks are used, and I can't guarantee that the files won't end up as a copy, or have issues across network mounts, for example. The file sizes are currently an order of magnitude different, and I think the chance for them being the same is very unlikely. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure if this matters but the reason the fs footprint of rustup looks quite different in different places is because of android (no hardlink support), bew (everything is a symlink to a symlink from our next release) (and I guess snap and other 3rd-party distributions can also differ to what one might expect from looking at our code). In particular see rust-lang/rustup#3137 for some context. tl;dr: there is no guarantee that the proxy and rustup itself are the same file, even though that is our default installation logic. On android they are symlinks. And the consequence of the tool not being a proxy is that someone has deliberately placed e.g. a 'rustc' wrapper that does something, which cargo would then not run. Checking for the same length will detect every common situation where they are different except for two cases I can see: two different binaries, alike in length, and two different symlinks, alike in length. For binaries I agree - its very unlikely that two different binaries the same length as rustup is large enough that the law of small numbers doesn't really apply. For symlinks, I suggest doing a readlink on the file. It is cheap enough to still be a lot faster than rustup manifest parsing (which I plan to do something about someday, but its not top of the list, and even after, not running code we don't need to run is how we make things fast). Oh and the final case - I alluded to above with 'common' - lets exclude other file types from consideration. Special node types should just immediate take the slow path. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Can you say more about why this would be needed? If the proxy symlink points at rustup, shouldn't they have the same size? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is possible for the symlink to point at something else with the same length path. The chance of two unrelated binary lengths colliding when they are ~11M in size (current rustup-init release size on Windows) is pretty low. But the chance of two ~100 byte paths being the same length is much much higher, and then multiply that out by our growing user bases I think its worth mitigating the risk. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe I'm still unclear, but this uses the standard If that doesn't resolve your concern, can you show a specific example? For example: /usr/bin/rustc -> /usr/bin/rust-compiler These have the same length paths, but different length targets, so they should be treated as being different. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok, so you're using fs::metadata(path).len(), not symlink_metadata ? Then I'm fine with that as-is |
||||
return None; | ||||
} | ||||
// Try to find the tool in rustup's toolchain directory. | ||||
let tool_exe = Path::new(tool_str).with_extension(env::consts::EXE_EXTENSION); | ||||
let toolchain_exe = home::rustup_home() | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Line 93 in 39684ff
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It shouldn't under normal circumstances, since the rustup proxies set There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it would be a good idea to prohibit this before stablisation @ehuss There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, I went ahead and posted #12101. |
||||
.ok()? | ||||
.join("toolchains") | ||||
.join(&toolchain) | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This could look at pretty random places on the filesystem with a path based toolchain. I suggest breaking out of this logic early based on There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good catch! I have added that. |
||||
.join("bin") | ||||
.join(&tool_exe); | ||||
toolchain_exe.exists().then_some(toolchain_exe) | ||||
}) | ||||
.unwrap_or_else(|| PathBuf::from(tool_str)) | ||||
} | ||||
|
||||
pub fn jobserver_from_env(&self) -> Option<&jobserver::Client> { | ||||
|
@@ -2645,3 +2697,17 @@ macro_rules! drop_eprint { | |||
$crate::__shell_print!($config, err, false, $($arg)*) | ||||
); | ||||
} | ||||
|
||||
enum Tool { | ||||
Rustc, | ||||
Rustdoc, | ||||
} | ||||
|
||||
impl Tool { | ||||
fn as_str(&self) -> &str { | ||||
match self { | ||||
Tool::Rustc => "rustc", | ||||
Tool::Rustdoc => "rustdoc", | ||||
} | ||||
} | ||||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,262 @@ | ||
//! Tests for Cargo's behavior under Rustup. | ||
use cargo_test_support::paths::{home, root, CargoPathExt}; | ||
use cargo_test_support::{cargo_process, process, project}; | ||
use std::env; | ||
use std::env::consts::EXE_EXTENSION; | ||
use std::ffi::OsString; | ||
use std::fs; | ||
use std::path::{Path, PathBuf}; | ||
|
||
/// Helper to generate an executable. | ||
fn make_exe(dest: &Path, name: &str, contents: &str, env: &[(&str, PathBuf)]) -> PathBuf { | ||
let rs_name = format!("{name}.rs"); | ||
fs::write( | ||
root().join(&rs_name), | ||
&format!("fn main() {{ {contents} }}"), | ||
) | ||
.unwrap(); | ||
let mut pb = process("rustc"); | ||
env.iter().for_each(|(key, value)| { | ||
pb.env(key, value); | ||
}); | ||
pb.arg("--edition=2021") | ||
.arg(root().join(&rs_name)) | ||
.exec() | ||
.unwrap(); | ||
let exe = Path::new(name).with_extension(EXE_EXTENSION); | ||
let output = dest.join(&exe); | ||
fs::rename(root().join(&exe), &output).unwrap(); | ||
output | ||
} | ||
|
||
fn prepend_path(path: &Path) -> OsString { | ||
let mut paths = vec![path.to_path_buf()]; | ||
paths.extend(env::split_paths(&env::var_os("PATH").unwrap_or_default())); | ||
env::join_paths(paths).unwrap() | ||
} | ||
|
||
struct RustupEnvironment { | ||
/// Path for ~/.cargo/bin | ||
cargo_bin: PathBuf, | ||
/// Path for ~/.rustup | ||
rustup_home: PathBuf, | ||
/// Path to the cargo executable in the toolchain directory | ||
/// (~/.rustup/toolchain/test-toolchain/bin/cargo.exe). | ||
cargo_toolchain_exe: PathBuf, | ||
} | ||
|
||
/// Creates an executable which prints a message and then runs the *real* rustc. | ||
fn real_rustc_wrapper(bin_dir: &Path, message: &str) -> PathBuf { | ||
let real_rustc = cargo_util::paths::resolve_executable("rustc".as_ref()).unwrap(); | ||
// The toolchain rustc needs to call the real rustc. In order to do that, | ||
// it needs to restore or clear the RUSTUP environment variables so that | ||
// if rustup is installed, it will call the correct rustc. | ||
let rustup_toolchain_setup = match std::env::var_os("RUSTUP_TOOLCHAIN") { | ||
Some(t) => format!( | ||
".env(\"RUSTUP_TOOLCHAIN\", \"{}\")", | ||
t.into_string().unwrap() | ||
), | ||
None => format!(".env_remove(\"RUSTUP_TOOLCHAIN\")"), | ||
}; | ||
let mut env = vec![("CARGO_RUSTUP_TEST_real_rustc", real_rustc)]; | ||
let rustup_home_setup = match std::env::var_os("RUSTUP_HOME") { | ||
Some(h) => { | ||
env.push(("CARGO_RUSTUP_TEST_RUSTUP_HOME", h.into())); | ||
format!(".env(\"RUSTUP_HOME\", env!(\"CARGO_RUSTUP_TEST_RUSTUP_HOME\"))") | ||
} | ||
None => format!(".env_remove(\"RUSTUP_HOME\")"), | ||
}; | ||
make_exe( | ||
bin_dir, | ||
"rustc", | ||
&format!( | ||
r#" | ||
eprintln!("{message}"); | ||
let r = std::process::Command::new(env!("CARGO_RUSTUP_TEST_real_rustc")) | ||
.args(std::env::args_os().skip(1)) | ||
{rustup_toolchain_setup} | ||
{rustup_home_setup} | ||
.status(); | ||
std::process::exit(r.unwrap().code().unwrap_or(2)); | ||
"# | ||
), | ||
&env, | ||
) | ||
} | ||
|
||
/// Creates a simulation of a rustup environment with `~/.cargo/bin` and | ||
/// `~/.rustup` directories populated with some executables that simulate | ||
/// rustup. | ||
fn simulated_rustup_environment() -> RustupEnvironment { | ||
// Set up ~/.rustup/toolchains/test-toolchain/bin with a custom rustc and cargo. | ||
let rustup_home = home().join(".rustup"); | ||
let toolchain_bin = rustup_home | ||
.join("toolchains") | ||
.join("test-toolchain") | ||
.join("bin"); | ||
toolchain_bin.mkdir_p(); | ||
let rustc_toolchain_exe = real_rustc_wrapper(&toolchain_bin, "real rustc running"); | ||
let cargo_toolchain_exe = make_exe( | ||
&toolchain_bin, | ||
"cargo", | ||
r#"panic!("cargo toolchain should not be called");"#, | ||
&[], | ||
); | ||
|
||
// Set up ~/.cargo/bin with a typical set of rustup proxies. | ||
let cargo_bin = home().join(".cargo").join("bin"); | ||
cargo_bin.mkdir_p(); | ||
|
||
let rustc_proxy = make_exe( | ||
&cargo_bin, | ||
"rustc", | ||
&format!( | ||
r#" | ||
match std::env::args().next().unwrap().as_ref() {{ | ||
"rustc" => {{}} | ||
arg => panic!("proxy only supports rustc, got {{arg:?}}"), | ||
}} | ||
eprintln!("rustc proxy running"); | ||
let r = std::process::Command::new(env!("CARGO_RUSTUP_TEST_rustc_toolchain_exe")) | ||
.args(std::env::args_os().skip(1)) | ||
.status(); | ||
std::process::exit(r.unwrap().code().unwrap_or(2)); | ||
"# | ||
), | ||
&[("CARGO_RUSTUP_TEST_rustc_toolchain_exe", rustc_toolchain_exe)], | ||
); | ||
fs::hard_link( | ||
&rustc_proxy, | ||
cargo_bin.join("cargo").with_extension(EXE_EXTENSION), | ||
) | ||
.unwrap(); | ||
fs::hard_link( | ||
&rustc_proxy, | ||
cargo_bin.join("rustup").with_extension(EXE_EXTENSION), | ||
) | ||
.unwrap(); | ||
|
||
RustupEnvironment { | ||
cargo_bin, | ||
rustup_home, | ||
cargo_toolchain_exe, | ||
} | ||
} | ||
|
||
#[cargo_test] | ||
fn typical_rustup() { | ||
// Test behavior under a typical rustup setup with a normal toolchain. | ||
let RustupEnvironment { | ||
cargo_bin, | ||
rustup_home, | ||
cargo_toolchain_exe, | ||
} = simulated_rustup_environment(); | ||
|
||
// Set up a project and run a normal cargo build. | ||
let p = project().file("src/lib.rs", "").build(); | ||
// The path is modified so that cargo will call `rustc` from | ||
// `~/.cargo/bin/rustc to use our custom rustup proxies. | ||
let path = prepend_path(&cargo_bin); | ||
p.cargo("check") | ||
.env("RUSTUP_TOOLCHAIN", "test-toolchain") | ||
.env("RUSTUP_HOME", &rustup_home) | ||
.env("PATH", &path) | ||
.with_stderr( | ||
"\ | ||
[CHECKING] foo v0.0.1 [..] | ||
real rustc running | ||
[FINISHED] [..] | ||
", | ||
) | ||
.run(); | ||
|
||
// Do a similar test, but with a toolchain link that does not have cargo | ||
// (which normally would do a fallback to nightly/beta/stable). | ||
cargo_toolchain_exe.rm_rf(); | ||
p.build_dir().rm_rf(); | ||
|
||
p.cargo("check") | ||
.env("RUSTUP_TOOLCHAIN", "test-toolchain") | ||
.env("RUSTUP_HOME", &rustup_home) | ||
.env("PATH", &path) | ||
.with_stderr( | ||
"\ | ||
[CHECKING] foo v0.0.1 [..] | ||
real rustc running | ||
[FINISHED] [..] | ||
", | ||
) | ||
.run(); | ||
} | ||
|
||
// This doesn't work on Windows because Cargo forces the PATH to contain the | ||
// sysroot_libdir, which is actually `bin`, preventing the test from | ||
// overriding the bin directory. | ||
#[cargo_test(ignore_windows = "PATH can't be overridden on Windows")] | ||
fn custom_calls_other_cargo() { | ||
// Test behavior when a custom subcommand tries to manipulate PATH to use | ||
// a different toolchain. | ||
let RustupEnvironment { | ||
cargo_bin, | ||
rustup_home, | ||
cargo_toolchain_exe: _, | ||
} = simulated_rustup_environment(); | ||
|
||
// Create a directory with a custom toolchain (outside of the rustup universe). | ||
let custom_bin = root().join("custom-bin"); | ||
custom_bin.mkdir_p(); | ||
// `cargo` points to the real cargo. | ||
let cargo_exe = cargo_test_support::cargo_exe(); | ||
fs::hard_link(&cargo_exe, custom_bin.join(cargo_exe.file_name().unwrap())).unwrap(); | ||
// `rustc` executes the real rustc. | ||
real_rustc_wrapper(&custom_bin, "custom toolchain rustc running"); | ||
|
||
// A project that cargo-custom will try to build. | ||
let p = project().file("src/lib.rs", "").build(); | ||
|
||
// Create a custom cargo subcommand. | ||
// This will modify PATH to a custom toolchain and call cargo from that. | ||
make_exe( | ||
&cargo_bin, | ||
"cargo-custom", | ||
r#" | ||
use std::env; | ||
use std::process::Command; | ||
eprintln!("custom command running"); | ||
let mut paths = vec![std::path::PathBuf::from(env!("CARGO_RUSTUP_TEST_custom_bin"))]; | ||
paths.extend(env::split_paths(&env::var_os("PATH").unwrap_or_default())); | ||
let path = env::join_paths(paths).unwrap(); | ||
let status = Command::new("cargo") | ||
.arg("check") | ||
.current_dir(env!("CARGO_RUSTUP_TEST_project_dir")) | ||
.env("PATH", path) | ||
.status() | ||
.unwrap(); | ||
assert!(status.success()); | ||
"#, | ||
&[ | ||
("CARGO_RUSTUP_TEST_custom_bin", custom_bin), | ||
("CARGO_RUSTUP_TEST_project_dir", p.root()), | ||
], | ||
); | ||
|
||
cargo_process("custom") | ||
// Set these to simulate what would happen when running under rustup. | ||
// We want to make sure that cargo-custom does not try to use the | ||
// rustup proxies. | ||
.env("RUSTUP_TOOLCHAIN", "test-toolchain") | ||
.env("RUSTUP_HOME", &rustup_home) | ||
.with_stderr( | ||
"\ | ||
custom command running | ||
[CHECKING] foo [..] | ||
custom toolchain rustc running | ||
[FINISHED] [..] | ||
", | ||
) | ||
.run(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a lot more complicated and brittle to me than #10998. In reviewing #10998, the only downside I saw listed was that it wasn't being driven by rustup which this has the same problem.
Is there something I'm missing for why we'd prefer this route over #10998?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Querying rustc is certainly a possibility, but the approach taken there incurs additional startup time for an initial cache. Adding a new flag to rustc has a fairly high bar, but adding a transparent optimization in cargo has no user-facing interaction so should be easier to move forward with. So, in terms of the global complexity (changes to cargo, rustc, and/or rustup, user-facing documentation, etc.), this seemed like the simplest solution with the least risk, and can receive benefits immediately rather than waiting a potentially very long time.
If this solution ends up having issues that one of the other solutions could address, then I think it would be worthwhile to re-investigate a different approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if its just a new enumerated value for an existing flag?
If I understand correctly, this solution could start failing and we'd never know it, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, new options are almost always added as an unstable option, and then it needs to go through the process of making a case for the compiler team to stabilize.
It is possible, though I think unlikely in most cases. I think any major regressions would require a significant change in the design of rustup, and I think that is unlikely for the foreseeable future. I could add a test that requires rustup to be installed if that may help with that concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add in that rustup has been consulted and we quite like this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely prefer a solution like this that doesn't add any additional invocations of rustc.