-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance regression in 1.72 related to normalization of opaques with late-bound vars #115283
Comments
In the mean time, I'll try to look at the flamegraphs for the code you shared... |
@kornelski: Is there another step I need to do in this build? 😅
|
Thank you for checking it out — sorry about the build error. I though I removed all of these. CSS is irrelevant here, can be an empty string. |
I'm finding several main sources of slowness here:
Footnotes
|
WG-prioritization assigning priority (Zulip discussion). @rustbot label -I-prioritize +P-high |
I'm probably not helping much, but I have a warp app with a decent number of warp filters and I'm typically seeing about 16s for incremental builds with 1.71 and around 10 minutes for the same incremental build with 1.72. |
@darkprokoba Warp (the web framework) Last time I checked, one workaround to dramatically speed up compile times was to add |
I'm already using .boxed() for my debug builds... |
For any issues with 1.72.0, it would be good to check with beta and nightly as well. #114948 has fixed some 1.72.0 regressions already, for example. If the issues still happen on nightly, then some details on how to reproduce the regressions will also be very useful. |
At https://github.com/zachs18/zbus-repro/tree/zachs18-minimized I have a similar perf regression using // this `async fn` is important, inlining the body into an `async` block in its use below does not cause perf issue
async fn asyncfn() {
let _ = zbus::Connection::session().await;
}
pub fn box_pin_asyncfn() {
// this binding is important, inlining the variable into its use below does not cause perf issue
let future = asyncfn();
let _ = Box::pin(future);
} This is reproducible on the latest nightly ( |
Note for people doing triage:
|
I can't provide a reproducible example, but I have a medium-to-large project and I'm seeing the exact same issue. I held off upgrading the project to 1.72 due to 5x compile time overall. For what it's worth, it seems the issue is more connected to the 'diesel' part of our project, which has a lot of proc-macros. I can't say for sure it's related, but it seems like it. |
I've found that removing of let __tracing_instrument_future = async move {
if false {
let __tracing_attr_fake_return: Result<HttpResponse, ServerError> = {
::core::panicking::panic_fmt(
format_args!(
"internal error: entered unreachable code: {0}",
format_args!(
"this is just for type inference, and is unreachable code",
),
),
);
};
return __tracing_attr_fake_return;
}
// …
} Here's a branch with a lot of the code cut out: |
As of rustc 1.74.0-nightly (ec08a03 2023-09-04) and https://gitlab.com/lib.rs/main/-/commit/422f6790dd7b6c0798a8dd9cc10ca6f22fbe3de6 the heaviest stack trace is:
|
I think I have another clue: async calls across crates seem to be particularly expensive. I've been deleting thousands of lines of code from other parts the project without noticeable speed improvement, but when I removed all calls from |
Cross-crate versus local async calls shouldn't matter at this point in the compilation. The problem here has to do with the depth of futures (futures that contain many futures inside of them are problematic). Probably by removing a cross-crate future, you're just pruning it and the many many successive futures that it contains. |
Rust 1.72.0 introduced compile time regressions for async heavy code: rust-lang/rust#115283 Reverting to 1.71.0 gets us back to reasonable compile times. We should stay there until a fix is released.
2734: fix(*): pin rust to 1.71.0 until compile time regression fixed r=zacharyhamm a=zacharyhamm Rust 1.72.0 introduced compile time regressions for async heavy code: rust-lang/rust#115283 Reverting to 1.71.0 gets us back to reasonable compile times for sdf. We should stay there until a fix is released. Co-authored-by: Zachary Hamm <[email protected]>
I don't see any improvement in my compile times with neither 1.72.1 nor 1.74.0-nightly (bdb0fa3 2023-09-19) |
@darkprokoba as per your previous comment you seem to be using the Warp web framework. As mentioned before, would it be possible to support your case with a minimal reproducible, even a small Warp application showing the regression? It could help understand if the slowdown you experience is connected to this issue or perhaps to something else. |
So I tried to do this with a dummy warp application with 600 filters, each with its own async handler function. I see no difference in compilation times between 1.71.1 and 1.72.2 :-( I have no idea how to arrive at a minimal reproducible case for this problem. Any pointers would be much appreciated. Can I run the compiler thru a profiler and post the gathered data somewhere, like explained here: https://rustc-dev-guide.rust-lang.org/profiling/with_perf.html Anything else I should try? |
Great news, 1.74.0-nightly (5ae769f 2023-09-26) sees to be slightly faster than 1.71.1 :-) Not sure who fixed this or when, but kudos to them! |
I'm also seeing big improvement in the latest nightly: compilation now takes 6s-16s, instead of minutes. In 1.71.1 it takes 3s-5s, so strictly speaking it's still a couple of times slower, but it's usable now. It might be thanks to one of these commits (I'm unsure if |
I wonder what fixed it -- maybe drop_tracking_mir getting stabilized? |
I've created a minimal example (maybe this could be added to Rust's perf tests?) It requires interleaving async and closures, and a lifetime in opaque types. use core::future::Future;
fn main() {
// let a = String::new(); // fast
let a = &String::new(); // slow
let b = async_wrap2(async_wrap2(async_wrap2(async_wrap2(async_wrap2(async_wrap2(async {
let _d = vec![a];
}))))));
let c = closure_wrap(|| b);
_ = async { c().await }
}
fn closure_wrap<R>(f: impl FnOnce() -> R) -> impl FnOnce() -> R {
|| f()
}
async fn async_wrap2<F, R>(f: F) -> R
where
F: Future<Output = R>,
{
(|| async { f.await })().await
} |
On my side, on 1.72, 1.73 and current nightly it's still the same 4x from 1.71. I ran self-profile, and summarize it (as per https://fasterthanli.me/articles/why-is-my-rust-build-so-slow), and I could see that on 1.71, the top 2 are So it seems like previous slowest steps got faster, but these 3 steps that were previously on the milliseconds, now take over 7x the time of the total previous build. |
Upon further investigation, I see that the issue is related in some way with: diesel-rs/diesel#3223 on my project. Some bad interaction from diesel with some change in the compiler from 1.72-onwards until current nightly |
Upgrade from 1.71 to 1.72 has made compilation time of my async-heavy actix server 350 times slower (from under 5s to 30 minutes, on a 32GB M1 Max CPU).
I've bisected it to commit a20a04e (#113108).
The slowdown is still in the latest nightly (1.74.0 69e97df 2023-08-26). 1.72 and the nightly spend 30% of all compilation time in
Interners::intern_ty
, mainly called by<rustc_middle[b15c2eca62e7285b]::ty::Ty as rustc_type_ir[f43ddb3fb00a443c]::fold::TypeSuperFoldable<rustc_middle[b15c2eca62e7285b]::ty::context::TyCtxt>>::try_super_fold_with::<rustc_middle[b15c2eca62e7285b]::ty::generic_args::ArgFolder>
and<rustc_middle[b15c2eca62e7285b]::ty::Ty as rustc_type_ir[f43ddb3fb00a443c]::fold::TypeSuperFoldable<rustc_middle[b15c2eca62e7285b]::ty::context::TyCtxt>>::try_super_fold_with::<rustc_trait_selection[961e83e42176765a]::traits::project::AssocTypeNormalizer>
.The reduced code is in: https://gitlab.com/lib.rs/main/-/commits/reproducer
outdated info
Apologies for a big spaghetti reproducer. Let me know if it'd be useful to reduce it.git clone --recursive https://gitlab.com/lib.rs/main cd main git reset --hard 81c612b cargo build -p crates-server
I suspect the culprit is this function, which is used in several places in
server/src/main.rs
:@rustbot modify labels: +regression-from-stable-to-stable -regression-untriaged
The text was updated successfully, but these errors were encountered: