Add `rust.lto=off` to bootstrap and set as compiler/library default #107241

clubby789 · 2023-01-23T21:25:45Z

The issue mentions embed-bitcode=on, but here

Lines 379 to 381 in c8e6a9e

    
           if stage >= 1 { 
        
               cargo.rustflag("-Cembed-bitcode=yes"); 
        
           }

it appears that this is always set for std stage 1+, so I'm unsure if changes are needed here.

@rustbot label +A-bootstrap

rustbot · 2023-01-23T21:25:51Z

r? @Mark-Simulacrum

(rustbot has picked a reviewer for you, use r? to override)

ehuss · 2023-01-24T00:04:22Z

src/bootstrap/compile.rs

@@ -722,6 +722,9 @@ impl Step for Rustc {
                    cargo.rustflag("-Cembed-bitcode=yes");
                }
                RustcLto::ThinLocal => { /* Do nothing, this is the default */ }
+                RustcLto::Off => {


Just FYI, this doesn't disable LTO for the compiler built in stage0. I think this would need to go outside the check for the compiler stage.

To elaborate: The reason -Clto doesn't run in stage 0 is per the comment above about LTO not working in stage 0, but the "off" setting is specifically turning it off, so it doesn't have the same restriction.

Kobzol · 2023-01-25T07:31:04Z

Can you confirm by local benchmarks that rebuilds are faster with this option set to off?

clubby789 · 2023-01-25T23:20:48Z

x b library with lto=off -

x b library  1260.84s user 31.05s system 527% cpu 4:05.01 total
x b library  1251.37s user 30.98s system 526% cpu 4:03.67 total
x b library  1248.36s user 30.92s system 510% cpu 4:10.61 total

x b library with lto=thin-local -

x b library  1462.13s user 38.48s system 567% cpu 4:24.46 total
x b library  1471.53s user 39.20s system 551% cpu 4:33.82 total
x b library  1474.35s user 39.11s system 532% cpu 4:44.30 total

For x b library --keep-stage 0 (after touch library/core/src/lib.rs), the average times were around 35s to build stage1 std with thin-local, and 15-20s for off.

Kobzol · 2023-01-26T07:52:19Z

Looks great! Hopefully the compiler won't be unusably slow without any LTO.

albertlarsan68 · 2023-01-26T08:53:41Z

$ hyperfine -L mode off,thinlocal -p "x clean; x build library --dry-run" "x build library --config config.{mode}.toml";hyperfine -L mode off,thinlocal -p "x clean; x build library; touch library/core/src/lib.rs" "x build library --config config.{mode}.toml"

# Clean build
Benchmark 1: x build library --config config.off.toml
  Time (mean ± σ):     25.388 s ±  0.361 s    [User: 59.866 s, System: 5.066 s]
  Range (min … max):   25.094 s … 26.181 s    10 runs

Benchmark 2: x build library --config config.thinlocal.toml
  Time (mean ± σ):     26.330 s ±  0.308 s    [User: 68.647 s, System: 6.686 s]
  Range (min … max):   26.096 s … 27.065 s    10 runs

Summary
  'x build library --config config.off.toml' ran
    1.04 ± 0.02 times faster than 'x build library --config config.thinlocal.toml'

# Frech incr comp (touch library/core/src/lib.rs)
Benchmark 1: x build library --config config.off.toml
  Time (mean ± σ):     11.516 s ±  0.159 s    [User: 20.368 s, System: 3.478 s]
  Range (min … max):   11.392 s … 11.936 s    10 runs

Benchmark 2: x build library --config config.thinlocal.toml
  Time (mean ± σ):     25.453 s ±  0.412 s    [User: 65.687 s, System: 5.908 s]
  Range (min … max):   25.105 s … 26.519 s    10 runs

Summary
  'x build library --config config.off.toml' ran
    2.21 ± 0.05 times faster than 'x build library --config config.thinlocal.toml'

lqd · 2023-01-26T09:18:02Z

How much slower is the resulting compiler built without LTO, e.g. on one of the longer check build benchmarks from rustc-perf (like cargo, diesel, keccak) ?

Kobzol · 2023-01-26T11:34:50Z

Oh, I'm stupid, I didn't realize that this PR also changed the default to be lto=off. Therefore I was just benchmarking the same code against itself before...

Anyway, without any LTO, it's (expectedly) quite slower:

But I would say that it's not catastrophically slower (100+%), and the rustc incremental compilation wins are really nice. This should definitely be propagated as a way of speeding up (incremental) compiler builds, and I think that it would also make sense to make it be the default.

Although maybe implementing it and switching the default should be split into 2 PRs.

jyn514

Other than Eric's comment, this looks great!

src/bootstrap/defaults/config.compiler.toml

lqd · 2023-01-26T16:38:20Z

Let's also notify people if the default for these profiles is changed: a 30% difference is significant enough to be surprising/confusing, especially when a PR will need interacting with the perfbot, its results are likely to be quite different from local runs.

Kobzol · 2023-01-26T16:40:42Z

Indeed, and probably we should also add it to the rustc-dev-guide, e.g. here: https://rustc-dev-guide.rust-lang.org/profiling.html.

jyn514 · 2023-01-26T17:10:11Z

Let's also notify people if the default for these profiles is changed: a 30% difference is significant enough to be surprising/confusing, especially when a PR will need interacting with the perfbot, its results are likely to be quite different from local runs.

What do you mean by "notify people"? In the past, we've used blog posts and bumping MAJOR_VERSION in src/bootstrap/main.rs - I would be fine with a blog post, but I don't think changing the version number is warranted since it only affects some profiles.

jyn514 · 2023-01-26T17:11:01Z

Anyway, without any LTO, it's (expectedly) quite slower:

Do you have a link to the perf run for that screenshot? I'm curious to know how much it improves bootstrap times for the compiler itself.

Kobzol · 2023-01-26T17:15:27Z

I ran it locally, just a few crates, I didn't test bootstrap. We could run the whole perf. suite in this PR by changing this line to --set rust.lto=off.

Actually, now that I think about it, we should be very careful with the off default. There are a lot of dist builders out there
and currently only dist-x86_64-msvc and dist-x86_64-linux (maybe also OS X? not sure) overwrite the default to thin. So if the default changes to off, we could start building much slower rustc artifacts for all targets except x64 Linux and x64 Windows.

Edit: I'm not sure if the defaults in the modified .toml example files are used in CI though. If not, then it's fine.

lqd · 2023-01-26T17:37:40Z

What do you mean by "notify people"?

Nothing necessarily major, but like the announcements at the beginning of t-compiler meetings, a zulip message there, and/or one to the wg-compiler-performance zulip stream. Not a full MCP or Inside Rust blog post I don't think, just a heads up that this is coming. Using these profiles is the preferred way described in the dev guide IIRC, and the perf difference could therefore surprise many contributors ?

jyn514 · 2023-01-26T17:38:35Z

@Kobzol no, CI intentionally doesn't use any of the customized defaults (I want to change it to use profile = "user" at some point, but currently it doesn't even do that).

Kobzol · 2023-01-26T17:39:19Z

Ah, good to know :) Then it should be fine.

Mark-Simulacrum · 2023-01-29T18:54:04Z

r=me after a notification at next meeting, I'll nominate for putting it on the agenda.

michaelwoerister · 2023-02-02T15:48:47Z

Did anyone look into how this affects ./x test? In such a case we are invoking the (now slower) compiler thousands of times.

EDIT: I don't mean to block this PR -- it's just something we might want to look out for after it has landed.

clubby789 · 2023-02-02T17:14:39Z

$ hyperfine -L mode off,thin-local -s "x clean; x build library --config config.{mode}.toml" "x t tests/ui --force-rerun --config config.{mode}.toml"
Benchmark 1: x t tests/ui --force-rerun --config config.off.toml
  Time (mean ± σ):     192.653 s ±  7.313 s    [User: 926.090 s, System: 301.098 s]
  Range (min … max):   186.157 s … 200.573 s    3 runs
 
Benchmark 2: x t tests/ui --force-rerun --config config.thin-local.toml
  Time (mean ± σ):     190.043 s ±  4.840 s    [User: 852.896 s, System: 298.910 s]
  Range (min … max):   185.453 s … 195.100 s    3 runs
 
Summary
  'x t tests/ui --force-rerun --config config.thin-local.toml' ran
    1.01 ± 0.05 times faster than 'x t tests/ui --force-rerun --config config.off.toml'

pnkfelix · 2023-02-02T20:14:00Z

We discussed this at today's T-compiler triage meeting.

We concluded by saying we'd like to see this PR move forward, and we'll just need to keep an eye on whether there's negative fallout from the change in defaults. But overall, we expect that a decent proportion of the developers will benefit from this change.

@rustbot label: -I-compiler-nominated

Nothing necessarily major, but like the announcements at the beginning of t-compiler meetings, a zulip message there, and/or one to the wg-compiler-performance zulip stream.

oh, and after re-reading @lqd 's message above, I opted to also post a topic about this PR moving forward in the t-compiler/performance zulip stream.

pnkfelix · 2023-02-02T20:14:27Z

@bors r=simulacrum rollup=never

(I opted not to roll this up. I know its not supposed to affect perf.rlo, but if anything that's all the more reason to ensure that its very easy to bisect, if someone is locally observing a regression and wants to track it down to this.)

bors · 2023-02-02T20:14:29Z

📌 Commit 2adf26f has been approved by simulacrum

It is now in the queue for this repository.

bors · 2023-02-03T01:19:07Z

⌛ Testing commit 2adf26f with merge f02439d...

bors · 2023-02-03T04:49:39Z

☀️ Test successful - checks-actions
Approved by: simulacrum
Pushing f02439d to master...

rust-timer · 2023-02-03T06:10:25Z

Finished benchmarking commit (f02439d): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	1.9%	[1.3%, 3.0%]	3
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Cycles

This benchmark run did not return any relevant results for this metric.

Sync codegen defaults with compiler defaults and add a ping message so they stay in sync Looks like this got missed in rust-lang#107241.

rustbot assigned Mark-Simulacrum Jan 23, 2023

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) labels Jan 23, 2023

clubby789 force-pushed the bootstrap-lto-off branch from b582992 to d6549fc Compare January 23, 2023 21:28

ehuss reviewed Jan 24, 2023

View reviewed changes

clubby789 force-pushed the bootstrap-lto-off branch from d6549fc to 54b6917 Compare January 24, 2023 11:32

This comment was marked as outdated.

Sign in to view

jyn514 approved these changes Jan 26, 2023

View reviewed changes

src/bootstrap/defaults/config.compiler.toml Show resolved Hide resolved

Add rust.lto=off to bootstrap

2adf26f

clubby789 force-pushed the bootstrap-lto-off branch from 54b6917 to 2adf26f Compare January 26, 2023 12:25

clubby789 changed the title ~~Add rust.lto=off to bootstrap~~ Add rust.lto=off to bootstrap and set as compiler/library default Jan 26, 2023

Mark-Simulacrum added the I-compiler-nominated Nominated for discussion during a compiler team meeting. label Jan 29, 2023

rustbot removed the I-compiler-nominated Nominated for discussion during a compiler team meeting. label Feb 2, 2023

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 2, 2023

bors added the merged-by-bors This PR was explicitly merged by bors. label Feb 3, 2023

bors merged commit f02439d into rust-lang:master Feb 3, 2023

rustbot added this to the 1.69.0 milestone Feb 3, 2023

bors mentioned this pull request Feb 3, 2023

bootstrap: Include line numbers in debuginfo by default for library/compiler profile #104968

Closed

jyn514 mentioned this pull request Mar 1, 2023

Sync codegen defaults with compiler defaults and add a ping message so they stay in sync #108616

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `rust.lto=off` to bootstrap and set as compiler/library default #107241

Add `rust.lto=off` to bootstrap and set as compiler/library default #107241

clubby789 commented Jan 23, 2023

rustbot commented Jan 23, 2023

ehuss Jan 24, 2023 •

edited

Loading

Kobzol commented Jan 25, 2023

clubby789 commented Jan 25, 2023 •

edited

Loading

Kobzol commented Jan 26, 2023

albertlarsan68 commented Jan 26, 2023 •

edited

Loading

lqd commented Jan 26, 2023

This comment was marked as outdated.

Kobzol commented Jan 26, 2023 •

edited

Loading

jyn514 left a comment

lqd commented Jan 26, 2023

Kobzol commented Jan 26, 2023

jyn514 commented Jan 26, 2023

jyn514 commented Jan 26, 2023

Kobzol commented Jan 26, 2023 •

edited

Loading

lqd commented Jan 26, 2023

jyn514 commented Jan 26, 2023 •

edited

Loading

Kobzol commented Jan 26, 2023

Mark-Simulacrum commented Jan 29, 2023

michaelwoerister commented Feb 2, 2023 •

edited

Loading

clubby789 commented Feb 2, 2023

pnkfelix commented Feb 2, 2023 •

edited

Loading

pnkfelix commented Feb 2, 2023 •

edited

Loading

bors commented Feb 2, 2023

bors commented Feb 3, 2023

bors commented Feb 3, 2023

rust-timer commented Feb 3, 2023

Add rust.lto=off to bootstrap and set as compiler/library default #107241

Add rust.lto=off to bootstrap and set as compiler/library default #107241

Conversation

clubby789 commented Jan 23, 2023

rustbot commented Jan 23, 2023

ehuss Jan 24, 2023 • edited Loading

Choose a reason for hiding this comment

Kobzol commented Jan 25, 2023

clubby789 commented Jan 25, 2023 • edited Loading

Kobzol commented Jan 26, 2023

albertlarsan68 commented Jan 26, 2023 • edited Loading

lqd commented Jan 26, 2023

This comment was marked as outdated.

Kobzol commented Jan 26, 2023 • edited Loading

jyn514 left a comment

Choose a reason for hiding this comment

lqd commented Jan 26, 2023

Kobzol commented Jan 26, 2023

jyn514 commented Jan 26, 2023

jyn514 commented Jan 26, 2023

Kobzol commented Jan 26, 2023 • edited Loading

lqd commented Jan 26, 2023

jyn514 commented Jan 26, 2023 • edited Loading

Kobzol commented Jan 26, 2023

Mark-Simulacrum commented Jan 29, 2023

michaelwoerister commented Feb 2, 2023 • edited Loading

clubby789 commented Feb 2, 2023

pnkfelix commented Feb 2, 2023 • edited Loading

pnkfelix commented Feb 2, 2023 • edited Loading

bors commented Feb 2, 2023

bors commented Feb 3, 2023

bors commented Feb 3, 2023

rust-timer commented Feb 3, 2023

Overall result: no relevant changes - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Add `rust.lto=off` to bootstrap and set as compiler/library default #107241

Add `rust.lto=off` to bootstrap and set as compiler/library default #107241

ehuss Jan 24, 2023 •

edited

Loading

clubby789 commented Jan 25, 2023 •

edited

Loading

albertlarsan68 commented Jan 26, 2023 •

edited

Loading

Kobzol commented Jan 26, 2023 •

edited

Loading

Kobzol commented Jan 26, 2023 •

edited

Loading

jyn514 commented Jan 26, 2023 •

edited

Loading

michaelwoerister commented Feb 2, 2023 •

edited

Loading

pnkfelix commented Feb 2, 2023 •

edited

Loading

pnkfelix commented Feb 2, 2023 •

edited

Loading