-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow for re-using monomorphizations in upstream crates. #48779
Conversation
r? @estebank (rust_highfive has picked a reviewer for you, use r? to override) |
@bors try |
⌛ Trying commit 35c9b1f2f0a365187789fc684cdb5eba9afb81dd with merge 55221f5e2d6d5c71f6b89674eb29f2a213f415ef... |
☀️ Test successful - status-travis |
@Mark-Simulacrum, could you do a perf run for this too, please? |
Perf queued. Probably about 40-45 minutes until it starts. |
Thanks, @Mark-Simulacrum! Here's the link: |
@Mark-Simulacrum, the results don't seem to be available yet. Is it still in the queue or has something gone wrong? |
9a1af56
to
710e4d6
Compare
Hm, it does look like something went wrong -- I've restarted the build. |
Will the link be the same? |
Hm, it failed again -- I'm going to try and keep an eye on it and hopefully diagnose why, it also turns out we weren't properly logging the failures for try builds previously so I've now corrected that as well. |
URL works now! |
Thanks, Mark! |
OK, so those numbers look good. I had hoped that they would be even better though. It looks like it's mostly small functions that get re-used. But yeah, -15.9% for tokio-web-push, I'll take it |
@rust-lang/compiler & @alexcrichton, do you have any objections to pursuing this further? There's a description at the top and performance numbers are here: http://perf.rust-lang.org/compare.html?start=6f2100b92cb14fbea2102701af6a3ac5814bd06c&end=55221f5e2d6d5c71f6b89674eb29f2a213f415ef&stat=instructions%3Au |
Awesome work here @michaelwoerister! It's pretty neat how it's not to difficult to play around with various schemes like this these days :) One concern I might have here is the size of binaries but given that this only affects debug mode rather than optimized then I guess it doesn't matter too much? We rely on In general though seems like a great idea to me to keep pursuing, any bugs or surprises along the way we can probably smooth over! For the diamond problem you gisted above, is this what all that "link once ODR" stuff is for in LLVM? I feel like that's all basically intended for optimized binaries linking only one copy rather than for debug mode, so it may not benefit us much if we don't turn this on in optimized mode. Speaking of optimized mode though, we may actually be able to get some nice wins here with |
The monomorphizations are still assigned We could look into |
Also, thanks for the feedback, @alexcrichton! |
Oh right that's true, I'd sort of doubt that I think that for executables we don't currently pass linker scripts/symbol whitelists, but AFAIK that's because we just never have before. We could likely start now! |
OK, I'll make sure we do as part of the PR. |
We discussed this in the @rust-lang/compiler meeting today. Everybody felt pretty good about it. It'd be nice to land this and possibly do further experimentation to see if we can enable in optimized builds without hurting perf. |
be1e8f6
to
d4264dc
Compare
…upport Rust dylibs.
679ba55
to
61991a5
Compare
@bors r=alexcrichton |
📌 Commit 61991a5 has been approved by |
Allow for re-using monomorphizations in upstream crates. Followup to #48611. This implementation is pretty much finished modulo failing tests if there are any. Not quite ready for review yet though. ### DESCRIPTION This PR introduces a `share-generics` mode for RLIBs and Rust dylibs. When a crate is compiled in this mode, two things will happen: - before instantiating a monomorphization in the current crate, the compiler will look for that monomorphization in all upstream crates and link to it, if possible. - monomorphizations are not internalized during partitioning. Instead they are added to the list of symbols exported from the crate. This results in less code being translated and LLVMed. However, there are also downsides: - it will impede optimization somewhat, since fewer functions can be internalized, and - Rust dylibs will have bigger symbol tables since they'll also export monomorphizations. Consequently, this PR only enables the `shared-generics` mode for opt-levels `No`, `Less`, `Size`, and `MinSize`, and for when incremental compilation is activated. `-O2` and `-O3` will still generate generic functions per-crate. Another thing to note is that this has a somewhat similar effect as MIR-only RLIBs, in that monomorphizations are shared, but it is less effective because it cannot share monomorphizations between sibling crates: ``` A <--- defines `fn foo<T>() { .. }` / \ / \ B C <--- both call `foo<u32>()` \ / \ / D <--- calls `foo<u32>()` too ``` With `share-generics`, both `B` and `C` have to instantiate `foo<u32>` and only `D` can re-use it (from either `B` or `C`). With MIR-only RLIBs, `B` and `C` would not instantiate anything, and in `D` we would then only instantiate `foo<u32>` once. On the other hand, when there are many leaf crates in the graph (e.g. when compiling many individual test binaries) then the `share-generics` approach will often be more effective. ### TODO - [x] Add codegen test that makes sure monomorphizations can be internalized in non-Rust binaries. - [x] Add codegen-units test that makes sure we share generics. - [x] Add run-make test that makes sure we don't export any monomorphizations from non-Rust binaries. - [x] Review for reproducible-builds implications.
☀️ Test successful - status-appveyor, status-travis |
@@ -531,3 +530,9 @@ impl_stable_hash_for!(struct GeneratorData<'tcx> { layout }); | |||
// Tags used for encoding Spans: | |||
pub const TAG_VALID_SPAN: u8 = 0; | |||
pub const TAG_INVALID_SPAN: u8 = 1; | |||
|
|||
#[derive(RustcEncodable, RustcDecodable)] | |||
pub struct EncodedExportedSymbols { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should have a comment on it, that it's used to avoid adding a 'tcx
parameter to CrateRoot
(which I'm not even sure is a problem, if covariant, we'd just store it as 'static
).
Followup to #48611. This implementation is pretty much finished modulo failing tests if there are any. Not quite ready for review yet though.
DESCRIPTION
This PR introduces a
share-generics
mode for RLIBs and Rust dylibs. When a crate is compiled in this mode, two things will happen:This results in less code being translated and LLVMed. However, there are also downsides:
Consequently, this PR only enables the
shared-generics
mode for opt-levelsNo
,Less
,Size
, andMinSize
, and for when incremental compilation is activated.-O2
and-O3
will still generate generic functions per-crate.Another thing to note is that this has a somewhat similar effect as MIR-only RLIBs, in that monomorphizations are shared, but it is less effective because it cannot share monomorphizations between sibling crates:
With
share-generics
, bothB
andC
have to instantiatefoo<u32>
and onlyD
can re-use it (from eitherB
orC
). With MIR-only RLIBs,B
andC
would not instantiate anything, and inD
we would then only instantiatefoo<u32>
once.On the other hand, when there are many leaf crates in the graph (e.g. when compiling many individual test binaries) then the
share-generics
approach will often be more effective.TODO