-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement cross-language ThinLTO #49879
Comments
I think I personally like option 4 best, although I might spin it a little differently as |
With |
@michaelwoerister yeah I think we could try to pass all the right options by default. I'm not actually sure how you configure full/thin at the linker layer? One neat thing we could do, though, is that if you're on MSVC, for example, we could switch to |
By passing If we did it as I'd love if we could shift all of LTO into the linker completely. But that would mean that we essentially can't use the MSVC linker anymore. And the Make jobserver story would regress too. |
`-Clto=thin,cross`?
…On Fri, Apr 13, 2018, 11:14 Michael Woerister ***@***.***> wrote:
I'm not actually sure how you configure full/thin at the linker layer?
By passing -plugin-opt=thinlto to the linker, I think.
If we did it as -C lto=cross-language, we'd need another way of selecting
thin vs full. Or have -C lto=thin-cross-language, which I find
aesthetically displeasing :)
I'd love if we could shift *all* of LTO into the linker completely. But
that would mean that we essentially can't use the MSVC linker anymore. And
the Make jobserver story would regress too.
—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
<#49879 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AApc0oZ6AV9XuBLXcI0eQQ2QLHzXH5MHks5toF5rgaJpZM4TQJAM>
.
|
@michaelwoerister ah good point, in that case having a separate |
@alexcrichton, do you know how to enable building the |
Ah I've never built it myself so I'm not sure :( |
A little status update:
The next steps are:
|
TODO: Make |
Add some groundwork for cross-language LTO. Implements part of #49879: - Adds a `-Z cross-lang-lto` flag to rustc - Makes sure that bitcode is embedded in object files if the flag is set. This should already allow for using cross language LTO for staticlibs (where one has to invoke the linker manually anyway). However, `rustc` will not try to enable LTO for its own linker invocations yet. r? @alexcrichton
I now got it working the other way round too: inlining C code into a Rust executable. It turned out that the linker plugin was actually properly importing functions from the C module. But then it would refuse to inline the C functions. The missing piece was the |
So here is something interesting: I managed to build the Rust compiler and its LLVM with this and I'm seeing compile times reduced by 15-20% for release builds. Most of the reduction seems to come from LLVM just being faster. Translation is only marginally faster. Although I just realized that this isn't a fair comparison since the non-LTO version is compiled with GCC while the LTO version is compiled with CLANG... |
OK, I did another comparison, this time building LLVM with CLANG instead of GCC and the CLANG version is quite a bit faster. It seems that at least 50% of the speedup observed before is just from building with CLANG 6.0 instead of GCC 5.4. |
@michaelwoerister holy cow! Sounds like we should definitely be building with Clang! Do you have time to work on that or should I try to get that landed? |
@alexcrichton You certainly have more experience with the docker images and sccache. Just switching to Clang shouldn't be too hard. Also enabling ThinLTO is a bit more involved because of the linking step. |
Ok cool. @michaelwoerister what was the benchmark you were using? (to verify the claims as well) FWIW for Linux at least we're using a pretty ancient gcc, 4.7, and newer versions may actually have better optimizations as well |
I was using the regex (current master) and style-servo (from rustc-perf) as benchmarks. Newer GCC versions will probably generate faster code than 4.7 but with Clang we have to future option of also using ThinLTO, so I think that's the better choice in any case. |
Agreed! We mainly gotta figure out how to convince clang to work well with our custom libc builds we have all over the place. I'll work on switching to Clang 6 for everything in the near future. |
How did you deal with rdylibs here? |
What were the absolute numbers of the speedup observed, though? |
I didn't have to do anything special for them. |
@michaelwoerister I believe we'd initially have to just take a hit to compile times. We build LLVM as a static library and link it directly into one of the dynamic libraries that we create. In that sense ThinLTO would probably happen when we compile the I do think we're positioned to turn on ThinLTO with LLVM for tier 1 platforms as soon as we're ready, AFAIK it's mostly rustbuild changes. All tier 1 platforms are using Clang 6 right now to compile LLVM |
Yes, that matches with my observations.
I'll see if I can put together a PR to get some numbers. Ideally we'd want to have cross-lang LTO to speed up Another question: At least on Windows and Linux we should use LLD. Is that available on CI? |
Windows has LLD available through the clang 6 download but I believe that for Linux we'll have to compile LLD from source |
@rust-lang/core, I'd like to get this feature (cross-language LTO) stabilized. Do we need an RFC for that? Or is a tracking issue sufficient? |
I'm not sure which exact approach we'd be stabilizing -- there are 4 described in this issue -- could you explain? |
The feature that would be stabilized is cross-language LTO, meaning we'd provide facilities in the Rust compiler to have inlining and other optimizations across language boundaries, performed via linker-based LLVM-LTO plugins. Concretely a
Note that we don't want to guarantee a particular format for the object files generated and especially not that crates compiled with Some examples of using this: # C dependency in Rust
#=====================
# compile your C code and put it into a static archive
clang -c my_c_code.c -flto=thin -O2 -o my_c_code.o
llvm-ar rv libmy_c_code.a my_c_code.o
# Use rustc to compile your mixed Rust/C program, letting rustc take care of invoking the linker
# If clang/lld is not your default linker
rustc -Ccross-lang-lto -Clinker=clang -Clink-arg=-fuse-ld=lld -L. -O my_rust_code.rs
# If clang/lld *is* your default linker
rustc -Ccross-lang-lto -L. -O my_rust_code.rs
# If you want to use the Gold linker with a specific plugin
rustc -Ccross-lang-lto=<path to LLVMgold.so> -Clink-arg=-fuse-ld=gold -L. -O my_rust_code.rs # Rust dependency in C
#=====================
# Compile your C code prepared for (Thin-) LTO
clang -c my_c_code.c -flto=thin -O2 -o my_c_code.o
# Compile your Rust code prepared for (Thin-) LTO into a staticlib
rustc -Ccross-lang-lto -O --crate-type=staticlib my_rust_code.rs
# Use clang/lld to link everything, including the LTO step
clang -fuse-ld=lld -flto=thin -O2 -L. -lmy_rust_code my_c_code.o |
Thanks! I think we don't need an RFC for this -- it seems like a feature addition that is quite limited. However, we should go through the usual FCP on a tracking issue (e.g., here) and cc stakeholders (not sure who, specifically, though). I think the description you give already does this, but I'd also like to be careful to not stabilize anything LLVM specific or generally dependent on a non "standard" linker feature. But it looks like ld/gold/lld support this in some fashion so I'm happy with this! |
Well, it is LLVM specific, which is an interesting point. The linker plugin mechanism is pretty universal for modern Unix linkers, but the compilers and linker plugins in question all have to be LLVM-based. We might want to take this into account somehow. In my opinion, it's not a problem to stabilize something that is only available with the LLVM backend (like with already have with |
If it's LLVM specific then renaming it seems good -- I understood it as something that is currently LLVM specific, but in theory the underlying format could be used by others, e.g. cranelift, though perhaps with a different IR format. I think it's reasonable to rename the flag to -C llvm-cross-lang-lto or something along those lines -- seems low-cost for users and makes the LLVM dependency explicit for us. |
AIUI, cross language LTO is not particularly "cross language" specific. So why should that appear in the flag name? |
It's the only form of LTO that allows for crossing language boundaries. But I agree, it's not specific to multi-language scenarios. Maybe something like |
Obviously this is an inferior solution than getting the rust compiler to provide a solution, but I played around with what was available so far on the nightly branch and wasn't very satisfied with what I found, so I thought I'd share the workaround I came up with in case it's of interest to anyone else out there who is trying to link C/C++ and Rust code together with clang. It's a simple script that takes a .rlib file and emits a .a file containing the llvm bitcode of the library in a format suitable for passing to
This code is released under the UIUC license: https://en.wikipedia.org/wiki/University_of_Illinois/NCSA_Open_Source_License |
@dwightguth, can you elaborate on what you found lacking with what nightly provides at the moment? |
Well, I just couldn't get it to work with Cargo. No matter what I tried, it either didn't pass the -Z flag to rustc, or it crashed because it couldn't link the binaries that my library depended on. Ultimately I think I needed some dependencies to be built with cross-lang-lto and others without, but since there's no way to specify rust flags on a per crate level, I was stuck. EDIT: also, rust nightly uses llvm 8 and since it has not in fact been released yet, I didn't want to upgrade. |
I'm going to raise another issue here since I ran into it trying to get rust and C++ LTO working together, and I want you to be aware of it because it's possible it might impact this. When I try to extract the bitcode from the standard library (eg crate std), the bitcode extracted apparently does not verify with the system llvm. This occurs even if the llvm version used by rustc and the llvm version used to verify the bitcode seem to match, and it also occurs on the latest stable of rustc. However, if I use the version of lld present in the rust distribution, it works, suggesting that the problem has to do with the patches that rust added to llvm. Is it possible that the rustc llvm has been patched in ways that will change the structure of llvm bitcode? and if so, it seems unlikely that you will be able to take full advantage of lto when compiling multiple languages together unless you distribute a compatible version of llvm yourself. |
Yeah, these issues are probably both related to Rust's and Clang's LLVM version not being compatible. Unless you use the same version for both, things will likely fail. At some point we thought that LLD would be able to handle older versions of bitcode so that everything would be fine as long as your LLD is at least as new as the LLVM of the two compilers -- but that doesn't always seem to be true either. As a consequence one has to be rather careful to use the right compiler versions (which can often only be obtained by building from source). In practice this will probably mean that this optimization will only be usable in niche cases. |
This actually happens when both Rust and Clang are supposedly exactly LLVM 6.0. I suspect the issue has to do with rust's non-upstreamed llvm patches, but I don't know more than that. |
Is there an example project somewhere showing how to build a C library using cargo such that it can be inlined into Rust? I've been trying to get this enabled in the |
…alexcrichton Stabilize linker-plugin based LTO (aka cross-language LTO) This PR stabilizes [linker plugin based LTO](rust-lang#49879), also known as "cross-language LTO" because it allows for doing inlining and other optimizations across language boundaries in mixed Rust/C/C++ projects. As described in the tracking issue, it works by making `rustc` emit LLVM bitcode instead of machine code, the same as `clang` does. A linker with the proper plugin (like LLD) can then run (Thin)LTO across all modules. The feature has been implemented over a number of pull requests and there are various [codegen](https://github.com/rust-lang/rust/blob/master/src/test/codegen/no-dllimport-w-cross-lang-lto.rs) and [run](https://github.com/rust-lang/rust/tree/master/src/test/run-make-fulldeps/cross-lang-lto-clang)-[make](https://github.com/rust-lang/rust/tree/master/src/test/run-make-fulldeps/cross-lang-lto-upstream-rlibs) [tests](https://github.com/rust-lang/rust/tree/master/src/test/run-make-fulldeps/cross-lang-lto) that make sure that it keeps working. It also works for building big projects like [Firefox](https://treeherder.mozilla.org/#/jobs?repo=try&revision=2ce2d5ddcea6fbff790503eac406954e469b2f5d). The PR makes the feature available under the `-C linker-plugin-lto` flag. As discussed in the tracking issue it is not cross-language specific and also not LLD specific. `-C linker-plugin-lto` is descriptive of what it does. If someone has a better name, let me know `:)`
…alexcrichton Stabilize linker-plugin based LTO (aka cross-language LTO) This PR stabilizes [linker plugin based LTO](rust-lang#49879), also known as "cross-language LTO" because it allows for doing inlining and other optimizations across language boundaries in mixed Rust/C/C++ projects. As described in the tracking issue, it works by making `rustc` emit LLVM bitcode instead of machine code, the same as `clang` does. A linker with the proper plugin (like LLD) can then run (Thin)LTO across all modules. The feature has been implemented over a number of pull requests and there are various [codegen](https://github.com/rust-lang/rust/blob/master/src/test/codegen/no-dllimport-w-cross-lang-lto.rs) and [run](https://github.com/rust-lang/rust/tree/master/src/test/run-make-fulldeps/cross-lang-lto-clang)-[make](https://github.com/rust-lang/rust/tree/master/src/test/run-make-fulldeps/cross-lang-lto-upstream-rlibs) [tests](https://github.com/rust-lang/rust/tree/master/src/test/run-make-fulldeps/cross-lang-lto) that make sure that it keeps working. It also works for building big projects like [Firefox](https://treeherder.mozilla.org/#/jobs?repo=try&revision=2ce2d5ddcea6fbff790503eac406954e469b2f5d). The PR makes the feature available under the `-C linker-plugin-lto` flag. As discussed in the tracking issue it is not cross-language specific and also not LLD specific. `-C linker-plugin-lto` is descriptive of what it does. If someone has a better name, let me know `:)`
…alexcrichton Stabilize linker-plugin based LTO (aka cross-language LTO) This PR stabilizes [linker plugin based LTO](rust-lang#49879), also known as "cross-language LTO" because it allows for doing inlining and other optimizations across language boundaries in mixed Rust/C/C++ projects. As described in the tracking issue, it works by making `rustc` emit LLVM bitcode instead of machine code, the same as `clang` does. A linker with the proper plugin (like LLD) can then run (Thin)LTO across all modules. The feature has been implemented over a number of pull requests and there are various [codegen](https://github.com/rust-lang/rust/blob/master/src/test/codegen/no-dllimport-w-cross-lang-lto.rs) and [run](https://github.com/rust-lang/rust/tree/master/src/test/run-make-fulldeps/cross-lang-lto-clang)-[make](https://github.com/rust-lang/rust/tree/master/src/test/run-make-fulldeps/cross-lang-lto-upstream-rlibs) [tests](https://github.com/rust-lang/rust/tree/master/src/test/run-make-fulldeps/cross-lang-lto) that make sure that it keeps working. It also works for building big projects like [Firefox](https://treeherder.mozilla.org/#/jobs?repo=try&revision=2ce2d5ddcea6fbff790503eac406954e469b2f5d). The PR makes the feature available under the `-C linker-plugin-lto` flag. As discussed in the tracking issue it is not cross-language specific and also not LLD specific. `-C linker-plugin-lto` is descriptive of what it does. If someone has a better name, let me know `:)`
This has been stabilized in #58057. Closing. |
What is cross-language LTO?
Rust uses LLVM as its code generation backend, as does the Clang C/C++ compiler and many other languages. As a consequence, all of those LLVM-based compilers can produce artifacts that can partake in a common Link-Time-Optimization step, irrespective of the given source language. Thus, in this context, cross-language LTO means that we enable the Rust compiler to produce static libraries that can make use of LLVM-LTO-based linker plugins as exist for newer versions of
ld
,gold
, and inlld
.Why is cross-language LTO a good thing?
In order for Rust to interoperate with code written in other languages, calls have to go through a C interface. This interface poses a boundary for inter-procedural optimizations like inlining. At the same time inter-procedural optimizations are very important for performance. Cross-language LTO makes this boundary transparent to LLVM, effectively allowing for C/C++ code to be inlined into Rust code and vice versa.
How can it be implemented?
There are several options. The basic requirement is that we emit LLVM bitcode into our object files in a format that the LLVM linker plugin can handle. There are two formats that fulfill this requirement:
.o
files that actually aren't object files but plain LLVM bitcode files..llvmbc
section of the object file.Given these requirements there are a few ways of implementing the feature:
Always emit bitcode into object files instead of storing them as separate files in RLIBs
Just stabilize
-Z embed-bitcode
and require users to do the restAdd a flag that makes
rustc
emit bitcode files instead of object filesAdd a
-C cross-language-lto
flag that (1) makes the compiler embed bitcode into RLIBs and static libraries, and (2) makes the compiler invoke the linker with theLLVMgold.so
plugin, if applicable.rustc
rustc
can skip the redundant ThinLTO step for binaries and dylibsI think I would opt for option (1) since it's the most straightforward to use. EDIT: Added option (4) which I also like.
cc @rust-lang/compiler @alexcrichton
(@rust-lang/wg-codegen might also be interested in this)
The text was updated successfully, but these errors were encountered: