-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: Enable opt-dist for dist-aarch64-linux builds #133807
base: master
Are you sure you want to change the base?
Conversation
Some changes occurred in src/tools/opt-dist cc @Kobzol |
Hi! Could you please split the part that moves the job to the aarch64 runner and the PGO/LTO part? So that we can evaluate the CI cost of these two actions separately. Thanks! |
ENV SCRIPT python3 ../x.py build --set rust.debug=true opt-dist && \ | ||
./build/$HOSTS/stage0-tools-bin/opt-dist linux-ci -- python3 ../x.py dist \ | ||
--host $HOSTS --target $HOSTS --include-default-paths build-manifest bootstrap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either way this is a completely new dockerfile, so do you mean just replace this with a simple ./x dist
call and then wrap it with opt-dist separately? Just in separate commits or separate PRs altogether?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant a separate PR, so that we can land these two changes (move to aarch64 host first, and then enable optimizations) separately :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, just wanted to make sure - no problem :)
What improvements are you seeing with this PR, over the current artifacts? |
I've not yet benchmarked the changes, and I'm not sure how they compare to the artifacts from cross-compilation because I was only doing aarch64 runs but specifically adding opt-dist with LTO and PGO seems to increase the binary sizes of the main artifacts as follows:
|
@bors try Let's also see how long it takes with the optimizations. |
ci: Enable opt-dist for dist-aarch64-linux builds Move the CI dist-aarch64-linux job to an aarch64 runner and enable optimised dist builds with the opt-dist pipeline. For the time being, disable bolt on aarch64 due to upstream bolt bugs. r? `@Kobzol` cc `@lqd`
ci: Enable opt-dist for dist-aarch64-linux builds Move the CI dist-aarch64-linux job to an aarch64 runner and enable optimised dist builds with the opt-dist pipeline. For the time being, disable bolt on aarch64 due to upstream bolt bugs. r? `@Kobzol` cc `@lqd`
That’s not going to be the good try job Jakub :3 |
💔 Test failed - checks-actions |
Ah, crap. Thanks! @bors try |
ci: Enable opt-dist for dist-aarch64-linux builds Move the CI dist-aarch64-linux job to an aarch64 runner and enable optimised dist builds with the opt-dist pipeline. For the time being, disable bolt on aarch64 due to upstream bolt bugs. r? `@Kobzol` cc `@lqd` try-job: dist-aarch64-linux
☀️ Try build successful - checks-actions |
So that's an extra hour for LTO+PGO without the cache. 2h22 vs 3h22. |
@bors try |
ci: Enable opt-dist for dist-aarch64-linux builds Move the CI dist-aarch64-linux job to an aarch64 runner and enable optimised dist builds with the opt-dist pipeline. For the time being, disable bolt on aarch64 due to upstream bolt bugs. r? `@Kobzol` cc `@lqd` try-job: dist-aarch64-linux
☀️ Try build successful - checks-actions |
1h54 cached, not so bad. Back to roughly the same time as the x86 cross build then. |
I assume good benchmark results can also help with the cost discussion. |
Indeed! You can download the CI artifacts e.g. using rustup-toolchain-install-master and benchmark it locally using rustc-perf. It would be nice to see the perf. diff. Let me know on Zulip if you want help with that. |
Huh TIL that this is a thing - very neat, thanks for the suggestion! |
Just for completeness I ran a build with LTO+PGO+BOLT on AArch64. While the bolt hacks and workarounds can get it to work it gets very mixed results for the time being. Some improvements in compile times and on average it's beneficial but the artifact size really explodes. At the moment it doesn't work without out-of-tree patches to llvm anyway. |
Yeah BOLT is quite bad in reusing the old .text segment currently. For BOLT, don't look at instruction counts, these will be artificially inflated by the larger artifact size. For BOLT cycles and wall-time should be improved, and it looks like it is (mean -2% walltime improvement, although wall-time measurements are quite noisy, this looks genuine). |
Is the "instructions" metric in rustc-perf not just "number of instructions executed"? Apart from that yeah, the improvements are there so once BOLT is fixed we can still enable it. |
Yeah, but somewhat counter-intuitively, that includes the load of the binary itself into the address space, the work of the dynamic linker etc., and this can be inflated a lot if the binary suddenly gets a lot larger :) When we enabled BOLT for the Rust compiler (#116352 (comment)), the icounts were horrible, but cycles had really nice improvements. |
Although previously, the icount regressions were mostly for tiny crates, which is what I would expect given the larger binary. In your screenshot, the icount regressions were for large crates, where the binary size of the compiler shouldn't have such an effect. Anyway, we'll deal with BOLT later, no need to dig into that now :) |
Ahh right yeah that makes sense - thank you for the explanation! Exactly, let's cross that bridge when we get to it :) |
02b958d
to
d71a321
Compare
Move the dist-aarch64-linux CI job to an aarch64 runner instead of cross-compiling it from an x86 one. This will make it possible to perform optimisations such as LTO, PGO and BOLT later on.
Move the CI dist-aarch64-linux job to an aarch64 runner and enable optimised dist builds with the opt-dist pipeline. For the time being, disable bolt on aarch64 due to upstream bolt bugs.
d71a321
to
0f4f465
Compare
@Kobzol wanna try opt-dist on centos as well? |
@bors try |
ci: Enable opt-dist for dist-aarch64-linux builds Move the CI dist-aarch64-linux job to an aarch64 runner and enable optimised dist builds with the opt-dist pipeline. For the time being, disable bolt on aarch64 due to upstream bolt bugs. r? `@Kobzol` cc `@lqd` try-job: dist-aarch64-linux
☀️ Try build successful - checks-actions |
Move the CI dist-aarch64-linux job to an aarch64 runner and enable optimised dist builds with the opt-dist pipeline.
For the time being, disable bolt on aarch64 due to upstream bolt bugs.
r? @Kobzol
cc @lqd
try-job: dist-aarch64-linux