Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Optimize validator binaries with LTO #4311

Closed
crystalin opened this issue Nov 17, 2021 · 14 comments
Closed

Optimize validator binaries with LTO #4311

crystalin opened this issue Nov 17, 2021 · 14 comments

Comments

@crystalin
Copy link

crystalin commented Nov 17, 2021

Currently, the polkadot binary is generated with the default release profile. This allow a good binary with a decent compilation time.
However using compilations options CARGO_PROFILE_RELEASE_LTO=true RUSTFLAGS="-C codegen-units=1" cargo build --release , it provides a much more efficient binary, 200-400% cpu perf (at the cost of twice longer to compile).

I suggest to not include it in the Cargo.toml to avoid people developing on it to get a longer compilation time. In Moonbeam, we include it only in the CI for the release binary and we added those flags in the documentation for operators building their own binary.

I believe it is important to start optimizing the validators as it is becoming a bottleneck to process the parachain blocks in time.

@ordian
Copy link
Member

ordian commented Nov 17, 2021

cc @s3krit

@bkchr
Copy link
Member

bkchr commented Nov 17, 2021

And cc @chevdor

Please realize this already for the 0.9.13 release.

I would also propose that we push a commit to each release branch that adds lto = true to the Cargo.toml file. As @crystalin said, this is relative expensive and we don't want this on master, but I don't see any reason why not add this to the release branches. The positive effect of this would be that everybody, who builds the binary on its own, would build with lto by default.

@ordian
Copy link
Member

ordian commented Nov 17, 2021

Do you know how much does RUSTFLAGS="-C codegen-units=1" bring on top of LTO?

We can also add codegen-units = 1 in https://doc.rust-lang.org/cargo/reference/profiles.html#release along with lto = true.

@burdges
Copy link
Contributor

burdges commented Nov 17, 2021

Just fyi, RUSTFLAGS="-C codegen-units=1" changes trait object pointer equality, only in a beneficial way of course. Yet, it's conceivable someone writes code that depends upon codegen-units=1 and breaks when compiled otherwise. I don't think this hurts anything, and running production on codegen-units=1 and tests without is probably ideal, but just fyi..

@crystalin
Copy link
Author

> Do you know how much does RUSTFLAGS="-C codegen-units=1"` bring on top of LTO?

We can also add codegen-units = 1 in https://doc.rust-lang.org/cargo/reference/profiles.html#release along with lto = true.

codegen-units=1 is not bringing any improvement, it can be skipped

@crystalin
Copy link
Author

FYI, other optimizations that we tried:

  • PGO (didn't give good result, but it is hard to setup properly)
  • target-cpu=native (Good result for recent CPU, but can't be applied to generic release. In Moonbeam we provide 3 binaries in release, moonbeam, moonbeam-skylake, moonbeam-znver3 to satisfy 80% of our collators)

@s3krit
Copy link
Contributor

s3krit commented Dec 2, 2021

The compilation flags provided in the OP (CARGO_PROFILE_RELEASE_LTO=true RUSTFLAGS="-C codegen-units=1" cargo build --release) have been integrated into our internal build system for building the release binaries, packages, docker images, etc. Cheers!

@bkchr
Copy link
Member

bkchr commented Dec 2, 2021

Latest rust release is bringing exactly what we want here! Custom profiles: https://blog.rust-lang.org/2021/12/02/Rust-1.57.0.html#cargo-support-for-custom-profiles

@ggwpez
Copy link
Member

ggwpez commented Jan 6, 2022

The compilation flags provided in the OP (CARGO_PROFILE_RELEASE_LTO=true RUSTFLAGS="-C codegen-units=1" cargo build --release) have been integrated into our internal build system for building the release binaries, packages, docker images, etc. Cheers!

So the binaries are using the flags, but the benchmarks are run without, right? @s3krit

@ghzlatarev
Copy link

CARGO_PROFILE_RELEASE_LTO=true RUSTFLAGS="-C codegen-units=1" cargo build --releas

Why would you not add it to your benchmarking workflows as well btw ?

@ggwpez
Copy link
Member

ggwpez commented Jan 11, 2022

Why would you not add it to your benchmarking workflows as well btw ?

This is coming up soon paritytech/substrate#10608
@ghzlatarev

@ggwpez
Copy link
Member

ggwpez commented Jan 21, 2022

I do not see anything regarding codegen-units=1 in the Cargo.toml nor github CI files.
In Substrate this leads for performance increase, so we could add it additionally to the lto=true in the Cargo.toml.

@chevdor
Copy link
Contributor

chevdor commented Jan 25, 2022

@ggwpez I do confirm that we do use it for the builds but this is rather ad'hoc right now and we should simply build using the profile in the Cargo.toml. I will make a PR to fix that.

In the meantime, I do confirm that we build using:

  • CARGO_PROFILE_RELEASE_LTO=true
  • RUSTFLAGS="-C codegen-units=1"

chevdor added a commit that referenced this issue Jan 25, 2022
paritytech-processbot bot pushed a commit that referenced this issue Jan 25, 2022
* Add codeden-units=1

ref #4311

* opt-level to 3

* Fix opt-level

* Refactor apt-level into the release profile
chevdor added a commit that referenced this issue Jan 28, 2022
* Add codeden-units=1

ref #4311

* opt-level to 3

* Fix opt-level

* Refactor apt-level into the release profile
chevdor added a commit that referenced this issue Jan 28, 2022
* Add codeden-units=1

ref #4311

* opt-level to 3

* Fix opt-level

* Refactor apt-level into the release profile

fix #4780
chevdor added a commit that referenced this issue Jan 28, 2022
* Update deps

* Fix release profile (#4778)

* Add codeden-units=1

ref #4311

* opt-level to 3

* Fix opt-level

* Refactor apt-level into the release profile

fix #4780

* wrong if-case - backport of #4798 into v0.9.16 (#4800)

* fixup

* fmt

* fix tests

* Rk fix sorting 0.9.16 (#4806)

* Fix incomplete sorting.

* fmt.

* Better test.

* Update runtime/parachains/src/disputes.rs

Co-authored-by: Bastian Köcher <[email protected]>

* cargo fmt

Co-authored-by: Bastian Köcher <[email protected]>

* Refactor check validation outputs - Backport of #4727 into v0.9.16 (#4801)

* Move PersistedValidationData check into

* Address feedback

* Remove incorrect comment

* Update runtime/parachains/src/inclusion/mod.rs

* fmt

* Add logging

Co-authored-by: Robert Habermeier <[email protected]>
Co-authored-by: Andronik <[email protected]>

Co-authored-by: Bernhard Schuster <[email protected]>
Co-authored-by: Robert Klotzner <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Lldenaurois <[email protected]>
Co-authored-by: Robert Habermeier <[email protected]>
Co-authored-by: Andronik <[email protected]>
Wizdave97 pushed a commit to ComposableFi/polkadot that referenced this issue Feb 3, 2022
* Add codeden-units=1

ref paritytech#4311

* opt-level to 3

* Fix opt-level

* Refactor apt-level into the release profile
@s3krit s3krit removed their assignment Mar 14, 2022
@mrcnski
Copy link
Contributor

mrcnski commented Apr 23, 2023

I'm curious what the status is here? If in CI we already build production binaries with LTO, can this be closed?

(Not sure we should always have lto = true for release builds as the compilation would take too long for average Joe's. But there should be something in e.g. the Validator Guide for validators to build with this setting (i.e. the production profile) if there isn't already.)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants