-
Notifications
You must be signed in to change notification settings - Fork 685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable Link-Time Optimization (LTO) and codegen-units = 1 #317
Comments
Makes sense, can you create a PR for an additional |
Sure. One question: do I need to change anything in a CI pipeline (if any exists for the project) to enable using the |
There is no CI pipeline yet 😞. I'll add it after I get around to #287 |
Git this at fresh fee20e5 build:
|
Hi!
I noticed that in the
Cargo.toml
file Link-Time Optimization (LTO) for the project is not enabled. I suggest switching it on since it will reduce the binary size (always a good thing to have) and will likely improve the application's performance a little bit. If you want to read more about LTO and its possible modes, I recommend starting from this Rustc documentation.I think you can enable LTO only for the Release builds so as not to sacrifice the developers' experience while working on the project since LTO consumes an additional amount of time to finish the compilation routine. If you think that a regular Release build should not be affected by such a change as well, then I suggest adding an additional
dist
orrelease-lto
profile where in addition to regularrelease
optimizations LTO will also be added. Such a change simplifies life for maintainers and others interested in the project persons who want to build the most optimized version of the application. However, if we enable it on the Cargo profile level for the Release profile, users, who install the application withcargo install
will get the LTO-optimized version of the game "automatically". E.g., checkcargo-outdated
Release profile. You also could be interested in other optimization options likecodegen-units = 1
, etc.Basically, it can be enabled with the following lines:
I have made quick tests (Fedora 41, Rust 1.82, the latest version of the project,
CC=clang CXX=clang++ cargo build --release --workspace
command) - here are the results (first results - the original Release profile, second results - the original Release profile + Fat LTO +codegen-units = 1
):zluda_with
: from 405 Kib to 361 Kibzluda_bindgen
: from 8.4 Mib to 5.8 Miblibcuda_base.so
: from 4.6 Mib to 4.4 Miblibnvcuda.so
: from 5 Mib to 4.8 Miblibnvml.so
: almost no changeslibptx_parser_macros.so
: from 4 to 3.7 Miblibzluda_dump.so
: from 3.2 Mib to 2.8 Miblibzluda_redirect.so
: almost no changesI haven't made performance tests since I don't know how to do it for the project.
Thank you.
The text was updated successfully, but these errors were encountered: