-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow users to override or filter list of exported symbols passed to the WASM linker #104130
Comments
A similar issue seems to be #73958, though that applies to staticlibs too (the flag proposed here may help that use case as well) |
Even if I manually invoke |
Some data for why customizing the exported symbols at link time is superior to post-processing methods such as wasm-snip: In the ICU4X library, we export approximately 500 symbols from our WASM file. Most clients need only a small subset of the 500 symbols. I did a test where I removed about 300 of the symbols (those in the formatting and collation components) with two methods:
In both cases, the source Results:
CC @fitzgen Steps to reproduce:
1. Download the icu4x repository at commit 7d4a5983e0303ff9e8ea87efae0f33ea8da4eeba
2. Create a file `build.sh` with the following contents:
#! /usr/bin/bash
set -e
# Set toolchain variable to a default if not defined
ICU4X_NIGHTLY_TOOLCHAIN="${ICU4X_NIGHTLY_TOOLCHAIN:-nightly-2022-04-05}"
# Install Rust toolchains
rustup toolchain install ${ICU4X_NIGHTLY_TOOLCHAIN}
rustup +${ICU4X_NIGHTLY_TOOLCHAIN} component add rust-src
# 100 KiB, working around a bug in older rustc
# https://github.com/unicode-org/icu4x/issues/2753
# keep in sync with .cargo/config.toml
WASM_STACK_SIZE=100000
BASEDIR=$(dirname "$(realpath "$0")")
# Build the WASM library
RUSTFLAGS="-Cpanic=abort -Copt-level=s -C link-arg=-zstack-size=${WASM_STACK_SIZE} -Clinker-plugin-lto -Ccodegen-units=1" cargo +${ICU4X_NIGHTLY_TOOLCHAIN} build \
-Z build-std=std,panic_abort -Z build-std-features=panic_immediate_abort \
--target wasm32-unknown-unknown \
--release \
--package icu_capi_cdylib \
--features wasm_default
// This file is part of ICU4X. For terms of use, please see the file
// called LICENSE at the top level of the ICU4X source tree
// (online at: https://github.com/unicode-org/icu4x/blob/main/LICENSE ).
#[diplomat::bridge]
pub mod ffi {
use alloc::boxed::Box;
#[diplomat::enum_convert(core::cmp::Ordering)]
#[diplomat::rust_link(core::cmp::Ordering, Enum)]
pub enum ICU4XOrdering {
Less = -1,
Equal = 0,
Greater = 1,
}
}
pub mod bidi;
// pub mod calendar;
// pub mod collator;
pub mod common;
pub mod data_struct;
// pub mod date;
// pub mod datetime;
// pub mod datetime_formatter;
// pub mod decimal;
pub mod errors;
pub mod fallbacker;
// pub mod fixed_decimal;
// pub mod list;
pub mod locale;
// pub mod locid_transform;
pub mod logging;
// pub mod normalizer;
// pub mod normalizer_properties;
// pub mod pluralrules;
pub mod properties_maps;
pub mod properties_sets;
pub mod provider;
pub mod script;
// pub mod segmenter_grapheme;
// pub mod segmenter_line;
// pub mod segmenter_sentence;
// pub mod segmenter_word;
// pub mod time;
// pub mod timezone;
// pub mod timezone_formatter;
// pub mod week;
// pub mod zoned_formatter;
|
What happens if you run wasm-opt after wasm-snip? Does that win back the 40kb? |
Running
So I get back some of the difference, but there's still a 30 kB diff. |
As discussed in #73958, the root cause seems to be There are a few directions here:
Any of the above could also be generalized to non-wasm linkers if applicable, but I am mostly only aware of what goes on in wasm-ld. |
From talking in Zulip it seems like a productive route forward would be to only implement the allowlist/denylist, ideally in a way that works for all platforms probably need to start here. rust/compiler/rustc_codegen_ssa/src/back/linker.rs Lines 1499 to 1514 in c1a859b
and rust/compiler/rustc_codegen_ssa/src/back/write.rs Line 1013 in c1a859b
|
@naricc is hitting this issue as well; see unicode-org/icu4x#2868 (reply in thread) |
Done by #105405 which will be in rust 1.67. |
Indeed |
I'm currently using a custom linker script as a workaround for this; would really like if this could be entirely controlled from the rustc CLI. We can either do a prettier includelist/excludelist solution (option 4) or a simpler one that removes all exports and relies on link-args to add them back (option 3) |
There's a pretty nasty bug where Cargo can't generate the shared library itself, and we need to do in CMake. This isn't a huge deal, as we can simply link to the static library created by Cargo. This is an upstream issue where Rust marks unknown symbols as local, and they never appear in the dynamic symbol table. There is an open issue to allow more granular filtering (rust-lang/rust#104130) as well as an upstream CXX issue (dtolnay/cxx#1331)
Currently, the way export visibility is handled in WASM is that rustc manually passes a list of all exported functions to the WASM linker:
rust/compiler/rustc_codegen_ssa/src/back/linker.rs
Lines 1312 to 1314 in d69c33a
In almost every mode of compilation for Rust it doesn't matter too much if the exports list covers more than is needed, because there's a linker at the very end dealing with this. Rlibs and staticlibs may end up containing extra symbols, but the end binary will have dead code removed by the final linker. Dynamic libraries end up containing extra symbols which is a disk size issue (not a huge deal), but the dynamic linker's capable of only paging in the necessary bits.
The landscape is very different for WASM, however. In WASM, the final binary is a .wasm file that gets sent over the network, and its size matters. It's impractical to compile large libraries to a single .wasm file and recommend everyone use it since most users will not be using 99% of that wasm file.
There are tools for doing dead code elimination in JS, usually going by the name "tree shaking". Furthermore it's often practical for a WASM user to just list out the symbols they need.
With all of this, it would be useful to be able to customize which symbols get exported as a part of the linker process, rather than updating the code (as different library clients will need different things and it might not always be practical to pepper the library with a hundred client specific cfgs, nor is it an appropriate separation of concerns)
It would be nice if rust had a pair of codegen flags that would:
rust-lld --export
rust-lld --export
, to "filter out" featurescc @sffc
The text was updated successfully, but these errors were encountered: