Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: #[export] (dynamically linked crates) #3435

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
398 changes: 398 additions & 0 deletions text/0000-export.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,398 @@
- Feature Name: `export`
- Start Date: 2023-04-19
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)

# Dynamically Linked Crates

This is a proposal for a new `#[export]` attribute to greatly simplify
the creation and use of dynamic libraries.

This proposal complements the ["crabi" ABI](https://github.com/rust-lang/rust/pull/105586) proposal.

## Problem statement

Imagine a simple library crate with just one simple function, and an application that uses it:

```rust
//! library crate

pub fn hello() {
println!("Hello!");
}
```

```rust
//! application crate

fn main() {
library::hello();
}
```

By default, Cargo will automatically build both crates and **statically** link them into a single binary.

However, there are many reasons why one might want to **dynamically** link the library instead.
The use cases for dynamic linking can be roughly split into two:

1. Cases where both the dynamic library and application are compiled with the exact same compiler (on the same platform, with the same settings) and shipped together.
2. Cases where dynamic library and application can be compiled and shipped separately from each other.

The first situation is currently relatively well supported by Rust.
The Rust compiler itself falls into this category, where we ship a single `librustc_driver.so` (or .dll or equivalent)
file that is used by `rustc`, `rustfmt`, `rustdoc`, and `clippy`.
The motivation is simply to reduce the binary size of the overall package containing all these tools.

The second situation has far more use cases and currently not supported well by Rust.
A common use case is a library that is shipped as part of the system (e.g. `libz.so` or `kernel32.dll`),
in which case you want to use the version provided by the system the program is run on,
and not from the system it was compiled on.
In these cases, dynamically linking is important to make sure the library can be independently updated.
(And it also helps to not blow up binary sizes.)

We need a good solution for this second category of use cases.

### Solution today

Currently, a way to implement this would make use of a combination of `extern "C"`, `#[no_mangle]` and `unsafe`,
each of which has major downsides.

It'd look something like this:

```rust
//! library crate

pub fn hello() {
println!("Hello!");
}

#[no_mangle]
pub extern "C" fn some_unique_name_for_hello() {
hello();
}
```

```rust
//! library bindings crate

#[link(name = "library")]
extern "C" {
fn some_unique_name_for_hello();
}

#[inline]
pub fn hello() {
unsafe { some_unique_name_for_hello() };
}
```

```rust
//! application crate

fn main() {
library_bindings::hello();
}
```

This is bad. It's very verbose and error prone. More specifically:

- `#[no_mangle]` is needed to export a symbol under a stable name, but it requires manually picking a good unique name that won't collide with other items from other crates.
- A stable ABI is necessary to allow linking code from a different compiler (version),
but `extern "C"` puts severe limitations on the function signatures,
as most Rust types can't directly pass through the C ABI.
- `unsafe` code is required, because the compiler cannot validate the imported symbol matches the expected function signature.
Importing the wrong library (with the same symbol name) could result in unsoundness.
- There are now two library crates: one that will be compiled into the dynamic library (the .dll/.so/.dylib file),
and one that provides the bindings to that dynamic library.
The second library likely fully inlined into the final application,
as it only has wrappers, just to bring back the original (safe) function signatures.

Much of this solution could be automated by a procedural macro,
but splitting a library crate in two falls outside of what a procedural macro can reasonably do.

### Proposed solution sketch

Instead of all the manual usage of `#[no_mangle]`, `extern`, and `unsafe`,
a much better solution would look as closely as possible to the original code.

With the proposal below, one only needs to add an `#[export]` attribute, and give the function a stable ABI
(e.g. `extern "C"` or (in the future) `extern "crabi"`):

```rust
//! library crate

#[export]
pub extern "C" fn hello() {
println!("Hello!");
}
m-ou-se marked this conversation as resolved.
Show resolved Hide resolved
```

```rust
//! application crate

fn main() {
library::hello();
Copy link
Contributor

@petrochenkov petrochenkov May 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my main question after seeing this example is whether this proposal requires a stable Rust metadata format, or it just works on symbol names.

How is library::hello resolved here?

Does library contain some metadata saying that the crate root module contains a name hello and it's a function?
That means keeping and reading Rust metadata.
Keeping metadata format compatible across rustc versions (by making it stable, or by keeping logic for reading all previous metadata versions) would put a lot of maintenance burden on rustc development, maybe a prohibitive amount of maintenance burden.

Or we just hash the whole library::hello path to obtain some string like hello_HASH and dlsym that string from the dynamic library?
In that case hashing the path only seems not to be enough, because it means the argument types and return type are not included into the hash.
Do we need to hash the whole call expression library::hello() instead? Even then we don't know the return type to hash.
Also, if the hash requires knowing types then resolution of such paths needs to be delayed until type checking, is that right?

I don't like any of these alternatives, TBH.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you fully read the RFC? The library::hello path in that example is resolved no different than it is today. There is still a crate named library with a hello item. I'm not proposing to somehow enable generate symbol names out of thin air based on just the call expression. The symbol name comes from the definition of the item.

whether this proposal requires a stable Rust metadata format, or it just works on symbol names

A metadata format is perhaps a future possibility, but definitely not part of this RFC. This RFC proposes symbols with a hash based on all information relevant for type safety. That hash does need to be stable, but can start out with strict limitations for only certain kinds of signatures and then slowly extended over time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The library::hello path in that example is resolved no different than it is today.

Then I don't understand how can the dynamic library and the application be built by different versions of rustc ("2. Cases where dynamic library and application can be compiled and shipped separately from each other.").

To resolve paths like we do it today we need to read a lot of various Rust metadata from the dynamic library, but the metadata details typically change between every rustc release and rustc 1.N.0 won't be able to correctly parse metadata generated by rustc 1.M.0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the whole RFC, and I have an impression that either I'm missing something big, or it focuses in great detail on secondary issues while skipping on the primary problem with the metadata format incompatibility.

Copy link
Contributor

@digama0 digama0 May 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's say that we want to dynamically link library compiled by rust 1.M.0 with main compiled by 1.N.0 (where N > M for concreteness). My understanding is that you would build both main and library with cargo + rustc 1.N.0 while signaling that the library dependency is dynamic = true. This is basically exactly like a statically linked build process except that the code of library is not emitted and instead symbols using the stable ABI mangling scheme are called instead, i.e. main would call _Rlibrary5hello_ABCD or what have you.

The previous compile of the library crate by 1.M.0 also produced a function of the name _Rlibrary5hello_ABCD because it had the same stable ABI hash, so when these two are linked together (as one would do for an FFI binding) everything works out. The metadata itself is not shared.

There are some future work discussions about being able to skip parts of library while building main on 1.N.0 (since most of that work would be discarded anyway) but it doesn't seem to be part of the RFC as written.

Copy link
Contributor

@petrochenkov petrochenkov May 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, the "Building Dynamic Dependencies" paragraph talks about this, although in a way that is obscured by potential future optimizations.

So, we

  • still need the source code for the dynamic library and all its dependencies;
  • instead of doing cargo build on that dynamic library we are now basically doing cargo check on it (with rustc 1.N.0) to get its rmeta.

And the benefits we get (compared to just building the dynamic library with 1.N.0) are

  • saving some time on not doing codegen;
  • having opportunity to make changes to the pre-build dynamic library as long as they don't sufficiently diverge from the previously published source code used for generating rmeta.

All this sounds more or less reasonable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still need the source code for the dynamic library and all its dependencies

Since we only need the metadata and not the function bodies, we could skip some things. Once cargo has a separation between public and private dependencies, we can skip private dependencies entirely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need private dependencies if they are proc macros. Proc macros may expand to #[export] attributes or items with #[export] attributes.

}
```

The library can then be either linked statically or dynamically, by informing cargo of the choice:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think static/dynamic is the correct distinction to make here. I can see a use case for statically linked dependencies that require a stable ABI: proprietary libraries. This is quite common on Windows with C/C++ libraries.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. Perhaps that means we should use slightly different names, but I don't think that changes anything else substantially about the RFC.


```diff
[dependencies]
- library = { path = "..." }
+ library = { path = "...", dynamic = true }
```

## Proposal

Creating and using dynamic libraries involves three things:

1. A stable ABI that can be used for the items that are exported/imported.
2. A way to export and import items.
3. A way to create and use dynamic libraries.

For (1) we currently only have `extern "C"`, which only suffices for very simple cases.
This proposal does not include any improvements for (1),
but the ["crabi" proposal](https://github.com/rust-lang/rust/pull/105586) proposes the creation
of a new `extern "…"` ABI that is more flexible, which perfectly complements this proposal.

This proposal provides solutions for (2) and (3).
Exporting (and importing) items is done through a new language feature: the `#[export]` attribute.
Creating and using dynamic libraries is made easy through a new Cargo feature: `dynamic` dependencies.

### The `#[export]` Attribute

The `#[export]` attribute is used to mark items which are "stable" (in ABI/layout/signature)
such that they can be used across the border between (separately compiled) dynamically linked libraries/binaries.

The `#[export]` attribute can be applied to any public item that is *exportable*.
Which items are *exportable* is something that can increase over time with future proposals.
Initially, only the following items are *exportable*:

- Non-generic functions with a stable ABI (e.g. `extern "C"`)
for which every user defined type used in the signature is also marked as `#[export]`.
- This includes type associated functions ("methods").
- Structs/enums/unions with a stable representation (e.g. `repr(i32)` or `repr(C)`).
- Re-exports of those items (`use` statements, `type` aliases).
m-ou-se marked this conversation as resolved.
Show resolved Hide resolved

An `#[export]` attribute can also be applied to a crate, module, and non-generic type `impl` block,
which is simply equivalent to applying the attribute to every public item within it.
m-ou-se marked this conversation as resolved.
Show resolved Hide resolved

For types, the `#[export]` attribute represents the commitment to keep the representation of the type stable.
(To differentiate from, for example, a `#[repr(i32)]` that only exists as an optimization rather than as a stable promise.)

For functions, the `#[export]` attribute will make the function available from the dynamic library
under a stable "mangled" symbol that uniquely represents its crate and module path *and full signature*.
(More on that below.)

For aliases of functions, an `#[export]` attribute on the `use` statement will use the
path (and name) of the alias, not of the original function.
(So it won't 'leak' the name of any (possibly private/unstable) module it was re-exported from.)
m-ou-se marked this conversation as resolved.
Show resolved Hide resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider throwing in a quick mention that removing #[export] is considered a major breaking change since it affects the ABI. I think this is pretty clear, but will let me have an official document I can cite in cargo-semver-checks when we implement a semver-check for this.

Suggested change
Removing `#[export]` from an item that was previously exported is a major breaking change,
since it removes that item from the stable ABI.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How ABI stability relates to API version is an interesting question. I'll have to think about that a bit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you'd like a sounding board to bounce ideas off of, give me a ping!

I started the cargo-semver-checks project so I've been trying to answer the same question coming from the opposite direction — e.g. we already treat removing #[repr(C)] as a major breaking change.

### Privacy

It is an error to export an item that is not public, or is part of a non-public module.
The set of exported items of a crate will always be a subset of the crate's public interface.
m-ou-se marked this conversation as resolved.
Show resolved Hide resolved

It's fine to `#[export]` a public alias of a public type from a private module:

```rust
mod a {
pub extern "C" fn f() { … }
}

#[export]
pub mod b {
pub use super::a::f;
}
```

(This will export the function f as `b::f`.)

### Importing Exported Items

Normally, when using a crate as a dependency, any `#[export]` attributes of that crate have no effect
and the dependency is statically linked into the resulting binary.

When explicitly specifying `dynamic = true` for the dependency with `Cargo.toml`,
or when using a `extern dyn crate …;` statement in the source code,
only the items marked as `#[export]` will be available and the dependency will be linked dynamically.

### Building Dynamic Dependencies
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does Cargo handle the case where a crate is used as a dynamic dependency in one place in the crate graph, but as a normal dependency in another place? I presume in that case it is still compiled as a normal dependency, but the crate declaring it as a dynamic dependency is still restricted to only using exported items.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose I'd just defer that to the 'unresolved questions' section of the tracking issue. I think I agree with your expectation, but I don't feel strongly about it.

Copy link
Member

@programmerjake programmerjake Jun 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one issue we'll have with compiling a crate once for making a .so file and again for making a program that links against the .so is that compiling the same code multiple times isn't guaranteed to produce the same output, an extreme example is a proc macro designed to produce random numbers, which then end up as constants in the generated code and/or symbol names.

e.g.: https://crates.io/crates/const-random which is a dependency of the very widely used hashbrown crate

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of this RFC is that you do not need the exact same crate for creating the .so and for linking to it. (So you can swap compatible versions, compile it with a different compiler or with different settings, etc.) You only need something that is ABI compatible, and having a mismatch does not result in memory unsafety thanks to all relevant safety information being hashed into the symbols.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having a mismatch does not result in memory unsafety thanks to all relevant safety information being hashed into the symbols.

I think there's one point where this does not work: when safety depends on functional properties of upstream code. Imagine crate A exporting a fn make_int() -> i32. Create B re-exports that function. Crate C does something like

if A::make_int() != B::make_int() { unsafe { unreachable_unchecked(); } }

This is perfectly sound as long as B promises that its function always behaves like that of A. Even if an updated A changes the integer returned, things will keep working.

But if we are in a dynamic linking situation, can't it happen that we get one version of the function via B (since maybe it got inlined or whatever) but a different version via A (since maybe we are dynamically linking that one)? Basically exactly the reason we need hashes for types with private fields?

OTOH it seems like there are other issues in that situation (e.g. a static mut in A that would suddenly behave differently because now there are 2 copies of it), so I might also be misunderstanding what exactly is going on during dynamic linking here.

Copy link
Contributor

@digama0 digama0 Jun 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RalfJung Did you mean to call make_int() there, or compare the function pointers? Obviously unless make_int() is a pure function we would not necessarily be allowed to assume it always returns the same value, even if it was A::make_int() != A::make_int() in the comparison.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right I was assuming a promise of A that the function is pure.

(It's common for unsafe code to depend on functional properties of safe upstream code like that.)


When using `dynamic = true` for a dependency, there is no need to build that full crate:
only the signatures of its exported items are necessary.
Cargo will pass a flag to the Rust compiler which will stop it from generating
code for non-exported items and function bodies.
Comment on lines +223 to +226
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would imagine that the default behavior of Cargo in this case would be to build the full crate, but emit it as a separate .so file. Building just the current crate without the dynamic dependency would be done only if the metadata is available without the source code, but this is future work and not part of the initial plan.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends a bit on the use case what makes most sense as the default. E.g. I wouldn't expect cargo to start building libjsonparser.so if we expect that library file to already exist on the target system. (Just like how we don't attempt to build libc.so or kernel32.dll or similar.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something that I could see being different for different profiles too. E.g. you may prefer to build from source for cargo test so you don't need the correct version installed, but any release builds would want to link the system. It would be nice if there were some way to configure this.

I assume the rustc flag will be something like --extern=dynamic=... that treats the extern crate like a header file?


A clear separation between "public dependencies" (which used in the interface)
and "private dependencies" (which are only used in the implementation) is required
to avoid building unnecessary indirect dependencies.
A system for that has been proposed in [RFC 1977](https://rust-lang.github.io/rfcs/1977-public-private-dependencies.html).

### Name Mangling and Safety
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will the format of the crate metadata used to lookup symbol names and function signatures be? Will it be some stable binary format or a rust like syntax that users can directly write? The former is faster, but the later is easier to avoid breaking across rustc versions and makes it easier to prevent the user accidentally breaking the ABI by requiring the interface file to be modified for this to happen. Also will dylibs embed the crate metadata as a whole, just a hash to check compatibility or not at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exact hashing algorithm is something to be designed later (hopefully in an incremental way so we can support e.g. i32 before we support more complicated types).

Ideally it should be specified in a way that it can also be generated by other tools that are not rustc.

Dynamic libraries will include these hashes in the symbol names. Including more information can be useful for debugging tools, but is not necessary. (See the "future possibilities" section.)


Because a dynamic dependency and the crate that uses it are compiled separately
and only combined at runtime,
it is impossible for the compiler to perform any (safety, borrow, signature, …) checks.
However, making a (perhaps accidental) change to a function signature or type
should not lead to undefined behavior at runtime.

There are two ways to solve this problem:

1. Make it the responsibility of the user.
2. Make it the responsibility of the loader/linker.

Option (1) simply means making everything `unsafe`, which isn't very helpful to the user.
Option (2) means the loader (the part that loads the dynamic library at runtime) needs to perform the checks.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean dynamic linker here, or is this to say that dynamic linking of rust libs would be skipped at initialization, and evaluated lazily by the loader? On that note, I don't see any mention of dynamic loading rust libs anywhere; maybe it's out of scope, but seems like it should be mentioned, if only as future work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I used "loader" as synonym for "dynamic linker" (e.g. /lib64/ld-linux-x86-64.so.2 or w/e your system uses). The part of the operating system that loads dynamic libraries and resolves symbols when executables are loaded.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah they're typically the same file in practice, I just mean to ask whether libraries would be linked/loaded before or after .init.


Unless we ship our own loader as part of Rust binaries,
we can only make use of the one functionality available in the loaders of all operating systems:
looking up symbols by their name.

So, in order to be able to provide safety, the symbol name has to be unique for the full signature,
including all relevant type descriptions.

To avoid extremely long symbol names that contain a full (encoded) version of the function signature
and all relevant type descriptions, we use a 128-bit hash based on all this information.

For example, an exported item in `foo::bar` in the crate `mycrate` would be exported with a symbol name such as:

```
_RNvNtC_7mycrate3foo3bar_f8771d213159376fafbff1d3b93bb212
```
Comment on lines +261 to +263
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to be able to handle multiple semver incompatible versions of the same crate. We currently include the StableCrateId in the symbol name for this which includes all -Cmetadata arguments. Cargo uses a separate -Cmetadata value for every crate version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we can properly support that. If e.g. libmycrate.so is separately compiled, we cannot (and should not) make the symbols dependent on the crate id it will have in a dependency tree of whatever binary will use this library later.

Perhaps we want to include just the major version in the symbol names, but not everyone uses versions (a cargo concept) or follows semver, nor does the version of the public API need to match the public of the stable ABI.

Maybe we need an crate-global attribute to optionally specify the ABI version (or even just a random hash) for the crate, to allow for symbols with identical paths and signatures of multiple identically named crates to exist at once, although I imagine most use cases would not use e.g. two different versions of librustls.so at the same time. (If that's a supported use case, they should probably be called librustls1.so and librustls2.so, using rustls1 and rustls2 as the name used in the symbols rather than just rustls.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The StableCrateId does not depend on any dependencies. It only depends on the crate name, if it is a lib or bin and all -Cmetadata arguments (in sorted order). The SVH (crate hash) does depend on dependencies. In addition the current cargo impl hashes dependency versions into the -Cmetadata argument it passes, but that can easily be changed to only pass the major part of the semver version.

Copy link
Member Author

@m-ou-se m-ou-se May 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could work, but I'm not convinced the API version and the ABI version should be treated as one and the same. Renaming structs or non-exported pub functions/methods etc. etc. could result in an incompatible API while preserving a compatible ABI. And in the other direction, a change in invariants or private fields could result in an incompatible ABI while preserving a compatible API.

As I mentioned above, it seems a weird situation if a binary somehow ends up depending on two different librustls.so, considering they'd both have to have the same file name. (And if those are called librustls1.so and librustls2.so, then the rustls1 and rustls2 names are probably what should be used in the symbol names rather than just rustls. And that could just be a build option to cargo or a property in Cargo.toml.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These arguments actually bring into question whether there should be a supported ABI version in Cargo as well. Since Cargo already reconciles latest-compatible updates for APIs, why shouldn't it do the same with ABIs too?


Where the first part is the (mangled) path and name of the item,
and the second part is the hexadecimal representation of a 128 hash of all relevant signature and type information.
m-ou-se marked this conversation as resolved.
Show resolved Hide resolved
The hash algorithm is still to be determined.

(See also the "alternatives" section below.)

### Type Information

As mentioned above, the hash in a symbol name needs to cover _all_ relevant type information.
However, exactly which information is and isn't relevant for safety is a complicated question.

#### Types with Public Fields

For a simple user defined type where all fields are both public, like the `Point` struct below,
the relevant parts are the size, alignment, and recursively all field information.

```rust
#[export]
#[repr(C)]
pub struct Point {
pub x: f32,
pub y: f32,
pub name: &str,
}
```

The `#[export]` attribute is the user's commitment to keep the type stable, but without `unsafe`,
any mistakes should _not_ result in unsoundness.
Accidentally changing the struct to swap the `x` and `name` fields should result in a different hash,
such that the `f32` won't get interpreted as a `&str`, for example.

Note that, technically, the names of the type and the fields are not relevant, at least _not for memory safety_.
Swapping the `x` and `y` fields result in surprises and bugs and shouldn't be done,
but it won't result in undefined behaviour, since any Rust code can swap the fields without using `unsafe`.

However, for public fields, the field names are already part of the stable API, so we include them in the hash as well.

It is an error to use a plain `#[export]` attribute on a type with out stable `#[repr(…)]`,
if it has any private fields,
or if any of the fields are not of an `#[export]`ed or builtin type.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is builtin type right? For example, *const Foo is stable ABI but is opaque-ish if Foo isn't export too. &str is also builtin but doesn't feel like we should commit to a stable ABI for (at least yet). Maybe worth listing the specific types here rather than naming a category?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this work is largely being offloaded to the "crabi" ABI RFC. Probably the best thing to say here would be "FFI-safe type" in the sense of the improper_ctypes lint.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that lint is appropriate here, it has the same questions/problems with it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why this RFC in particular would need to solve that problem though, the problem is pre-existing and this RFC isn't trying to address it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This RFC is prescribing an error, rather than a lint, and seems intended to start with a very small set and expand over time. So it seems to me that it is trying to do something different; it's also a safe interface AFAICT - so that also matters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I should clarify this section a bit. The rules I had in mind are a bit stricter than what I wrote down in the RFC. (Matching your expectations.)


#### Types with Private Fields
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that dealing with private fields is very complex and probably not something that many people can commit to. I would rather see a system where a type with private fields is treated as an opaque extern type (#1861). This effectively means that you can only interact with this type by reference and all operations on it must be done through exported functions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with that for many use cases, especially once "crabi" supports dyn trait objects. But for types like NonZero or File, we should be able to allow them to be used by value, since the invaraint / private details will likely not change. (We can still pick a new hash if File does change, but "it's just a file descriptor / handle" will be true for the forseeable future.)

I do agree this is a feature that should be less often than opaque dyn-like objects or fully public types, which is why I think it's fine that exporting a type with private fields requires an unsafe attribute, steering people away to safer alternatives for most situations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This effectively means that you can only interact with this type by reference and all operations on it must be done through exported functions.

I don't see how that helps about the private invariant issue -- if the invariants change, it is quite important that this is considered a separate type, even if everything is by-reference.


For types where not all fields are public, the situation is much more complicated.

Private fields usually come with certain *invariants*, and come with `unsafe` code that makes assumptions about them.
For example, the private fields of a `Vec` are assumed to represent a valid "owned" pointer to an allocation together with its capacity and initialized size.

If it would be possible to define a identically named type with the same fields but different (or no) invariants/assumptions,
or just change the invariants in an existing library,
it'd be possible to cause undefined behavior by loading the "wrong" dynamic library.

Therefore, we can't allow a regular `#[export]` attribute on a type with private fields,
since we have no way of automatically determining the invariants / unsafe assumptions about private fields.

Instead, for these types, we must require the user to *unsafely* commit to
m-ou-se marked this conversation as resolved.
Show resolved Hide resolved
ABI stability if they want to make the type available to exported functions.

Using `#[export(unsafe_stable_abi = «hash»)]`, one can make the (unsafe) promise
that the type will remain ABI compatible as long as the provided hash remains the same.
The hash must have been randomly generated to ensure uniqueness (which is part of the unsafe promise).
m-ou-se marked this conversation as resolved.
Show resolved Hide resolved
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see a brief mention of tool support here. People aren't typically familiar with the facilities to generate random numbers locally.

You can generate an unsafe_stable_abi hash by setting it to an empty string and running cargo fix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't be surprised if rust-analyzer would get some kind of "generate new random number" action for these attributes.

Producing a random number as a suggestion in the diagnostic when leaving the attribute empty (which could be used by cargo fix) seems helpful.

Copy link
Contributor

@digama0 digama0 May 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why does it have to be a random number in the first place? As I mentioned before, this is terrible for code review - if the point is to avoid some kind of collision then it is susceptible to spoofing. A simple incrementing number or string should be sufficient for ABI versioning purposes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple incrementing number or string should be sufficient for ABI versioning purposes.

That is not sufficient. It needs to be globally unique. Otherwise you'd need to (unsafely!) assume that the same (crate) name always refers to the exact same codebase with linear versioning. If a crate is forked and both forks introduce a different new safety invariant on private fields, it would go wrong if both just increment their number, resulting in the same number being used for two ABI-incompatible but identically named types.

In the Rust ecosystem there is no expectation that crate/package names are globally unique. (There exists a lot of Rust code outside of crates.io.) There is no expectation to put e.g. a domain name in the package/crate name to make them unique, like in some other languages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also why I suggested a string instead of a number, it's a lot easier to put necessary disambiguating information in a string compared to a number. But I don't see how this changes the equation really, since it's unsafe either way. The rule is not "it must be globally unique", it is "this safety invariant must exactly match the safety invariant on any other ABI compatible types with the same unsafe_stable_abi which are linked with this crate" which is a much more manageable auditing task.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which are linked with this crate

The problem is that "this crate" is meaningless. Outside crates.io, crate names aren't unique.

We could easily allow a string literal instead of a hash, such that you can write "the first field must be a valid file descriptor and the second field must be a prime number" (and hash that) rather than specifying a random token like b58d7aabb705df024fa578d9a0e20a7c, but I'm not sure if that's much better.

Using GUIDs to uniquely identify something seems pretty standard for many forms of identification related to dynamic linking/FFI in various languages.

Another option would be to specify one random hash/guid for the whole crate at the top level, and then use only simple incremental numbers for the types in the crate. But that has a higher chance of resulting in issues when working on multiple versions/branches, if two branches both bump a number up to the same new number with incompatible safety invariants.

Copy link
Contributor

@digama0 digama0 Jun 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which are linked with this crate

The problem is that "this crate" is meaningless. Outside crates.io, crate names aren't unique.

"This crate" is meaningful, I mean this particular linker object that is linked with other things. This constraint is not something the crate author can check, and the (non)uniqueness of crate names doesn't really matter here. It is the author of the final binary who has the responsibility to not link multiple copies of a crate with the same unsafe crate ABI and different actual semantics, and we can provide some assistance in making it difficult to accidentally break this rule, but we can't check this condition, and using hashes only obscures the problem, it doesn't make it any easier to solve.

We could easily allow a string literal instead of a hash, such that you can write "the first field must be a valid file descriptor and the second field must be a prime number" (and hash that) rather than specifying a random token like b58d7aabb705df024fa578d9a0e20a7c, but I'm not sure if that's much better.

Really? A string with the literal safety comment would be much easier to verify than a random string of digits (of unknown provenance). And since it's free-form, crates can set up their own policies on how to use the field, i.e. use sequential versioning, hierarchical versions, short identifiers, or full safety comments.

How do I know that your UID is actually unique? Maybe the original programmer knows because they used some "generate random number" utility themselves but what about everyone else?

Using GUIDs to uniquely identify something seems pretty standard for many forms of identification related to dynamic linking/FFI in various languages.

Hand validation of GUIDs isn't something anyone does though, and anything which is unsafe basically has a "please validate me" sign on it. Hashing crate names in mangled names is fine since the generation and uniqueness of these IDs is handled by rustc itself, no one has to look at them directly. Putting them in the source code is a completely different ballgame.

Another option would be to specify one random hash/guid for the whole crate at the top level, and then use only simple incremental numbers for the types in the crate.

Certainly you could mix a crate hash into the type's hash, although that would prevent use cases where two types are deliberately sharing the same unsafe crate ABI despite being defined in different crates (because they are actually supposed to be ABI compatible).


```rust
#[export(unsafe_stable_abi = "ca83050b302bf0644a1417ac3fa6982a")]
#[repr(C)]
pub struct ListNode {
next: *const ListNode,
value: i32,
}
```

In this case, using the type as part of a function signature will not result in a hash based on the full (recursive) type definition,
but will instead be based on the user provided hash.

### Standard Library

Once the ["crabi"](https://github.com/rust-lang/rust/pull/105586) feature has progressed far enough,
we should consider adding `#[export]` attributes to some standard library types, effectively committing to a stable ABI for those.
For example, `Box`, `Vec`, `Option`, `String`, `NonZero`, `File`,
and many others are good candidates for `#[export(unsafe_stable_abi)]`
(if the "crabi" ABI doesn't already handle them specially).
m-ou-se marked this conversation as resolved.
Show resolved Hide resolved

## Future Possibilities

- A `#[no_export]` attribute, which can be useful when marking an entire crate or module as `#![export]`.
- Options within the `#[export(…)]` attribute for e.g. a future name mangling scheme version.
- `#[export(by_path)]` to export a symbol only based on the path, without the hash of relevant safety/type information.
This is useful in situations where safety is not a primary concern, to simplify cases like using Rust code from another language (with a simple symbol name).
(Importing (or using) such a symbol from Rust will be `unsafe`.)
- `#[export(opaque)]` (or `#[export(indirect)]`) for opaque types that can only be used indirectly (e.g. through a pointer or reference) in exported items,
such that their size is not a stable ABI promise.
- Exportable `static`s.
- Exportable `trait`s, for e.g. dynamic trait objects. (See also https://github.com/rust-lang/rust/pull/105586.)
- A tool to create a stripped, 'dynamic import only' version of the crate source code,
with only the exported items, without the function bodies.
- Allow exporting two identically named items to create a dynamic library that is backwards compatible with an older interface,
including both a symbol for the old and new interface.
- Next to the hash of the type information,
additionally and optionally include the full type information in an extra section,
to allow for (debug) tools to accurately diagnose mismatched symbol errors.
m-ou-se marked this conversation as resolved.
Show resolved Hide resolved

## Alternatives

- Alternatives for using a hash of all relevant type information:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this RFC might want to talk about why this is the right choice where we previously made a different one in the move from legacy to v0 (which moved away from a hash, at least mostly). I think given the use case here this may be better, but the discussion seems worthwhile.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like a good idea to me to combine both of them -- v0 mangling so debuggers and similar can get the full source symbol name including generic arguments, and the hash to detect ABI breaks (ABI breaks won't necessarily change the v0 mangling, such as reordering struct fields)

- Don't include type information in the symbols,
but make using a dynamic dependency `unsafe` by requiring e.g. `unsafe extern dyn crate …;`.
- Include the full (encoded) type information in the symbols, without hashing it.
This results in extremely long symbol names, and all the type information will be recoverable
(which might be useful or might be undesirable, depending on the use case).
This can result in significantly larger binary sizes.
- Don't include type information in the symbols, but include the information in another way (e.g. an extra section).
If we do this, we can't make use of the loader/linker for the safety checks,
so we'll have to include extra code in Rust binaries that will perform the checks separately
before using any dynamic dependency.

## What this Proposal is not

Questions like

- How do panics propagate across dynamically linked crates or FFI boundaries?
- How can allocated types can cross an export boundary and be dropped/deallocated on the other side?

are **not** solved by `#[export]`, but instead are the responsibility of the ABI.

The existing `extern "C"` ABI 'solves' these by simply not having any such features.

The [`extern "crabi"` ABI](https://github.com/rust-lang/rust/pull/105586)
will attempt to solve these (but perhaps not in the first version),
but that falls outside the scope of this RFC.

(A separate RFC for the first version of "crabi" might very well appear soon. ^^)

Separately, the question of how this will be (optionally) used for the standard library is another question entirely,
which is left for a later proposal. (Although the hope is that this RFC gives at least a rough idea of how that might work.)