-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Add an alias
attribute to #[link] and -l
#1296
Closed
Closed
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,316 @@ | ||
- Feature Name: `link_alias` | ||
- Start Date: 2015-09-24 | ||
- RFC PR: (leave this empty) | ||
- Rust Issue: (leave this empty) | ||
|
||
# Summary | ||
|
||
Add a new `alias` attribute to `#[link]` and the `-l` flag which indicates that | ||
the linkage will happen through another annotation to inform how a library is | ||
linked. This is then leverage to inform the compiler about dllimport and | ||
dllexport with respect to native libraries on the MSVC platform. | ||
|
||
# Motivation | ||
|
||
Most of the time a linkage directive is only needed to inform the linker about | ||
what native libraries need to be linked into a program. On some platforms, | ||
however, the compiler needs more detailed knowledge about what's being linked | ||
from where in order to ensure that symbols are wired up correctly. | ||
|
||
For example, on MSVC, if a symbol imported from a native library is actually | ||
imported from a DLL, it is linked to differently than if it's imported from a | ||
native library that's linked statically. In linkage terms, importing a function | ||
from a DLL requires the compiler to tag the import with "dllimport", an | ||
attribute to LLVM, in order for the native library to be linked correctly. | ||
|
||
Currently the compiler is not able to correctly place dllimport annotations on | ||
imports from native libraries as it has no knowledge of whether a symbols is | ||
being imported from a DLL or not. Blocks of symbols (those wrapped in an | ||
`extern` directive) may be tagged with `#[link]` in which case the compiler | ||
could reasonbly infer where the symbols come from, but not all `extern` blocks | ||
are always tagged as such: | ||
|
||
* Many native libraries are linked via the command line via `-l` which is passed | ||
in through Cargo build scripts instead of being written in the source code | ||
itself. As a recap, a native library may change names across platforms or | ||
distributions or it may be linked dynamically in some situations and | ||
statically in others which is why build scripts are leveraged to make these | ||
dynamic decisions. | ||
* Many `extern` blocks are empty to have the `#[link]` directives located | ||
elsewhere either for convenience or for organizational purposes. | ||
|
||
Another example of where the compiler needs more knowledge about how to deal | ||
with symbols from native libraries (specifically on MSVC) is that any symbol | ||
exported from a DLL also needs to be tagged as such. This specifically comes up | ||
whenever the compiler produces a dylib which contains some statically linked | ||
native libraries. If any of the native libraries' symbols are reachable via the | ||
public API, then the symbols need to be tagged with dllexport. | ||
|
||
Similar to the dllimport situation, the compiler cannot currently reason about | ||
the connection between symbols and what libraries they came from, so the | ||
compiler isn't able to place dllexport annotations to ensure that symbols are | ||
exported correctly. | ||
|
||
Overall, the common motivation of these scenarios is that the compiler does not | ||
always have a connection between a set of symbols from a native library and | ||
which native library they came from. By enabling the compiler to have this | ||
knowledge in all situations the it can automatically handle dllexport/dllimport | ||
on MSVC and perhaps do more-clever things on Unix in the future. | ||
|
||
# Detailed design | ||
|
||
Two changes will be made to the compiler, the first to add an `alias` kind to | ||
native libraries and the second is how treatment of native libraries with | ||
respect to dllimport and dllimport will change. | ||
|
||
### Adding `alias` | ||
|
||
The addition of aliases is handled in two separate locations, the `#[link]` | ||
attribute and the `-l` flag. At a high level, each of these will be able to | ||
introduce a named alias for the library also being linked, and the attribute | ||
form `#[link]` will be able to reference these aliases. First, let's look of the | ||
way to introduce an alias for a library being linked. | ||
|
||
#### Introducing an alias | ||
|
||
First, the existing `#[link]` attribute and `-l` flag will introduce an alias of | ||
the same name as the name of the library being linked. Both of the following | ||
forms will introduce an alias of the name "foo" pointing to the native library | ||
"foo": | ||
|
||
```rust | ||
#[link(name = "foo")] | ||
``` | ||
|
||
``` | ||
-l foo | ||
``` | ||
|
||
The purpose of `alias`, however, will be to introduce a name that's not exactly | ||
the same as the native library's name (because the native library could have | ||
different names across platforms/distributions). The `#[link]` attribute will be | ||
extended with an `alias` key as well as the `-l` flag being extended: | ||
|
||
```rust | ||
#[link(name = "bar", alias = "foo")] | ||
``` | ||
|
||
``` | ||
rustc -l alias=foo=bar | ||
``` | ||
|
||
(note that `-l` argument looks like this Cargo build script) | ||
|
||
```rust | ||
fn main() { | ||
println!("cargo:rustc-link-lib=alias=foo=bar"); | ||
} | ||
``` | ||
|
||
These alias forms mean the dynamic library "bar" is linked but also introduces | ||
an alias "foo" for the library. Note that in all of these cases an optional | ||
`kind` can also be specified: | ||
|
||
```rust | ||
#[link(name = "bar", alias = "foo", kind = "static")] | ||
``` | ||
|
||
``` | ||
rustc -l alias=foo=static=bar | ||
``` | ||
|
||
With these introduction forms the compiler now has a mapping from all native | ||
libraries being linked to a set of aliases that library is known under. The | ||
compiler will use this to construct a mapping from alias name to native library | ||
for use in the next section. | ||
|
||
#### Using an alias | ||
|
||
Now that we've established aliases for all native libraries being linked as part | ||
of a compilation, the compiler will also support a `#[link]` attribute of the | ||
form: | ||
|
||
```rust | ||
#[link(alias = "foo")] | ||
``` | ||
|
||
The compiler will resolve the alias name "foo" to a native library using the | ||
mapping built up from the introduction forms above. For example this attribute | ||
above would be connected with a directive or a flag that looked like: | ||
|
||
```rust | ||
#[link(name = "foo")] // implicit alias is called "foo" | ||
#[link(name = "bar", alias = "foo")] // explicitly aliased as `foo` | ||
``` | ||
|
||
``` | ||
rustc -l foo # implicit alias is called "foo" | ||
rustc -l alias=foo=bar # "bar" is explicitly aliased as "foo" | ||
``` | ||
|
||
A `#[link(alias = "...")]` annotation is required to resolve to some native | ||
library, and an error will be generated if the alias has not been introduced. | ||
The culmination of this is now the compiler can connect an alias directive to a | ||
native library to understand that the symbols in the `extern` block are | ||
contained in that native library. | ||
|
||
#### Alias examples | ||
|
||
Some example usage of introducing aliases and then using them looks like: | ||
|
||
```rust | ||
// compiled with: rustc -l alias=a1=d1 | ||
|
||
#[link(name = "lib1")] | ||
extern {} | ||
|
||
// aliases the library defined above | ||
#[link(alias = "lib1")] | ||
extern {} | ||
|
||
// aliases the library `d1` on the command line | ||
#[link(alias = "a1")] | ||
extern {} | ||
|
||
// also aliases the library `d1` on the command line | ||
#[link(alias = "d1")] | ||
extern {} | ||
|
||
// introduce the alias `a2` for `lib2 | ||
#[link(name = "lib2", alias = "a2")] | ||
extern {} | ||
|
||
// aliases the library `lib2` above | ||
#[link(alias = "a2")] | ||
extern {} | ||
``` | ||
|
||
### Treatment of dll{import,export} | ||
|
||
As a recap, let's take a look at today's treatment of dll{import,export} in the | ||
compiler with respect to native libraries. Note that the terms "functions" and | ||
"static" here refer to those defined in native libraries (e.g. connected via | ||
FFI). Currently dllimport is only applied when a static is referenced through an | ||
external crate. References to locally declared statics or functions in any | ||
circumstance never have dllimport applied. The compiler also currently has an | ||
unstable `#[linked_from]` attribute which it leverages to apply the dllexport | ||
annotation to statically linked native libraries, but otherwise dllexport is | ||
never applied. This treatment is incorrect in [a number of ways][issue], and is | ||
a strong motivating factor for this RFC! | ||
|
||
[issue]: https://github.com/rust-lang/rust/issues/27438 | ||
|
||
Armed with `alias` directives, the compiler is now able to handle | ||
dllimport/dllexport correctly in all cases for native libraries. The dllimport | ||
attribute will be applied to all symbols in an `extern` block if that block has | ||
any linkage directive indicating that the symbols are linked via a dynamic | ||
library. (e.g. following alias pointers to their concrete linkage directives). | ||
Similarly, dllexport will only be applied to a block of symbols if a directive | ||
indicates that they're linked statically. | ||
|
||
Example application of dllexport/dllimport looks like: | ||
|
||
```rust | ||
// compiled with: rustc -l l1 | ||
|
||
#[link(alias = "l1")] | ||
extern { | ||
// dllimport applied, dllexport not applied | ||
} | ||
|
||
#[link(name = "l2")] | ||
extern { | ||
// dllimport applied, dllexport not applied | ||
} | ||
|
||
#[link(name = "l3", kind = "static")] | ||
extern { | ||
// dllimport not appplied, dllexport applied if linked staticaly to dylib | ||
} | ||
|
||
extern { | ||
// dllimport not applied, dllexport not applied | ||
} | ||
``` | ||
|
||
# Drawbacks | ||
|
||
For libraries to work robustly on MSVC, the correct `#[link]` annotation will | ||
be required. Most cases will "just work" on MSVC due to the compiler strongly | ||
favoring static linkage, but any symbols imported from a dynamic library or | ||
exported as a Rust dynamic library will need to be tagged appropriately to | ||
ensure that they work in all situations. Worse still, the `#[link]` annotations | ||
on an `extern` block are not required on any other platform to work correctly, | ||
meaning that it will be common that these attributes are left off by accident. | ||
|
||
It may be possible in the future for the compiler to parse the output of the | ||
linker for MSVC and detect common error messages which look like dllimport or | ||
dllexport aren't being applied, in which case the compiler could provide a much | ||
nicer error message about how to deal with the problem. | ||
|
||
Another drawback is that the CLI syntax is a little wonky with three `=` | ||
characters in some situations. | ||
|
||
# Alternatives | ||
|
||
* Instead of enhancing `#[link]`, a `#[linked_from = "foo"]` annotation could | ||
replace `#[link(alias = "foo")]` without supporting `alias` in `#[link]` or | ||
`-l`. This has the drawback of not being able to handle native libraries whose | ||
name is unpredictable across platforms in an easy fashion, however. | ||
Additionally, it adds an extra attribute to the comipler that wasn't known | ||
previously. | ||
|
||
* Instead having a desire to connect symbols to libraries, the compiler could | ||
instead simply support a `#[dllimport]` and `#[dllexport]` attribute both for | ||
native symbols. These would directly correspond to the respective attributes | ||
and the burden of deciding when to apply them would be on the author instead | ||
of the compiler. This has a number of drawbacks, however: | ||
|
||
* The annotation burden here is much higher than with `alias` as an | ||
attribute is needed per-function. | ||
* It's not always known whether `#[dllexport]` is needed. If a native | ||
library is statically linked into an rlib then that rlib could later | ||
either become an executable or a DLL itself. In the executable case | ||
`dllexport` needs to not be applied, but in the DLL case it may need to be | ||
applied (if the symbol is reachable). Handling all this logic is possible | ||
from a crate author's perspective, but it would be quite tedious to | ||
replicate this logic across all crates in the ecosystem. | ||
* Similarly, it's not always known whether `#[dllimport]` is needed. Native | ||
libraires are not always known whether they're linked dynamically or | ||
statically (e.g. that's what a build script decides), so setting up the | ||
build script to enable the crate to conditionally emit `dllimport` has an | ||
even higher annotation burden than just applying `#[dllimport]` itself. | ||
|
||
Overall, it appears all usage of manyual dllimport/dllexport can be encoded | ||
via `alias` which has a much smaller annotation burden and is much more robust | ||
in the face of dllexport particularly (e.g. only the compiler really knows | ||
whether the symbols are reachable or not) but also dllimport (auto applying or | ||
not applying depending on how the library is linked). | ||
|
||
* When linking native libraries, the compiler could attempt to locate each | ||
library on the filesystem and probe the contents for what symbol names are | ||
exported from the native library. This list could then be cross-referenced | ||
with all symbols declared in the program locally to understand which symbols | ||
are coming from a dylib and which are being linked statically. Some downsides | ||
of this approach may include: | ||
|
||
* It's unclear whether this will be a performant operation and not cause | ||
undue runtime overhead during compiles. | ||
|
||
* On Windows linking to a DLL involves linking to its "import library", so | ||
it may be difficult to know whether a symbol truly comes from a DLL or | ||
not. | ||
|
||
* Locating libraries on the system may be difficult as the system linker | ||
often has search paths baked in that the compiler does not know about. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh the irony given this is exactly the problem that |
||
|
||
# Unresolved questions | ||
|
||
What does the transition plan for crates look like with these attributes? Today | ||
the compiler's liberal application of dllimport to statics enables many crates | ||
to link correctly, but once this change is implemented that will no longer be | ||
the case. If a crate wants to work on stable it cannot use | ||
`#[link(alias = "foo")]` and on nightly it *must* use it if no other `#[link]` | ||
directive is applied. Is it the case that in this situation `#[link]` is already | ||
applied with `kind = "dylib"`? | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pointing out that
dllexport
is only applied when linked statically to a dylib seems a bit redundant as the kind is already specified as static so we know it will be linked statically anddllexport
only matters when creating a dylib anyway.Also it is important to note that it should only be
dllexport
ed from the first immediate dylib that is created. So if I createa.dll
which depends onb.dll
which statically depends on a static native libraryc.lib
, then symbols fromc.lib
which end up as part of the interface ofb.dll
should bedllexport
ed fromb.dll
but if they then end up in the interface ofa.dll
they should not bedllexport
ed since thec.lib
symbols don't exist ina.dll
. Instead consumers ofa.dll
need to link tob.dll
's import library and get the relevantc.lib
symbols from there.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that is all handled by the compiler as it implicitly understands the dependency graph and what formats libraries are in.