-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Target Bundles #1711
Target Bundles #1711
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,246 @@ | ||
- Feature Name: target_bundles | ||
- Start Date: 2016-08 | ||
- RFC PR: (leave this empty) | ||
- Rust Issue: (leave this empty) | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
Combine distribution of standard libraries and targets into bundles for targeting a particular | ||
platform. Such bundle is all you need from the Rust side to cross-compile for your target. | ||
|
||
The major ideas of this RFC is to: | ||
|
||
1. Make JSON targets as full featured as they need to be in order to support specifying targets as | ||
custom as they need to be; | ||
2. Convert current built-in targets to the JSON targets; | ||
3. Change distribution of libstd to include a corresponding JSON target. | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
Currently there’s two different ways rustc targets are distributed: built-in targets and custom | ||
JSON targets. Built-in targets are very inflexible – they cannot be changed without changing and | ||
recompiling the compiler itself. On the other side of the spectrum, custom JSON targets are easy to | ||
adjust and adapt, but feature wise are very limited and rarely are suitable for the more uncommon | ||
use-cases. | ||
|
||
We’ve observed a considerable amount of desire by users of the language to customize targets they | ||
use in the ways currently not supported by our current infrastructure (sans making changes to the | ||
compiler itself, of course), and noted that the current scheme is not very feasible in the long | ||
run. This RFC should go a long way towards fixing the issues. | ||
|
||
Then, there also is a strong need to be able to inspect arbitrary parts of target specification, | ||
regardless of their origin. For example, in a cross-compilation setting, when the crate uses a | ||
build.rs script, `#[cfg]` variables are for the host, rather than the target. This way the author | ||
is forced to parse the target triple and figure out particularities of the target on their own, as | ||
rustc does not provide any way to inspect any of the built-in targets. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, there is That said, I'm in favor of unifying our target specifications. |
||
|
||
# Detailed design | ||
[design]: #detailed-design | ||
|
||
## What constitutes a fully working target? | ||
|
||
In order to have any meaningful discussion about targets we need to decide on what constitutes a | ||
full, complete target. | ||
|
||
Currently to compile for a specific target a number of pieces are necessary: compiler knows | ||
information about the target in some way, there is a set of rust standard libraries compiled for | ||
the target, system has native library dependencies for the target and there exists a linker which | ||
is capable of linking code for the target. | ||
|
||
In this RFC we will *not* propose how to call custom linkers or “extend” capabilities of the LLVM | ||
used. | ||
|
||
## Changes to target format | ||
|
||
### Comments and reuse | ||
|
||
There has been attempt already to migrate built-in targets to JSON targets, but it didn’t go all | ||
the way because of [loss of comments][comments]. One proposal was to migrate towards TOML, however | ||
another, [more elegant solution][jsmin] which allows us to keep using JSON was proposed by Douglas | ||
Crockford himself: | ||
|
||
> Suppose you are using JSON to keep configuration files, which you would like to annotate. Go | ||
> ahead and insert all the comments you like. Then pipe it through JSMin before handing it to your | ||
> JSON parser. | ||
|
||
This RFC proposes to use similar scheme: adjust the build system to remove comments from all the | ||
JSON target specifications before packaging them for use by rustc. This way we get to keep using | ||
JSON and can keep on having comments in the checked-in target specifications. | ||
|
||
Similar preprocessing step could be used to implement some form of target inheritance so the | ||
duplication between built-in targets could be reduced greatly. | ||
|
||
[comments]: https://github.com/rust-lang/rust/pull/34980#issuecomment-234683183 | ||
[jsmin]: https://plus.google.com/+DouglasCrockfordEsq/posts/RK8qyGVaGSr | ||
|
||
### Target quirks and cfg | ||
|
||
Currently we have a few keys dealing with the configuration variables only: `target_os`, | ||
`target_env`, `target_vendor`, `target_family`, `target_endian`, `target_pointer_width`¹, etc; | ||
Then there’s options from which some configuration variables are derived, but they are also used to | ||
tweak compilation: `is_like_windows`, `is_like_osx`, `is_like_solaris`, etc. | ||
|
||
This RFC proposes replacing these keys with a different set of keys which explicitly control | ||
configuration variables and keys which control compilation details: | ||
|
||
* `cfg: {"target_env": "msvc", windows: null }` would result in two cofiguration variables | ||
`#[cfg(windows)]` and `#[cfg(target_env = "msvc")]` evaluating to true, but no variables provided | ||
by the `cfg` key would get used to influence the behaviour of the compiler; and | ||
* `debuginfo: ["CodeView", 1]` some targets require different debuginfo format than what LLVM | ||
generates by default. MSVC targets want CodeView version 1, OS X and Android want Dwarf version 2, | ||
while LLVM appears to use the highest supported version of Dwarf by default. | ||
* similarly for many other variables which tweak the way compilation is done. | ||
|
||
¹: Technically, `target_pointer_width` is used in trans, but it does not provide any extra | ||
information over `data_layout`, which is already used elsewhere in the compiler to calculate | ||
layout, sizing information and alignment of all types, including the pointers. | ||
|
||
This would allow selectively reusing various conditional implementations that are present in the | ||
compiler for a custom target, when none of the `is_like_*` variables would fully suit the target. | ||
Moreover, being able to specify arbitrary cfg variables would allow easily adapting for various | ||
miniscule details related to the targets. For example, the targets for ARM CPUs with NEON support | ||
could export `target_has_neon` without any extra language or compiler support. | ||
|
||
That being said, it might make sense to have a whitelist of stable options and cfg variables and | ||
keep everything else unstable for some duration. | ||
|
||
### Proposed JSON key-values | ||
|
||
```js | ||
{ | ||
// REQUIRED | ||
"llvm_target": "x86_64-unknown-linux-gnu", // LLVM target triple (does not need to match with rustc triple) | ||
"data_layout": "e-m:e-i64:64-f80:128-n8:16:32:64-S128", // DataLayout for the target | ||
"arch": "x86_64", // Architecture of the target | ||
|
||
// OPTIONAL (should have sane defaults) | ||
// Configuration variables injected into the compilation units. | ||
"cfg": { | ||
"target_os": "linux", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Some of these are currently mandatory. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I address this in lines 96-97. Namely, the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I also meant There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
We already use data-layout to calculate stuff like …ehem… layout in rustc, including sizing and alignment information of pointers. Even if data-layout format changed somehow to not contain (or have defaults for) this information anymore, that particular part of the compiler would need some way to have both pointer size and alignment specified. I’d argue that I’m also not proposing to derive
AFAICT there’s exactly two uses of this cfg in the compiler, both when deciding particulars of ABI, and these IMO should be handled by distinct keys. One of the uses would already be handled by |
||
"target_family": "unix", | ||
"target_arch": "x86_64", | ||
"target_endian": "little", | ||
"target_pointer_width": "64", | ||
"target_env": "gnu", | ||
"target_vendor": "unknown", | ||
"target_has_atomic": ["8", "16", "32", "64"], // any of #[cfg(target_has_atomic={"8","16","32","64"}] work. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Amanieu previously complained about this being an explicit listing. See rust-lang/rust#38579 |
||
"target_has_atomic_ptr": null, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would prefer the previous system where you only need to specify a single value for atomic support: just the maximum atomic bit width supported by the target. It's much simpler and gives the same information. |
||
"target_thread_local": null, | ||
"unix": null | ||
}, | ||
|
||
// Type of, and the linker used | ||
"linker_kind": "gnu", // previously linker_is_gnu: bool | ||
// possible values: "gnu", "msvc", "osx" | ||
"linker": "cc", | ||
"ar": "ar", | ||
"archive_format": "gnu", | ||
|
||
"function_sections": true, | ||
"dynamic_linking": true, | ||
"disable_redzone": false, | ||
"obj_is_bitcode": false, | ||
"allow_asm": true, | ||
"allows_weak_linkage": true, | ||
"no_default_libraries": true, | ||
"custom_unwind_resume": false, // might make sense to merge into below | ||
"eh_method": "dwarf", // NEW: what EH method to use | ||
"dll_storage_attrs": false, // NEW: should use dll storage attrs? | ||
"debug_info": ["Dwarf", 4], // NEW: what debug info format | ||
"system_abi": "C", // NEW: what "system" ABI means | ||
"c_abi_kind": "cabi_x86_64", // NEW: what C_ABI implementation to use | ||
|
||
"pre_link_args": ["_Wl,__as_needed", "_Wl,_z,noexecstack"], | ||
"post_link_args": [], | ||
"pre_link_objects_dll": [], | ||
"pre_link_objects_exe": [], | ||
"post_link_objects": [], | ||
"late_link_args": [], | ||
"gc_sections_args": [], // NEW: how to strip sections | ||
"rpath_prefix": "$ORIGIN", // CHANGED: to allow specifying rpath prefix, null to disable rpath altogether | ||
"no_compiler_rt": false, | ||
"metadata_section": ".note.rustc", // NEW: name of section for metadata | ||
"has_frameworks": false, // NEW: is concept of frameworks supported? | ||
"position_independent_executables": true, // should become a plain linker argument? | ||
"lib_allocation_crate": "alloc_system", | ||
"exe_allocation_crate": "alloc_jemalloc", | ||
// should become a template? `lib{}.so` is much nicer | ||
"dll_prefix": "lib", | ||
"dll_suffix": ".so", | ||
"exe_suffix": "", | ||
"staticlib_prefix": "lib", | ||
"staticlib_suffix": ".a", | ||
|
||
"cpu": "x86_64", // CPU of the target | ||
"features": "", // LLVM features | ||
"relocation_model": "pic", | ||
"code_model": "default", | ||
"eliminate_frame_pointer": true, | ||
|
||
// is_like_solaris is handled by the extra gc_sections_args key | ||
// is_like_msvc is handled by the extra metadata_section, linker_kind, eh_method, | ||
// dll_storage_attrs, debug_info keys | ||
// is_like_windows is handled by the extra system_abi and c_abi_kind keys | ||
// is_like_android is handled by the extra debug_info key | ||
// is_like_osx is handled by linker_kind: "osx", rpath_prefix, has_frameworks, c_abi_kind | ||
} | ||
``` | ||
|
||
## Distribution of targets | ||
|
||
Currently every built-in target is distributed along with the rustc compiler. This is suboptimal, | ||
because in majority of cases users are interested in targets which rustc can compile for, rather | ||
than the built-in targets rustc knows about, therefore distributing targets built-in into rustc is | ||
providing no benefits. | ||
|
||
Native libraries and linkers aside, it is obvious there’s little sense in distributing targets | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't agree with this. Once "on the fly compilation of std" becomes a thing, it actually makes sense to provide all the target specifications "upfront" with (*) This actually makes more sense for final deployments/releases. Instead of what's proposed here, I propose adding a new There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Getting rid of precompiled There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
They aren't independent of the compiler however, as the fields will probably be subject to change for a while. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You are right. Then we could use the same "channel" policy that the toolchain uses. If you do There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do not disagree with any points raised here, but I feel like solving this concern should become a part of eventual “custom std on demand” RFC, which AFAICT not even feasible at the moment due to stability concerns. |
||
separately from the standard libraries for them. This RFC proposes to change the distribution so | ||
the rust-std and rust (not rustc or rust-docs) packages begin including the JSON target description | ||
for the target which the standard library targets. The target JSONs would get installed in | ||
`$sysroot/share/rustc/targets/` or a similar directory, and the `RUST_TARGET_PATH` environment | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd propose placing the target specifications into the In meta-rust we've been shipping a patch that does that for a while (to make our life a bit easier). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I prefer There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Not necessarily. Packages regularly put non-shared libraries (xml & conf) in Also, There is no (current) allowance for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The distinction is that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Well, then should all of I'd also consider that I guess my point is that target specifications and their associated compiled libraries are clearly related, and as a result I'd prefer they are stored "close" to each other. If There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I guess this is the point I disagree on. The main difference between the compiled object files and the target descriptions is just that one happens to be compiled, so people are more comfortable with them sitting in lib. The target descriptions are specific to a target, they are just parse-able outside of that target, but aren't useful outside of the context of using them for a specific target. I'd like to re-iterate that configuration files for valgrind and systemd (among other things) are also placed in lib. These aren't compiled objects. They are as specific to the architecture as our target specification is (and perhaps somewhat less specific to it in some cases). And gcc's headers in lib aren't compiled objects. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On thing to consider is that if we had This has the nice advantage of a single folder bundling the std libs and target-spec, so that renaming a target is the same as renaming a directory (instead of needing 2 renames). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
At least for systemd, this is false - on multiarch systems, systemd units can very well contain architecture-specific paths, thus making them architecture-dependent in exactly the way
I don't think that works very well; see #1711 (comment) for my reply to your other comment suggesting this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Let me clarify my point in bringing up valgrind & systemd here:
I'm then considering our target spec bits:
The key difference between the target specs & target libs is that one is "examinable" (in a traditional sense that I don't need to parse elf files :). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "Examinable" is a red herring, and I suspect why is rooted in this:
The crucial misstep is that "work with" isn't the constraint - it's "work on". If a systemd unit file reference an arch-dependent path (say, in Meanwhile, if I had a native rustc and a canadian-cross rustc, of the same versions, they could both use the exact same target specifications unchanged. This is exactly the property denoting what should go in And before you then go back to saying this is true of rustlib, it's not - not in the face of dynamic linking, which is supported. Sure, you can link to them - and then the binary remains linked to them, and is now using arch-dependent things from |
||
variable would be adjusted to include this directory by default. | ||
|
||
This scheme allows users to easily produce and distribute custom standard library and target | ||
combinations, thus removing the need to land every single target as a built-in to rustc. Moreover, | ||
under this scheme, if rustc reports knowing about a target, it is very likely it will be able to | ||
compile for it as well, instead of reporting a confusing | ||
|
||
> error: can't find crate for `std` [E0463] | ||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
This RFC does not attempt to solve issues surrounding linker invocation and native library | ||
discovery, especially during cross-compilation. Instead the code built-in into rustc is still relied | ||
onto to deal with these problems. On the other hand, it is likely that there isn’t much more | ||
necessary than rustc knowing how to invoke the few linkers it already knows about in order to cover | ||
the great majority of use cases. | ||
|
||
The users will be able to build the standard library-target bundles, but only for nightly versions | ||
of rustc compilers, because of the number of unstable features necessary to build a libstd. On the | ||
upside, it should become as easy as `cd src/libstd && cargo build --release | ||
--target=/path/to/custom-target.json && build_bundle`. If there’s a desire to make “bundles” work | ||
with stable rustc, the target would still be submitted upstream. | ||
|
||
`#![no_core]` users will have to download the rust-std even if they have no use for libraries in | ||
there. | ||
|
||
Proposed change to add a key for each option, instead of having an umbrella `is_like_*` keys, will | ||
result in big increase of such options. All of these are optional and should have sane default | ||
values, though. | ||
|
||
# Alternatives | ||
[alternatives]: #alternatives | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Seems like an alternative (in some sense) would be to have a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Its not an alternative at all, because that option does not help creating and distributing custom targets. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's a solution to the issue of being able to determine information about a target externally (which I saw as one of the motivations for moving all targets to json), but you're right that it doesn't have anything to do with creating & distributing custom targets. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've got the code for |
||
|
||
This RFC is still very viable without the proposed changes to JSON keys (the “target quirks and | ||
cfg” as well as the “proposed JSON key-values” sections). | ||
|
||
Keep distributing all the built-in targets with the rustc package. | ||
|
||
Since we’re already tweaking the structure of the target json spec, instead of allowing comments in | ||
JSON, just switch to TOML. | ||
|
||
# Unresolved questions | ||
[unresolved]: #unresolved-questions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this change, the compiler will have to learn to not use
std
binaries pre-compiled for$TARGET
(e.g.x86_64-unknown-linux-gnu
) if its corresponding target file ($TARGET.json
) has been modified because codegen options may have been changed in an incompatible way. This is not a problem today because one can't override a built-in target specification with a target specification file.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it sound sensible to include a hash of the target spec in emitted binaries? The issue goes beyond
std
, and including a spec hash would resolve it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A hash of the target spec makes sense to me. I think it would have to be appended to the target triple of the rustlib directories:
$sysroot/lib/rustlib/$TARGET-$HASH
. That way several variants of the same target triple can coexist in the sysroot. It may also be worth to encode the hash in the rlibs' metadata for "extra protection"; someone might rename a rustlib directory from$TARGET-$HASH1
to$TARGET-$HASH2
-- the hash in metadata would prevent using the wrong variant in that case.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent, triple-hash solves the discoverability/ergonomics problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it makes sense to suffix the rustlib-target-dir with a hash. Consider that if we have a directory blessed to contain target specifications, and the name of those specifications is
<target-name>.ext
, we already know that all the installed specfications have unique names.If we're worried about handling targets that don't have specifications installed, why would they have
std
installed inrustlib
? Instead, for handling non-installed target specs (those not in the blessed dir), I'd propose the following:std
crates either in the target-spec (brittle for dynamic uses, but potentially automatic) or as a seperate argument to rustc (easy for dynamic, but no automation)Alternately (and I prefer this):
--target
to accept a directory (so--target ./my-target
) that contains a target spec (./my-target/target.json
or./my-target/target.toml
) and thestd
crates in the same way asrustlib
today.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strongly disagree - it solves uniqueness at one point in time, but if the target spec is edited, you can now have a target dir that contains a mix of objects built under different target spec contents. That's unsafe.
That's really, really nasty if we put the hash of the target in the path to rustlib - you need the target to hash it to find the directory in which you can find the target. You wind up having to scan every directory, looking for the target you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very true. But using a hash in a directory name as a way of avoiding that unsafety isn't perfect. For avoiding unsafety, we'd really want to embed the target hash in the object files. This has the nice benefit of fixing that unsafety for all object files, instead of just those in the sysroot libs (std, etc).
Very true. Which is why the beginning of my comment discards the idea of using a hash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do both. but remember that even if rustc can check rlib metadata to verify the hash, ld is not rust-specific and won't. For interopt with other tools it's still nice to get the safety of one directory / target.