-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Add target
configuration
#2991
Conversation
increases compile time and makes a crate incompatible with certain build | ||
systems. | ||
|
||
Otherwise, all available components would need to be specified separately: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vaguely remember that the target
isn't necessarily the concatenation of the various target_{foo}
parts. That said, I'm having trouble finding the part of the docs that says this...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
x86_64-fuchsia
is an example (it doesn't concatenate unknown
s into the result).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang/LLVM treats empty components, none
and unknown
as equivalent, so for example x86_64-fuchsia
, x86_64-unknown-fuchsia
and x86_64-none-fuchsia
are all considered equivalent and internally normalized to x86_64-unknown-fuchsia
.
We prefer the shortest spelling for convenience, but if Rust prefers always using normalized triples, we could switch to using x86_64-unknown-fuchsia
, I'd be fine with that. This wasn't possible in the past when LLVM normalized empty components inconsistently, but that issue has since been resolved.
- `"x86_64-pc-windows-msvc"` | ||
- `"x86_64-unknown-linux-gnu"` | ||
|
||
# Drawbacks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another drawback might be that technically this could be a breaking change with anybody that conditionally adds a --cfg target=...
cfg either via RUSTFLAGS or cargo:rustc-cfg=target=...
from a build script.
This goes for your target_abi
PR also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, so if a person's build.rs had lines like say
let target = env::var("TARGET").expect("Couldn't read `TARGET`");
println!("cargo:rustc-env=TARGET={}", target);
Then technically they'd hit a small break ;P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that should be fine. The code would have to be:
let target = env::var("TARGET").expect("Couldn't read `TARGET`");
println!("cargo:rustc-cfg=target={}", target);
in order to have a break... Actually, this would define the target identically to what it was before, so they'd have to define it as something different (unless their definition would take priority?), or only in some cases, I guess.
This is a problem adding any new predefined cfg
faces, and stagnating the set of cfg
s feels pretty undesirable to me, so I don't know if it's a case worth caring about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The string is equal to the TARGET
environment var anyway, so target
would not change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's what I meant by:
Actually, this would define the target identically to what it was before, so they'd have to define it as something different (unless their definition would take priority?), or only in some cases, I guess.
It might be a breaking change for:
let target = env::var("CARGO_CFG_TARGET_ARCH").expect("Couldn't read `CARGO_CFG_TARGET_ARCH`");
println!("cargo:rustc-cfg=target={}", target);
or
if some_condition {
let target = env::var("TARGET").expect("Couldn't read `TARGET`");
println!("cargo:rustc-cfg=target={}", target);
}
But these are pretty obscure cases...
string (e.g. `arm-unknown-linux-gnueabihf`). This also adds a `CARGO_CFG_TARGET` | ||
environment variable for parity with other `CARGO_CFG_*` variables. | ||
|
||
# Motivation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the motivation section provides enough of a compelling argument for why this is desirable. Fleshing this out with specific examples of existing crates that use workarounds to achieve this and why they do so would be useful.
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
- Configuring against specific targets can be overly strict and could make |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the biggest drawback in my experience. People will often reach for the most accessible way to target a platform even if it's the wrong one. We saw this a lot in the Firefox codebase where people would use the #ifdef
for "using the GTK widget implementation" when they really meant "this is Linux" or other similar things.
`aarch64-unknown-none-softfloat`, yet one would likely want to include ABI | ||
variants. The same concern applies to the target vendor. | ||
|
||
A potential solution would be to allow glob matching (e.g. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How feasible would it be to allow specifying individual components of the target triple string by keyword inside a target
cfg, something like:
#[cfg(target(arch="aarch64", os="none"))]
which would be equivalent to:
#[cfg(all(target_arch="aarch64", target_os="none"))]
?
I definitely agree with the assertion that "matching the entire target with the current set of cfg
options is extremely verbose", but I'm not convinced that "allow matching the entire target string directly" is the right way to address that.
One problem with matching against target names is that it doesn't work for JSON targets. This is a problem with the standard library which has several build scripts when look at the target name, and that is something we'd specifically like to move away from. I think it would be a problem if it was allowed to use them in It might be helpful to collect data on how common it is to have cfg expressions that match more than 3 |
Can they not specify their own target name? Or even if they do does it not work (which would sound like a bug to me) |
The problem is that the target name is whatever the filename is minus the |
@rfcbot merge We discussed this in lang today and we feel this is a pretty obvious extension based on the features we currently have. That is, we let you cfg on each member of the triple, we should also let you cfg on the triple as a whole. We also wanted to tag @rust-lang/cargo to be aware of this RFC because it adds a cargo env variable. |
Team member @withoutboats has proposed to merge this. The next step is review by the rest of the tagged team members: Concerns:
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
@withoutboats did the team discuss @luser's suggestion of |
Relevant issue: rust-lang/rust#63217 |
We didn't talk about that, actually, and I think I'd like to see the RFC extended to cover the alternative. but before that, I'd like to hear your thoughts on why you prefer to match the entire target string (I have some thoughts but I'd be curious to hear yours first). I see @luser also raised the concern that this makes matching the entire target string too accessible, and that folks will reach for it first. Plausible, though I think ultimately I'm not too worried about it. It seems like something to solve in documentation, by emphasizing the more accessible forms first, and presenting "match the precise target ABI" as a kind of fallback. |
@rfcbot concern match-target-string I'm going to go ahead and register a concern that we ought to include @luser's proposed form in the alternatives (at least) and describe why we chose not to adopt it. |
On further thought, cargo does already allow matching on the full target for specifying platform-specific dependencies, so perhaps it's not a huge problem in practice. Given that we do already have most of the components of the target already available for |
Worth noting that the arch isn't really the first part of the target string always. e.g. |
As I mentioned above, the fact that Cargo allows matching on the target name causes problems today. I personally think this would be an anti-pattern (that is, I think it should be forbidden in libstd for example). I don't see any data or examples for a motivation for this. The vast majority of cfg expressions I've seen only match on one or two components. |
One example for which I wanted |
Following up on this: I share the same concern as @nikomatsakis, and I think I'd prefer that alternative. In particular, if this is primarily about |
@rfcbot concern shorthand-for-target-by-components I would prefer the proposed alternative that provides a shorthand. |
As mentioned the shorthand is similar, but not exactly the same. The first component isn't quite the target arch, and it's valuable to match on it in many cases. |
@thomcc are you referring to this comment? Can you elaborate on what is enabled by this approach that would not otherwise work? Sorry if you're repeating yourself. |
For most cross-platform libaries in the ecosystem it's probably an antipattern to match on a specific target triple. But for other software it's quite common that it's built for two or three specific targets (e.g. a company making software for specific hardware). In those cases, it would be a lot more readable to match exactly on the few target triples that are supported, rather than having to translate those into separate components. (And stabilizing |
We discussed this in today's @rust-lang/lang meeting. Our consensus was that we'd like to see both the ability to match on full target strings as proposed in this RFC, and the shorthand to match components of the target. We'd like to see the shorthand proposal added to the RFC: |
Though, that means that |
Agreed. I was trying to briefly describe the shorthand, but yes, this should only work for |
My assumption is that this would be true only for the |
Presuming we added an alias to If we made, within the compiler, the target code actually attempt to understand the triple string, and e.g. I would also like to see the concern about json targets from above figured out before this merges. |
@nagisa I would propose that I'm not necessarily suggesting we should start elaborating/canonicalizing target strings in general; this (and similarly |
The pc ←→ unknown was just an example I had on hand, since I remembered the RFC during the discussion about potentially making these aliases. My concern is broader though – I would like this RFC to clarify what the plan would be if we made the relationship between target string and target definition not a (usually) one-to-one mapping, but rather a (usually) many-to-one mapping. In reaction to your response, I would also like to posit that it would be quite confusing if |
Perhaps, but it's really the only way that this RFC stays viable — otherwise there are a potentially unbounded number of target strings that you'd have to match. I think there needs to be a notion of a "canonical" name for the target, which the other names are normalized to (as Josh said) as early as possible. That said, I kind of feel like it's plausible that this is not this RFC's responsibility, so much as the responsibility of whatever RFC causes the currently one-to-one mapping to become a many-to-one mapping. That said, I don't feel that strongly on that point |
This wouldn't work if you were matching target strings for toolchain purposes, though (however, at that point, it may or may not be reasonable to just use/forward the |
I honestly think it'd be a leak of an implementation detail if you could detect which non-canonical alias were used. I think if we're going to have aliases at all (rather than just saying that you must pass the canonical name of the target), then those aliases should be translated to the canonical target name as early as possible, and then only that canonical target name and its components should ever be exposed to Rust code. |
What would you do, then, if you wanted to access the toolchain corresponding to the actual target? If I'm invoking with |
Cargo has a
If those both exist and refer to different programs, something has gone terribly wrong. Until we have automatic detection, it may make sense for a build script (or more likely its helper library, such as cc-rs) to simply check for names in order until it finds one that exists. For instance, try |
On Mon, Jun 21, 2021 at 14:32 Josh Triplett ***@***.***> wrote:
What would you do, then, if you wanted to access the toolchain
corresponding to the actual target?
Cargo has a target.(target string).linker option; use that. If we need to
expose that in more places, we should. We should also have an equivalent
for the C compiler. And we could improve our handling of that to
automatically find it in more cases. But build scripts should not need to
guess.
If I'm invoking with --target x86_64-linux-gnu (hypothetically), and I
want ld for host, the program I need to find is x86_64-linux-gnu-ld not
x86_64-unknown-linux-gnu-ld.
If those both exist and refer to different programs, something has gone
terribly wrong. Until we have automatic detection, I think it's appropriate
for a build script (or more likely its helper library, such as cc-rs) to
simply check for names in order until it finds one that exists. For
instance, try x86_64-unknown-linux-gnu-ld and fall back to
x86_64-pc-linux-gnu-ld.
Well, that works on theory, but there can be instances where different (but
otherwise functionally identical) vendor fields are used for cross
compilation. For example, when building first stage cross toolchain for
lfs, the target x86_64-lfs-linux-gnu (which canonicalizes to
x86_64-unknown-linux-gnu) is used. If you were then to try invoking
x86_64-unknown-linux-gnu-ld (between stage1 toolchain and stage2
toolchain), you might find the host toolchain instead of the cross
toolchain.
… —
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2991 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABGLD26YPES62SVC7FIPL4DTT6AT3ANCNFSM4R3XEIEQ>
.
|
On Mon, Jun 21, 2021 at 01:36:19PM -0700, Connor Horman wrote:
On Mon, Jun 21, 2021 at 14:32 Josh Triplett ***@***.***>
wrote:
> What would you do, then, if you wanted to access the toolchain
> corresponding to the actual target?
>
> Cargo has a target.(target string).linker option; use that. If we need to
> expose that in more places, we should. We should also have an equivalent
> for the C compiler. And we could improve our handling of that to
> automatically find it in more cases. But build scripts should not need to
> guess.
>
> If I'm invoking with --target x86_64-linux-gnu (hypothetically), and I
> want ld for host, the program I need to find is x86_64-linux-gnu-ld not
> x86_64-unknown-linux-gnu-ld.
>
> If those both exist and refer to different programs, something has gone
> terribly wrong. Until we have automatic detection, I think it's appropriate
> for a build script (or more likely its helper library, such as cc-rs) to
> simply check for names in order until it finds one that exists. For
> instance, try x86_64-unknown-linux-gnu-ld and fall back to
> x86_64-pc-linux-gnu-ld.
>
Well, that works on theory, but there can be instances where different (but
otherwise functionally identical) vendor fields are used for cross
compilation. For example, when building first stage cross toolchain for
lfs, the target x86_64-lfs-linux-gnu (which canonicalizes to
x86_64-unknown-linux-gnu) is used. If you were then to try invoking
x86_64-unknown-linux-gnu-ld (between stage1 toolchain and stage2
toolchain), you might find the host toolchain instead of the cross
toolchain.
We could, in theory, have rustc's/cargo's linker/toolchain detection
(which could use some improvement) look at the alias as a prefix.
Keeping that inside of rustc/cargo seems preferable to exposing it to
random build scripts that may have wildly varying mechanisms to find
toolchains.
|
I'd also note that this applies to things other than the basic toolchain (linker, assembler, c/++ compiler). I highly doubt cargo has need to expose things like objcopy, or custom tools. |
This feels very much out of scope for this RFC. It would be great for cargo to provide access to the various parts of the toolchain (I'm quite sure I've filed issues for similar things in the past), but that's an entirely separate problem. |
We talked about this in today's @rust-lang/lang meeting. We decided that this RFC doesn't depend on any specific decision about how to handle target aliases, and we can discuss how to handle target aliases with |
@rfcbot resolve match-target-string |
@joshtriplett did you intend to resolve your concern? |
@cramertj No, per #2991 (comment) I'm waiting for the RFC to be updated to cover the |
@rfcbot fcp cancel This hasn't moved in a long enough time that I am going to cancel the FCP. I'd love for someone to pick this up and make the edits that @joshtriplett suggested, perhaps in a separate PR, it seems like we're very close to getting resolution here! |
@nikomatsakis proposal cancelled. |
We just found a use case for it in rust-lang/log#490. I'll try to pick this up because we need it. |
I opened #3239 and added what was requested. |
I am closing this in favor of #3239 since I have not been shepherding this RFC very well, and I trust @GuillaumeGomez to follow it through to acceptance. |
This proposes a new
cfg
:target
, which matches the entire target triple string (e.g.arm-unknown-linux-gnueabihf
). This also adds aCARGO_CFG_TARGET
environment variable for parity with otherCARGO_CFG_*
variables.Rendered