r[attributes.codegen]
The following attributes are used for controlling code generation.
r[attributes.codegen.hint]
r[attributes.codegen.hint.cold-inline]
The cold
and inline
attributes give suggestions to generate code in a
way that may be faster than what it would do without the hint. The attributes
are only hints, and may be ignored.
r[attributes.codegen.hint.usage] Both attributes can be used on functions. When applied to a function in a trait, they apply only to that function when used as a default function for a trait implementation and not to all trait implementations. The attributes have no effect on a trait function without a body.
r[attributes.codegen.inline]
r[attributes.codegen.inline.intro]
The inline
attribute suggests that a copy of the attributed function
should be placed in the caller, rather than generating code to call the
function where it is defined.
Note: The
rustc
compiler automatically inlines functions based on internal heuristics. Incorrectly inlining functions can make the program slower, so this attribute should be used with care.
r[attributes.codegen.inline.modes] There are three ways to use the inline attribute:
#[inline]
suggests performing an inline expansion.#[inline(always)]
suggests that an inline expansion should always be performed.#[inline(never)]
suggests that an inline expansion should never be performed.
Note:
#[inline]
in every form is a hint, with no requirements on the language to place a copy of the attributed function in the caller.
r[attributes.codegen.cold]
The cold
attribute suggests that the attributed function is unlikely to
be called.
r[attributes.codegen.no_builtins]
The no_builtins
attribute may be applied at the crate level to disable
optimizing certain code patterns to invocations of library functions that are
assumed to exist.
r[attributes.codegen.target_feature]
r[attributes.codegen.target_feature.intro]
The target_feature
attribute may be applied to a function to
enable code generation of that function for specific platform architecture
features. It uses the MetaListNameValueStr syntax with a single key of
enable
whose value is a string of comma-separated feature names to enable.
# #[cfg(target_feature = "avx2")]
#[target_feature(enable = "avx2")]
unsafe fn foo_avx2() {}
r[attributes.codegen.target_feature.arch] Each target architecture has a set of features that may be enabled. It is an error to specify a feature for a target architecture that the crate is not being compiled for.
r[attributes.codegen.target_feature.target-ub] It is undefined behavior to call a function that is compiled with a feature that is not supported on the current platform the code is running on, except if the platform explicitly documents this to be safe.
r[attributes.codegen.target_feature.inline]
Functions marked with target_feature
are not inlined into a context that
does not support the given features. The #[inline(always)]
attribute may not
be used with a target_feature
attribute.
r[attributes.codegen.target_feature.availability]
The following is a list of the available feature names.
r[attributes.codegen.target_feature.x86]
Executing code with unsupported features is undefined behavior on this platform.
Hence this platform requires that #[target_feature]
is only applied to unsafe
functions.
Feature | Implicitly Enables | Description |
---|---|---|
adx |
ADX --- Multi-Precision Add-Carry Instruction Extensions | |
aes |
sse2 |
AES --- Advanced Encryption Standard |
avx |
sse4.2 |
AVX --- Advanced Vector Extensions |
avx2 |
avx |
AVX2 --- Advanced Vector Extensions 2 |
bmi1 |
BMI1 --- Bit Manipulation Instruction Sets | |
bmi2 |
BMI2 --- Bit Manipulation Instruction Sets 2 | |
cmpxchg16b |
cmpxchg16b --- Compares and exchange 16 bytes (128 bits) of data atomically |
|
f16c |
avx |
F16C --- 16-bit floating point conversion instructions |
fma |
avx |
FMA3 --- Three-operand fused multiply-add |
fxsr |
fxsave and fxrstor --- Save and restore x87 FPU, MMX Technology, and SSE State |
|
lzcnt |
lzcnt --- Leading zeros count |
|
movbe |
movbe --- Move data after swapping bytes |
|
pclmulqdq |
sse2 |
pclmulqdq --- Packed carry-less multiplication quadword |
popcnt |
popcnt --- Count of bits set to 1 |
|
rdrand |
rdrand --- Read random number |
|
rdseed |
rdseed --- Read random seed |
|
sha |
sse2 |
SHA --- Secure Hash Algorithm |
sse |
SSE --- Streaming SIMD Extensions | |
sse2 |
sse |
SSE2 --- Streaming SIMD Extensions 2 |
sse3 |
sse2 |
SSE3 --- Streaming SIMD Extensions 3 |
sse4.1 |
ssse3 |
SSE4.1 --- Streaming SIMD Extensions 4.1 |
sse4.2 |
sse4.1 |
SSE4.2 --- Streaming SIMD Extensions 4.2 |
ssse3 |
sse3 |
SSSE3 --- Supplemental Streaming SIMD Extensions 3 |
xsave |
xsave --- Save processor extended states |
|
xsavec |
xsavec --- Save processor extended states with compaction |
|
xsaveopt |
xsaveopt --- Save processor extended states optimized |
|
xsaves |
xsaves --- Save processor extended states supervisor |
r[attributes.codegen.target_feature.aarch64]
This platform requires that #[target_feature]
is only applied to unsafe
functions.
Further documentation on these features can be found in the ARM Architecture Reference Manual, or elsewhere on developer.arm.com.
Note: The following pairs of features should both be marked as enabled or disabled together if used:
paca
andpacg
, which LLVM currently implements as one feature.
Feature | Implicitly Enables | Feature Name |
---|---|---|
aes |
neon |
FEAT_AES & FEAT_PMULL --- Advanced SIMD AES & PMULL instructions |
bf16 |
FEAT_BF16 --- BFloat16 instructions | |
bti |
FEAT_BTI --- Branch Target Identification | |
crc |
FEAT_CRC --- CRC32 checksum instructions | |
dit |
FEAT_DIT --- Data Independent Timing instructions | |
dotprod |
FEAT_DotProd --- Advanced SIMD Int8 dot product instructions | |
dpb |
FEAT_DPB --- Data cache clean to point of persistence | |
dpb2 |
FEAT_DPB2 --- Data cache clean to point of deep persistence | |
f32mm |
sve |
FEAT_F32MM --- SVE single-precision FP matrix multiply instruction |
f64mm |
sve |
FEAT_F64MM --- SVE double-precision FP matrix multiply instruction |
fcma |
neon |
FEAT_FCMA --- Floating point complex number support |
fhm |
fp16 |
FEAT_FHM --- Half-precision FP FMLAL instructions |
flagm |
FEAT_FlagM --- Conditional flag manipulation | |
fp16 |
neon |
FEAT_FP16 --- Half-precision FP data processing |
frintts |
FEAT_FRINTTS --- Floating-point to int helper instructions | |
i8mm |
FEAT_I8MM --- Int8 Matrix Multiplication | |
jsconv |
neon |
FEAT_JSCVT --- JavaScript conversion instruction |
lse |
FEAT_LSE --- Large System Extension | |
lor |
FEAT_LOR --- Limited Ordering Regions extension | |
mte |
FEAT_MTE & FEAT_MTE2 --- Memory Tagging Extension | |
neon |
FEAT_FP & FEAT_AdvSIMD --- Floating Point and Advanced SIMD extension | |
pan |
FEAT_PAN --- Privileged Access-Never extension | |
paca |
FEAT_PAuth --- Pointer Authentication (address authentication) | |
pacg |
FEAT_PAuth --- Pointer Authentication (generic authentication) | |
pmuv3 |
FEAT_PMUv3 --- Performance Monitors extension (v3) | |
rand |
FEAT_RNG --- Random Number Generator | |
ras |
FEAT_RAS & FEAT_RASv1p1 --- Reliability, Availability and Serviceability extension | |
rcpc |
FEAT_LRCPC --- Release consistent Processor Consistent | |
rcpc2 |
rcpc |
FEAT_LRCPC2 --- RcPc with immediate offsets |
rdm |
FEAT_RDM --- Rounding Double Multiply accumulate | |
sb |
FEAT_SB --- Speculation Barrier | |
sha2 |
neon |
FEAT_SHA1 & FEAT_SHA256 --- Advanced SIMD SHA instructions |
sha3 |
sha2 |
FEAT_SHA512 & FEAT_SHA3 --- Advanced SIMD SHA instructions |
sm4 |
neon |
FEAT_SM3 & FEAT_SM4 --- Advanced SIMD SM3/4 instructions |
spe |
FEAT_SPE --- Statistical Profiling Extension | |
ssbs |
FEAT_SSBS & FEAT_SSBS2 --- Speculative Store Bypass Safe | |
sve |
fp16 |
FEAT_SVE --- Scalable Vector Extension |
sve2 |
sve |
FEAT_SVE2 --- Scalable Vector Extension 2 |
sve2-aes |
sve2 , aes |
FEAT_SVE_AES --- SVE AES instructions |
sve2-sm4 |
sve2 , sm4 |
FEAT_SVE_SM4 --- SVE SM4 instructions |
sve2-sha3 |
sve2 , sha3 |
FEAT_SVE_SHA3 --- SVE SHA3 instructions |
sve2-bitperm |
sve2 |
FEAT_SVE_BitPerm --- SVE Bit Permute |
tme |
FEAT_TME --- Transactional Memory Extension | |
vh |
FEAT_VHE --- Virtualization Host Extensions |
r[attributes.codegen.target_feature.riscv]
This platform requires that #[target_feature]
is only applied to unsafe
functions.
Further documentation on these features can be found in their respective specification. Many specifications are described in the RISC-V ISA Manual or in another manual hosted on the RISC-V GitHub Account.
Feature | Implicitly Enables | Description |
---|---|---|
a |
A --- Atomic instructions | |
c |
C --- Compressed instructions | |
m |
M --- Integer Multiplication and Division instructions | |
zb |
zba , zbc , zbs |
Zb --- Bit Manipulation instructions |
zba |
Zba --- Address Generation instructions | |
zbb |
Zbb --- Basic bit-manipulation | |
zbc |
Zbc --- Carry-less multiplication | |
zbkb |
Zbkb --- Bit Manipulation Instructions for Cryptography | |
zbkc |
Zbkc --- Carry-less multiplication for Cryptography | |
zbkx |
Zbkx --- Crossbar permutations | |
zbs |
Zbs --- Single-bit instructions | |
zk |
zkn , zkr , zks , zkt , zbkb , zbkc , zkbx |
Zk --- Scalar Cryptography |
zkn |
zknd , zkne , zknh , zbkb , zbkc , zkbx |
Zkn --- NIST Algorithm suite extension |
zknd |
Zknd --- NIST Suite: AES Decryption | |
zkne |
Zkne --- NIST Suite: AES Encryption | |
zknh |
Zknh --- NIST Suite: Hash Function Instructions | |
zkr |
Zkr --- Entropy Source Extension | |
zks |
zksed , zksh , zbkb , zbkc , zkbx |
Zks --- ShangMi Algorithm Suite |
zksed |
Zksed --- ShangMi Suite: SM4 Block Cipher Instructions | |
zksh |
Zksh --- ShangMi Suite: SM3 Hash Function Instructions | |
zkt |
Zkt --- Data Independent Execution Latency Subset |
r[attributes.codegen.target_feature.wasm]
#[target_feature]
may be used with both safe and
unsafe
functions on Wasm platforms. It is impossible to
cause undefined behavior via the #[target_feature]
attribute because
attempting to use instructions unsupported by the Wasm engine will fail at load
time without the risk of being interpreted in a way different from what the
compiler expected.
Feature | Implicitly Enables | Description |
---|---|---|
bulk-memory |
WebAssembly bulk memory operations proposal | |
extended-const |
WebAssembly extended const expressions proposal | |
mutable-globals |
WebAssembly mutable global proposal | |
nontrapping-fptoint |
WebAssembly non-trapping float-to-int conversion proposal | |
relaxed-simd |
simd128 |
WebAssembly relaxed simd proposal |
sign-ext |
WebAssembly sign extension operators Proposal | |
simd128 |
WebAssembly simd proposal | |
multivalue |
WebAssembly multivalue proposal | |
reference-types |
WebAssembly reference-types proposal | |
tail-call |
WebAssembly tail-call proposal |
r[attributes.codegen.target_feature.info]
r[attributes.codegen.target_feature.remark-cfg]
See the target_feature
conditional compilation option for selectively
enabling or disabling compilation of code based on compile-time settings. Note
that this option is not affected by the target_feature
attribute, and is
only driven by the features enabled for the entire crate.
r[attributes.codegen.target_feature.remark-rt]
See the is_x86_feature_detected
or is_aarch64_feature_detected
macros
in the standard library for runtime feature detection on these platforms.
Note:
rustc
has a default set of features enabled for each target and CPU. The CPU may be chosen with the-C target-cpu
flag. Individual features may be enabled or disabled for an entire crate with the-C target-feature
flag.
r[attributes.codegen.track_caller]
r[attributes.codegen.track_caller.allowed-positions]
The track_caller
attribute may be applied to any function with "Rust"
ABI
with the exception of the entry point fn main
.
r[attributes.codegen.track_caller.traits] When applied to functions and methods in trait declarations, the attribute applies to all implementations. If the trait provides a default implementation with the attribute, then the attribute also applies to override implementations.
r[attributes.codegen.track_caller.extern]
When applied to a function in an extern
block the attribute must also be applied to any linked
implementations, otherwise undefined behavior results. When applied to a function which is made
available to an extern
block, the declaration in the extern
block must also have the attribute,
otherwise undefined behavior results.
r[attributes.codegen.track_caller.behavior]
Applying the attribute to a function f
allows code within f
to get a hint of the Location
of
the "topmost" tracked call that led to f
's invocation. At the point of observation, an
implementation behaves as if it walks up the stack from f
's frame to find the nearest frame of an
unattributed function outer
, and it returns the Location
of the tracked call in outer
.
#[track_caller]
fn f() {
println!("{}", std::panic::Location::caller());
}
Note:
core
provides [core::panic::Location::caller
] for observing caller locations. It wraps the [core::intrinsics::caller_location
] intrinsic implemented byrustc
.
Note: because the resulting
Location
is a hint, an implementation may halt its walk up the stack early. See Limitations for important caveats.
When f
is called directly by calls_f
, code in f
observes its callsite within calls_f
:
# #[track_caller]
# fn f() {
# println!("{}", std::panic::Location::caller());
# }
fn calls_f() {
f(); // <-- f() prints this location
}
When f
is called by another attributed function g
which is in turn called by calls_g
, code in
both f
and g
observes g
's callsite within calls_g
:
# #[track_caller]
# fn f() {
# println!("{}", std::panic::Location::caller());
# }
#[track_caller]
fn g() {
println!("{}", std::panic::Location::caller());
f();
}
fn calls_g() {
g(); // <-- g() prints this location twice, once itself and once from f()
}
When g
is called by another attributed function h
which is in turn called by calls_h
, all code
in f
, g
, and h
observes h
's callsite within calls_h
:
# #[track_caller]
# fn f() {
# println!("{}", std::panic::Location::caller());
# }
# #[track_caller]
# fn g() {
# println!("{}", std::panic::Location::caller());
# f();
# }
#[track_caller]
fn h() {
println!("{}", std::panic::Location::caller());
g();
}
fn calls_h() {
h(); // <-- prints this location three times, once itself, once from g(), once from f()
}
And so on.
r[attributes.codegen.track_caller.limits]
r[attributes.codegen.track_caller.hint] This information is a hint and implementations are not required to preserve it.
r[attributes.codegen.track_caller.decay]
In particular, coercing a function with #[track_caller]
to a function pointer creates a shim which
appears to observers to have been called at the attributed function's definition site, losing actual
caller information across virtual calls. A common example of this coercion is the creation of a
trait object whose methods are attributed.
Note: The aforementioned shim for function pointers is necessary because
rustc
implementstrack_caller
in a codegen context by appending an implicit parameter to the function ABI, but this would be unsound for an indirect call because the parameter is not a part of the function's type and a given function pointer type may or may not refer to a function with the attribute. The creation of a shim hides the implicit parameter from callers of the function pointer, preserving soundness.
r[attributes.codegen.instruction_set]
r[attributes.codegen.instruction_set.allowed-positions]
The instruction_set
attribute may be applied to a function to control which instruction set the function will be generated for.
r[attributes.codegen.instruction_set.behavior] This allows mixing more than one instruction set in a single program on CPU architectures that support it.
r[attributes.codegen.instruction_set.syntax] It uses the MetaListPath syntax, and a path comprised of the architecture family name and instruction set name.
r[attributes.codegen.instruction_set.target-limits]
It is a compilation error to use the instruction_set
attribute on a target that does not support it.
r[attributes.codegen.instruction_set.arm]
For the ARMv4T
and ARMv5te
architectures, the following are supported:
arm::a32
--- Generate the function as A32 "ARM" code.arm::t32
--- Generate the function as T32 "Thumb" code.
#[instruction_set(arm::a32)]
fn foo_arm_code() {}
#[instruction_set(arm::t32)]
fn bar_thumb_code() {}
Using the instruction_set
attribute has the following effects:
- If the address of the function is taken as a function pointer, the low bit of the address will be set to 0 (arm) or 1 (thumb) depending on the instruction set.
- Any inline assembly in the function must use the specified instruction set instead of the target default.