Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for NEON intrinsics #90972

Closed
Amanieu opened this issue Nov 17, 2021 · 20 comments
Closed

Tracking Issue for NEON intrinsics #90972

Amanieu opened this issue Nov 17, 2021 · 20 comments
Labels
A-SIMD Area: SIMD (Single Instruction Multiple Data) A-target-feature Area: Enabling/disabling target features like AVX, Neon, etc. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. O-AArch64 Armv8-A or later processors in AArch64 mode O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@Amanieu
Copy link
Member

Amanieu commented Nov 17, 2021

Feature gate: #![feature(neon_intrinsics)]

This is a tracking issue for NEON SIMD intrinsics on ARM and AArch64. These intrinsics can be found in the std::arch::arm and std::arch::aarch64 modules.

Public API

The NEON intrinsics that are covered by this tracking issue consist of all the ARM/AArch64 intrinsics in those modules that start with the letter v. These are checked against the equivalent Clang intrinsics using automated verifiers.

Unresolved Questions

  • None yet.
@Amanieu Amanieu added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC labels Nov 17, 2021
@sanxiyn sanxiyn added the O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state label Nov 17, 2021
@workingjubilee workingjubilee added the A-SIMD Area: SIMD (Single Instruction Multiple Data) label Nov 18, 2021
@SparrowLii
Copy link
Member

SparrowLii commented Nov 22, 2021

Could we clarify exactly what we're going to do? For example, the feature needs to be added to the rustc?

@Amanieu
Copy link
Member Author

Amanieu commented Nov 22, 2021

I created an updated list of intrinsics that are missing/broken compared to Clang and the ARM intrinsic database CSV in rust-lang/stdarch#1258. Thanks to @JamieCunliffe for writing the tool to automate the testing of the intrinsics.

AArch64

This looks pretty good. We are only missing:

  • ARMv8.3 complex number intrinsics.
  • ARMv8.4 dot product intrinsics.
  • All intrinsics which use f16 and bf16 types, due to lack of rustc support. (These are not listed in missing_aarch64.txt)
  • The vqshlu* intrinsics currently cause an ICE in LLVM. This needs to be investigated and fixed.

I think we can start the process of stabilizing the AArch64 intrinsics now, conditional on fixing the LLVM ICEs. The missing intrinsics can be added later.

ARM

The situation here is a bit uglier and I don't think we can stabilize this as it is.

  • Clang and GCC disagree on the type of poly64_t on A32: it's a signed integer in Clang and an unsigned integer in GCC. On AArch64 both compilers use an unsigned integer. I believe GCC has the correct behavior here, but I am not sure. cc @adamgemmell @JamieCunliffe
  • The same missing intrinsics in AArch64 are also missing here (ARMv8.3 complex number, ARMv8.4 dot product, f16, bf16). This is not a blocker for stabilization.
  • We are missing quite a few intrinsics ("Implemented in stdarch for A64 only, Clang supports both A32/A64"). This must be fixed prior to stabilization.
  • We have some intrinsics that seem to be incorrectly implemented ("Failing tests: stdarch has incorrect results compared to Clang"). This must be fixed prior to stabilization.
  • Finally we also have a different set of intrinsics which cause an ICE in LLVM. This also needs to be investigated and fixed.

@Amanieu
Copy link
Member Author

Amanieu commented Nov 22, 2021

As per the report above, I would like to propose the stabilization of all the currently implemented NEON intrinsics on AArch64 only.

@rfcbot fcp merge

@rfcbot
Copy link

rfcbot commented Nov 22, 2021

Team member @Amanieu has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Nov 22, 2021
@JamieCunliffe
Copy link
Contributor

I would say GCC is correct here, in https://github.com/ARM-software/acle/blob/main/main/acle.rst#scalar-data-types poly64_t is an unsigned integer type.

For the LLVM ICE is there a bug for this? I'll take a look into it if no one else has.

@rfcbot
Copy link

rfcbot commented Dec 2, 2021

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels Dec 2, 2021
@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. to-announce Announce this issue on triage meeting and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Dec 12, 2021
@rfcbot
Copy link

rfcbot commented Dec 12, 2021

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Dec 15, 2021
update stdarch

2 commits in d219ad6..0716b22
2021-12-9 23:50:37 +0000 to 2021-12-14 16:17:57 +0100
 * Fix a bunch of typos ([Fix a bunch of typos stdarch#1267](rust-lang/stdarch#1267))
 * Stabilize armv8 neon instruction set on aarch64 ([Stabilize armv8 neon instruction set on aarch64 stdarch#1266](rust-lang/stdarch#1266))

The update stabilizes armv8 neon instructions on aarch64. rust-lang#90972
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Dec 16, 2021
update stdarch

2 commits in d219ad6..0716b22
2021-12-9 23:50:37 +0000 to 2021-12-14 16:17:57 +0100
 * Fix a bunch of typos ([Fix a bunch of typos stdarch#1267](rust-lang/stdarch#1267))
 * Stabilize armv8 neon instruction set on aarch64 ([Stabilize armv8 neon instruction set on aarch64 stdarch#1266](rust-lang/stdarch#1266))

The update stabilizes armv8 neon instructions on aarch64. rust-lang#90972
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Dec 16, 2021
update stdarch

2 commits in d219ad6..0716b22
2021-12-9 23:50:37 +0000 to 2021-12-14 16:17:57 +0100
 * Fix a bunch of typos ([Fix a bunch of typos stdarch#1267](rust-lang/stdarch#1267))
 * Stabilize armv8 neon instruction set on aarch64 ([Stabilize armv8 neon instruction set on aarch64 stdarch#1266](rust-lang/stdarch#1266))

The update stabilizes armv8 neon instructions on aarch64. rust-lang#90972
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Dec 16, 2021
update stdarch

2 commits in d219ad6..0716b22
2021-12-9 23:50:37 +0000 to 2021-12-14 16:17:57 +0100
 * Fix a bunch of typos ([Fix a bunch of typos stdarch#1267](rust-lang/stdarch#1267))
 * Stabilize armv8 neon instruction set on aarch64 ([Stabilize armv8 neon instruction set on aarch64 stdarch#1266](rust-lang/stdarch#1266))

The update stabilizes armv8 neon instructions on aarch64. rust-lang#90972
@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Dec 16, 2021
@Amanieu
Copy link
Member Author

Amanieu commented Dec 19, 2021

The next step is to stabilize the NEON intrinsics on ARM. Looking at the list of intrinsics which are failing the tests, I consider the following entries to be blockers:

  • # Failing tests: stdarch has incorrect results compared to Clang
  • # Implemented in stdarch for A64 only, Clang support both A32/A64

Once these are fixed, we should be able to move forward with stabilizing the ARM intrinsics and also resolve rust-lang/stdarch#1268.

@Shnatsel
Copy link
Member

rust-lang/stdarch#1266 has landed and shipped in 1.59. Should this issue be closed, or is there something else left to do?

@hkratz
Copy link
Contributor

hkratz commented Feb 25, 2022

rust-lang/stdarch#1266 has landed and shipped in 1.59. Should this issue be closed, or is there something else left to do?

Only aarch64/armv8 neon support was stabilized, this issue also tracks armv7 neon.

@marmeladema
Copy link
Contributor

I am trying to understand what exactly has been stabilized and what target feature is needed to use the simd intrinsics on aarch64 but when I go to https://doc.rust-lang.org/core/arch/aarch64/fn.vdupq_n_u8.html it says that the intrinsic is still unstable:

🔬 This is a nightly-only experimental API. (stdsimd [#48556](https://github.com/rust-lang/rust/issues/48556))

Additionally, it says that the target feature neon is required, however I tried and I am able to use them without declaring any target feature:

bash-5.1# cat neon.rs 
use std::arch::aarch64::*;

pub fn main() {
    let a = unsafe { vdupq_n_u8(7) };
    println!("a = {:?}", a);
}

bash-5.1# rustc neon.rs 

bash-5.1# ./neon 
a = uint8x16_t(7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7)

bash-5.1# rustc --version
rustc 1.59.0 (9d1b2106e 2022-02-23)

For posterity, here is a screenshot of what you can currently see for libcore aarch64 simd intrinsics:
aarch64_simd_intrinsics

@CryZe
Copy link
Contributor

CryZe commented Feb 25, 2022

This is a rustdoc bug. The intrinsics that are shared with the 32-bit ARM targets are basically partially stable; stable on AArch64, unstable on 32-bit. Rustdoc doesn't seem to be able to handle that. As to why you don't need to specify neon: It's an active by default target feature of AArch64. However even if it wasn't, you would still be able to unsafely call it, it would just be "UB" (I think) to call the instruction without having ensured (at least at runtime) the CPU actually supports it.

@marmeladema
Copy link
Contributor

marmeladema commented Feb 25, 2022

Thank you for your answer, that explains everything.

Even though, from a user perspective, I think it's very confusing. At least, it was very confusing for me.

Is there an issue opened for the rustdoc bug already? If not, I can create one.

For the neon target feature, it's a bit weird to advertise it in the doc as a regular target feature that looks like needed to be enabled. Would it be possible to add a note saying it's enabled by default for aarch64? I can provide the PR, I just don't know where to look.

@marmeladema
Copy link
Contributor

Additionally, I tried this small snippet:

use std::arch::aarch64::*;

#[target_feature(enable = "neon")]
unsafe fn aarch64_intrinsic() {
    let a = unsafe { vdupq_n_u8(7) };
    println!("a = {:?}", a);
}

pub fn main() {
        unsafe { aarch64_intrinsic() };
}

Which doesn't compile because the neon target feature is considered unstable:

error[E0658]: the target feature `neon` is currently unstable
 --> neon.rs:3:18
  |
3 | #[target_feature(enable = "neon")]
  |                  ^^^^^^^^^^^^^^^
  |
  = note: see issue #44839 <https://github.com/rust-lang/rust/issues/44839> for more information

error: aborting due to previous error

For more information about this error, try `rustc --explain E0658`.

Now, I understand even less what has been stabilized.

@workingjubilee workingjubilee added the O-AArch64 Armv8-A or later processors in AArch64 mode label Mar 23, 2022
geky added a commit to geky/gf256 that referenced this issue May 8, 2022
…table

Note, neon still requires a nightly compiler to enable, this just means we
don't need an explicit flag to detect this. And this code should "just work"
when neon becomes stable. The state of things is a bit confusing.

SIMD in Rust: rust-lang/rust#48556
NEON in Rust: rust-lang/rust#90972
@workingjubilee workingjubilee added the A-target-feature Area: Enabling/disabling target features like AVX, Neon, etc. label Mar 3, 2023
@IceTDrinker
Copy link

IceTDrinker commented May 3, 2023

Hello, any news on this?

And if contributions are possible to push towards stabilization what's currently missing?

@workingjubilee
Copy link
Member

workingjubilee commented May 3, 2023

@IceTDrinker Most NEON intrinsics were stabilized for AArch64. To resolve this for 32-bit Armv7 requires implementation, debugging their apparent lack of support, and any bugs with them mentioned in rust-lang/stdarch. In addition, some things are still not implemented, and to finish this out, we would want to implement the last barrage of intrinsics.

Basically, just covering everything mentioned in #90972 (comment)

@IceTDrinker
Copy link

@IceTDrinker Most NEON intrinsics were stabilized for AArch64. To resolve this for 32-bit Armv7 requires implementation, debugging their apparent lack of support, and any bugs with them mentioned in rust-lang/stdarch. In addition, some things are still not implemented, and to finish this out, we would want to implement the last barrage of intrinsics.

Basically, just covering everything mentioned in #90972 (comment)

Thanks, I was mainly asking for aarch64 intrisics, will take a look at the link and failing tests, but I'm afraid I don't have an arm32 machine to take a shot at fixing those (plus I'm more than a beginner contributing to a language/compiler)

@betelgeuse
Copy link
Contributor

rust-lang/stdarch#1266 has landed and shipped in 1.59. Should this issue be closed, or is there something else left to do?

Only aarch64/armv8 neon support was stabilized, this issue also tracks armv7 neon.

I would add this information to the issue title.

@tarcieri
Copy link
Contributor

It seems like this is the tracking issue for NEON on 32-bit ARM: #111800

@Amanieu
Copy link
Member Author

Amanieu commented Jul 2, 2024

Yes, this tracking issue is now complete and AArch32 NEON support is tracked in #111800.

@Amanieu Amanieu closed this as completed Jul 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-SIMD Area: SIMD (Single Instruction Multiple Data) A-target-feature Area: Enabling/disabling target features like AVX, Neon, etc. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. O-AArch64 Armv8-A or later processors in AArch64 mode O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests