Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cpu.rs: Refactor and Fix ARM/Aarch64 CPU features handling. #1100

Merged
merged 1 commit into from
Nov 18, 2020

Conversation

briansmith
Copy link
Owner

Presently on aarch64-apple-* GFp_armcap_P is always zero. That's wrong; the
assembly language code needs it to be set correctly, or else the most optimized
code paths (NEON and/or SHA-2 extensions) will never be chosen.

Refactor the code so that GFp_armcap_P is set correctly, and to make it easier to
understand and maintain.

This will enable more optimized implementations on aarch64-apple-* targets, whereas
before the lowest common denominator implementations were being used for any
features that did the feature detection in assembly language code instead of Rust.

Move the definition of GFp_armcap_P to Rust so wouldn't have to keep C and Rust
code for it in sync. Remove the fallback definitions of GFp_armcap_P that use the
".comm"; they would always be set to zero if they were ever used, which wouldn't
(necessarily) match the static feature set. Removing them makes it clearer that
those definitions aren't used.

@briansmith briansmith self-assigned this Nov 18, 2020
@briansmith briansmith force-pushed the b/aarch64-static-feature-detection branch from 9943ce8 to 66d8c5e Compare November 18, 2020 05:35
Presently on aarch64-apple-* `GFp_armcap_P` is always zero. That's wrong; the
assembly language code needs it to be set correctly, or else the most optimized
code paths (NEON and/or SHA-2 extensions) will never be chosen.

Refactor the code so that `GFp_armcap_P` is set correctly, and to make it easier to
understand and maintain.

This will enable more optimized implementations on aarch64-apple-* targets, whereas
before the lowest common denominator implementations were being used for any
features that did the feature detection in assembly language code instead of Rust.

Move the definition of `GFp_armcap_P` to Rust so wouldn't have to keep C and Rust
code for it in sync. Remove the fallback definitions of `GFp_armcap_P` that use the
".comm"; they would always be set to zero if they were ever used, which wouldn't
(necessarily) match the static feature set. Removing them makes it clearer that
those definitions aren't used.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant