Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<atomic>: Avoid checks for cmpxchg16b availability #4481

Closed
AlexGuteniev opened this issue Mar 15, 2024 · 3 comments · Fixed by #4751
Closed

<atomic>: Avoid checks for cmpxchg16b availability #4481

AlexGuteniev opened this issue Mar 15, 2024 · 3 comments · Fixed by #4751
Labels
fixed Something works now, yay! performance Must go faster

Comments

@AlexGuteniev
Copy link
Contributor

AlexGuteniev commented Mar 15, 2024

Ideally, we'd want drop it completely, but need to confirm if these CPUs are not needed to be supported. They can run on systems before Windows 8.1

Alternatively we may have CPU assumption based on a certain /arch mode. I don't know which, though. I'm sure AVX512 implies that, but looks like the bar can be lower.

@Alcaro
Copy link
Contributor

Alcaro commented Mar 15, 2024

Might as well repost from Discord.

According to assorted sources, every currently-existing CPU that supports AVX also supports CX16, with exception of VIA eden-x4. If ignoring a few other VIAs, everything with SSSE3 has CX16. It's also present in -march=x86-64-v2, and mandatory for w8+, so future CPUs without that instruction seem quite unlikely to me.

I'd say the first step should be to ask the VIA folks if gcc's lack of that instruction is a gcc bug - this guy says AVX-enabled VIAs have atomic xmm moves (though I wasn't able to find that llvm bug), and if that one's atomic, not supporting CX16 feels like a quite strange move to me.

If the instruction is absent, and supporting that CPU is deemed valuable, then the only /arch that guarantees CX16 is avx512, which is (to my knowledge) a quite rare value, hardly worth optimizing for. Perhaps the best solution is simply wait until w7 departs.

(Probably not far off. Rest in peace.)

@StephanTLavavej StephanTLavavej added performance Must go faster blocked Something is preventing work on this vNext Breaks binary compatibility and removed blocked Something is preventing work on this labels Mar 20, 2024
@StephanTLavavej
Copy link
Member

At this time, we definitely still need to support ancient CPUs, e.g. those that don't have SSE4.2. (There was a recent compiler back-end bug regarding unguarded use of the popcnt instruction, which wrought havoc before it was fixed.)

I'm also not exactly sure of the precise timing of when this instruction appeared (with respect to increasing ISA feature sets) and when we'll be able to start assuming its presence. We talked about this at the weekly maintainer meeting, and this really seems like a vNext optimization, creating a clean baseline for performance going forward.

@StephanTLavavej StephanTLavavej removed the vNext Breaks binary compatibility label Jun 17, 2024
@StephanTLavavej
Copy link
Member

We can do this before vNext - I have the necessary changes, plus an argument for why it's reasonable to do. Stay tuned...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed Something works now, yay! performance Must go faster
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants