Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the shuffle1_dyn API #228

Open
3 tasks
gnzlbg opened this issue Mar 12, 2019 · 1 comment
Open
3 tasks

Update the shuffle1_dyn API #228

gnzlbg opened this issue Mar 12, 2019 · 1 comment
Labels
Bug Something isn't working Documentation Documentation issues Enhancement New feature or request

Comments

@gnzlbg
Copy link
Contributor

gnzlbg commented Mar 12, 2019

Currently, shuffle1_dyn is a safe function, but if the index is out-of-bounds its behavior is undefined: it will panic! in some archs, and return an unspecified lane value in others - this is a bug (allowing safe Rust to have UB).

  • We should add an unsafe shuffle1_dyn_unchecked API that has undefined behavior if the index is out-of-bounds and not 0x80.

  • We should change shuffle1_dyn to panic! if any index is out-of-bounds and not 0x80 (that is, shuffle1_dyn checks the indices and calls shuffle1_dyn_unchecked).

  • Both APIs should set the resulting lane to 0 if the index is 0x80 - we can provide a: const SHUFFLE1_DYN_ZERO: usize = 0x80; constant for this.

See WebAssembly/simd#68 , WebAssembly/simd#24, WebAssembly/simd#71

where @zeux mentioned:

x86_64: If bit 7 is 1, set to 0, otherwise use lower 4 bits for lookup
ARM: if index is out of range (0-15), set to 0
PPC64: use lower 4 bits for lookup (no out of range handling)
MIPS: if bit 6 or bit 7 is 1, set to 0; otherwise use lower 4 bits for lookup (or rather use lower 5 bits for lookup into a table that has 32 elements constructed from 2 input vectors, but if both vectors are the same then it effectively means bits 4,5 are ignored)
RISCV: if index is out of range (0-15), set to 0.

[...] if the specification says that all out-of-range indices return 0, then ARM and RISCV will use the native instruction, x86_64/PPC64/MIPS will have to generate additional checks using a 3-instruction sequence like "compare with 15; shuffle; andnot".

we could say that if index is 0x80 we return 0, if index is 0-15 we return the index result, otherwise the resulting byte is unspecified. This seems like the cleanest way to unify x86_64 behavior with the rest, resulting in zero-cost implementation for x86_64, ARM, MIPS, RISCV, and a 3-instruction sequence (compare to 0x80, shuffle, andnot) on PPC64.

@gnzlbg gnzlbg added Bug Something isn't working Enhancement New feature or request Documentation Documentation issues labels Mar 12, 2019
@reinerp
Copy link

reinerp commented Dec 29, 2019

We should add an unsafe shuffle1_dyn_unchecked API that has undefined behavior if the index is out-of-bounds and not 0x80.

We should change shuffle1_dyn to panic! if any index is out-of-bounds and not 0x80 (that is, shuffle1_dyn checks the indices and calls shuffle1_dyn_unchecked).

Both APIs should set the resulting lane to 0 if the index is 0x80 - we can provide a: const SHUFFLE1_DYN_ZERO: usize = 0x80; constant for this.Both APIs should set the resulting lane to 0 if the index is 0x80 - we can provide a: const SHUFFLE1_DYN_ZERO: usize = 0x80; constant for this.

I'd prefer the following, which is equally as consistent across architectures, and is defined in more cases: instead of the "return-zero" case being ==0x80, instead make it >=0x80. On all of x86_64, ARM, MIPS, RISCV it has a zero-cost implementation. On PPC64 I'd guess (although I don't know that ISA) it still has a 3-instruction sequence (compare to 0x80, shuffle, andnot), but this time the comparison is >= instead of ==.

With >=0x80 the implementation is defined in more cases, which is more useful for users (admittedly it constrains future hardware implementors). An especially useful case is that we'd like an index of 0xFF to return zero; this comes up naturally when ORing with m8 mask objects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working Documentation Documentation issues Enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants