-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: Vector128.ShuffleUnsafe #81609
Comments
Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics Issue DetailsBackground and motivationWhen implementing vectorized algorithms using the new
The solution in both cases is to use the architecture-specific instruction directly, often writing a helper such as private static Vector128<byte> Shuffle(Vector128<byte> vector, Vector128<byte> indices)
{
return Ssse3.IsSupported
? Ssse3.Shuffle(vector, indices)
: AdvSimd.Arm64.VectorTableLookup(vector, indices);
} (as of #80963, we already have 4 such helpers in runtime) I propose adding a API Proposalnamespace System.Runtime.Intrinsics;
public static class Vector128
{
// Existing
public static Vector128<byte> Shuffle(Vector128<byte> vector, Vector128<byte> indices);
// Proposed
public static Vector128<byte> ShuffleUnsafe(Vector128<byte> vector, Vector128<byte> indices);
} API Usage-Vector128<byte> bitMask = Shuffle(bitmapLookup, lowNibbles);
-Vector128<byte> bitPositions = Shuffle(Vector128.Create(0x8040201008040201).AsByte(),
+Vector128<byte> bitMask = Vector128.ShuffleUnsafe(bitmapLookup, lowNibbles);
+Vector128<byte> bitPositions = Vector128.ShuffleUnsafe(Vector128.Create(0x8040201008040201).AsByte(), highNibbles);
-private static Vector128<byte> Shuffle(Vector128<byte> vector, Vector128<byte> indices)
-{
- return Ssse3.IsSupported
- ? Ssse3.Shuffle(vector, indices)
- : AdvSimd.Arm64.VectorTableLookup(vector, indices);
-} Alternative DesignsNo response RisksNo response
|
namespace System.Runtime.Intrinsics;
public partial class Vector64
{
// Existing
// public static Vector64<byte> Shuffle(Vector64<byte> vector, Vector64<byte> indices);
public static Vector64<byte> ShuffleUnsafe(Vector64<byte> vector, Vector64<byte> indices);
}
public partial class Vector128
{
// Existing
// public static Vector128<byte> Shuffle(Vector128<byte> vector, Vector128<byte> indices);
public static Vector128<byte> ShuffleUnsafe(Vector128<byte> vector, Vector128<byte> indices);
}
public partial class Vector256
{
// Existing
// public static Vector256<byte> Shuffle(Vector256<byte> vector, Vector256<byte> indices);
public static Vector256<byte> ShuffleUnsafe(Vector256<byte> vector, Vector256<byte> indices);
}
public partial class Vector512
{
// Existing
// public static Vector512<byte> Shuffle(Vector512<byte> vector, Vector512<byte> indices);
public static Vector512<byte> ShuffleUnsafe(Vector512<byte> vector, Vector512<byte> indices);
} |
This won't make .NET 8 and isn't critical enough to warrant a bar check given there are several trivial workarounds. We can finish up the PR once CI is passing anytime after main opens for .NET 9 changes next month |
Background and motivation
When implementing vectorized algorithms using the new
Vector128
and friends APIs,Vector128.Shuffle
is a case where one sometimes has to drop back down to using architecture-specific instructions.Vector128.Shuffle
guarantees that results for out-of-range indices (values above 15) are normalized to 0.If the JIT can't prove that indices are already in the [0, 15] range, it is forced to emit extra logic to do the normalization.
For some algorithms, you may also want exactly the behavior of the underlying intrinsic (e.g. X86's PSHUFB will treat indices MOD 16, and set the result to 0 if the high bit is set).
The solution in both cases is to use the architecture-specific instruction directly, often writing a helper such as
(We already have 5 such helpers in runtime)
I propose adding a
Vector128.ShuffleUnsafe
helper that does not guarantee identical behavior for out-of-range indices across platforms. Instead, it would be defined as using the underlying intrinsic directly when available.API Proposal
API Usage
Alternative Designs
No response
Risks
No response
The text was updated successfully, but these errors were encountered: