-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Analyzer] Suggest more optimal SIMD patterns where applicable #82488
Comments
Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics Issue DetailsWriting SIMD code can be complex and the "optimal" pattern can vary per platform/architecture. This can get more complicated when newer ISAs are also available for use. As such, we should provide a code fixer that identifies such patterns and suggests simple transformations to alternative methods that are likely to be better performing. Category: Performance ExampleA simple example is given below. The exact scenarios supported/recognized will likely grow over time based on input from SIMD experts: - return value * 2;
+ return value + value; -or- - return Sse.Shuffle(left, right, 0b11_11_10_10);
+ return Sse.UnpackHigh(left, right); Some of these are patterns that the JIT could recognize and optimize implicitly, but that can be equally complex. Recognizing and suggesting the fixes in C# is often simpler/faster.
|
Category: Performance For code that replaces calls into intrinsics this seems fine, such as: - Sse.Shuffle(value, value, 0b11_11_10_10);
+ Sse.UnpackHigh(value, value); For more general code that doesn't involve calls to intrinsics we should consider recognizing those in the JIT instead, such as: - value * 2;
+ value + value; |
Analyzers like this really cannot come soon enough. Even a cheat sheet would be useful just now. |
The recommended patterns is likely to grow in the future. |
I think it really depends on the operation in question and how likely it is for users to want to separate a given pattern out from the rest.
I do think #82486 is the more prominent/important one to handle first, while its targeted around maintainability/portability, it also has a big impact on JIT codegen and factors into perf as well. The analyzer represented by this issue (82488) is then more about recommending alternative patterns that the underlying runtime may not support yet or which may be available on newer ISAs that the user hasn't considered yet that should be smaller or more performant. In some cases, the two analyzers may overlap; for example It will ultimately come down to some level of brainstorming, but there's plenty of patterns to be handled many of which are called out in the respective architecture optimization manuals or which other compilers like MSVC/Clang or even RyuJIT itself is already handling. Many of them may even be simplifications that are recommended so that we can keep the amount of runtime pattern recognition simpler. If the analyzer exists to tell users to call |
#103557 is an example of a place where this analyzer applies. Simply in removing redundant |
Writing SIMD code can be complex and the "optimal" pattern can vary per platform/architecture. This can get more complicated when newer ISAs are also available for use.
As such, we should provide a code fixer that identifies such patterns and suggests simple transformations to alternative methods that are likely to be better performing.
Category: Performance
Severity = suggestion
Example
A simple example is given below. The exact scenarios supported/recognized will likely grow over time based on input from SIMD experts:
-or-
Some of these are patterns that the JIT could recognize and optimize implicitly, but that can be equally complex. Recognizing and suggesting the fixes in C# is often simpler/faster.
The text was updated successfully, but these errors were encountered: