Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wasm] Optimize Vector128<float>/<double>.Equals in interp/jiterp #88064

Merged
merged 3 commits into from
Jul 12, 2023

Conversation

kg
Copy link
Member

@kg kg commented Jun 26, 2023

Calling Equals on vectors of floating point types produces extremely nasty interp code that compiles down to a huge blob of wasm, which makes it hard to write tests/measurements that actually fit into traces. This PR adds intrinsics for doing equality comparisons on R4/R8 vectors and then implements them in jiterp. The performance also improves slightly over what it was before since the generated code is much improved.

The c interp implementation could probably be improved on, though - perhaps if clang has an all bits set intrinsic and an all_true intrinsic?

@ghost
Copy link

ghost commented Jun 26, 2023

Tagging subscribers to this area: @BrzVlad, @kotlarmilos
See info in area-owners.md if you want to be subscribed.

Issue Details

Calling Equals on vectors of floating point types produces extremely nasty interp code that compiles down to a huge blob of wasm, which makes it hard to write tests/measurements that actually fit into traces. This PR adds intrinsics for doing equality comparisons on R4/R8 vectors and then implements them in jiterp. The performance also improves slightly over what it was before since the generated code is much improved.

The c interp implementation could probably be improved on, though - perhaps if clang has an all bits set intrinsic and an all_true intrinsic?

Author: kg
Assignees: -
Labels:

area-Codegen-Interpreter-mono

Milestone: -

@kg
Copy link
Member Author

kg commented Jul 11, 2023

@tannergooding any concerns about this change to Equals?

@lewing lewing requested a review from tannergooding July 11, 2023 03:51
@lewing
Copy link
Member

lewing commented Jul 11, 2023

coreclr failures do not apply, @kg merge at will. @tannergooding we welcome feedback before or after merge

Comment on lines +395 to +399
internal static bool EqualsFloatingPoint (Vector128<T> lhs, Vector128<T> rhs)
{
Vector128<T> result = Vector128.Equals(lhs, rhs) | ~(Vector128.Equals(lhs, lhs) | Vector128.Equals(rhs, rhs));
return result.AsInt32() == Vector128<int>.AllBitsSet;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just handle Equals directly? There are several APIs on Vector2/3/4 and Vector<T> that are [Intrinsic] instance methods, so I would expect Mono has the support for handling that already?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now we already vectorize the underlying Equals operation, the problem is that this one desugars (?) to a bunch of individual SIMD operations that each get their own interp opcode.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is, to be clear, if you call Vector128<float>.Equals(...) the interp generates something like this right now:

v128_equals_r4 (lhs, rhs)
v128_equals_r4 (lhs, lhs)
v128_equals_r4 (rhs, rhs)
v128_or stack
v128_not stack
v128_or stack
v128_load_allbitsset
v128_equals_i4 stack
v128_all_true

Copy link
Member

@tannergooding tannergooding Jul 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant why not mark the instance Equals method as [Intrinsic] and have Mono treat that to be the same as operator == for integers and to be a single opcode representing this sequence for float/double

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's what you prefer I can figure out how to do it. I wanted to keep this change as narrow as possible since the existing Equals method is fine for ints as-is. I'll see how much work it is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is, why split off a separate EqualsFloatingPoint at all, rather than simply handling its only caller directly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries if its overly complex to do. I just figured it would be better overall for both RyuJIT and Mono.

We actually used to do just what I've proposed in RyuJIT, but dropped that support a while back since the inliner was able to handle it and the instance equals calls were much rarer to encounter.

kg added 3 commits July 11, 2023 16:22
Add interp intrinsics for Vector128 float and double Equals methods
Implement Vector128 float and double Equals methods in jiterp
Add validation to make sure we never appendSimd(0) by accident
@kg kg force-pushed the interp-vec-fp-equals branch from 7954f7d to 4aaffd1 Compare July 12, 2023 00:02
@kg
Copy link
Member Author

kg commented Jul 12, 2023

Some of the PackedSimd changes on main broke this, so I had to update it. Good thing I ran my tests again :-)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants