Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have gen* methods use the optimized BitOperations where possible #81697

Closed
wants to merge 1 commit into from

Conversation

tannergooding
Copy link
Member

@tannergooding tannergooding commented Feb 6, 2023

The only real change here is that:

  • BitOperations::BitScanReverse doesn't require an "out" parameter, it just directly asserts "not zero"
  • BitOperations::BitScanForward is the same, which means it can use __builtin_ctz on Unix, leading to better codegen
  • genCountBits uses __builtin_popcount on Clang/GCC and the well-known bit twiddling hacks on MSVC (which is what Clang/GCC use for their builtin).

The other functions are just being made consistent and forwarding to a centralized impl. In a couple places comments were added to clarify edge case behavior that a given gen* method was assuming (e.g. genLog2 is not a real log2, but rather assumes the input has exactly one bit set).

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 6, 2023
@ghost ghost assigned tannergooding Feb 6, 2023
@ghost
Copy link

ghost commented Feb 6, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

The only real change here is that:

  • BitOperations::BitScanReverse doesn't require an "out" parameter, it just directly asserts "not zero"
  • BitOperations::BitScanForward is the same, which means it can use __builtin_ctz on Unix, leading to better codegen
  • genCountBits uses __builtin_popcount on Clang/GCC and the well-known bit twiddling hacks on MSVC (which is what Clang/GCC use for their builtin).
Author: tannergooding
Assignees: tannergooding
Labels:

area-CodeGen-coreclr

Milestone: -

@tannergooding
Copy link
Member Author

These were spotted when looking into ways to reduce TP for #79544 (comment)

Once this goes in, a separate PR that tweaks how LSRA consumes these can be tested.

@tannergooding tannergooding force-pushed the better-util branch 2 times, most recently from c29f736 to 0168e15 Compare February 6, 2023 15:36
@tannergooding
Copy link
Member Author

tannergooding commented Feb 6, 2023

Actual impact is a bit hit/miss. Some platforms are better, others are worse by about the same amount.

There's probably a smaller subset that could be extracted, but it's not something I'm going to spend cycles digging into atm given how small the benefit is showing up as.

It's possible its something small getting in the way, like inlining.

@ghost ghost locked as resolved and limited conversation to collaborators Mar 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant