Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Add a primitive to split statements at a specified tree #83005

Merged
merged 22 commits into from
Mar 11, 2023

Conversation

jakobbotsch
Copy link
Member

@jakobbotsch jakobbotsch commented Mar 5, 2023

Add a gtSplitTree that can split a statement at an arbitrary tree: this creates new statements before the statement such that the original statement now executes the input tree as the very first non-invariant node.

This can be used for example to introduce control flow at an arbitrary point within a tree.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 5, 2023
@ghost ghost assigned jakobbotsch Mar 5, 2023
@ghost
Copy link

ghost commented Mar 5, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

null

Author: jakobbotsch
Assignees: jakobbotsch
Labels:

area-CodeGen-coreclr

Milestone: -

@BruceForstall
Copy link
Member

It seems totally reasonable to add this as one of the JIT stress modes if it is incorporated into the JIT.

Could we use gtSplitTree to remove all GT_COMMA by hoisting the op1 to its own statement? There are probably lots of optimizations that are doing tree-based pattern matching including commas that would generate worse code (or fail) as a result, but maybe as a stress mode?

@jakobbotsch
Copy link
Member Author

It seems totally reasonable to add this as one of the JIT stress modes if it is incorporated into the JIT.

Could we use gtSplitTree to remove all GT_COMMA by hoisting the op1 to its own statement? There are probably lots of optimizations that are doing tree-based pattern matching including commas that would generate worse code (or fail) as a result, but maybe as a stress mode?

Yes, I think I will end up PR'ing this primitive separately from #81635, and it seems like a good idea to have these stress modes. Indeed it can be used to remove all COMMA nodes, that would also be an interesting stress mode, I'll add it once this is in a more workable state.

@jakobbotsch
Copy link
Member Author

/azp run Fuzzlyn

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Also fix LIR links traversed initialization
@jakobbotsch
Copy link
Member Author

/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@jakobbotsch jakobbotsch changed the title Stress testing gtSplitTree JIT: Add a primitive to split statements at a specified tree Mar 10, 2023
@jakobbotsch
Copy link
Member Author

/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Mar 11, 2023

There are probably lots of optimizations that are doing tree-based pattern matching including commas that would generate worse code (or fail) as a result, but maybe as a stress mode?

In case you are curious, here are the diffs if we remove all commas before the optimization phases:
https://gist.github.com/jakobbotsch/e210230f6d154c7547ae0823ed32b230

The interference checking done by gtSplitTree in this PR is very rudimentary, we would need something closer to IsInvariantInRange in lowering if we actually wanted to try this for real (to avoid being conservative).

@BruceForstall
Copy link
Member

In case you are curious, here are the diffs

As expected... bad :-) If optimizations used SSA instead of tree-based pattern matching, maybe it wouldn't be so bad.

@jakobbotsch
Copy link
Member Author

Previous stress runs were clean except for #83298. I had to merge this to pick up #83224 that this was based on, but that shouldn't require a rerun of those pipelines.

cc @dotnet/jit-contrib PTAL @EgorBo

Comment on lines +5403 to +5413
if (gtSplitTree(block, stmt, tree, &newStmt, &use))
{
while ((newStmt != nullptr) && (newStmt != stmt))
{
fgMorphStmtBlockOps(block, newStmt);
newStmt = newStmt->GetNextStmt();
}

fgMorphStmtBlockOps(block, stmt);
gtUpdateStmtSideEffects(stmt);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EgorBo Your PR will need to do something similar like this, but note that calling fgMorphStmtBlockOps can invalidate the use, so it should be done only after you've replaced it with the new local.

@@ -18,7 +18,7 @@ bool Compiler::fgBlockContainsStatementBounded(BasicBlock* block,
Statement* stmt,
bool answerOnBoundExceeded /*= true*/)
{
const __int64 maxLinks = 1000000000;
const __int64 maxLinks = 100000000;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this limit was a bit on the high side before.

Copy link
Member

@EgorBo EgorBo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I've been following changes in this PR so LGTM

@jakobbotsch jakobbotsch merged commit 5265218 into dotnet:main Mar 11, 2023
@jakobbotsch jakobbotsch deleted the stress-split-tree branch March 11, 2023 21:11
@ghost ghost locked as resolved and limited conversation to collaborators Apr 11, 2023
@jakobbotsch
Copy link
Member Author

As expected... bad :-) If optimizations used SSA instead of tree-based pattern matching, maybe it wouldn't be so bad.

I realized while working on #89560 that the comma removal is more conservative than necessary -- it introduces a local for the comma's op2, which is unnecessary.

With that fixed the diffs look like https://gist.github.com/jakobbotsch/dd2ea6d7b94d9dbf80283191d98e2dd0, so it goes from "very bad" to "still very bad, but only half as very bad".

@EgorBo
Copy link
Member

EgorBo commented Jul 27, 2023

Wonder what happens in this small guy: System.Numerics.Vector2:get_Item(int):float:this (FullOpts)
image

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Jul 27, 2023

Wonder what happens in this small guy: System.Numerics.Vector2:get_Item(int):float:this (FullOpts)

Removal of commas changes

N009 ( 12, 14) [000008] ---XG------                           RETURN    float  $205
N008 ( 11, 13) [000007] ---XG------                         └──▌  HWINTRINSIC float  float GetElement <l:$2c3, c:$2c2>
N002 (  3,  2) [000001] ---XG------                            ├──▌  IND       simd8  <l:$240, c:$241>
N001 (  1,  1) [000000] -----------                              └──▌  LCL_VAR   byref  V00 this         u:1 (last use) $80
N007 (  7, 10) [000006] ---X-------                            └──▌  COMMA     int    $280
N005 (  6,  9) [000005] ---X-------                               ├──▌  BOUNDS_CHECK_ArgRng void   $203
N003 (  1,  1) [000004] -----------                                 ├──▌  LCL_VAR   int    V01 arg1         u:1 $c0
N004 (  1,  1) [000003] -----------                                 └──▌  CNS_INT   int    2 $44
N006 (  1,  1) [000002] -----------                               └──▌  LCL_VAR   int    V01 arg1         u:1 (last use) $c0

into

------------ BB01 [000..00D) (return), preds={} succs={}

***** BB01
STMT00001 ( 0x000[E-] ... ??? )
N003 (  7,  5) [000009] DA-XG------                           STORE_LCL_VAR simd8  V03 rat0         
N002 (  3,  2) [000001] ---XG------                         └──▌  IND       simd8  <l:$240, c:$241>
N001 (  1,  1) [000000] -----------                            └──▌  LCL_VAR   byref  V00 this         u:1 (last use) $80

***** BB01
STMT00002 ( ??? ... ??? )
N003 (  6,  9) [000005] ---X-------                           BOUNDS_CHECK_ArgRng void   $203
N001 (  1,  1) [000004] -----------                         ├──▌  LCL_VAR   int    V01 arg1         u:1 $c0
N002 (  1,  1) [000003] -----------                         └──▌  CNS_INT   int    2 $44

***** BB01
STMT00000 ( 0x000[E-] ... 0x00C )
N004 (  6,  5) [000008] ----G------                           RETURN    float  $205
N003 (  5,  4) [000007] ----G------                         └──▌  HWINTRINSIC float  float GetElement <l:$2c3, c:$2c2>
N001 (  3,  2) [000010] -----------                            ├──▌  LCL_VAR   simd8  V03 rat0         
N002 (  1,  1) [000002] -----------                            └──▌  LCL_VAR   int    V01 arg1         u:1 (last use) $c0

It actually looks like LowerHWIntrinsicGetElement is doing something illegal here in the baseline. It ends up reordering [000001] and [000005], which reorders a ArgumentOutOfRangeException and NullReferenceException.
Although in this case presumably [000001] is non-faulting, even though it's not marked as such.

Edit: opened #89565 for this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants