Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release/7.0] JIT: fix bug in cloning conditions for jagged array (#83414) #83462

Merged
merged 2 commits into from
Mar 31, 2023

Conversation

AndyAyersMS
Copy link
Member

@AndyAyersMS AndyAyersMS commented Mar 15, 2023

Backport of #83414 to release/7.0, fixes #83242

When checking that an inner array access is in bounds, we must ensure any outer access is fully in bounds too. We were checking that idx < array.Len but not that idx >= 0.

Use an unsigned compare for this check so we can do both sides with a single instruction.

Fixes #83242.

Customer Impact

Customer reported access violation when updating their app to run on .NET 7. Because of this bug they are unable to update.

When the JIT clones a loop it has to create a series of checks to decide if it executing the cloned version will be safe, since the clone will bypass bounds checks and the like -- the idea being to get all these checks out of the way before getting into the loop, so there is no need to repeatedly do them inside the loop .

These safety checks are ordered because some checks depend on others. For instance, we have to null check a possible array deference before validating the array length is suitable. In this case we were missing one check -- that the array index might be negative, and a subsequent array length check was dependent on that check, so we caused the customer app to AV in the code the added by the JIT to decide which version of the loop to run.

In .NET 6 the JIT included that missing check and a whole bunch of other redundant checks. We made a change during .NET 7 to prune away these redundant checks, but we did not realize one of them was not fully redundant and was safeguarding a later check.

As far as I know this bug will only surface if the method has a loop over the final dimension of a jagged array.

For example, the JIT may try and clone the following loop to produce a version with no bounds checks on a[ii][j].

for (int j = 0; j < 5; j++)
{
    if (ii >= 0)
    {
        sum += a[ii][j];
    }
}

To safely invoke this version the jit must do 5 checks: (a != null) && (ii >=0) && (ii < a.Length) && (a[ii] != null) && (4 < a[ii].Length). Note some of the latter checks are dependent on the earlier ones.

The bug fixed here is that in current .NET 7 codegen the (ii >= 0) check is missing from the above, so the later check of a[ii] may cause an out of bounds access, leading to an AV or other unpredictable behavior.

The fix is to logically add in that missing check; however, it turns out there is a well-known way to leverage unsigned compares to check both ii >= 0 (which we were missing) and ii < a.Length (which we were doing) with a single compare, so my fix just modifies the existing ii < a.Length check to do an unsigned comparison.

Testing

Verified the fix on a sample app provided by the customer. From that was able to distill a very simple repro which I've added as a new test case.

Risk

This is a regression introduced in .NET 7.

Risk is low. Requires a particular pattern of jagged array access in a loop. Small (~20) number of impacted methods seen in SPMI testing; typical change in codegen is to use an unsigned compare rather than a signed compare, so fix is quite surgical.

…3414)

Backport of #83414 to release/7.0, fixes #83242

When checking that an inner array access is in bounds, we must ensure any outer
access is fully in bounds too. We were checking that `idx < array.Len` but not
that `idx >= 0`.

Use an unsigned compare for this check so we can do both sides with a single
instruction.

Fixes #83242.
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 15, 2023
@ghost ghost assigned AndyAyersMS Mar 15, 2023
@ghost
Copy link

ghost commented Mar 15, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

Backport of #83414 to release/7.0, fixes #83242

When checking that an inner array access is in bounds, we must ensure any outer access is fully in bounds too. We were checking that idx < array.Len but not that idx >= 0.

Use an unsigned compare for this check so we can do both sides with a single instruction.

Fixes #83242.

Customer Impact

Customer reported access violation when updating their app to run on .NET 7. Because of this bug they are unable to update.

Testing

Verified the fix on a sample app provided by the customer. From that was able to distill a very simple repro which I've added as a new test case.

Risk

This is a regression introduced in .NET 7.

Risk is low. Requires a particular pattern of jagged array access in a loop. Small (~20) number of impacted methods seen in SPMI testing; typical change in codegen is to use an unsigned compare rather than a signed compare, so fix is quite surgical.

Author: AndyAyersMS
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@AndyAyersMS
Copy link
Member Author

@BruceForstall PTAL. This was a manual backport.

@AndyAyersMS
Copy link
Member Author

cc @dotnet/jit-contrib

Copy link
Member

@jeffschwMSFT jeffschwMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved. we will take for consideration in 7.0.x

@jeffschwMSFT jeffschwMSFT added the Servicing-consider Issue for next servicing release review label Mar 15, 2023
@jeffschwMSFT jeffschwMSFT added this to the 7.0.x milestone Mar 15, 2023
@rbhanda rbhanda added Servicing-approved Approved for servicing release and removed Servicing-consider Issue for next servicing release review labels Mar 16, 2023
@rbhanda rbhanda modified the milestones: 7.0.x, 7.0.6 Mar 16, 2023
@danmoseley
Copy link
Member

&& (a[ii].Length < 5)

should this be > 4 ?

@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented Mar 16, 2023

&& (a[ii].Length < 5)

should this be > 4 ?

Yes. My mistake in transcribing -- the JIT gets this right. Internal jit output:

Deriving cloning conditions for L01
Conditions: (5 LE V00[V03].Length)
Array deref condition tree:
V00, level 0 => {
    V03, level 1 => {
    }
}
Block conditions:
0 = (V00 NE null)
1 = (V03 LTU V00.Length)
2 = (V00[V03] NE null)

(here V00 is a and V03 is ii).

@carlossanlop
Copy link
Member

I'm retargeting this PR to the new release/7.0-staging branch, which is the one that we will use from now on for servicing fixes.

Repo maintainers will now be allowed to merge their own servicing PR as long as it meets the requirements:

  • It is approved by Tactics (signaled by adding the Servicing-approved label).
  • It's signed-off by an area owner.
  • The CI is green, or the failures are investigated as unrelated.
  • And if the PR touches an OOB package, the necessary OOB authoring changes are added.

The new process is described here: runtime/docs/project/library-servicing.md.

The infra team will be actively monitoring servicing PRs to ensure all requirements are met and to help with any issues.

Let me know if you have any questions.

@AndyAyersMS
Copy link
Member Author

Failure is unrelated, opened #84130 for build analysis.

@AndyAyersMS
Copy link
Member Author

Let me know if you have any questions.

"check-labels" seems to be stuck?

@carlossanlop carlossanlop added Servicing-approved Approved for servicing release and removed Servicing-approved Approved for servicing release labels Mar 30, 2023
@carlossanlop
Copy link
Member

"check-labels" seems to be stuck?

Fixed by removing +readding Servicing-approved to retrigger it. You can merge now @AndyAyersMS.

Yesterday morning we were seeing some weird behaviors with GitHub Actions and this one in particular stopped working entirely. We made a couple of changes and it started working normally, but for affected PRs it seems that we need to manually retrigger it to get a valid result again (removing+readding the label). cc @hoyosjs

@AndyAyersMS AndyAyersMS merged commit 7cd68bc into release/7.0-staging Mar 31, 2023
@jkotas jkotas deleted the Fix83242Backport branch March 31, 2023 21:33
markples added a commit that referenced this pull request Apr 6, 2023
This should fix test builds with `set BuildAsStandalone=true`.

- [Remove](5f14f05) unconditional `<BuildAsStandalone>false</BuildAsStandalone>` from various HardwareIntrinsics tests. This is already the default via Directory.Build.props but interferes with `set BuildAsStandalone=true`.  (nit - also remove an empty `PropertyGroup` from GitHub_43569.csproj)
- [Remove `Main` methods](0a886f8). We might be able to pursue a strategy where the test author provides a `Main(args)` method for local custom execution. The test would either need a separate parameterless [Fact] method or a [Theory] plus data for `Main` for merged group to execute it. However, a RequiresProcessIsolation/BuildAsStandalone build would ignore that metadata, so I'd like to avoid that complexity. The Main methods here set a bool or parse an int, which can be added locally to any test as needed.
- [Add comment](e79a10c) to Directory.Merged.props (addressing feedback on #83462)
- [Add the Methodical test back into the superpmi test](7c61de8) using RequiresProcessIsolation

Fixes #76421
Fixes #81984
@ghost ghost locked as resolved and limited conversation to collaborators May 1, 2023
@rbhanda rbhanda modified the milestones: 7.0.6, 7.0.7 Jun 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI Servicing-approved Approved for servicing release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants