Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

windows arm/arm64: runtime-coreclr outerloop tests failing with stack overflow #56570

Closed
kunalspathak opened this issue Jul 29, 2021 · 9 comments · Fixed by #56585
Closed

windows arm/arm64: runtime-coreclr outerloop tests failing with stack overflow #56570

kunalspathak opened this issue Jul 29, 2021 · 9 comments · Fixed by #56585
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI blocking-outerloop Blocking the 'runtime-coreclr outerloop' and 'runtime-libraries-coreclr outerloop' runs untriaged New issue has not been triaged by the area owner

Comments

@kunalspathak
Copy link
Member

Majority of runtime-coreclr outerloop tests are failing with following fundamental error on windows/arm64 R2R-CG2, CoreClr windows/arm

Stack overflow.
Repeat 319 times:
at System.SR.GetResourceString(System.String)
at System.AccessViolationException..ctor()
at System.SR.InternalGetResourceString(System.String)
at System.SR.GetResourceString(System.String)
at System.AccessViolationException..ctor()
at System.Collections.Generic.List1[[System.__Canon, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]]..cctor() at System.SR.InternalGetResourceString(System.String) at System.SR.GetResourceString(System.String) at System.AccessViolationException..ctor() at System.SR..cctor() at System.SR.GetResourceString(System.String) at System.AccessViolationException..ctor() at System.Collections.HashHelpers..cctor() at System.Collections.HashHelpers.GetPrime(Int32) at System.Collections.Generic.Dictionary2[[System.__Canon, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].Initialize(Int32)
at System.Collections.Generic.Dictionary2[[System.__Canon, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]]..ctor(Int32, System.Collections.Generic.IEqualityComparer1<System.__Canon>)
at System.AppContext.Setup(Char**, Char**, Int32)

Here is the query of failing tests: https://runfo.azurewebsites.net/search/tests/?q=started%3A%7E2+definition%3A655 . The tests were passing in https://dev.azure.com/dnceng/public/_build/results?buildId=1264415&view=results and started failing since https://dev.azure.com/dnceng/public/_build/results?buildId=1264799&view=results. Here are the changes that went in between 7b3e22b...db1b302

[ 7b3e22b9777 ] 2021-07-28 17:34 radical@.. [wasm] Fix Publish for Blazorwasm projects on VS17 (#56432)
[ 39803d4d3cb ] 2021-07-28 22:51 jan.vorl.. Fix redhat arm64 (#52244)
[ beaea95307a ] 2021-07-28 13:50 lakshanf.. EventSource Manifest Trimmer test (#56463)
[ cd1b4cff818 ] 2021-07-28 13:02 tarekms@.. Fix fr-CA culture time formatting and parsing (#56443)
[ 5d03d42eefa ] 2021-07-28 13:55 danmose@.. Update area-owners.md (#56481)
[ ca908f5cff6 ] 2021-07-28 12:43 Kunal.Pa.. Assert if we find undefined use during interval validation (#56439)
[ b25bd29f9ee ] 2021-07-28 12:39 andya@mi.. JIT: properly update loop memory dependence when loops are removed (#56436)

I am guessing it is mostly because of #56436.

@kunalspathak
Copy link
Member Author

@AndyAyersMS - can you please take a look and confirm?

@kunalspathak kunalspathak added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI blocking-outerloop Blocking the 'runtime-coreclr outerloop' and 'runtime-libraries-coreclr outerloop' runs labels Jul 29, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Jul 29, 2021
@AndyAyersMS
Copy link
Member

Sure. Seems odd that if it's this change, that the behavior would be arch specific.

@kunalspathak
Copy link
Member Author

Another suspect would be 39803d4 by @janvorli in #52244.

@AndyAyersMS
Copy link
Member

@janvorli I see there's what look like all-platform changes from your pr #52244 in arm/stubs.cpp and changes in unix assembly, but nothing on the windows assembly side.... ?

@janvorli
Copy link
Member

Oops, I've left out the Windows arm asm part. I wonder how come the CI was green.
But this is just for arm, no arm64 changes were there.

@AndyAyersMS
Copy link
Member

Pretty sure these are just arm tests failing, and not arm64 tests. The title is confusing though:

R2R-CG2 windows arm Checked no_tiered_compilation @ Windows.10.Arm64v8.Open

@kunalspathak
Copy link
Member Author

Yes...I misread the failure titles...they are all windows/arm.

image

I wonder how come the CI was green.

We should definitely add P0 test that will be run in CI.

@BruceForstall
Copy link
Member

We should definitely add P0 test that will be run in CI.

When Windows arm32 was removed from the supported platform set, we intentionally removed it from many testing runs. See #39655. We keep it minimally alive via outerloop testing to aid JIT developers (mostly), for whom working on Windows arm32 can be a better experience than working on Linux arm32.

@BruceForstall
Copy link
Member

The title is confusing though:

The format for jobs puts the Helix machine queue after the @ sign, and Windows arm32 jobs run on Windows arm64 machines, just as Linux arm32 runs on Linux arm64 machines, and Windows x86 on Windows x64 machines.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jul 29, 2021
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jul 30, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Aug 29, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI blocking-outerloop Blocking the 'runtime-coreclr outerloop' and 'runtime-libraries-coreclr outerloop' runs untriaged New issue has not been triaged by the area owner
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants