Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jdk19 OpenJDK java/lang/Thread/virtual/stress/TimedGet.java crash vmState=0x00000000 #16729

Closed
pshipton opened this issue Feb 15, 2023 · 24 comments · Fixed by #16953
Closed

jdk19 OpenJDK java/lang/Thread/virtual/stress/TimedGet.java crash vmState=0x00000000 #16729

pshipton opened this issue Feb 15, 2023 · 24 comments · Fixed by #16953
Labels
comp:vm jdk19 segfault Issues that describe segfaults / JVM crashes test failure
Milestone

Comments

@pshipton
Copy link
Member

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_x86-64_mac_Nightly/110
jdk_lang_0
java/lang/Thread/virtual/stress/TimedGet.java

00:25:15  Unhandled exception
00:25:15  Type=Segmentation error vmState=0x00000000
00:25:15  Unhandled exception
00:25:15  Type=Segmentation error vmState=0x00000000
00:25:15  Unhandled exception
00:25:15  Type=Segmentation error vmState=0x00000000
00:25:15  J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
00:25:15  Handler1=000000000C837C60 Handler2=000000000AF859F0 InaccessibleAddress=0000000000000003
00:25:15  J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
00:25:15  Handler1=000000000C837C60 Handler2=000000000AF859F0 InaccessibleAddress=0000000000000003
00:25:15  Unhandled exception
00:25:15  Type=Segmentation error vmState=0x00000000
00:25:15  J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
00:25:15  Handler1=000000000C837C60 Handler2=000000000AF859F0 InaccessibleAddress=0000000000000003
00:25:15  J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
00:25:15  Handler1=000000000C837C60 Handler2=000000000AF859F0 InaccessibleAddress=0000000000000003
00:25:15  RDI=0000000000000015 RSI=0000000000000003 RAX=0000000000000000 RBX=0000700008653AE0
00:25:15  RCX=000000001023AB00 RDX=000000001023A6C8 R8=000000000000F232 R9=0000000000000014
00:25:15  R10=00007FA9F3100000 R11=0000000000000000 R12=0000700008653AF8 R13=00000000C49AA600
00:25:15  R14=0000700008653AF0 R15=000000001023A6F0
00:25:15  RDI=0000000000000015 RSI=0000000000000003 RAX=0000000000000000 RBX=0000700008A6BAE0
00:25:15  RCX=0000000010287000 RDX=0000000010286B58 R8=0000000000009003 R9=0000000000000014
00:25:15  R10=00007FA9F3300000 R11=0000000000000000 R12=0000700008A6BAF8 R13=00000000C49AAB78
00:25:15  R14=0000700008A6BAF0 R15=0000000010286B80
00:25:15  RDI=0000000000000015 RSI=0000000000000003 RAX=0000000000000000 RBX=00007000088E2AE0
00:25:15  RCX=000000001023D000 RDX=0000000010275128 R8=000000000000F7AA R9=0000000000000014
00:25:15  R10=00007FA9F3400000 R11=0000000000000000 R12=00007000088E2AF8 R13=00000000C49AAFD8
00:25:15  R14=00007000088E2AF0 R15=0000000010275150
00:25:15  RDI=0000000000000015 RSI=0000000000000003 RAX=0000000000000000 RBX=0000700008AEEAE0
00:25:15  RCX=000000001028E200 RDX=000000001028DD48 R8=000000000000464F R9=0000000000000014
00:25:15  R10=00007FA9F3B00000 R11=0000000000000000 R12=0000700008AEEAF8 R13=00000000C49AAC90
00:25:15  R14=0000700008AEEAF0 R15=000000001028DD70
00:25:15  RIP=000000000C8AD792 GS=0000 FS=0000 RSP=0000700008653840
00:25:15  RFlags=0000000000010246 CS=002B RBP=0000700008653AD0 ERR=0000000300000004
00:25:15  TRAPNO=000000040000000E CPU=0003000000040000 FAULTVADDR=0000000000000003
00:25:15  RIP=000000000C8AD792 GS=0000 FS=0000 RSP=0000700008A6B840
00:25:15  RFlags=0000000000010246 CS=002B RBP=0000700008A6BAD0 ERR=0000000300000004
00:25:15  TRAPNO=000000040000000E CPU=0003000000040000 FAULTVADDR=0000000000000003
00:25:15  RIP=000000000C8AD792 GS=0000 FS=0000 RSP=00007000088E2840
00:25:15  RFlags=0000000000010246 CS=002B RBP=00007000088E2AD0 ERR=0000000300000004
00:25:15  TRAPNO=000000040000000E CPU=0003000000040000 FAULTVADDR=0000000000000003
00:25:15  RIP=000000000C8AD792 GS=0000 FS=0000 RSP=0000700008AEE840
00:25:15  RFlags=0000000000010246 CS=002B RBP=0000700008AEEAD0 ERR=0000000300000004
00:25:15  TRAPNO=000000040000000E CPU=0003000000040000 FAULTVADDR=0000000000000003
00:25:15  XMM0 3f60624dd2f1a9fc (f: 3539053056.000000, d: 2.000000e-03)
00:25:15  XMM1 43e0000000000000 (f: 0.000000, d: 9.223372e+18)
00:25:15  XMM2 c3e0000000000000 (f: 0.000000, d: -9.223372e+18)
00:25:15  XMM3 4330000000000000 (f: 0.000000, d: 4.503600e+15)
00:25:15  XMM4 412e848000000000 (f: 0.000000, d: 1.000000e+06)
00:25:15  XMM5 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM6 3fcb5b519e8fb5a4 (f: 2660218368.000000, d: 2.137243e-01)
00:25:15  XMM7 402fe2804e87b348 (f: 1317516160.000000, d: 1.594239e+01)
00:25:15  XMM8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM0 3f689374bc6a7efa (f: 3161095936.000000, d: 3.000000e-03)
00:25:15  XMM1 43e0000000000000 (f: 0.000000, d: 9.223372e+18)
00:25:15  XMM2 c3e0000000000000 (f: 0.000000, d: -9.223372e+18)
00:25:15  XMM3 4330000000000000 (f: 0.000000, d: 4.503600e+15)
00:25:15  XMM4 412e848000000000 (f: 0.000000, d: 1.000000e+06)
00:25:15  XMM5 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM6 3fcaf3c94e80bff3 (f: 1317060608.000000, d: 2.105648e-01)
00:25:15  XMM7 402fe2804e87b348 (f: 1317516160.000000, d: 1.594239e+01)
00:25:15  XMM8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM0 3f60624dd2f1a9fc (f: 3539053056.000000, d: 2.000000e-03)
00:25:15  XMM1 43e0000000000000 (f: 0.000000, d: 9.223372e+18)
00:25:15  XMM2 c3e0000000000000 (f: 0.000000, d: -9.223372e+18)
00:25:15  XMM3 4330000000000000 (f: 0.000000, d: 4.503600e+15)
00:25:15  XMM4 412e848000000000 (f: 0.000000, d: 1.000000e+06)
00:25:15  XMM5 0000000000ca0000 (f: 13238272.000000, d: 6.540575e-317)
00:25:15  XMM6 3fcb5b519e8fb5a4 (f: 2660218368.000000, d: 2.137243e-01)
00:25:15  XMM7 402fe2804e87b348 (f: 1317516160.000000, d: 1.594239e+01)
00:25:15  XMM8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM0 3f689374bc6a7efa (f: 3161095936.000000, d: 3.000000e-03)
00:25:15  XMM1 43e0000000000000 (f: 0.000000, d: 9.223372e+18)
00:25:15  XMM2 c3e0000000000000 (f: 0.000000, d: -9.223372e+18)
00:25:15  XMM3 4330000000000000 (f: 0.000000, d: 4.503600e+15)
00:25:15  XMM4 412e848000000000 (f: 0.000000, d: 1.000000e+06)
00:25:15  XMM5 0000000000520000 (f: 5373952.000000, d: 2.655085e-317)
00:25:15  XMM6 3fcc8ff7c79a9a22 (f: 3348797952.000000, d: 2.231436e-01)
00:25:15  XMM7 402fe2804e87b348 (f: 1317516160.000000, d: 1.594239e+01)
00:25:15  XMM8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  XMM15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:25:15  JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "native memory exhausted" at 2023/02/15 00:20:47 - please wait.
00:25:15  JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "native memory exhausted" at 2023/02/15 00:20:47 - please wait.
00:25:15  JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "native memory exhausted" at 2023/02/15 00:20:47 - please wait.
00:25:15  JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "native memory exhausted" at 2023/02/15 00:20:47 - please wait.
00:25:15  JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "native memory exhausted" at 2023/02/15 00:20:47 - please wait.
00:25:15  JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "native memory exhausted" at 2023/02/15 00:20:47 - please wait.
00:25:15  JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "native memory exhausted" at 2023/02/15 00:20:47 - please wait.
00:25:15  JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "native memory exhausted" at 2023/02/15 00:20:47 - please wait.
00:25:15  Module=/Users/jenkins/workspace/Test_openjdk19_j9_sanity.openjdk_x86-64_mac_Nightly/openjdkbinary/j2sdk-image/lib/default/libj9vm29.dylib
00:25:15  Module_base_address=000000000C800000 Symbol=_ZN32VM_BytecodeInterpreterCompressed3runEP10J9VMThread
00:25:15  Symbol_address=000000000C88F4D0
00:25:15  Target=2_90_20230214_179 (Mac OS X 10.15.7)
00:25:15  CPU=amd64 (12 logical CPUs) (0x400000000 RAM)
00:25:15  ----------- Stack Backtrace -----------
00:25:15  Module=/Users/jenkins/workspace/Test_openjdk19_j9_sanity.openjdk_x86-64_mac_Nightly/openjdkbinary/j2sdk-image/lib/default/libj9vm29.dylib
00:25:15  Module_base_address=000000000C800000 Symbol=_ZN32VM_BytecodeInterpreterCompressed3runEP10J9VMThread
00:25:15  Symbol_address=000000000C88F4D0
00:25:15  Target=2_90_20230214_179 (Mac OS X 10.15.7)
00:25:15  CPU=amd64 (12 logical CPUs) (0x400000000 RAM)
00:25:15  ----------- Stack Backtrace -----------
00:25:15  Module=/Users/jenkins/workspace/Test_openjdk19_j9_sanity.openjdk_x86-64_mac_Nightly/openjdkbinary/j2sdk-image/lib/default/libj9vm29.dylib
00:25:15  Module_base_address=000000000C800000 Symbol=_ZN32VM_BytecodeInterpreterCompressed3runEP10J9VMThread
00:25:15  Symbol_address=000000000C88F4D0
00:25:15  Target=2_90_20230214_179 (Mac OS X 10.15.7)
00:25:15  CPU=amd64 (12 logical CPUs) (0x400000000 RAM)
00:25:15  ----------- Stack Backtrace -----------
00:25:15  Module=/Users/jenkins/workspace/Test_openjdk19_j9_sanity.openjdk_x86-64_mac_Nightly/openjdkbinary/j2sdk-image/lib/default/libj9vm29.dylib
00:25:15  Module_base_address=000000000C800000 Symbol=_ZN32VM_BytecodeInterpreterCompressed3runEP10J9VMThread
00:25:15  Symbol_address=000000000C88F4D0
00:25:15  Target=2_90_20230214_179 (Mac OS X 10.15.7)
00:25:15  CPU=amd64 (12 logical CPUs) (0x400000000 RAM)
00:25:15  ----------- Stack Backtrace -----------
00:25:15  _ZN32VM_BytecodeInterpreterCompressed3runEP10J9VMThread+0x1e2c3 (0x000000000C8AD793 [libj9vm29.dylib+0xad793])
00:25:15  bytecodeLoopCompressed+0xb4 (0x000000000C88F4C4 [libj9vm29.dylib+0x8f4c4])
00:25:15  ---------------------------------------
00:25:15  _ZN32VM_BytecodeInterpreterCompressed3runEP10J9VMThread+0x1e2c3 (0x000000000C8AD793 [libj9vm29.dylib+0xad793])
00:25:15  bytecodeLoopCompressed+0xb4 (0x000000000C88F4C4 [libj9vm29.dylib+0x8f4c4])
00:25:15  ---------------------------------------
00:25:15  _ZN32VM_BytecodeInterpreterCompressed3runEP10J9VMThread+0x1e2c3 (0x000000000C8AD793 [libj9vm29.dylib+0xad793])
00:25:15  bytecodeLoopCompressed+0xb4 (0x000000000C88F4C4 [libj9vm29.dylib+0x8f4c4])
00:25:15  ---------------------------------------
00:25:15  _ZN32VM_BytecodeInterpreterCompressed3runEP10J9VMThread+0x1e2c3 (0x000000000C8AD793 [libj9vm29.dylib+0xad793])
00:25:15  bytecodeLoopCompressed+0xb4 (0x000000000C88F4C4 [libj9vm29.dylib+0x8f4c4])
00:25:15  ---------------------------------------

@babsingh is this caused by #16678
@tajila @fengxue-IS fyi

@pshipton pshipton added comp:vm test failure segfault Issues that describe segfaults / JVM crashes jdk19 labels Feb 15, 2023
@pshipton pshipton added this to the Java 19 0.37 milestone Feb 15, 2023
@fengxue-IS
Copy link
Contributor

This test has just been re-enabled by adoptium/aqa-tests#4337.
Have not seen this failure during local/grinder runs before re-enabling. Will try to reproduce this to investigate further

@babsingh
Copy link
Contributor

babsingh commented Feb 15, 2023

@babsingh is this caused by #16678 ?

JDK19 impl stays the same in #16678. #16678 only renames a J9VMThread field in JDK19 to make clean up easier once JDK19 goes out of service. So, it should not cause this failure.

@pshipton
Copy link
Member Author

@fengxue-IS
Copy link
Contributor

Grinder with -Xnocompressedref passes, verifying if this is due to sub 4G memory issue

@babsingh
Copy link
Contributor

babsingh commented Mar 13, 2023

@pshipton @tajila This is a perf issue. Refer to #15184 for more details. -Xnocompressedrefs is a workaround. We had decided to address perf issues closer to the JEP's GA milestone. This issue should not block the JDK19 release.

@pshipton
Copy link
Member Author

@babsingh how is it a perf issue is causing a crash?

@babsingh
Copy link
Contributor

how is it a perf issue is causing a crash?

The stress test exhausts sub-4G memory by allocating Java stack for 250K virtual threads. The -Xnocompressedrefs workaround resolves this problem.

@tajila
Copy link
Contributor

tajila commented Mar 14, 2023

The stress test exhausts sub-4G memory by allocating Java stack for 250K virtual threads. The -Xnocompressedrefs workaround resolves this problem.

Dont we throw OOM if we cant allocate the stack?

@babsingh
Copy link
Contributor

Dont we throw OOM if we cant allocate the stack?

Yes, OOM is thrown multiple times:

00:25:15  JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "native memory exhausted" at 2023/02/15 00:20:47 - please wait.

@tajila
Copy link
Contributor

tajila commented Mar 14, 2023

So what is the cause of the Unhandled exception

@babsingh
Copy link
Contributor

babsingh commented Mar 14, 2023

We reproduced the failure on @ehrenjulzert's machine last Friday but lldb didn't show line numbers for the source code. @ehrenjulzert can you try again to find the cause of the Unhandled exception?

@ehrenjulzert
Copy link

Sure, I can try taking a look again today

@ehrenjulzert
Copy link

I'm still not sure what's going on, but here's what we found so far:

originally we were only getting this traceback in lldb that didn't tell us much:

Process 82296 launched: '/Users/ehren/Documents/openj9-openjdk-jdk19/build/macosx-x86_64-server-slowdebug/images/jdk/bin/java' (x86_64)
libj9vm29.dylib was compiled with optimization - stepping may behave oddly; variables may not be available.
Process 82296 stopped
* thread #46, stop reason = EXC_BAD_ACCESS (code=1, address=0x3)
    frame #0: 0x0000000001bcaa45 libj9vm29.dylib`VM_BytecodeInterpreterCompressed::run(this=0x0000700002d59ae0, vmThread=<unavailable>) at BytecodeInterpreter.hpp:0 [opt]
   1   	/*******************************************************************************
   2   	 * Copyright (c) 1991, 2023 IBM Corp. and others
   3   	 *
   4   	 * This program and the accompanying materials are made available under
   5   	 * the terms of the Eclipse Public License 2.0 which accompanies this
   6   	 * distribution and is available at https://www.eclipse.org/legal/epl-2.0/
   7   	 * or the Apache License, Version 2.0 which accompanies this distribution and
  thread #47, stop reason = EXC_BAD_ACCESS (code=1, address=0x3)
    frame #0: 0x0000000001bcaa45 libj9vm29.dylib`VM_BytecodeInterpreterCompressed::run(this=0x0000700002ddcae0, vmThread=<unavailable>) at BytecodeInterpreter.hpp:0 [opt]
   1   	/*******************************************************************************
   2   	 * Copyright (c) 1991, 2023 IBM Corp. and others
   3   	 *
   4   	 * This program and the accompanying materials are made available under
   5   	 * the terms of the Eclipse Public License 2.0 which accompanies this
   6   	 * distribution and is available at https://www.eclipse.org/legal/epl-2.0/
   7   	 * or the Apache License, Version 2.0 which accompanies this distribution and
  thread #48, stop reason = EXC_BAD_ACCESS (code=1, address=0x3)
    frame #0: 0x0000000001bcaa45 libj9vm29.dylib`VM_BytecodeInterpreterCompressed::run(this=0x0000700002e5fae0, vmThread=<unavailable>) at BytecodeInterpreter.hpp:0 [opt]
   1   	/*******************************************************************************
   2   	 * Copyright (c) 1991, 2023 IBM Corp. and others
   3   	 *
   4   	 * This program and the accompanying materials are made available under
   5   	 * the terms of the Eclipse Public License 2.0 which accompanies this
   6   	 * distribution and is available at https://www.eclipse.org/legal/epl-2.0/
   7   	 * or the Apache License, Version 2.0 which accompanies this distribution and
Target 0: (java) stopped.
(lldb) bt
* thread #46, stop reason = EXC_BAD_ACCESS (code=1, address=0x3)
  * frame #0: 0x0000000001bcaa45 libj9vm29.dylib`VM_BytecodeInterpreterCompressed::run(this=0x0000700002d59ae0, vmThread=<unavailable>) at BytecodeInterpreter.hpp:0 [opt]
    frame #1: 0x0000000001baab8b libj9vm29.dylib`::bytecodeLoopCompressed(currentThread=<unavailable>) at BytecodeInterpreter.inc:112:21 [opt]
    frame #2: 0x0000000001d0a052 libj9vm29.dylib`cInterpreter + 22
    frame #3: 0x0000000001b2ea69 libj9vm29.dylib`::runJavaThread(currentThread=<unavailable>) at callin.cpp:682:4 [opt]
    frame #4: 0x0000000001ba5d24 libj9vm29.dylib`javaProtectedThreadProc(portLibrary=0x00000000003471f8, entryarg=0x00000000182e0200) at vmthread.cpp:2093:3 [opt]
    frame #5: 0x000000000038a050 libj9prt29.dylib`omrsig_protect(portLibrary=0x00000000003471f8, fn=(libj9vm29.dylib`javaProtectedThreadProc(J9PortLibrary*, void*) at vmthread.cpp:2073), fn_arg=0x00000000182e0200, handler=(libj9vm29.dylib`structuredSignalHandler at gphandle.c:613), handler_arg=<unavailable>, flags=<unavailable>, result=0x0000700002d59f70) at omrsignal.c:425:12 [opt]
    frame #6: 0x0000000001ba5c38 libj9vm29.dylib`::javaThreadProc(entryarg=0x0000000001013420) at vmthread.cpp:372:2 [opt]
    frame #7: 0x00000000001b0623 libj9thr29.dylib`thread_wrapper(arg=0x00000000078225d8) at omrthread.c:1733:2 [opt]
    frame #8: 0x00007ff8083564e1 libsystem_pthread.dylib`_pthread_start + 125
    frame #9: 0x00007ff808351f6b libsystem_pthread.dylib`thread_start + 15

we tried changing the code in enterContinuationImpl to the following:

VMINLINE VM_BytecodeAction
	enterContinuationImpl(REGISTER_ARGS_LIST)
	{
		VM_BytecodeAction rc = EXECUTE_BYTECODE;

		j9object_t continuationObject = *(j9object_t*)_sp;

		buildInternalNativeStackFrame(REGISTER_ARGS);
		updateVMStruct(REGISTER_ARGS);

		/* Notify GC of Continuation stack swap */
		_vm->memoryManagerFunctions->preMountContinuation(_currentThread, continuationObject);

		BOOLEAN result = enterContinuation(_currentThread, continuationObject);

		VMStructHasBeenUpdated(REGISTER_ARGS);

		if (result) {
			_sendMethod = J9VMJDKINTERNALVMCONTINUATION_ENTER_METHOD(_currentThread->javaVM);
			rc = GOTO_RUN_METHOD;
		} else {
			restoreInternalNativeStackFrame(REGISTER_ARGS);
		}

		if (immediateAsyncPending()) {
			rc = GOTO_ASYNC_CHECK;
		} else if (VM_VMHelpers::exceptionPending(_currentThread)) {
			rc = GOTO_THROW_CURRENT_EXCEPTION;
		}

		return rc;
	}

which gave us this much nicer traceback but it's still crashing in dropPendingSendPushes:

Process 83779 launched: '/Users/ehren/Documents/openj9-openjdk-jdk19/build/macosx-x86_64-server-slowdebug/images/jdk/bin/java' (x86_64)
libj9vm29.dylib was compiled with optimization - stepping may behave oddly; variables may not be available.
Process 83779 stopped
* thread #48, stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
    frame #0: 0x0000000002c4445a libj9vm29.dylib`dropPendingSendPushes(currentThread=0x00000000282f8600) at drophelp.c:58:22 [opt]
   55  			} else {
   56  				J9Method * method = currentThread->literals;
   57  				J9ROMMethod * romMethod = J9_ROM_METHOD_FROM_RAM_METHOD(method);
-> 58  				UDATA slotCount = J9_ARG_COUNT_FROM_ROM_METHOD(romMethod) + J9_TEMP_COUNT_FROM_ROM_METHOD(romMethod);
   59  	
   60  	
   61  				if (romMethod->modifiers & J9AccSynchronized) {
  thread #49, stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
    frame #0: 0x0000000002c4445a libj9vm29.dylib`dropPendingSendPushes(currentThread=0x0000000028300d00) at drophelp.c:58:22 [opt]
   55  			} else {
   56  				J9Method * method = currentThread->literals;
   57  				J9ROMMethod * romMethod = J9_ROM_METHOD_FROM_RAM_METHOD(method);
-> 58  				UDATA slotCount = J9_ARG_COUNT_FROM_ROM_METHOD(romMethod) + J9_TEMP_COUNT_FROM_ROM_METHOD(romMethod);
   59  	
   60  	
   61  				if (romMethod->modifiers & J9AccSynchronized) {
  thread #50, stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
    frame #0: 0x0000000002c4445a libj9vm29.dylib`dropPendingSendPushes(currentThread=0x000000002830ef00) at drophelp.c:58:22 [opt]
   55  			} else {
   56  				J9Method * method = currentThread->literals;
   57  				J9ROMMethod * romMethod = J9_ROM_METHOD_FROM_RAM_METHOD(method);
-> 58  				UDATA slotCount = J9_ARG_COUNT_FROM_ROM_METHOD(romMethod) + J9_TEMP_COUNT_FROM_ROM_METHOD(romMethod);
   59  	
   60  	
   61  				if (romMethod->modifiers & J9AccSynchronized) {
Target 0: (java) stopped.
(lldb) bt
* thread #48, stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
  * frame #0: 0x0000000002c4445a libj9vm29.dylib`dropPendingSendPushes(currentThread=0x00000000282f8600) at drophelp.c:58:22 [opt]
    frame #1: 0x0000000002c4458e libj9vm29.dylib`prepareForExceptionThrow(currentThread=0x00000000282f8600) at drophelp.c:85:6 [opt]
    frame #2: 0x0000000002cac4b5 libj9vm29.dylib`VM_BytecodeInterpreterCompressed::run(J9VMThread*) [inlined] VM_BytecodeInterpreterCompressed::throwException(this=0x0000700001f3fae0, _sp=0x0000700001f3f918, _pc=<unavailable>) at BytecodeInterpreter.hpp:2527:3 [opt]
    frame #3: 0x0000000002cac47b libj9vm29.dylib`VM_BytecodeInterpreterCompressed::run(this=0x0000700001f3fae0, vmThread=<unavailable>) at BytecodeInterpreter.hpp:10832:2 [opt]
    frame #4: 0x0000000002caab8b libj9vm29.dylib`::bytecodeLoopCompressed(currentThread=<unavailable>) at BytecodeInterpreter.inc:112:21 [opt]
    frame #5: 0x0000000002e0a052 libj9vm29.dylib`cInterpreter + 22
    frame #6: 0x0000000002c2ea69 libj9vm29.dylib`::runJavaThread(currentThread=<unavailable>) at callin.cpp:682:4 [opt]
    frame #7: 0x0000000002ca5d24 libj9vm29.dylib`javaProtectedThreadProc(portLibrary=0x00000000003471f8, entryarg=0x00000000282f8600) at vmthread.cpp:2093:3 [opt]
    frame #8: 0x000000000038a050 libj9prt29.dylib`omrsig_protect(portLibrary=0x00000000003471f8, fn=(libj9vm29.dylib`javaProtectedThreadProc(J9PortLibrary*, void*) at vmthread.cpp:2073), fn_arg=0x00000000282f8600, handler=(libj9vm29.dylib`structuredSignalHandler at gphandle.c:613), handler_arg=<unavailable>, flags=<unavailable>, result=0x0000700001f3ff70) at omrsignal.c:425:12 [opt]
    frame #9: 0x0000000002ca5c38 libj9vm29.dylib`::javaThreadProc(entryarg=0x000000000200b420) at vmthread.cpp:372:2 [opt]
    frame #10: 0x00000000001b0623 libj9thr29.dylib`thread_wrapper(arg=0x000000007580f3d8) at omrthread.c:1733:2 [opt]
    frame #11: 0x00007ff8083564e1 libsystem_pthread.dylib`_pthread_start + 125
    frame #12: 0x00007ff808351f6b libsystem_pthread.dylib`thread_start + 15

We also tried removing the immediateAsyncPending check, resulting in the following code:

VMINLINE VM_BytecodeAction
	enterContinuationImpl(REGISTER_ARGS_LIST)
	{
		VM_BytecodeAction rc = EXECUTE_BYTECODE;

		j9object_t continuationObject = *(j9object_t*)_sp;

		buildInternalNativeStackFrame(REGISTER_ARGS);
		updateVMStruct(REGISTER_ARGS);

		/* Notify GC of Continuation stack swap */
		_vm->memoryManagerFunctions->preMountContinuation(_currentThread, continuationObject);

		BOOLEAN result = enterContinuation(_currentThread, continuationObject);

		VMStructHasBeenUpdated(REGISTER_ARGS);

		if (result) {
			_sendMethod = J9VMJDKINTERNALVMCONTINUATION_ENTER_METHOD(_currentThread->javaVM);
			rc = GOTO_RUN_METHOD;
		} else {
			rc = GOTO_THROW_CURRENT_EXCEPTION;
		}

		return rc;
	}

When I ran this I just got the same crash as in the first change we made, however it's weirdly inconsistent. Sometimes it will crash in dropPendingSendPushes as before, and sometimes it will crash in the JIT (which is what the following traceback shows):

* thread #6, stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
    frame #0: 0x0000000005d2d5fc libj9jit29.dylib`getOriginalROMMethodUnchecked(method=0x00000000469ccb98) at romhelp.c:32:26 [opt]
   29  	J9ROMMethod *
   30  	getOriginalROMMethodUnchecked(J9Method * method)
   31  	{
-> 32  		J9Class * methodClass = J9_CLASS_FROM_METHOD(method);
   33  		J9ROMClass * romClass = methodClass->romClass;
   34  		J9ROMMethod * romMethod = J9_ROM_METHOD_FROM_RAM_METHOD(method);
   35  		U_8 * bytecodes = J9_BYTECODE_START_FROM_ROM_METHOD(romMethod);
Target 0: (java) stopped.
(lldb) bt
* thread #6, stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
  * frame #0: 0x0000000005d2d5fc libj9jit29.dylib`getOriginalROMMethodUnchecked(method=0x00000000469ccb98) at romhelp.c:32:26 [opt]
    frame #1: 0x0000000005d2d756 libj9jit29.dylib`getOriginalROMMethod(method=0x00000000469ccb98) at romhelp.c:65:14 [opt]
    frame #2: 0x0000000004a9f3ee libj9jit29.dylib`TR_J9Method::TR_J9Method(this=0x0000000090d80340, fe=0x0000600003510020, trMemory=0x0000700002b31af8, aMethod=0x00000000469ccb98) at j9method.cpp:1970:19 [opt]
    frame #3: 0x0000000004a9c40e libj9jit29.dylib`TR_ResolvedJ9Method::TR_ResolvedJ9Method(this=0x0000000090d80340, aMethod=0x00000000469ccb98, fe=0x0000600003510020, trMemory=0x0000700002b31af8, owner=0x0000700002b31c70, vTableSlot=0) at j9method.cpp:2005:6 [opt]
    frame #4: 0x0000000004a9abf1 libj9jit29.dylib`TR_J9VMBase::createResolvedMethodWithSignature(TR_Memory*, TR_OpaqueMethodBlock*, TR_OpaqueClassBlock*, char*, int, TR_ResolvedMethod*, unsigned int) [inlined] TR_ResolvedJ9Method::TR_ResolvedJ9Method(this=0x0000000090d80340, aMethod=0x00000000469ccb98, fe=0x0000600003510020, trMemory=0x0000700002b31af8, owner=0x0000700002b31c70, vTableSlot=<unavailable>) at j9method.cpp:2006:4 [opt]
    frame #5: 0x0000000004a9abd5 libj9jit29.dylib`TR_J9VMBase::createResolvedMethodWithSignature(this=0x0000600003510020, trMemory=0x0000700002b31af8, aMethod=0x00000000469ccb98, classForNewInstance=0x0000000000000000, signature=0x0000000000000000, signatureLength=-1, owningMethod=0x0000700002b31c70, vTableSlot=0) at j9method.cpp:241:47 [opt]
    frame #6: 0x0000000004a9aa75 libj9jit29.dylib`TR_J9VMBase::createResolvedMethod(this=<unavailable>, trMemory=<unavailable>, aMethod=<unavailable>, owningMethod=<unavailable>, classForNewInstance=<unavailable>) at j9method.cpp:204:11 [opt]
    frame #7: 0x0000000004aa513b libj9jit29.dylib`TR_ResolvedJ9Method::getResolvedDynamicMethod(this=<unavailable>, comp=0x0000000090d00000, callSiteIndex=<unavailable>, unresolvedInCP=<unavailable>, isInvokeCacheAppendixNull=<unavailable>) at j9method.cpp:6637:24 [opt]
    frame #8: 0x0000000004a58abe libj9jit29.dylib`J9::SymbolReferenceTable::findOrCreateDynamicMethodSymbol(this=0x0000000090d01c40, owningMethodSymbol=0x0000000090d04980, callSiteIndex=<unavailable>, unresolvedInCP=<unavailable>, isInvokeCacheAppendixNull=<unavailable>) at J9SymbolReferenceTable.cpp:440:75 [opt]
    frame #9: 0x0000000004af375a libj9jit29.dylib`TR_J9ByteCodeIlGenerator::genInvokeDynamic(this=0x0000000090d22320, callSiteIndex=1) at Walker.cpp:3181:60 [opt]
    frame #10: 0x0000000004aeef6c libj9jit29.dylib`TR_J9ByteCodeIlGenerator::walker(this=<unavailable>, prevBlock=0x0000000000000000) at Walker.cpp:454:13 [opt]
    frame #11: 0x0000000004adf87a libj9jit29.dylib`TR_J9ByteCodeIlGenerator::genILFromByteCodes(this=0x0000000090d22320) at IlGenerator.cpp:359:28 [opt]
    frame #12: 0x0000000004adeca6 libj9jit29.dylib`TR_J9ByteCodeIlGenerator::internalGenIL(this=<unavailable>) at IlGenerator.cpp:0 [opt] [artificial]
    frame #13: 0x0000000004ade6ab libj9jit29.dylib`TR_J9ByteCodeIlGenerator::genIL(this=0x0000000090d22320) at IlGenerator.cpp:144:19 [opt]
    frame #14: 0x0000000004d88b2b libj9jit29.dylib`OMR::ResolvedMethodSymbol::genIL(this=0x0000000090d04980, fe=0x0000600003510020, comp=0x0000000090d00000, symRefTab=0x0000000090d01c40, customRequest=0x0000700002b31ae8) at OMRResolvedMethodSymbol.cpp:1207:33 [opt]
    frame #15: 0x0000000004d70a62 libj9jit29.dylib`OMR::Compilation::compile(this=0x0000000090d00000) at OMRCompilation.cpp:1007:37 [opt]
    frame #16: 0x0000000004a7378e libj9jit29.dylib`TR::CompilationInfoPerThreadBase::compile(this=0x0000000010101310, vmThread=<unavailable>, compiler=0x0000000090d00000, compilee=0x0000700002b31c70, vm=0x0000600003510020, optimizationPlan=0x000000001857baf0, scratchSegmentProvider=0x0000700002b31b58) at CompilationThread.cpp:9967:26 [opt]
    frame #17: 0x0000000004a711d1 libj9jit29.dylib`TR::CompilationInfoPerThreadBase::wrappedCompile(portLib=<unavailable>, opaqueParameters=0x0000700002b31aa8) at CompilationThread.cpp:9460:24 [opt]
    frame #18: 0x000000000038a050 libj9prt29.dylib`omrsig_protect(portLibrary=0x00000000003471f8, fn=(libj9jit29.dylib`TR::CompilationInfoPerThreadBase::wrappedCompile(J9PortLibrary*, void*) at CompilationThread.cpp:8517), fn_arg=0x0000700002b31aa8, handler=(libj9jit29.dylib`jitSignalHandler(J9PortLibrary*, unsigned int, void*, void*) at CompilationThread.cpp:209), handler_arg=<unavailable>, flags=<unavailable>, result=0x0000700002b31a70) at omrsignal.c:425:12 [opt]
    frame #19: 0x0000000004a6c362 libj9jit29.dylib`TR::CompilationInfoPerThreadBase::compile(this=0x0000000010101310, vmThread=0x0000000020019800, entry=0x0000600003e34120, scratchSegmentProvider=<unavailable>) at CompilationThread.cpp:8445:31 [opt]
    frame #20: 0x0000000004a6ba1c libj9jit29.dylib`TR::CompilationInfoPerThread::processEntry(this=0x0000000010101310, entry=0x0000600003e34120, scratchSegmentProvider=<unavailable>) at CompilationThread.cpp:4668:20 [opt]
    frame #21: 0x0000000004a6ad89 libj9jit29.dylib`TR::CompilationInfoPerThread::processEntries(this=0x0000000010101310) at CompilationThread.cpp:4373:13 [opt]
    frame #22: 0x0000000004a6ab4a libj9jit29.dylib`TR::CompilationInfoPerThread::run(this=0x0000000010101310) at CompilationThread.cpp:4208:13 [opt]
    frame #23: 0x0000000004a6a900 libj9jit29.dylib`protectedCompilationThreadProc((null)=<unavailable>, compInfoPT=0x0000000010101310) at CompilationThread.cpp:4142:16 [opt]
    frame #24: 0x000000000038a050 libj9prt29.dylib`omrsig_protect(portLibrary=0x00000000003471f8, fn=(libj9jit29.dylib`protectedCompilationThreadProc(J9PortLibrary*, TR::CompilationInfoPerThread*) at CompilationThread.cpp:4069), fn_arg=0x0000000010101310, handler=(libj9vm29.dylib`structuredSignalHandler at gphandle.c:613), handler_arg=<unavailable>, flags=<unavailable>, result=0x0000700002b32f68) at omrsignal.c:425:12 [opt]
    frame #25: 0x0000000004a68cc6 libj9jit29.dylib`compilationThreadProc(entryarg=0x0000000010101310) at CompilationThread.cpp:4047:25 [opt]
    frame #26: 0x00000000001b0623 libj9thr29.dylib`thread_wrapper(arg=0x00000000020211d8) at omrthread.c:1733:2 [opt]
    frame #27: 0x00007ff8083564e1 libsystem_pthread.dylib`_pthread_start + 125
    frame #28: 0x00007ff808351f6b libsystem_pthread.dylib`thread_start + 15

Also, for some reason adding -Xnocompressedrefs while running the 2nd version of the code (the one with the immediateAsyncPending check removed) no longer fixes the issue

@fengxue-IS
Copy link
Contributor

The return value of enterContinuation is also used to determine if this is a first enter or yield resume, so you can't use the else case in this way.

The correct syntax should be (apology for the confusion)

VMINLINE VM_BytecodeAction
enterContinuationImpl(REGISTER_ARGS_LIST)
{
	VM_BytecodeAction rc = EXECUTE_BYTECODE;

	j9object_t continuationObject = *(j9object_t*)_sp;

	buildInternalNativeStackFrame(REGISTER_ARGS);
	updateVMStruct(REGISTER_ARGS);

	/* Notify GC of Continuation stack swap */
	_vm->memoryManagerFunctions->preMountContinuation(_currentThread, continuationObject);

	if (enterContinuation(_currentThread, continuationObject)) {
		_sendMethod = J9VMJDKINTERNALVMCONTINUATION_ENTER_METHOD(_currentThread->javaVM);
		rc = GOTO_RUN_METHOD;
	}

	VMStructHasBeenUpdated(REGISTER_ARGS);

	if (immediateAsyncPending()) {
		rc = GOTO_ASYNC_CHECK;
	} else if (VM_VMHelpers::exceptionPending(_currentThread)) {
		rc = GOTO_THROW_CURRENT_EXCEPTION;
	}
	return rc;
}

@ehrenjulzert
Copy link

Ok with that change I'm still getting java.lang.OutOfMemoryErrors but the process no longer crashes. And -Xnocompressedrefs once again fixes the error.

@babsingh
Copy link
Contributor

Ok with that change I'm still getting java.lang.OutOfMemoryErrors but the process no longer crashes. And -Xnocompressedrefs once again fixes the error.

Excellent, that's the expected behaviour. @ehrenjulzert Can you create a PR for this fix?

ehrenjulzert pushed a commit to ehrenjulzert/openj9 that referenced this issue Mar 17, 2023
@ehrenjulzert
Copy link

PR: #16953

@babsingh
Copy link
Contributor

babsingh commented Mar 20, 2023

@ehrenjulzert Two more pending tasks:

  1. We need to port the fix to the 0.37 release branch (JDK19). For this, you will need to create a branch of https://github.com/eclipse-openj9/openj9/tree/v0.37.0-release, cherry-pick the commit with the fix and open a PR to deliver the fix to 0.37 release branch.
  2. Since this PR will be closed, we need to update the test exclude lists. There is an automated script which tries to enable tests when the corresponding issues have been fixed/closed. The test will stay excluded but the reason (issue number) will change. The below lines need to be updated to java/lang/Thread/virtual/stress/TimedGet.java https://github.com/eclipse-openj9/openj9/issues/15184 macosx-x64.

@babsingh babsingh reopened this Mar 20, 2023
ehrenjulzert pushed a commit to ehrenjulzert/aqa-tests that referenced this issue Mar 20, 2023
Changed due to eclipse-openj9/openj9/issues/16729 being resolved

Signed-off-by: Ehren Julien-Neitzert <[email protected]>
@ehrenjulzert
Copy link

aqa-tests PR: adoptium/aqa-tests#4452

@ehrenjulzert
Copy link

0.37 PR: #16961

@pshipton pshipton modified the milestones: Java 20 0.39?, Java 19 0.37 Mar 20, 2023
@babsingh
Copy link
Contributor

Closing. Pendings task in #16729 (comment) have been completed.

Mesbah-Alam pushed a commit to adoptium/aqa-tests that referenced this issue Mar 21, 2023
Changed due to eclipse-openj9/openj9/issues/16729 being resolved

Signed-off-by: Ehren Julien-Neitzert <[email protected]>
Co-authored-by: Ehren Julien-Neitzert <[email protected]>
@pshipton
Copy link
Member Author

pshipton commented Apr 3, 2023

I'll reopen this until we unexclude the test.

@pshipton pshipton reopened this Apr 3, 2023
@pshipton
Copy link
Member Author

pshipton commented Apr 3, 2023

I was looking at an old exclude, the test is excluded against #15184

@pshipton pshipton closed this as completed Apr 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:vm jdk19 segfault Issues that describe segfaults / JVM crashes test failure
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants