Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test-extended.system win SharedClassesAPI_0 assert The softmx limit for shared cache usage is smaller than the minimum feasible value #4351

Closed
pshipton opened this issue Jan 18, 2019 · 11 comments · Fixed by #4532

Comments

@pshipton
Copy link
Member

STF 17:43:37.304 - +------ Step 31 - DefinedLocationGroupAccessJava: Monitor workload processes
STF 17:43:37.304 - | Wait for processes to meet expectations
STF 17:43:37.304 - |   Processes: [WL1, WL2, WL3, WL4]
STF 17:43:37.304 - |
STF 17:43:37.304 - Monitoring processes: WL1 WL2 WL3 WL4
STF 17:43:37.436 - Found dump at: C:\Users\jenkins\workspace\Test-extended.system-JDK8-win_x86-64_cmprssptrs\openjdk-tests\TestConfig\test_output_15477678633841\SharedClassesAPI_0\20190117-174310-SharedClassesAPI\results\core.20190117.174337.2916.0001.dmp
STF 17:43:37.476 - Found dump at: C:\Users\jenkins\workspace\Test-extended.system-JDK8-win_x86-64_cmprssptrs\openjdk-tests\TestConfig\test_output_15477678633841\SharedClassesAPI_0\20190117-174310-SharedClassesAPI\results\core.20190117.174337.7228.0001.dmp
STF 17:43:39.106 - Found dump at: C:\Users\jenkins\workspace\Test-extended.system-JDK8-win_x86-64_cmprssptrs\openjdk-tests\TestConfig\test_output_15477678633841\SharedClassesAPI_0\20190117-174310-SharedClassesAPI\results\javacore.20190117.174337.7228.0002.txt
STF 17:43:42.090 - Found dump at: C:\Users\jenkins\workspace\Test-extended.system-JDK8-win_x86-64_cmprssptrs\openjdk-tests\TestConfig\test_output_15477678633841\SharedClassesAPI_0\20190117-174310-SharedClassesAPI\results\Snap.20190117.174337.7228.0003.trc
STF 17:43:44.403 - Found dump at: C:\Users\jenkins\workspace\Test-extended.system-JDK8-win_x86-64_cmprssptrs\openjdk-tests\TestConfig\test_output_15477678633841\SharedClassesAPI_0\20190117-174310-SharedClassesAPI\results\Snap.20190117.174337.2916.0003.trc
STF 17:43:44.462 - Found dump at: C:\Users\jenkins\workspace\Test-extended.system-JDK8-win_x86-64_cmprssptrs\openjdk-tests\TestConfig\test_output_15477678633841\SharedClassesAPI_0\20190117-174310-SharedClassesAPI\results\javacore.20190117.174337.2916.0002.txt
STF 17:43:44.718 - **FAILED** Process WL3 ended with exit code (4294967295) and not the expected exit code/s (0)
STF 17:43:44.719 - Monitoring Report Summary:
STF 17:43:44.719 -   o Process WL1 has crashed unexpectedly
STF 17:43:44.720 -   o Process WL2 ended with the expected exit code (0)
STF 17:43:44.720 -   o Process WL3 has crashed unexpectedly
STF 17:43:44.720 -   o Process WL4 has crashed unexpectedly

in stderr output

JVMSHRC777W The softmx limit for shared cache usage is smaller than the minimum feasible value 3980759048 bytes.
23:43:37.434 0xee9a00   j9shr.1012   *   ** ASSERTION FAILED ** at CompositeCache.cpp:6276: (((0 != _theca) && hasWriteMutex(currentThread) && (getTotalSize() >= softMaxBytes) && (softMaxBytes >= getUsedBytes())))

in javacore

1XECTHTYPE     Current thread history (J9VMThread:0x00000000017D9A00)
3XEHSTTYPE     23:43:37:474859273 GMT j9shr.1012 - * ** ASSERTION FAILED ** at CompositeCache.cpp:6276: (((0 != _theca) && hasWriteMutex(currentThread) && (getTotalSize() >= softMaxBytes) && (softMaxBytes >= getUsedBytes()))) 
3XEHSTTYPE     23:43:37:474851307 GMT j9prt.58 -  <j9shmem_stat exit - contents of controlFile C:\Users\jenkins\workspace\Test-extended.system-JDK8-win_x86-64_cmprssptrs\openjdk-tests\TestConfig\test_output_15477678633841\SharedClassesAPI_0\20190117-174310-SharedClassesAPI\results\caches\javasharedresources\\C290M4F1A64_DefinedLocationGroupAccessJavaWL3_G35 is corrupted 
3XEHSTTYPE     23:43:37:474838657 GMT j9prt.56 -  >j9shmem_stat entered, name = C290M4F1A64_DefinedLocationGroupAccessJavaWL3_G35 
3XEHSTTYPE     23:43:37:474830133 GMT j9shr.2197 - * CC setCacheAreaBoundaries: The softmx bytes 0 is smaller than the already used bytes 3980759048 
3XEHSTTYPE     23:43:37:474393980 GMT j9prt.58 -  <j9shmem_stat exit - contents of controlFile C:\Users\jenkins\workspace\Test-extended.system-JDK8-win_x86-64_cmprssptrs\openjdk-tests\TestConfig\test_output_15477678633841\SharedClassesAPI_0\20190117-174310-SharedClassesAPI\results\caches\javasharedresources\\C290M4F1A64_DefinedLocationGroupAccessJavaWL3_G35 is corrupted 
3XEHSTTYPE     23:43:37:474383497 GMT j9prt.56 -  >j9shmem_stat entered, name = C290M4F1A64_DefinedLocationGroupAccessJavaWL3_G35 
3XEHSTTYPE     23:43:37:474374591 GMT j9shr.1028 -   CC setCacheAreaBoundaries: finalSegmentStart=000000000FA99000, newCacheEnd=0000000022640000, _theca->totalBytes=314572608 
3XEHSTTYPE     23:43:37:474367741 GMT j9shr.1027 -   CC setCacheAreaBoundaries: finalSegmentStart=000000000FA981E0, origCacheEnd=0000000022640000, _theca->totalBytes=314572608 
3XEHSTTYPE     23:43:37:474357480 GMT j9shr.1026 -  >CC setCacheAreaBoundaries: Entering 
3XEHSTTYPE     23:43:37:474347666 GMT j9shr.2269 -   CC startup: initializing a new cache 
@pshipton
Copy link
Member Author

@Mesbah-Alam @hangshao0

@pshipton
Copy link
Member Author

Mesbah said in adoptium/openj9-systemtest#74

The Workload that crashes, WL3 runs the following command:

C:/Users/jenkins/workspace/Test-extended.system-JDK8-win_x86-64_cmprssptrs/openjdkbinary/j2sdk-image/jre/bin/../../bin/java -Xshareclasses:name=DefinedLocationGroupAccessJavaWL3,cacheDir=C:\Users\jenkins\workspace\Test-extended.system-JDK8-win_x86-64_cmprssptrs\openjdk-tests\TestConfig\test_output_15477678633841\SharedClassesAPI_0\20190117-174310-SharedClassesAPI\results\caches,nonpersistent,groupAccess -Xcompressedrefs -classpath C:\Users\jenkins\workspace\Test-extended.system-JDK8-win_x86-64_cmprssptrs\jvmtest\systemtest\stf\stf.load\bin;C:\Users\jenkins\workspace\Test-extended.system-JDK8-win_x86-64_cmprssptrs\jvmtest\systemtest\stf\stf.core\bin;C:/Users/jenkins/workspace/Test-extended.system-JDK8-win_x86-64_cmprssptrs/jvmtest/systemtest/systemtest_prereqs/log4j-2.3/log4j-api-2.3.jar;C:/Users/jenkins/workspace/Test-extended.system-JDK8-win_x86-64_cmprssptrs/jvmtest/systemtest/systemtest_prereqs/log4j-2.3/log4j-core-2.3.jar;C:/Users/jenkins/workspace/Test-extended.system-JDK8-win_x86-64_cmprssptrs/jvmtest/systemtest/systemtest_prereqs/junit-4.12/junit-4.12.jar;C:/Users/jenkins/workspace/Test-extended.system-JDK8-win_x86-64_cmprssptrs/jvmtest/systemtest/systemtest_prereqs/junit-4.12/hamcrest-core-1.3.jar;C:\Users\jenkins\workspace\Test-extended.system-JDK8-win_x86-64_cmprssptrs\jvmtest\systemtest\openjdk-systemtest\openjdk.test.lang\bin;C:\Users\jenkins\workspace\Test-extended.system-JDK8-win_x86-64_cmprssptrs\jvmtest\systemtest\openjdk-systemtest\openjdk.test.util\bin;C:\Users\jenkins\workspace\Test-extended.system-JDK8-win_x86-64_cmprssptrs\jvmtest\systemtest\openjdk-systemtest\openjdk.test.math\bin net.adoptopenjdk.loadTest.LoadTest -resultsDir C:\Users\jenkins\workspace\Test-extended.system-JDK8-win_x86-64_cmprssptrs\openjdk-tests\TestConfig\test_output_15477678633841\SharedClassesAPI_0\20190117-174310-SharedClassesAPI\results -resultsPrefix 29.WL3. -dumpRequested false -reportFailureLimit 1 -abortAtFailureLimit 10 -maxTotalLogFileSpace 200M -maxSingleLogSize 1/25 -suite.mini-mix.inventoryFile C:/Users/jenkins/workspace/Test-extended.system-JDK8-win_x86-64_cmprssptrs/openjdk-tests/TestConfig/test_output_15477678633841/SharedClassesAPI_0/20190117-174310-SharedClassesAPI/results/29.WL3.inventory/openjdk.test.load/config/inventories/mix/mini-mix.xml -suite.mini-mix.inventoryExcludeFile C:/Users/jenkins/workspace/Test-extended.system-JDK8-win_x86-64_cmprssptrs/openjdk-tests/TestConfig/test_output_15477678633841/SharedClassesAPI_0/20190117-174310-SharedClassesAPI/results/29.WL3.inventory/openjdk.test.load/config/inventories/mix/mini-mix_exclude.xml -suite.mini-mix.threadCount 3 -suite.mini-mix.totalNumberTests 78 -suite.mini-mix.selection random -suite.mini-mix.seed -1 -suite.mini-mix.repeatCount 1 -suite.mini-mix.thinkingTime 0ms..0ms

@pshipton
Copy link
Member Author

@Mesbah-Alam I was thinking more along the lines of, what is the test doing that causes the assert.

Is the 'minimum feasible' value for softmx universal or platform specific?

It seems to be related to the cache contents. The assert checks the softmx value is between the amount of data which is in the cache, but not bigger than the total size of the cache.
(getTotalSize() >= softMaxBytes) && (softMaxBytes >= getUsedBytes())

@Mesbah-Alam
Copy link
Contributor

Mesbah-Alam commented Jan 21, 2019

There was a coma missing from the sharedClasses options being formulated to be used in the testcase. Added here: adoptium/openj9-systemtest#76

Not sure if it is the root cause of the crash though.

Running a 5x Grinder in internal Jenkins on Windows with the above pull request atm.

@hangshao0
Copy link
Contributor

This test is now passing:

Windows JDK8:

22:20:49 STF 21:20:51.524 - Overall result: PASSED
22:20:49 
22:20:49 SharedClassesAPI_0_PASSED

https://ci.eclipse.org/openj9/job/Test-extended.system-JDK8-win_x86-64_cmprssptrs/152/consoleFull

Window JDK11:
https://ci.eclipse.org/openj9/job/Test-extended.system-JDK11-win_x86-64_cmprssptrs/152/

@hangshao0
Copy link
Contributor

Linking the systemtest repo PR here: adoptium/openj9-systemtest#80

@pshipton
Copy link
Member Author

@hangshao0 do you know what condition caused the VM to assert? A user shouldn't be able to cause the VM to assert via command line arguments, the underlying problem should still be fixed.

@hangshao0
Copy link
Contributor

hangshao0 commented Jan 29, 2019

memHandle = CreateFileW(unicodePath, GENERIC_READ, (FILE_SHARE_READ | FILE_SHARE_WRITE), NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
	
	if (unicodeBuffer != unicodePath) {
		omrmem_free_memory(unicodePath);
	}
		
	if ((NULL == memHandle) || (INVALID_HANDLE_VALUE==memHandle)) {
		Trc_PRT_shmem_j9shmem_stat_Exit2(sharedMemoryFullPath);        <---------- triggered
		omrmem_free_memory(sharedMemoryFullPath);
		return -1;
	}

We call the above code to check the cache file size. We are not able to open the cache file, so cache file size _actualCacheSize is 0. When setting the softmx, we assert than the sotfmx < _actualCacheSize (=0).

@hangshao0
Copy link
Contributor

I will create a Pull Request to ensure that the cache file path is not longer than the OS limit before starting up the shared cache.

hangshao0 added a commit to hangshao0/openj9 that referenced this issue Feb 5, 2019
1. Remove duplicated define of J9SH_MAXPATH in j9sharedhelper.h
2. Make sure the length of non-persistent cache mutex name is less than
_MAX_PATH on Windows.
3. Make sure the cache file path is shorter than J9SH_MAXPATH before
starting up the shared cache.

closes eclipse-openj9#4351

Signed-off-by: hangshao <[email protected]>
@pshipton
Copy link
Member Author

pshipton commented Feb 6, 2019

Reopened since #4532 was reverted.

@hangshao0
Copy link
Contributor

OMR PR (eclipse-omr/omr#3553) to export file_get_unicode_path() which prepend \\?\ when path is > MAX_PATH.

hangshao0 added a commit to hangshao0/openj9 that referenced this issue Feb 14, 2019
On Windows, "\\?\" needs to be prepended to the path if its length is >
MAX_PATH. port_convertFromUTF8() won't prepend "\\?\". Call
port_file_get_unicode_path() so that "\\?\" is prepended.  

Closes eclipse-openj9#4351

Signed-off-by: hangshao <[email protected]>
jason-hall pushed a commit to jason-hall/openj9 that referenced this issue Feb 15, 2019
On Windows, "\\?\" needs to be prepended to the path if its length is >
MAX_PATH. port_convertFromUTF8() won't prepend "\\?\". Call
port_file_get_unicode_path() so that "\\?\" is prepended.  

Closes eclipse-openj9#4351

Signed-off-by: hangshao <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants