-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SEA tests run into SIGSEGV during SEA execution #50740
Comments
Crashed on GHA too on test-linux: test/sequential/test-single-executable-application-snapshot-and-code-cache.js#L64
--- stderr ---
[process 207047]: --- stderr ---
[process 207047]: --- stdout ---
[process 207047]: status = null, signal = SIGSEGV
/home/runner/work/node/node/test/common/child_process.js:86
throw new Error(`${failures.join('\n')}`);
^
Error: - process terminated with status null, expected 0
- process terminated with signal SIGSEGV, expected null
at logAndThrow (/home/runner/work/node/node/test/common/child_process.js:86:11)
at expectSyncExit (/home/runner/work/node/node/test/common/child_process.js:91:5)
at spawnSyncAndExitWithoutError (/home/runner/work/node/node/test/common/child_process.js:125:10)
at Object.<anonymous> (/home/runner/work/node/node/test/sequential/test-single-executable-application-snapshot-and-code-cache.js:64:3)
at Module._compile (node:internal/modules/cjs/loader:1376:14)
at Module._extensions..js (node:internal/modules/cjs/loader:1435:10)
at Module.load (node:internal/modules/cjs/loader:1207:32)
at Module._load (node:internal/modules/cjs/loader:1023:12)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:135:12)
at node:internal/main/run_main_module:28:49
Node.js v22.0.0-pre
Command: out/Release/node --test-reporter=spec --test-reporter-destination=stdout --test-reporter=./tools/github_reporter/index.js --test-reporter-destination=stdout /home/runner/work/node/node/test/sequential/test-single-executable-application-snapshot-and-code-cache.js |
Not sure if this is the same as the "all failed on PPC" that we are seeing. I'll open another PR to at least log some stuff. Seperately it would be great if we can install a built-in SIGSEGV handler to dump the stack trace.. |
PR-URL: #50750 Refs: #50740 Refs: nodejs/reliability#718 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Michael Dawson <[email protected]>
PR-URL: #50750 Refs: #50740 Refs: nodejs/reliability#718 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Michael Dawson <[email protected]>
PR-URL: #50750 Refs: #50740 Refs: nodejs/reliability#718 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Michael Dawson <[email protected]>
PR-URL: nodejs#50750 Refs: nodejs#50740 Refs: nodejs/reliability#718 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Michael Dawson <[email protected]>
PR-URL: nodejs#50750 Refs: nodejs#50740 Refs: nodejs/reliability#718 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Michael Dawson <[email protected]>
PR-URL: #50759 Refs: #50740 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Darshan Sen <[email protected]>
- Use spawnSyncAndExitWithoutError to log more information on error. - Use NODE_DEBUG_NATIVE to log internals - Skip the test when available disk space < 120MB PR-URL: #50759 Refs: #50740 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Darshan Sen <[email protected]>
PR-URL: #50750 Refs: #50740 Refs: nodejs/reliability#718 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Michael Dawson <[email protected]>
PR-URL: #50750 Refs: #50740 Refs: nodejs/reliability#718 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Michael Dawson <[email protected]>
PR-URL: #50759 Refs: #50740 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Darshan Sen <[email protected]>
- Use spawnSyncAndExitWithoutError to log more information on error. - Use NODE_DEBUG_NATIVE to log internals - Skip the test when available disk space < 120MB PR-URL: #50759 Refs: #50740 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Darshan Sen <[email protected]>
In test status files, `$system` will be the OS and not the arch (which would be `$arch`). Add missing single-executable-application test to the list of tests marked flaky on Linux ppc64le. PR-URL: #51422 Refs: #50828 Refs: #50740 Reviewed-By: Rafael Gonzaga <[email protected]> Reviewed-By: Michael Dawson <[email protected]> Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Joyee Cheung <[email protected]> Reviewed-By: Luigi Pinca <[email protected]>
In test status files, `$system` will be the OS and not the arch (which would be `$arch`). Add missing single-executable-application test to the list of tests marked flaky on Linux ppc64le. PR-URL: #51422 Refs: #50828 Refs: #50740 Reviewed-By: Rafael Gonzaga <[email protected]> Reviewed-By: Michael Dawson <[email protected]> Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Joyee Cheung <[email protected]> Reviewed-By: Luigi Pinca <[email protected]>
In test status files, `$system` will be the OS and not the arch (which would be `$arch`). Add missing single-executable-application test to the list of tests marked flaky on Linux ppc64le. PR-URL: nodejs#51422 Refs: nodejs#50828 Refs: nodejs#50740 Reviewed-By: Rafael Gonzaga <[email protected]> Reviewed-By: Michael Dawson <[email protected]> Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Joyee Cheung <[email protected]> Reviewed-By: Luigi Pinca <[email protected]>
In test status files, `$system` will be the OS and not the arch (which would be `$arch`). Add missing single-executable-application test to the list of tests marked flaky on Linux ppc64le. PR-URL: nodejs#51422 Refs: nodejs#50828 Refs: nodejs#50740 Reviewed-By: Rafael Gonzaga <[email protected]> Reviewed-By: Michael Dawson <[email protected]> Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Joyee Cheung <[email protected]> Reviewed-By: Luigi Pinca <[email protected]>
One day I was looking at the CI logs, and I think I know what's going on..#52000 |
The string writing/reading was intended for debugging info in snapshot, which had a CHECK_GT(length, 0) check, it then got repurposed for SEA resource writing/reading and turned into a helper for string views, but was not updated to handle empty views, causing occasional crash in the CI when the read is protected. This patch fixes it. PR-URL: #52000 Fixes: #50740 Reviewed-By: Michaël Zasso <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Luigi Pinca <[email protected]>
Still happening: https://ci.nodejs.org/job/node-test-commit-linux-containered/42065/nodes=ubuntu2204_sharedlibs_withoutintl_x64/console
|
PR-URL: #50759 Refs: #50740 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Darshan Sen <[email protected]>
- Use spawnSyncAndExitWithoutError to log more information on error. - Use NODE_DEBUG_NATIVE to log internals - Skip the test when available disk space < 120MB PR-URL: #50759 Refs: #50740 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Darshan Sen <[email protected]>
In test status files, `$system` will be the OS and not the arch (which would be `$arch`). Add missing single-executable-application test to the list of tests marked flaky on Linux ppc64le. PR-URL: #51422 Refs: #50828 Refs: #50740 Reviewed-By: Rafael Gonzaga <[email protected]> Reviewed-By: Michael Dawson <[email protected]> Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Joyee Cheung <[email protected]> Reviewed-By: Luigi Pinca <[email protected]>
In test status files, `$system` will be the OS and not the arch (which would be `$arch`). Add missing single-executable-application test to the list of tests marked flaky on Linux ppc64le. PR-URL: #51422 Refs: #50828 Refs: #50740 Reviewed-By: Rafael Gonzaga <[email protected]> Reviewed-By: Michael Dawson <[email protected]> Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Joyee Cheung <[email protected]> Reviewed-By: Luigi Pinca <[email protected]>
The string writing/reading was intended for debugging info in snapshot, which had a CHECK_GT(length, 0) check, it then got repurposed for SEA resource writing/reading and turned into a helper for string views, but was not updated to handle empty views, causing occasional crash in the CI when the read is protected. This patch fixes it. PR-URL: nodejs#52000 Fixes: nodejs#50740 Reviewed-By: Michaël Zasso <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Luigi Pinca <[email protected]>
The string writing/reading was intended for debugging info in snapshot, which had a CHECK_GT(length, 0) check, it then got repurposed for SEA resource writing/reading and turned into a helper for string views, but was not updated to handle empty views, causing occasional crash in the CI when the read is protected. This patch fixes it. PR-URL: #52000 Fixes: #50740 Reviewed-By: Michaël Zasso <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Luigi Pinca <[email protected]>
The string writing/reading was intended for debugging info in snapshot, which had a CHECK_GT(length, 0) check, it then got repurposed for SEA resource writing/reading and turned into a helper for string views, but was not updated to handle empty views, causing occasional crash in the CI when the read is protected. This patch fixes it. PR-URL: nodejs#52000 Fixes: nodejs#50740 Reviewed-By: Michaël Zasso <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Luigi Pinca <[email protected]>
I figured out the cause of the crash and opened #54120 |
When both useCodeCache and useSnapshot are set, we generate the snapshot and skip the generation of the code cache since the snapshot already includes the code cache. But we previously still persist the code cache setting in the flags that got serialized into the SEA, so the resulting executable would still try to read the code cache even if it's not added to the SEA, leading to a flaky crash caused by OOB on some platforms. This patch fixes the crash by ignoring the code cache setting when generating the flag if both snapshot and code cache is configured. PR-URL: #54120 Fixes: #50740 Reviewed-By: Chengzhong Wu <[email protected]> Reviewed-By: Richard Lau <[email protected]>
When both useCodeCache and useSnapshot are set, we generate the snapshot and skip the generation of the code cache since the snapshot already includes the code cache. But we previously still persist the code cache setting in the flags that got serialized into the SEA, so the resulting executable would still try to read the code cache even if it's not added to the SEA, leading to a flaky crash caused by OOB on some platforms. This patch fixes the crash by ignoring the code cache setting when generating the flag if both snapshot and code cache is configured. PR-URL: #54120 Fixes: #50740 Reviewed-By: Chengzhong Wu <[email protected]> Reviewed-By: Richard Lau <[email protected]>
When both useCodeCache and useSnapshot are set, we generate the snapshot and skip the generation of the code cache since the snapshot already includes the code cache. But we previously still persist the code cache setting in the flags that got serialized into the SEA, so the resulting executable would still try to read the code cache even if it's not added to the SEA, leading to a flaky crash caused by OOB on some platforms. This patch fixes the crash by ignoring the code cache setting when generating the flag if both snapshot and code cache is configured. PR-URL: #54120 Fixes: #50740 Reviewed-By: Chengzhong Wu <[email protected]> Reviewed-By: Richard Lau <[email protected]>
When both useCodeCache and useSnapshot are set, we generate the snapshot and skip the generation of the code cache since the snapshot already includes the code cache. But we previously still persist the code cache setting in the flags that got serialized into the SEA, so the resulting executable would still try to read the code cache even if it's not added to the SEA, leading to a flaky crash caused by OOB on some platforms. This patch fixes the crash by ignoring the code cache setting when generating the flag if both snapshot and code cache is configured. PR-URL: #54120 Fixes: #50740 Reviewed-By: Chengzhong Wu <[email protected]> Reviewed-By: Richard Lau <[email protected]>
From the latest reliability report, it seems many of the SEA tests are running into SIGSEGV on Windows/macOS/RHEL when executing the generated SEA.
nodejs/reliability#718
Some ideas about investigating into the flakes:
spawnSyncAndExitWithoutError()
for better outputNODE_DEBUG_NATIVE=SEA
to log the deserializationcc @nodejs/single-executable
The text was updated successfully, but these errors were encountered: