-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tools: include exit code in test failures #19855
Conversation
Include the exit code in test failures. This will give us more information during the currently-puzzling failures that provide no information in CI such as: ``` 03:10:10 not ok 563 parallel/test-fs-truncate 03:10:10 --- 03:10:10 duration_ms: 1.119 03:10:10 severity: fail 03:10:10 stack: |- 03:10:10 ... ```
Tad "hacky", I'll see if I can make something more TAPish. |
Just scribbling some tweaks |
tools/test.py
Outdated
@@ -325,12 +329,10 @@ def HasRun(self, output): | |||
# duration_ms is measured in seconds and is read as such by TAP parsers. | |||
# It should read as "duration including ms" rather than "duration in ms" | |||
logger.info(' ---') | |||
logger.info(' duration_ms: %d.%d' % | |||
logger.info(' duration (s): %d.%d' % |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is an annoyance for many folks but I'd advise against changing this and instead learn to read it as "duration including milliseconds". From memory there's some tooling in Jenkins that depends on this. It's become a TAP quirk, not something we've introduced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack. I'm just smoke testing the CI TAP tooling
tools/test.py
Outdated
@@ -299,6 +302,7 @@ def HasRun(self, output): | |||
|
|||
if output.HasTimedOut(): | |||
self.severity = 'fail' | |||
self.traceback += '\nTimed Out' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big deal but I'd prefer we leave existing strings alone on this output where possible, it's parsable and we don't know what might depend on it--likely nothing but why risk it. I assume that in the majority of cases there's not going to be a traceback to append to so perhaps if you check for it being length>0 and then append \n
and then regardless append 'timeout', so in the majority of cases we just get a 'timeout' and nothing's changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My motivation was that the previous code was self.traceback = 'timeout'
, so in cases where there was output from the test we have been losing it if the test timesout.
As for the parsability of the output, I'm on a deepdive into the CI parsing code, trying to grok what's actually being consumed and what just passes through (I have same concern for the new exitcode
field).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P.S. duration_ms
is consumed in two code paths, one treats it as seconds (the red arrow):
https://github.com/jenkinsci/tap-plugin/blob/05bd8843f0c87a379fdaec4f6e007c1124f1a35c/src/main/java/org/tap4j/plugin/model/TapStreamResult.java#L165-L167
and another place as milliseconds (the green arrow):
https://github.com/jenkinsci/tap-plugin/blob/3a80a472c6acff847ae3f572bb6bcc47d64ab9dd/src/main/java/org/tap4j/plugin/model/TapTestResultResult.java#L260-L271
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P.P.S. I'm looking into finishing the work started by @jbergstroem (tap2junit
) as the final goal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest of the changes look okay, but would it be okay to use self.severity
to signal that the test has timed out? I think its better not to alter the traceback. Extreme corner case: if a test prints Timed Out and exits, we cannot be sure if the test exceeded the timelimit or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for context the old code did:
Line 332 in bdcd041
self.traceback = 'timeout' |
And even the current code does:
Line 287 in 800ce94
self.traceback = output.output.stdout + output.output.stderr |
and
Line 298 in 800ce94
self.traceback = "oh no!\nexit code: " + PrintCrashed(exit_code) |
So we are already being naughty with
traceback
As for severity
, it is the only really semantic field (the TAP plugin uses it to categorize the results), so I would advise against changing it. A different option is to add a message
field or even a dedicated timed out
field.
Because these changes are getting mildly more involved, I'll remove the |
Ohh boy I got rusty with the git 👴, wanted to draft something with the GitHub GUI (always a mistake) and ended up pushing to your branch... Pulled my changes out of your branch, sorry. |
It's cool. You can work in my branch if you want. Or work in your own branch. Whatevs. |
Always the gracious host 🎩 |
@refack Delighted you're back! |
@refack: As we continue to see |
AFAICT this is ready to land as is. I've tested it in multiple scenarios. We just need a +1 from someone. |
@nodejs/python |
Linux failure is build/infra-related. Unfortunately, I'm unable to fix it. (Tried wiping the workspace in Jenkins.) |
Resume build: https://ci.nodejs.org/job/node-test-commit/17575/ |
@refack please only add the |
@nodejs/python @nodejs/build PTAL. This needs some LGs. |
Re-running Windows (the only CI task that failed): https://ci.nodejs.org/job/node-test-commit-windows-fanned/17087/ |
I'm still not cool with the change of the string |
I rolled-back that specific change, But I do want to point out: Line 287 in 800ce94
so the current code swallows not only stacks, but all output to stderr & stdout .
|
LGTM but I guess I can't approve my own PR even if I didn't write any of the code. :-P |
Would be nice to get at least one more approval. @nodejs/build @nodejs/python @jbergstroem @bnoordhuis @Fishrock123 @mscdex @rvagg @nodejs/testing |
Re-running linux CI: https://ci.nodejs.org/job/node-test-commit-linux/17876/ |
These Linux yet again: https://ci.nodejs.org/job/node-test-commit-linux/17884/ |
Include the exit code in test failures. This will give us more information during the currently-puzzling failures that provide no information in CI such as: ``` 03:10:10 not ok 563 parallel/test-fs-truncate 03:10:10 --- 03:10:10 duration_ms: 1.119 03:10:10 severity: fail 03:10:10 stack: |- 03:10:10 ... ``` PR-URL: nodejs#19855 Reviewed-By: Sakthipriyan Vairamani <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Gibson Fahnestock <[email protected]> Reviewed-By: Rod Vagg <[email protected]>
PR-URL: nodejs#19855 Reviewed-By: Sakthipriyan Vairamani <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Gibson Fahnestock <[email protected]> Reviewed-By: Rod Vagg <[email protected]>
Landed in f3f1298...a3db1cc |
Include the exit code in test failures. This will give us more information during the currently-puzzling failures that provide no information in CI such as: ``` 03:10:10 not ok 563 parallel/test-fs-truncate 03:10:10 --- 03:10:10 duration_ms: 1.119 03:10:10 severity: fail 03:10:10 stack: |- 03:10:10 ... ``` PR-URL: #19855 Reviewed-By: Sakthipriyan Vairamani <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Gibson Fahnestock <[email protected]> Reviewed-By: Rod Vagg <[email protected]>
PR-URL: #19855 Reviewed-By: Sakthipriyan Vairamani <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Gibson Fahnestock <[email protected]> Reviewed-By: Rod Vagg <[email protected]>
Include the exit code in test failures. This will give us more information during the currently-puzzling failures that provide no information in CI such as: ``` 03:10:10 not ok 563 parallel/test-fs-truncate 03:10:10 --- 03:10:10 duration_ms: 1.119 03:10:10 severity: fail 03:10:10 stack: |- 03:10:10 ... ``` PR-URL: #19855 Reviewed-By: Sakthipriyan Vairamani <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Gibson Fahnestock <[email protected]> Reviewed-By: Rod Vagg <[email protected]>
PR-URL: #19855 Reviewed-By: Sakthipriyan Vairamani <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Gibson Fahnestock <[email protected]> Reviewed-By: Rod Vagg <[email protected]>
Include the exit code in test failures. This will give us more information during the currently-puzzling failures that provide no information in CI such as: ``` 03:10:10 not ok 563 parallel/test-fs-truncate 03:10:10 --- 03:10:10 duration_ms: 1.119 03:10:10 severity: fail 03:10:10 stack: |- 03:10:10 ... ``` PR-URL: #19855 Reviewed-By: Sakthipriyan Vairamani <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Gibson Fahnestock <[email protected]> Reviewed-By: Rod Vagg <[email protected]>
PR-URL: #19855 Reviewed-By: Sakthipriyan Vairamani <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Gibson Fahnestock <[email protected]> Reviewed-By: Rod Vagg <[email protected]>
Include the exit code in test failures. This will give us more
information during the currently-puzzling failures that provide no
information in CI such as:
@BridgeAR @addaleax @nodejs/build @nodejs/testing
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes