-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Disables flaky test test_operator_gpu.test_deconvolution #14146
Disables flaky test test_operator_gpu.test_deconvolution #14146
Conversation
@mxnet-label-bot add [pr-awaiting-review] |
@mxnet-label-bot add [flaky, test] |
0dd80b5
to
6b90641
Compare
6b90641
to
b27a6b1
Compare
Have you run the tests with master branch? |
Not specifically. But it was detected in a PR with unrelated changes. My understanding is that the PR jobs take the current master, apply the changes and run validation against that. So, in a way, they were run against master - at the time of that PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not a very strict way to identify a flaky test. A flaky test should be the one that fails in the master branch. Any PR no matter how seemlingly unrelated, could have potential impact to the existing code. That's why we have CI in place. Reject this PR unless the test failure could be reproduced on master
I think as a general statement, what you are saying makes sense. But in this case (#14144) - which I would suggest you look at - the changes were to the test script for the installation documentation. I find it difficult to believe that a bash script run by CI would trigger the (flaky) test_operator_gpu.test_deconvolution. Especially since, a) the PR was merged after running the job a second time (on the same code) and ended up green and b) the error report from the jenkins job (bellow) has all the hallmarks of a flaky test, namely miniscule floating point variations.
I would argue that the CI is in place to catch potential errors of merging you code in to the code base. If it catches an error on one run, but not on a second re-run of the exact same code, then the unreliably failing test is flaky by definition and CI isn't doing its job properly. Given all the data above, I don't feel the need to spend time reproducing this in master. |
@perdasilva Thanks for your detailed explanation. I am not disagreeing with your analysis. However, such analysis based on individual's expertise case by case IMHO is not a scalable mechanism for a software release and deployment process. On the other hand, even if this PR passes and fails CI tests at random times, we cannot exclude the possibility that the flaskiness was introduced by this particular PR. The most reliable way to monitor flaky tests should be and only be through master branch over a period of time. CI is our last guard to protect the quality of our software. I think it's better to be more conservative than optimistic. |
@apeforest sure, no problems. Let's kick the ball down the road. |
Description
Disables flaky test
Relates to #10973
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Disables test_operator_gpu.test_deconvolution test
Comments
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fwindows-gpu/detail/PR-14144/6/pipeline