-
Notifications
You must be signed in to change notification settings - Fork 1
Make test suites for js-ipfs and js-ipfs-api be flaky-free #127
Comments
To try and isolate and find what tests are flaky I started to look at the master branch of the js-ipfs project. I created a tool that gathers the failure report data and creates a simple chart to help highlight failures. I calculate a standard deviation using job run number, and only include failures that have a stdev over 1, this eliminates failures that happen consecutively. The more sporadic the failures, the high the stdev. The rows are order by the total number of failures for the given error / test. This is the chart for js-ipfs/master for runs 150 (Early July) to 237 (now) Chart: https://gateway.ipfs.io/ipfs/QmdkRvAuDbktNJzEF3W6Sm9GuqRcsTFKMGCExZ3bYUuyoA/ The following failures seem the most likely candidates for being flaky tests.
Some of these will only fail on certain platforms / versions of nodejs though.
I'm going to take this information and try to pull out some of the tests that I think we should apply retry logic too. I will also push up the tools I used for generate this information so that we can use it for other projects. |
This is so awesome, really great to know where to focus our energy.
|
Agree with Alan, awesome work @travisperson! We should be able to separate test failures because of timeouts (which I think many of these are) compared to "normal" test failures and "exceptional" test failures. Normal test failures would be a test case which has a assertion that is failing, exceptional test failures would be things like It'll be very useful to show the output of the failure when hovering/clicking on a cell in the table, so we could see directly what's going wrong. @travisperson can you publish the code for generating this somewhere? |
Published https://github.com/travisperson/jenkins-flake-report
I think most of them are.
Ya, the data I'm getting from Jenkins has some information we can test for to see if it's a timeout.
Ya I think that would be great. Currently the script is just a golang html template, but we could at least use a |
js-ipfs and js-libp2p are the biggest JS codebases we have and the test suites not only takes long time to run, but are also flaky, meaning they fail randomly.
We should make use of whatever tools we have (mocha's
.retry
for example) to make them not be flaky. Nothing is worse than after 40 minutes test run, see that one test timed out.We can consider this solved once you can run tests 10 times for the same commit and always have a successful run (if it was successful the first time) on CI.
The text was updated successfully, but these errors were encountered: