-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rerun Failing Tests #882
Comments
Anybody working on this? |
I started, but unfortunately I haven't been able to make time for it On Saturday, September 5, 2015, Dana Scheider [email protected]
|
I think I can work on this. I'll take a look at your PR, @richarda, and go from there. |
BTW, if anybody has any ideas on how to simulate a flaky test in the features, I'd be glad to hear them. It's trivial to make the feature under test fail or pass at random, but I'm still working out how to make it fail, then pass, as needed for the test. @richarda, I'm using pretty much your same feature file at this point. Any thoughts? |
You could use a global variable in the step def, a bit like we do here: https://github.com/cucumber/cucumber-ruby/blob/master/features/docs/cli/randomize.feature |
Thanks, @mattwynne, I'll go ahead and do it that way. |
Any objections to my implementing this as a formatter? |
I think the event-API should be used (which @mattwynne change the fail-fast formatter to use a296821, #903). It is intended both for internal clients and external clients like user defined filters. Today filters generally live in Cucumber::Filters, maybe we should create Cucumber::Listeners as a home for event listeners? |
A simplistic solution would be to make this both a filter and an event listener - listen for failed test cases, them pump them back through the filter API for re-evaluation. Where this falls down is that, presumably, we'll want to be able to report a test-case as passed if it's passed within the specified number of re-runs. To do that we'd need to reach deeper into things. I think this might need changes within the core to do it properly. |
@danascheider I think the place to start is with a failing feature anyway. |
You’re now making me think I want to get rid of the Filters namespace! |
I have failing features on my fork already if you want me to make a PR. I went ahead and wrote them with the same idea that @richarda used, which was that test cases that passed within the specified number of retries were considered passing, but the output noted that they were flakes by saying "1 passing, 1 flake". The disadvantage of that is that it's a little unclear whether it's talking about two test cases or one, so one alternative would be to consider flakes separately from either passes or failures, but still exit with status 0. I'm open to suggestions.
|
Incidentally, after working on this more, I also came to the conclusion that a formatter is not the best way to do this, and was also tending more toward a filter or event listener. I should have commented to that effect, sorry! And @mattwynne, it is starting to look to me as well that implementing this "right" would involve some changes to core, although probably not very dramatic ones. In any case, if you think of a better implementation that hasn't been discussed here, let me know and I'll be happy to incorporate it. |
One option about the summary output is to use the same principle as the --wip option. For instance if only one scenario was executed first failing and the passing when rerun, the summary output would be something like:
where "x" is the parameter for the maximum number of rerun attempts. |
Deciding on the output is tricky. I think that the implementation will be enough to focus on for now, so I suggest that for this first iteration, we just stick with the current formatter output. For example: Background:
Given a scenario "Flakey" that fails once, then passes
And a scenario "Shakey" that fails twice, then passes
And a scenario "Solid" that passes
Scenario:
When I run `cucumber --retry 2`
Then it should pass with:
"""
3 scenarios (2 passed, 1 failed)
""" |
I also think that focusing on the implementation is enough for now. I think that (the core of) the runner should be ignorant of the fact that the same test case is executed again (and again), it should just get the next test case to be executed and run it. Some other part (maybe in cucumber-ruby-core, maybe in cucumber-ruby) should be responsible to detect failures and trigger reruns. With some relation to the thoughts about the future of reporting (cucumber-attic/gherkin#12 (comment)), from which I get the idea of a primary (json) report, from which other report are created, I think that the first level of reporting should be very much WYSIWWE ("What You See Is What Was Executed"). In case of the example above I would expect the events/formatter-calls:
Maybe I interpret |
@brasmusson - I was interpreting |
You're right! I wasn't thinking straight. So it should be: Background:
Given a scenario "Flakey" that fails once, then passes
And a scenario "Shakey" that fails twice, then passes
And a scenario "Solid" that passes
Scenario:
When I run `cucumber --retry 1`
Then it should fail with:
"""
3 scenarios (2 passed, 1 failed)
"""
Scenario:
When I run `cucumber --retry 2`
Then it should pass with:
"""
3 scenarios (3 passed)
""" I think you can do this with a filter that listens for events, and pumps failed test cases back into the pipeline within the retry limit. Make sense @danascheider? |
Interesting UI here: https://github.com/NoRedInk/rspec-retry They use what are effectively tags to mark specific scenarios for being rerun. That seems like it might be more useful than a one-size-fits-all CLI options. WDYT? |
The problem I see there is that it would require the user to figure out, or at least guess, which features are likely to be flaky to begin with. I use cucumber-ruby to run integration tests against my Backbone.js app, so thinking of my use case there, literally any of the features are potentially flaky in the event the remote test API hiccups or there's a problem with the connection. Having to put a tag on all of them could be cumbersome. That said, I do think a tag is a good idea in cases like the wire protocol features here. Is there a reason we couldn't implement both? |
👍 |
@mattwynne has pointed me in this direction from the cukes forums after making a similar request. I'd suggested the following, using an around filter:
The idea here was to catch the exception of a failing test, and to rerun it - in this case, once. In my particular flaky test, setting a slowdown global helps them to run. If it were implemented using hooks it would give the dev the control over how to handle the failing test. In the use case for Maven it could inspect an environment variable from your CI platform and retry that number of times. This is all well and good, but there's no exception to catch :-/ I don't know enough about the internals of cukes to make an informed suggestion. When I realised there's no exception to catch, I stopped. Would it be possible to pass another variable into the block, against which the success or failure of the scenario can be set? |
@jnpoyser Doesn't your failing expectation create the exception? |
@diabolo No. Here's an example
Rescue block is not executed |
@jnpoyser @diabolo no, when those Around hook blocks run, any exceptions from the scenario are caught by Cucumber and reported to the formatters. Around hooks are a bit weird. Let's look forward and imagine how you might do this using the new We could add a |
@mattwynne retry sounds great, and straightforward - if the beforeretry hook was present then it would solve this problem. It would be a nice-to-have to be able to list the re-tried tests at the end of a run, but that's not a deal breaker - especially as CI servers wouldn't know what to do with that. |
@mattwynne I get really uncomfortable by the notion of the receiver of an "event" being able to influence the "event" sender by setting a variable or something. Then you are talking about something else than events. If it wasn't for that "hook" in Cucumber means something that becomes a step in a test case, it would be a good name for what you are talking about here ("register a I would also like to relate this to the discussion of Parallel scenario execution and the vision of a Cucumber IO using a set of Cucumber runners distributed across machines. The events you are referring to are all generated inside the runners, and in my mind the will be propagated to the Cucumber IO, so that formatters and other event recipients will not see any difference compared to non-parallel Cucumber execution. In that context an event recipient on the Cucumber IO side, for sure cannot expect to "set a variable inside" and expect that it will influence a runner on a different machine. |
I don't really have too much to add except for another possible use case that a I stumbled on this issue when looking for ways that I could interactively retry individual scenarios from pry on my local box before pushing fixes. Basically, what I have right now is an After do |scenario|
binding.pry if scenario.failed?
end This works beautifully. I can play around with the thing being tested in pry at the exact moment it failed. I can also hotswap in possible fixes or new features (In the case of TDD) and play around in pry to see if that yields good results. However, when I try to make a particular test pass, I often just want to run the same test again and again until it works while making changes to the thing I am testing. A Basically I am trying to build bridges as I am crossing them. |
@Archenoth that's an interesting idea. Could we discuss that in a separate ticket? |
I'm closing this ticket. IMO |
Thanks for all your hard work on this one @danascheider @brasmusson and everyone who gave us feedback! |
@mattwynne Is there any way to expose the --retry value as an environment variable? that way I can do something like |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
e.g. http://maven.apache.org/surefire/maven-surefire-plugin/examples/rerun-failing-tests.html
The text was updated successfully, but these errors were encountered: