Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polish off retry #992

Merged
merged 1 commit into from
Sep 2, 2016
Merged

Conversation

mattwynne
Copy link
Member

@mattwynne mattwynne commented Jul 2, 2016

We've had a problem since we merged #920 that the retry behaviour doesn't quite work in integration.

This PR is my attempt to resolve that, so we can remove the @wip tags and close #982 and #981.

There's a puzzle here with the output on the second scenario, with two
retries. I've detailed what I think the summary should be, but there
seems to be a passing scenario missing from the results totals, so the second scenario (Retry twice, so Shakey starts to pass too) is currently failing.

Did I make a mistake, or is the code still wrong?

TO DO

@mattwynne mattwynne force-pushed the fix-regression-displaying-cli-help branch from ec57acf to 78cedb6 Compare July 2, 2016 11:58
@mattwynne
Copy link
Member Author

ping @brasmusson. I wasn't able to reproduce the crash you mentioned, but the totals in the second scenario do look funny, don't they? Any thoughts why?

@mattwynne
Copy link
Member Author

Note that also the pretty formatter output is messy (missing steps etc) but I don't care about that right now.

@mattwynne mattwynne force-pushed the fix-regression-displaying-cli-help branch 2 times, most recently from 3a27fa1 to eb81b6b Compare July 2, 2016 21:45
@brasmusson
Copy link
Contributor

@mattwynne The crash I'm talking about is this one (Travis build), however it does not occur after the Event bus have been moved the the core (that is only on the v2.x-bugfix branch).

@mattwynne
Copy link
Member Author

OK I believe this is ready to merge. Anyone else want to take a look @cucumber/cucumber-ruby ?

- changed the acceptance test to use more descriptive names for the test
  scenarios (I kept getting confused)
- fixed a bug in the acceptance tests caused by our scenarios all running
  in one process, and using global variables (I added init scripts to
  reset them to zero at the start of a test run).
- tidied up the docs a bit in the acceptance test
- removed the @wip tag
- added the retry filter into the filterchain in runtime

There's a puzzle here with the output on the second scenario, with two
retries. I've detailed what I think the summary should be, but there
seems to be a passing scenario missing from the results totals.

Did I make a mistake, or is the code still wrong?
@@ -229,6 +229,7 @@ def filters
filters << Cucumber::Core::Test::LocationsFilter.new(filespecs.locations)
filters << Filters::Randomizer.new(@configuration.seed) if @configuration.randomize?
filters << Filters::Quit.new
filters << Filters::Retry.new(@configuration)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should Quit be the last filter, so that it is possible to abort the retry of a long running test case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect I'm missing something, but won't the Quit filter be invoked as soon as the first attempt to run the test case is passed on by the retry filter?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect the following behaviour:

  Given I use retry
  And a long-running, failing test has started to execute
  When Cucumber.wants_to_quit is set
  Then the long-running, failing test case is only executed once 

As far as I can tell, the order of the Quit and the Retry filter needs to be swapped, to get this behaviour.

@azohra
Copy link

azohra commented Aug 5, 2016

I have been giving this a try locally and I noticed that if you put --retry (+ any int) in a profile under cucumber.yml, it is seemingly ignored. Only when I added --retry to the command line were my tests actually retried.

This is my cucumber.yml (which works fine for everything but retry)

#Profiles
default: -p retry -p base_options -p cli_summary -p allure -p json_pretty -p test_tags -r features
create_runtime_log: -p base_options -p cli_pretty -p test_tags -p parallel_logger-r features

#Options
retry: --retry 3
base_options: --no-source --color
test_tags:  --tags ~@ignore

#Formatters
cli_summary: --format summary
cli_pretty: --format pretty
allure: --format AllureCucumber::Formatter --out /fake_dir
json_pretty: --format json_pretty --out report/json-results/results_<%= rand(36**10).to_s(36)  %>.json
parellel_logger: --format ParallelTests::Gherkin::RuntimeLogger  --out parallel_runtime_new.log

Moreover, I have noticed that when using the pretty or summary formatter, the failure output at the end of the test run is duplicated for each individual failure of the same test.

This seems dangerous for re-run logic and reporting. In my view, Cucumber should simply report the test as passed or failed.

Failing Scenarios:
cucumber -p retry -p base_options -p cli_pretty -p allure -p json_pretty -p test_tags features/tests/repo/login.feature:32
cucumber -p retry -p base_options -p cli_pretty -p allure -p json_pretty -p test_tags features/tests/repo/login.feature:32
cucumber -p retry -p base_options -p cli_pretty -p allure -p json_pretty -p test_tags features/tests/repo/login.feature:32
cucumber -p retry -p base_options -p cli_pretty -p allure -p json_pretty -p test_tags features/tests/repo/login.feature:32`

@danascheider
Copy link
Contributor

@tk8817 - thanks for reporting! @mattwynne and @brasmusson - do you think we should move this to a separate issue or fix it as part of this one?

@brasmusson
Copy link
Contributor

Cucumber should simply report the test as passed or failed.

No, a test that passes after retry is not the same as passed, it should rather be reported as flaky in a summary. Thinking ahead on a feature like running all test cases x times, to asses their flakiness, then there are three interesting categories:

  • the test cases that passed all x times -> passed,
  • the test cases that failed all x times -> failed, and
  • the test cases that passes some times and failed some times -> flaky.

@danascheider The summary reporting and exit code handling when using --retry is not ideal with this PR, but I can live with merging this PR and open another issue to polish the summary reporting and exit code handling.

@azohra
Copy link

azohra commented Aug 6, 2016

@brasmusson - I agree with you 100%, apologies if my comment did not disaply that. As you can see in my example, the test failed 3 times, it never passed. My contention is that the formatter is showing it as 3 failures and if I were to re-run based on my junit/json results, it would attempt to re-run the same test case 3 times.

I was not attempting to make a claim on cases which switch from failed to passed and indeed adding a status of 'flaky' would be very interesting.

My larger concern is the fact that retry does not work when invoked from cucumber.yml

Best,
Justin

@mattwynne
Copy link
Member Author

@danascheider I'd rather deal with these as separate issues I think. The profiles thing seems like a bug, but the formatting / reporting thing needs more thinking about, and will probably be more work to implement.

@tk8817 try using the new summary formatter as we do in the scenarios for retry. You might find the output is a bit clearer about what's going on.

@azohra
Copy link

azohra commented Aug 7, 2016

Hmm, I am not articulating this well :( The issue is not about the visual output in realtime. I noted with the pretty & summary formatters that Cucumber is reporting the same test case as a unique failure X times for X retries when X retries fail.

Failing Scenarios:
cucumber -p retry -p base_options -p cli_pretty -p allure -p json_pretty -p test_tags features/tests/repo/login.feature:32
cucumber -p retry -p base_options -p cli_pretty -p allure -p json_pretty -p test_tags features/tests/repo/login.feature:32
cucumber -p retry -p base_options -p cli_pretty -p allure -p json_pretty -p test_tags features/tests/repo/login.feature:32
cucumber -p retry -p base_options -p cli_pretty -p allure -p json_pretty -p test_tags features/tests/repo/login.feature:32`

My concerns with this are

  1. When re-running failed scenarios this single scenario is being re-run 4 times as it believes there are 4 unique failures, but its just the same scenario which failed 4 retries.
  2. The json & junit formatter output also shows the failures as unique and as such I am seeing what some may deem incorrect statistics and output (same test multiple times).

Until there is a strategy to coin tests as flaky or differentiate an original test from a retry, it seems that there will be downsides to using retry with existing formatters / infrastructure.

One alternate option would be to only report the final run of the scenario. As Matt said, a test will only be run for exactly as many times as it needs to. The upside is much cleaner reporting / re-running support. The downside is NO indication that a test was retried in that reporting.

Rock, meet hard place :)

@brasmusson
Copy link
Contributor

With respect to the summary and the pretty formatters I think the current behaviour is correct:

      Fails-twice feature
        Fails-twice ✗
        Fails-twice ✗
        Fails-twice ✓

This is what happened, and it should therefore be reported that way by the summary formatter (and similarly by the pretty formatter).

With respect to the summary printout (side not - Cucumber-JVM prints the summary statistics, snippets etc. separately so if the pretty formatter output is sent to a file, then the summary statistics etc. is not included in the file), there is no need to list the same test case several times under "Failing Scenarios:", it should rather be listed once under "Flaky Scenarios:" - and the same in the Scenario statistics ("1 scenario (1 flaky)"). The Step statistics, I do not think should be changed from the current behaviour - we cannot assume that the same step will fail when a flaky test case has failed several times, so a flaky test case with three steps will add more then three steps to the Step statistics.

The Json formatter should absolutely include every individual execution of a flaky test case, we can neither assume that the same step fails each time a flaky test case fails, and if the same step actually happen to fail, we cannot assume that the error is the same. Everything need to be included in the Json formatter output. Using the Json formatter output it should be possible to find out exactly what happened, so each time any step definition or hook has executed the information about this needs to be in the Json formatter output.

With respect to the JUnit formatter, reporting a flaky test case once could be an option. As the Cucumber statuses are mapped down to the three statuses, passed, skipped and failed in the JUnit formatter output, a flaky test case should in that case be mapped to passed in non-strict mode and failed in strict mode.

@mattwynne
Copy link
Member Author

OK merging this one, let's discuss enhancements in new tickets.

@mattwynne mattwynne merged commit 547e86c into fix-regression-displaying-cli-help Sep 2, 2016
@mattwynne
Copy link
Member Author

Somehow I merged this into the wrong branch instead of master. Have fixed that in 810384e

@mattwynne mattwynne deleted the finish-off-retry branch September 2, 2016 22:01
@lock
Copy link

lock bot commented Oct 25, 2018

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Oct 25, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants