Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-108834: regrtest reruns failed tests in subprocesses #108839

Merged
merged 4 commits into from
Sep 3, 2023

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Sep 3, 2023

Rename --verbose2 option (-w) to --rerun. Keep --verbose2 as a deprecated alias.

Fix statistics in regrtest summary.

Changes:

  • Add RunTests class.
  • Add TestResult.get_rerun_match_tests() method
  • Rewrite code to serialize/deserialize worker arguments as JSON using a new WorkerJob class.
  • Fix stats when a test is run with --forever --rerun.
  • If failed test names cannot be parsed, log a warning and don't filter tests.
  • test_regrtest.test_rerun_success() now uses a marker file, since the test is re-run in a separated process.
  • Add tests on normalize_test_name() function.

@vstinner
Copy link
Member Author

vstinner commented Sep 3, 2023

cc @AlexWaygood

@rhettinger rhettinger removed their request for review September 3, 2023 15:53
Rename --verbose2 option (-w) to --rerun. Keep --verbose2 as a
deprecated alias.

Changes:

* Fix and enhance statistics in regrtest summary. Add "(filtered)"
  when --match and/or --ignore options are used.
* Add RunTests class.
* Add TestResult.get_rerun_match_tests() method
* Rewrite code to serialize/deserialize worker arguments as JSON
  using a new WorkerJob class.
* Fix stats when a test is run with --forever --rerun.
* If failed test names cannot be parsed, log a warning and don't
  filter tests.
* test_regrtest.test_rerun_success() now uses a marker file, since
  the test is re-run in a separated process.
* Add tests on normalize_test_name() function.
Add test_success() and test_skip() tests.
@vstinner
Copy link
Member Author

vstinner commented Sep 3, 2023

Changes in statistics:

  • Total tests now calls all runs of the same test file, previously only latest stats of the latest run of a test file was counted.
  • Total test files: replaced success=xxx with run=xxx.

Example without rerun:

$ ./python -m test test_sys --forever 
(...)
Total duration: 3.6 sec
Total tests: run=115 failures=1 skipped=12
Total test files: run=2 failed=1
Result: FAILURE

Example with rerun:

$ ./python -m test test_sys --forever --rerun
(...)
Total duration: 4.0 sec
Total tests: run=116 failures=1 skipped=12
Total test files: run=3 rerun=1
Result: FAILURE then SUCCESS

Difference when rerun is used:

  • Total tests: run=115 failures=1 skipped=12 => Total tests: run=116 failures=1 skipped=12
  • Total test files: run=2 failed=1 => Total test files: run=3 rerun=1

Comparison with the main branch:

Without rerun:

$ ./python -m test test_sys --forever 
(...)
Total duration: 4.9 sec
Total tests: run=72 failures=2 skipped=7
Total test files: success=1 failed=1

With rerun:

 ./python -m test test_sys --forever --rerun
(...)
Total duration: 4.9 sec
Total tests: run=2 failures=2
Total test files: success=1 failed=1 rerun=1
Result: FAILURE then FAILURE

Comparison main => this PR with forever (but without --rerun):

  • Total tests: run=72 failures=2 skipped=7 => Total tests: run=115 failures=1 skipped=12
  • Total test files: success=1 failed=1 => Total test files: run=2 failed=1

Comparison main => this PR with forever and rerun:

  • Total tests: run=2 failures=2 => Total tests: run=116 failures=1 skipped=12
  • Total test files: success=1 failed=1 rerun=1 => Total test files: run=3 rerun=1

@vstinner
Copy link
Member Author

vstinner commented Sep 3, 2023

Tests / Windows (x64) (pull_request) Failing

Yeah! That's exactly why I wrote this change. It worked as expected!

0:00:00 Run tests in parallel using 4 child processes (timeout: 20 min, worker timeout: 25 min)
0:00:01 [  1/462] test_syslog skipped
test_syslog skipped -- No module named 'syslog'
0:00:01 [  2/462] test_type_cache passed
0:00:02 [  3/462] test_memoryview passed
0:00:03 [  4/462] test_sched passed
(...)
0:33:01 load avg: 0.05 running: test.test_concurrent_futures.test_deadlock (18 min 14 sec)
0:33:31 load avg: 0.06 running: test.test_concurrent_futures.test_deadlock (18 min 44 sec)
0:34:01 load avg: 0.05 running: test.test_concurrent_futures.test_deadlock (19 min 14 sec)
0:34:31 load avg: 0.03 running: test.test_concurrent_futures.test_deadlock (19 min 44 sec)
0:34:47 load avg: 0.05 [462/462/1] test.test_concurrent_futures.test_deadlock process crashed (Exit code 1)
Timeout (0:20:00)!
Thread 0x000002e8 (most recent call first):
  File "D:\a\cpython\cpython\Lib\multiprocessing\connection.py", line 282 in _send_bytes
  File "D:\a\cpython\cpython\Lib\multiprocessing\connection.py", line 199 in send_bytes
  File "D:\a\cpython\cpython\Lib\multiprocessing\queues.py", line 246 in _feed
  File "D:\a\cpython\cpython\Lib\threading.py", line 996 in run
  File "D:\a\cpython\cpython\Lib\threading.py", line 1059 in _bootstrap_inner
  File "D:\a\cpython\cpython\Lib\threading.py", line 1016 in _bootstrap

Thread 0x00001ad8 (most recent call first):
  File "D:\a\cpython\cpython\Lib\threading.py", line 1153 in _wait_for_tstate_lock
  File "D:\a\cpython\cpython\Lib\threading.py", line 1133 in join
  File "D:\a\cpython\cpython\Lib\multiprocessing\queues.py", line 199 in _finalize_join
  File "D:\a\cpython\cpython\Lib\multiprocessing\util.py", line 224 in __call__
  File "D:\a\cpython\cpython\Lib\multiprocessing\queues.py", line 151 in join_thread
  File "D:\a\cpython\cpython\Lib\concurrent\futures\process.py", line 553 in join_executor_internals
  File "D:\a\cpython\cpython\Lib\concurrent\futures\process.py", line 507 in terminate_broken
  File "D:\a\cpython\cpython\Lib\concurrent\futures\process.py", line 344 in run
  File "D:\a\cpython\cpython\Lib\threading.py", line 1059 in _bootstrap_inner
  File "D:\a\cpython\cpython\Lib\threading.py", line 1016 in _bootstrap

Thread 0x0000158c (most recent call first):
  File "D:\a\cpython\cpython\Lib\threading.py", line 1153 in _wait_for_tstate_lock
  File "D:\a\cpython\cpython\Lib\threading.py", line 1133 in join
  File "D:\a\cpython\cpython\Lib\concurrent\futures\process.py", line 836 in shutdown
  File "D:\a\cpython\cpython\Lib\concurrent\futures\_base.py", line 647 in __exit__
  File "D:\a\cpython\cpython\Lib\test\test_concurrent_futures\test_deadlock.py", line 236 in test_crash_big_data
(...)


== Tests result: FAILURE ==

423 tests OK.

(...)

1 test failed:
    test.test_concurrent_futures.test_deadlock

(...)

0:34:47 Re-running 1 failed tests in verbose mode in subprocesses
0:34:47 Run tests in parallel using 4 child processes (timeout: 20 min, worker timeout: 25 min)
0:35:17 load avg: 0.06 running: test.test_concurrent_futures.test_deadlock (30.0 sec)
0:35:47 load avg: 0.10 running: test.test_concurrent_futures.test_deadlock (1 min)
0:36:17 load avg: 0.06 running: test.test_concurrent_futures.test_deadlock (1 min 30 sec)
0:36:47 load avg: 0.06 running: test.test_concurrent_futures.test_deadlock (2 min)
0:37:17 load avg: 0.04 running: test.test_concurrent_futures.test_deadlock (2 min 30 sec)
0:37:47 load avg: 0.04 running: test.test_concurrent_futures.test_deadlock (3 min)
0:38:17 load avg: 0.08 running: test.test_concurrent_futures.test_deadlock (3 min 30 sec)
(...)

0:52:47 load avg: 0.27 running: test.test_concurrent_futures.test_deadlock (18 min)
0:53:17 load avg: 0.59 running: test.test_concurrent_futures.test_deadlock (18 min 30 sec)
0:53:47 load avg: 0.41 running: test.test_concurrent_futures.test_deadlock (19 min)
0:54:17 load avg: 0.28 running: test.test_concurrent_futures.test_deadlock (19 min 30 sec)
0:54:47 load avg: 0.17 running: test.test_concurrent_futures.test_deadlock (20 min)
0:54:47 load avg: 0.17 [1/1/1] test.test_concurrent_futures.test_deadlock process crashed (Exit code 1)
Re-running test.test_concurrent_futures.test_deadlock in verbose mode
(...)

test_crash_at_task_unpickle (test.test_concurrent_futures.test_deadlock.ProcessPoolSpawnExecutorDeadlockTest.test_crash_at_task_unpickle) ... 0.47s ok
test_crash_big_data (test.test_concurrent_futures.test_deadlock.ProcessPoolSpawnExecutorDeadlockTest.test_crash_big_data) ... Warning -- Uncaught thread exception: InvalidStateError
Exception in thread Thread-2:
Traceback (most recent call last):
  File "D:\a\cpython\cpython\Lib\threading.py", line 1059, in _bootstrap_inner
    self.run()
  File "D:\a\cpython\cpython\Lib\concurrent\futures\process.py", line 344, in run
    self.terminate_broken(cause)
  File "D:\a\cpython\cpython\Lib\concurrent\futures\process.py", line 492, in terminate_broken
    work_item.future.set_exception(bpe)
  File "D:\a\cpython\cpython\Lib\concurrent\futures\_base.py", line 559, in set_exception
    raise InvalidStateError('{}: {!r}'.format(self._state, self))
concurrent.futures._base.InvalidStateError: CANCELLED: <Future at 0x1b7d175c150 state=cancelled>
0.52s Warning -- threading_cleanup() failed to cleanup 1 threads (count: 1, dangling: 2)
Warning -- Dangling thread: <Thread(QueueFeederThread, started daemon 7084)>
Warning -- Dangling thread: <_MainThread(MainThread, started 4784)>
ok
(...)

test_shutdown_deadlock_pickle (test.test_concurrent_futures.test_deadlock.ProcessPoolSpawnExecutorDeadlockTest.test_shutdown_deadlock_pickle) ... 0.53s ok
Warning -- threading_cleanup() failed to cleanup 1 threads (count: 1, dangling: 2)
Warning -- Dangling thread: <_MainThread(MainThread, started 4784)>
Warning -- Dangling thread: <Thread(QueueFeederThread, started daemon 7084)>
Timeout (0:20:00)!
Thread 0x00001bac (most recent call first):
  File "D:\a\cpython\cpython\Lib\multiprocessing\connection.py", line 282 in _send_bytes
  File "D:\a\cpython\cpython\Lib\multiprocessing\connection.py", line 199 in send_bytes
  File "D:\a\cpython\cpython\Lib\multiprocessing\queues.py", line 246 in _feed
  File "D:\a\cpython\cpython\Lib\threading.py", line 996 in run
  File "D:\a\cpython\cpython\Lib\threading.py", line 1059 in _bootstrap_inner
  File "D:\a\cpython\cpython\Lib\threading.py", line 1016 in _bootstrap

Thread 0x000012b0 (most recent call first):
  File "D:\a\cpython\cpython\Lib\threading.py", line 1153 in _wait_for_tstate_lock
  File "D:\a\cpython\cpython\Lib\threading.py", line 1133 in join
  File "D:\a\cpython\cpython\Lib\multiprocessing\queues.py", line 199 in _finalize_join
  File "D:\a\cpython\cpython\Lib\multiprocessing\util.py", line 224 in __call__
  File "D:\a\cpython\cpython\Lib\multiprocessing\util.py", line 300 in _run_finalizers
  File "D:\a\cpython\cpython\Lib\multiprocessing\util.py", line 488 in _cleanup_tests
  File "D:\a\cpython\cpython\Lib\unittest\case.py", line 139 in doModuleCleanups
  File "D:\a\cpython\cpython\Lib\unittest\suite.py", line 275 in _handleModuleTearDown
  File "D:\a\cpython\cpython\Lib\unittest\suite.py", line 131 in run
  File "D:\a\cpython\cpython\Lib\unittest\suite.py", line 84 in __call__
  File "D:\a\cpython\cpython\Lib\unittest\runner.py", line 240 in run
  File "D:\a\cpython\cpython\Lib\test\support\__init__.py", line 1142 in _run_suite
  File "D:\a\cpython\cpython\Lib\test\support\__init__.py", line 1269 in run_unittest
  File "D:\a\cpython\cpython\Lib\test\libregrtest\runtest.py", line 393 in run_unittest
(...)

1 re-run test:
    test.test_concurrent_futures.test_deadlock

Total duration: 54 min 47 sec
Total tests: run=40,332 skipped=2,283
Total test files: run=461/462 failed=1 skipped=36 resource_denied=2 rerun=1
Result: FAILURE then FAILURE

Explanation:

  1. The test suite is run: Run tests in parallel using 4 child processes (timeout: 20 min, worker timeout: 25 min)
  2. At 0:34:47, test_concurrent_futures.test_deadlock process crashed (Exit code 1) with a timeout: Timeout (0:20:00)!
  3. regrtest displays a a first report: 423 tests OK. but 1 test failed: test.test_concurrent_futures.test_deadlock
  4. regrtest decides to re-run the failed test: Re-running 1 failed tests in verbose mode in subprocesses and Run tests in parallel using 4 child processes. First new feature: failed tests are run in parallel with 4 processes!
  5. As expected, test_concurrent_futures hangs again (known bug, see issue test_concurrent_futures.test_deadlock: test_crash_big_data() hangs randomly on Windows #107219) and is killed with a timeout. Nothing surprising here.
  6. WAAAAAAAAAAAAAAIT for the last part!!! regrtest gets back the control, and report the crashed worker process as nothing happens and it writes its final report! That's the second new feature: regrtest main process no longer crash when the re-run test crash. Instead, the test is run in a worker progress, and if the work crash, that's fine: the main process takes back the control and just write its report, cleanup the temporary directory, etc.
Total tests: run=40,332 skipped=2,283
Total test files: run=461/462 failed=1 skipped=36 resource_denied=2 rerun=1
Result: FAILURE then FAILURE

Before that, the logs stopped at Timeout (0:20:00)! and so I had to read all logs to get the previous report, understand which tests failed, why a test crashed, etc.

@vstinner vstinner enabled auto-merge (squash) September 3, 2023 21:13
@vstinner vstinner merged commit 31c2945 into python:main Sep 3, 2023
@vstinner vstinner deleted the regrtest_rerun branch September 3, 2023 21:37
@vstinner vstinner added needs backport to 3.11 only security fixes needs backport to 3.12 bug and security fixes labels Sep 5, 2023
@miss-islington
Copy link
Contributor

Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.12.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.11.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Sorry, @vstinner, I could not cleanly backport this to 3.12 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker 31c2945f143c6b80c837fcf09a5cfb85fea9ea4c 3.12

@miss-islington
Copy link
Contributor

Sorry, @vstinner, I could not cleanly backport this to 3.11 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker 31c2945f143c6b80c837fcf09a5cfb85fea9ea4c 3.11

vstinner added a commit to vstinner/cpython that referenced this pull request Sep 5, 2023
…#108839)

When using --rerun option, regrtest now re-runs failed tests
in verbose mode in fresh worker processes to have more
deterministic behavior. So it can write its final report even
if a test killed a worker progress.

Add --fail-rerun option to regrtest: exit with non-zero exit code
if a test failed pass passed when re-run in verbose mode (in a
fresh process). That's now more useful since tests can pass
when re-run in a fresh worker progress, whereas they failed
when run after other tests when tests are run sequentially.

Rename --verbose2 option (-w) to --rerun. Keep --verbose2 as a
deprecated alias.

Changes:

* Fix and enhance statistics in regrtest summary. Add "(filtered)"
  when --match and/or --ignore options are used.
* Add RunTests class.
* Add TestResult.get_rerun_match_tests() method
* Rewrite code to serialize/deserialize worker arguments as JSON
  using a new WorkerJob class.
* Fix stats when a test is run with --forever --rerun.
* If failed test names cannot be parsed, log a warning and don't
  filter tests.
* test_regrtest.test_rerun_success() now uses a marker file, since
  the test is re-run in a separated process.
* Add tests on normalize_test_name() function.
* Add test_success() and test_skip() tests to test_regrtest.

(cherry picked from commit 31c2945)
Yhg1s pushed a commit that referenced this pull request Sep 8, 2023
* gh-108834: regrtest reruns failed tests in subprocesses (#108839)

When using --rerun option, regrtest now re-runs failed tests
in verbose mode in fresh worker processes to have more
deterministic behavior. So it can write its final report even
if a test killed a worker progress.

Add --fail-rerun option to regrtest: exit with non-zero exit code
if a test failed pass passed when re-run in verbose mode (in a
fresh process). That's now more useful since tests can pass
when re-run in a fresh worker progress, whereas they failed
when run after other tests when tests are run sequentially.

Rename --verbose2 option (-w) to --rerun. Keep --verbose2 as a
deprecated alias.

Changes:

* Fix and enhance statistics in regrtest summary. Add "(filtered)"
  when --match and/or --ignore options are used.
* Add RunTests class.
* Add TestResult.get_rerun_match_tests() method
* Rewrite code to serialize/deserialize worker arguments as JSON
  using a new WorkerJob class.
* Fix stats when a test is run with --forever --rerun.
* If failed test names cannot be parsed, log a warning and don't
  filter tests.
* test_regrtest.test_rerun_success() now uses a marker file, since
  the test is re-run in a separated process.
* Add tests on normalize_test_name() function.
* Add test_success() and test_skip() tests to test_regrtest.

(cherry picked from commit 31c2945)

* gh-108834: regrtest --fail-rerun exits with code 5 (#108896)

When the --fail-rerun option is used and a test fails and then pass,
regrtest now uses exit code 5 ("rerun) instead of 2 ("bad test").

(cherry picked from commit 1170d5a)

* gh-108416: Mark slow but not CPU bound test methods with requires_resource('walltime') (GH-108480)

(cherry picked from commit 1e0d627)

* Manually sync Lib/test/libregrtest/ from main

---------

Co-authored-by: Serhiy Storchaka <[email protected]>
@picnixz
Copy link
Member

picnixz commented Sep 14, 2024

3.12 got its backport but 3.11 didn't get its backport but since it only receives security-fixes, it won't get any.

@picnixz picnixz removed needs backport to 3.11 only security fixes needs backport to 3.12 bug and security fixes labels Sep 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants