gh-108834: regrtest reruns failed tests in subprocesses #108839

vstinner · 2023-09-03T07:19:31Z

Rename --verbose2 option (-w) to --rerun. Keep --verbose2 as a deprecated alias.

Fix statistics in regrtest summary.

Changes:

Add RunTests class.
Add TestResult.get_rerun_match_tests() method
Rewrite code to serialize/deserialize worker arguments as JSON using a new WorkerJob class.
Fix stats when a test is run with --forever --rerun.
If failed test names cannot be parsed, log a warning and don't filter tests.
test_regrtest.test_rerun_success() now uses a marker file, since the test is re-run in a separated process.
Add tests on normalize_test_name() function.

Issue: regrtest should re-run failed tests in subprocesses #108834

vstinner · 2023-09-03T07:19:40Z

Rename --verbose2 option (-w) to --rerun. Keep --verbose2 as a deprecated alias. Changes: * Fix and enhance statistics in regrtest summary. Add "(filtered)" when --match and/or --ignore options are used. * Add RunTests class. * Add TestResult.get_rerun_match_tests() method * Rewrite code to serialize/deserialize worker arguments as JSON using a new WorkerJob class. * Fix stats when a test is run with --forever --rerun. * If failed test names cannot be parsed, log a warning and don't filter tests. * test_regrtest.test_rerun_success() now uses a marker file, since the test is re-run in a separated process. * Add tests on normalize_test_name() function.

Add test_success() and test_skip() tests.

vstinner · 2023-09-03T17:57:18Z

Changes in statistics:

Total tests now calls all runs of the same test file, previously only latest stats of the latest run of a test file was counted.
Total test files: replaced success=xxx with run=xxx.

Example without rerun:

$ ./python -m test test_sys --forever 
(...)
Total duration: 3.6 sec
Total tests: run=115 failures=1 skipped=12
Total test files: run=2 failed=1
Result: FAILURE

Example with rerun:

$ ./python -m test test_sys --forever --rerun
(...)
Total duration: 4.0 sec
Total tests: run=116 failures=1 skipped=12
Total test files: run=3 rerun=1
Result: FAILURE then SUCCESS

Difference when rerun is used:

Total tests: run=115 failures=1 skipped=12 => Total tests: run=116 failures=1 skipped=12
Total test files: run=2 failed=1 => Total test files: run=3 rerun=1

Comparison with the main branch:

Without rerun:

$ ./python -m test test_sys --forever 
(...)
Total duration: 4.9 sec
Total tests: run=72 failures=2 skipped=7
Total test files: success=1 failed=1

With rerun:

 ./python -m test test_sys --forever --rerun
(...)
Total duration: 4.9 sec
Total tests: run=2 failures=2
Total test files: success=1 failed=1 rerun=1
Result: FAILURE then FAILURE

Comparison main => this PR with forever (but without --rerun):

Total tests: run=72 failures=2 skipped=7 => Total tests: run=115 failures=1 skipped=12
Total test files: success=1 failed=1 => Total test files: run=2 failed=1

Comparison main => this PR with forever and rerun:

Total tests: run=2 failures=2 => Total tests: run=116 failures=1 skipped=12
Total test files: success=1 failed=1 rerun=1 => Total test files: run=3 rerun=1

vstinner · 2023-09-03T21:11:00Z

Tests / Windows (x64) (pull_request) Failing

Yeah! That's exactly why I wrote this change. It worked as expected!

0:00:00 Run tests in parallel using 4 child processes (timeout: 20 min, worker timeout: 25 min)
0:00:01 [  1/462] test_syslog skipped
test_syslog skipped -- No module named 'syslog'
0:00:01 [  2/462] test_type_cache passed
0:00:02 [  3/462] test_memoryview passed
0:00:03 [  4/462] test_sched passed
(...)
0:33:01 load avg: 0.05 running: test.test_concurrent_futures.test_deadlock (18 min 14 sec)
0:33:31 load avg: 0.06 running: test.test_concurrent_futures.test_deadlock (18 min 44 sec)
0:34:01 load avg: 0.05 running: test.test_concurrent_futures.test_deadlock (19 min 14 sec)
0:34:31 load avg: 0.03 running: test.test_concurrent_futures.test_deadlock (19 min 44 sec)
0:34:47 load avg: 0.05 [462/462/1] test.test_concurrent_futures.test_deadlock process crashed (Exit code 1)
Timeout (0:20:00)!
Thread 0x000002e8 (most recent call first):
  File "D:\a\cpython\cpython\Lib\multiprocessing\connection.py", line 282 in _send_bytes
  File "D:\a\cpython\cpython\Lib\multiprocessing\connection.py", line 199 in send_bytes
  File "D:\a\cpython\cpython\Lib\multiprocessing\queues.py", line 246 in _feed
  File "D:\a\cpython\cpython\Lib\threading.py", line 996 in run
  File "D:\a\cpython\cpython\Lib\threading.py", line 1059 in _bootstrap_inner
  File "D:\a\cpython\cpython\Lib\threading.py", line 1016 in _bootstrap

Thread 0x00001ad8 (most recent call first):
  File "D:\a\cpython\cpython\Lib\threading.py", line 1153 in _wait_for_tstate_lock
  File "D:\a\cpython\cpython\Lib\threading.py", line 1133 in join
  File "D:\a\cpython\cpython\Lib\multiprocessing\queues.py", line 199 in _finalize_join
  File "D:\a\cpython\cpython\Lib\multiprocessing\util.py", line 224 in __call__
  File "D:\a\cpython\cpython\Lib\multiprocessing\queues.py", line 151 in join_thread
  File "D:\a\cpython\cpython\Lib\concurrent\futures\process.py", line 553 in join_executor_internals
  File "D:\a\cpython\cpython\Lib\concurrent\futures\process.py", line 507 in terminate_broken
  File "D:\a\cpython\cpython\Lib\concurrent\futures\process.py", line 344 in run
  File "D:\a\cpython\cpython\Lib\threading.py", line 1059 in _bootstrap_inner
  File "D:\a\cpython\cpython\Lib\threading.py", line 1016 in _bootstrap

Thread 0x0000158c (most recent call first):
  File "D:\a\cpython\cpython\Lib\threading.py", line 1153 in _wait_for_tstate_lock
  File "D:\a\cpython\cpython\Lib\threading.py", line 1133 in join
  File "D:\a\cpython\cpython\Lib\concurrent\futures\process.py", line 836 in shutdown
  File "D:\a\cpython\cpython\Lib\concurrent\futures\_base.py", line 647 in __exit__
  File "D:\a\cpython\cpython\Lib\test\test_concurrent_futures\test_deadlock.py", line 236 in test_crash_big_data
(...)


== Tests result: FAILURE ==

423 tests OK.

(...)

1 test failed:
    test.test_concurrent_futures.test_deadlock

(...)

0:34:47 Re-running 1 failed tests in verbose mode in subprocesses
0:34:47 Run tests in parallel using 4 child processes (timeout: 20 min, worker timeout: 25 min)
0:35:17 load avg: 0.06 running: test.test_concurrent_futures.test_deadlock (30.0 sec)
0:35:47 load avg: 0.10 running: test.test_concurrent_futures.test_deadlock (1 min)
0:36:17 load avg: 0.06 running: test.test_concurrent_futures.test_deadlock (1 min 30 sec)
0:36:47 load avg: 0.06 running: test.test_concurrent_futures.test_deadlock (2 min)
0:37:17 load avg: 0.04 running: test.test_concurrent_futures.test_deadlock (2 min 30 sec)
0:37:47 load avg: 0.04 running: test.test_concurrent_futures.test_deadlock (3 min)
0:38:17 load avg: 0.08 running: test.test_concurrent_futures.test_deadlock (3 min 30 sec)
(...)

0:52:47 load avg: 0.27 running: test.test_concurrent_futures.test_deadlock (18 min)
0:53:17 load avg: 0.59 running: test.test_concurrent_futures.test_deadlock (18 min 30 sec)
0:53:47 load avg: 0.41 running: test.test_concurrent_futures.test_deadlock (19 min)
0:54:17 load avg: 0.28 running: test.test_concurrent_futures.test_deadlock (19 min 30 sec)
0:54:47 load avg: 0.17 running: test.test_concurrent_futures.test_deadlock (20 min)
0:54:47 load avg: 0.17 [1/1/1] test.test_concurrent_futures.test_deadlock process crashed (Exit code 1)
Re-running test.test_concurrent_futures.test_deadlock in verbose mode
(...)

test_crash_at_task_unpickle (test.test_concurrent_futures.test_deadlock.ProcessPoolSpawnExecutorDeadlockTest.test_crash_at_task_unpickle) ... 0.47s ok
test_crash_big_data (test.test_concurrent_futures.test_deadlock.ProcessPoolSpawnExecutorDeadlockTest.test_crash_big_data) ... Warning -- Uncaught thread exception: InvalidStateError
Exception in thread Thread-2:
Traceback (most recent call last):
  File "D:\a\cpython\cpython\Lib\threading.py", line 1059, in _bootstrap_inner
    self.run()
  File "D:\a\cpython\cpython\Lib\concurrent\futures\process.py", line 344, in run
    self.terminate_broken(cause)
  File "D:\a\cpython\cpython\Lib\concurrent\futures\process.py", line 492, in terminate_broken
    work_item.future.set_exception(bpe)
  File "D:\a\cpython\cpython\Lib\concurrent\futures\_base.py", line 559, in set_exception
    raise InvalidStateError('{}: {!r}'.format(self._state, self))
concurrent.futures._base.InvalidStateError: CANCELLED: <Future at 0x1b7d175c150 state=cancelled>
0.52s Warning -- threading_cleanup() failed to cleanup 1 threads (count: 1, dangling: 2)
Warning -- Dangling thread: <Thread(QueueFeederThread, started daemon 7084)>
Warning -- Dangling thread: <_MainThread(MainThread, started 4784)>
ok
(...)

test_shutdown_deadlock_pickle (test.test_concurrent_futures.test_deadlock.ProcessPoolSpawnExecutorDeadlockTest.test_shutdown_deadlock_pickle) ... 0.53s ok
Warning -- threading_cleanup() failed to cleanup 1 threads (count: 1, dangling: 2)
Warning -- Dangling thread: <_MainThread(MainThread, started 4784)>
Warning -- Dangling thread: <Thread(QueueFeederThread, started daemon 7084)>
Timeout (0:20:00)!
Thread 0x00001bac (most recent call first):
  File "D:\a\cpython\cpython\Lib\multiprocessing\connection.py", line 282 in _send_bytes
  File "D:\a\cpython\cpython\Lib\multiprocessing\connection.py", line 199 in send_bytes
  File "D:\a\cpython\cpython\Lib\multiprocessing\queues.py", line 246 in _feed
  File "D:\a\cpython\cpython\Lib\threading.py", line 996 in run
  File "D:\a\cpython\cpython\Lib\threading.py", line 1059 in _bootstrap_inner
  File "D:\a\cpython\cpython\Lib\threading.py", line 1016 in _bootstrap

Thread 0x000012b0 (most recent call first):
  File "D:\a\cpython\cpython\Lib\threading.py", line 1153 in _wait_for_tstate_lock
  File "D:\a\cpython\cpython\Lib\threading.py", line 1133 in join
  File "D:\a\cpython\cpython\Lib\multiprocessing\queues.py", line 199 in _finalize_join
  File "D:\a\cpython\cpython\Lib\multiprocessing\util.py", line 224 in __call__
  File "D:\a\cpython\cpython\Lib\multiprocessing\util.py", line 300 in _run_finalizers
  File "D:\a\cpython\cpython\Lib\multiprocessing\util.py", line 488 in _cleanup_tests
  File "D:\a\cpython\cpython\Lib\unittest\case.py", line 139 in doModuleCleanups
  File "D:\a\cpython\cpython\Lib\unittest\suite.py", line 275 in _handleModuleTearDown
  File "D:\a\cpython\cpython\Lib\unittest\suite.py", line 131 in run
  File "D:\a\cpython\cpython\Lib\unittest\suite.py", line 84 in __call__
  File "D:\a\cpython\cpython\Lib\unittest\runner.py", line 240 in run
  File "D:\a\cpython\cpython\Lib\test\support\__init__.py", line 1142 in _run_suite
  File "D:\a\cpython\cpython\Lib\test\support\__init__.py", line 1269 in run_unittest
  File "D:\a\cpython\cpython\Lib\test\libregrtest\runtest.py", line 393 in run_unittest
(...)

1 re-run test:
    test.test_concurrent_futures.test_deadlock

Total duration: 54 min 47 sec
Total tests: run=40,332 skipped=2,283
Total test files: run=461/462 failed=1 skipped=36 resource_denied=2 rerun=1
Result: FAILURE then FAILURE

Explanation:

The test suite is run: Run tests in parallel using 4 child processes (timeout: 20 min, worker timeout: 25 min)
At 0:34:47, test_concurrent_futures.test_deadlock process crashed (Exit code 1) with a timeout: Timeout (0:20:00)!
regrtest displays a a first report: 423 tests OK. but 1 test failed: test.test_concurrent_futures.test_deadlock
regrtest decides to re-run the failed test: Re-running 1 failed tests in verbose mode in subprocesses and Run tests in parallel using 4 child processes. First new feature: failed tests are run in parallel with 4 processes!
As expected, test_concurrent_futures hangs again (known bug, see issue test_concurrent_futures.test_deadlock: test_crash_big_data() hangs randomly on Windows #107219) and is killed with a timeout. Nothing surprising here.
WAAAAAAAAAAAAAAIT for the last part!!! regrtest gets back the control, and report the crashed worker process as nothing happens and it writes its final report! That's the second new feature: regrtest main process no longer crash when the re-run test crash. Instead, the test is run in a worker progress, and if the work crash, that's fine: the main process takes back the control and just write its report, cleanup the temporary directory, etc.

Total tests: run=40,332 skipped=2,283
Total test files: run=461/462 failed=1 skipped=36 resource_denied=2 rerun=1
Result: FAILURE then FAILURE

Before that, the logs stopped at Timeout (0:20:00)! and so I had to read all logs to get the previous report, understand which tests failed, why a test crashed, etc.

miss-islington · 2023-09-05T22:32:12Z

Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.12.
🐍🍒⛏🤖

miss-islington · 2023-09-05T22:32:13Z

Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.11.
🐍🍒⛏🤖

miss-islington · 2023-09-05T22:32:15Z

Sorry, @vstinner, I could not cleanly backport this to 3.12 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker 31c2945f143c6b80c837fcf09a5cfb85fea9ea4c 3.12

miss-islington · 2023-09-05T22:32:18Z

Sorry, @vstinner, I could not cleanly backport this to 3.11 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker 31c2945f143c6b80c837fcf09a5cfb85fea9ea4c 3.11

…#108839) When using --rerun option, regrtest now re-runs failed tests in verbose mode in fresh worker processes to have more deterministic behavior. So it can write its final report even if a test killed a worker progress. Add --fail-rerun option to regrtest: exit with non-zero exit code if a test failed pass passed when re-run in verbose mode (in a fresh process). That's now more useful since tests can pass when re-run in a fresh worker progress, whereas they failed when run after other tests when tests are run sequentially. Rename --verbose2 option (-w) to --rerun. Keep --verbose2 as a deprecated alias. Changes: * Fix and enhance statistics in regrtest summary. Add "(filtered)" when --match and/or --ignore options are used. * Add RunTests class. * Add TestResult.get_rerun_match_tests() method * Rewrite code to serialize/deserialize worker arguments as JSON using a new WorkerJob class. * Fix stats when a test is run with --forever --rerun. * If failed test names cannot be parsed, log a warning and don't filter tests. * test_regrtest.test_rerun_success() now uses a marker file, since the test is re-run in a separated process. * Add tests on normalize_test_name() function. * Add test_success() and test_skip() tests to test_regrtest. (cherry picked from commit 31c2945)

* gh-108834: regrtest reruns failed tests in subprocesses (#108839) When using --rerun option, regrtest now re-runs failed tests in verbose mode in fresh worker processes to have more deterministic behavior. So it can write its final report even if a test killed a worker progress. Add --fail-rerun option to regrtest: exit with non-zero exit code if a test failed pass passed when re-run in verbose mode (in a fresh process). That's now more useful since tests can pass when re-run in a fresh worker progress, whereas they failed when run after other tests when tests are run sequentially. Rename --verbose2 option (-w) to --rerun. Keep --verbose2 as a deprecated alias. Changes: * Fix and enhance statistics in regrtest summary. Add "(filtered)" when --match and/or --ignore options are used. * Add RunTests class. * Add TestResult.get_rerun_match_tests() method * Rewrite code to serialize/deserialize worker arguments as JSON using a new WorkerJob class. * Fix stats when a test is run with --forever --rerun. * If failed test names cannot be parsed, log a warning and don't filter tests. * test_regrtest.test_rerun_success() now uses a marker file, since the test is re-run in a separated process. * Add tests on normalize_test_name() function. * Add test_success() and test_skip() tests to test_regrtest. (cherry picked from commit 31c2945) * gh-108834: regrtest --fail-rerun exits with code 5 (#108896) When the --fail-rerun option is used and a test fails and then pass, regrtest now uses exit code 5 ("rerun) instead of 2 ("bad test"). (cherry picked from commit 1170d5a) * gh-108416: Mark slow but not CPU bound test methods with requires_resource('walltime') (GH-108480) (cherry picked from commit 1e0d627) * Manually sync Lib/test/libregrtest/ from main --------- Co-authored-by: Serhiy Storchaka <[email protected]>

picnixz · 2024-09-14T18:08:32Z

3.12 got its backport but 3.11 didn't get its backport but since it only receives security-fixes, it won't get any.

vstinner requested a review from rhettinger as a code owner September 3, 2023 07:19

bedevere-bot mentioned this pull request Sep 3, 2023

regrtest should re-run failed tests in subprocesses #108834

Closed

bedevere-bot added the awaiting core review label Sep 3, 2023

rhettinger removed their request for review September 3, 2023 15:53

vstinner force-pushed the regrtest_rerun branch from e132cde to 1ae6964 Compare September 3, 2023 16:42

Fix "skipped=count" stats

d96ebfa

Add test_success() and test_skip() tests.

vstinner added 2 commits September 3, 2023 20:06

Fix "Total test files" total and fix test_regrtest

c7b7853

Add --fail-rerun option

d6296b8

vstinner enabled auto-merge (squash) September 3, 2023 21:13

vstinner merged commit 31c2945 into python:main Sep 3, 2023

vstinner deleted the regrtest_rerun branch September 3, 2023 21:37

bedevere-bot removed the awaiting core review label Sep 3, 2023

vstinner added needs backport to 3.11 only security fixes needs backport to 3.12 bug and security fixes labels Sep 5, 2023

miss-islington assigned vstinner Sep 5, 2023

vstinner mentioned this pull request Sep 5, 2023

[3.12] gh-108834: Sync libregrtest with the main branch #108966

Merged

picnixz removed needs backport to 3.11 only security fixes needs backport to 3.12 bug and security fixes labels Sep 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-108834: regrtest reruns failed tests in subprocesses #108839

gh-108834: regrtest reruns failed tests in subprocesses #108839

vstinner commented Sep 3, 2023 •

edited by bedevere-bot

Loading

vstinner commented Sep 3, 2023

vstinner commented Sep 3, 2023

vstinner commented Sep 3, 2023

miss-islington commented Sep 5, 2023

miss-islington commented Sep 5, 2023

miss-islington commented Sep 5, 2023

miss-islington commented Sep 5, 2023

picnixz commented Sep 14, 2024

gh-108834: regrtest reruns failed tests in subprocesses #108839

gh-108834: regrtest reruns failed tests in subprocesses #108839

Conversation

vstinner commented Sep 3, 2023 • edited by bedevere-bot Loading

vstinner commented Sep 3, 2023

vstinner commented Sep 3, 2023

vstinner commented Sep 3, 2023

miss-islington commented Sep 5, 2023

miss-islington commented Sep 5, 2023

miss-islington commented Sep 5, 2023

miss-islington commented Sep 5, 2023

picnixz commented Sep 14, 2024

vstinner commented Sep 3, 2023 •

edited by bedevere-bot

Loading