Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-46523: fix tests rerun when setUp[Class|Module] fails #30895

Merged
merged 6 commits into from
Apr 7, 2023

Conversation

sobolevn
Copy link
Member

@sobolevn sobolevn commented Jan 25, 2022

Problem

There were two problems:

  1. Tests were trying to rerun setUpClass as a test name. It is visible here: Re-running test_typing in verbose mode (matching: setUpClass)
  2. Rerun was not considered an error even if there were errors.

I've fixed both problems.

Local testing

setUpModule

» ./python.exe -m test -w -v test_typing
== CPython 3.11.0a4+ (heads/main-dirty:ef3ef6fa43, Jan 20 2022, 20:48:25) [Clang 11.0.0 (clang-1100.0.33.16)]
== macOS-10.14.6-x86_64-i386-64bit little-endian
== cwd: /Users/sobolev/Desktop/cpython/build/test_python_6745æ
== CPU count: 4
== encodings: locale=UTF-8, FS=utf-8
0:00:00 load avg: 9.85 Run tests sequentially
0:00:00 load avg: 9.85 [1/1] test_typing
setUpModule (test.test_typing) ... ERROR

======================================================================
ERROR: setUpModule (test.test_typing)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/sobolev/Desktop/cpython/Lib/test/test_typing.py", line 43, in setUpModule
    raise TypeError('mo')
    ^^^^^^^^^^^^^^^^^^^^^
TypeError: mo

----------------------------------------------------------------------
Ran 0 tests in 0.004s

FAILED (errors=1)
test test_typing failed
test_typing failed (1 error)

== Tests result: FAILURE ==

1 test failed:
    test_typing
0:00:00 load avg: 9.85
0:00:00 load avg: 9.85 Re-running failed tests in verbose mode
0:00:00 load avg: 9.85 Re-running test_typing in verbose mode (matching: test_typing)
setUpModule (test.test_typing) ... ERROR

======================================================================
ERROR: setUpModule (test.test_typing)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/sobolev/Desktop/cpython/Lib/test/test_typing.py", line 43, in setUpModule
    raise TypeError('mo')
    ^^^^^^^^^^^^^^^^^^^^^
TypeError: mo

----------------------------------------------------------------------
Ran 0 tests in 0.003s

FAILED (errors=1)
test test_typing failed
1 test failed again:
    test_typing

== Tests result: FAILURE then FAILURE ==

1 test failed:
    test_typing

1 re-run test:
    test_typing

Total duration: 378 ms
Tests result: FAILURE then FAILURE

setUpClass

https://gist.github.com/sobolevn/adcaf0d470226b8644f70ef81178e028

setUp

https://gist.github.com/sobolevn/acf04e0f7ea785e1fa4c45a94f545743

https://bugs.python.org/issue46523

@@ -4001,6 +4001,7 @@ def cleanup(self):

@classmethod
def setUpClass(cls):
raise ValueError
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentional. Let's see the CI output, it should be red.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sobolevn sobolevn added needs backport to 3.9 only security fixes needs backport to 3.10 only security fixes skip news labels Jan 25, 2022
@sobolevn sobolevn requested a review from vstinner January 26, 2022 09:58
@vstinner
Copy link
Member

@ambv: This change is about the code that you wrote if I understood correctly.

@sobolevn sobolevn requested a review from ambv January 30, 2022 08:47
@serhiy-storchaka serhiy-storchaka added needs backport to 3.11 only security fixes and removed needs backport to 3.9 only security fixes labels May 20, 2022
@MaxwellDupre
Copy link
Contributor

I cant reproduce in 3.12.0a0 latest.
Also, bit confused your command line uses .exe but output is with Linux type directories.
The use case is not clear to me.

@sobolevn
Copy link
Member Author

sobolevn commented Jul 5, 2022

I cant reproduce in 3.12.0a0 latest.

I haven't tried it on 3.12, only 3.11.x

Also, bit confused your command line uses .exe but output is with Linux type directories.

I am on macos, since we have Python/ directory and macos does not recognise ./python executable and Python/ directory at the same time, we have to use .exe extension - just like Windows does.

The use case is not clear to me.

There was a problem with how tests were re-run. See #90681 (comment) for more details.

@vstinner
Copy link
Member

vstinner commented Apr 5, 2023

@ambv: You would mind to review this regrtest fix?

@sobolevn
Copy link
Member Author

sobolevn commented Apr 5, 2023

I can merge main later tomorrow to get this back into shape if someone wants to get this fixed :)

@vstinner
Copy link
Member

vstinner commented Apr 6, 2023

If you want me to review, I would feel more comfortable with a new test added to test_regrtest. This code is tricky and doesn't seem to be carefully tested.

@sobolevn
Copy link
Member Author

sobolevn commented Apr 6, 2023

Sure thing, will do! 👍
Happy to see that you are back!

@sobolevn
Copy link
Member Author

sobolevn commented Apr 7, 2023

@vstinner done!

I've added 8 new test cases:

  • setUpClass and tearDownClass (were affected, now fixed)
  • setUpModule and tearDownModule (were affected, now fixed)
  • setUp and tearDown (were not affected, were not tested, now are tested)
  • asyncSetUp and asyncTearDown (were not affected, were not tested, now are tested)

@ambv ambv merged commit 9953860 into python:main Apr 7, 2023
@miss-islington
Copy link
Contributor

Thanks @sobolevn for the PR, and @ambv for merging it 🌮🎉.. I'm working now to backport this PR to: 3.10, 3.11.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Sorry, @sobolevn and @ambv, I could not cleanly backport this to 3.11 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker 995386071f96e4cfebfa027a71ca9134e4651d2a 3.11

@miss-islington
Copy link
Contributor

Sorry @sobolevn and @ambv, I had trouble checking out the 3.10 backport branch.
Please retry by removing and re-adding the "needs backport to 3.10" label.
Alternatively, you can backport using cherry_picker on the command line.
cherry_picker 995386071f96e4cfebfa027a71ca9134e4651d2a 3.10

@ambv ambv removed the needs backport to 3.10 only security fixes label Apr 7, 2023
@ambv
Copy link
Contributor

ambv commented Apr 7, 2023

3.10 is out of scope already. 3.11 I'll merge myself.

@sobolevn
Copy link
Member Author

sobolevn commented Apr 7, 2023

Thanks everyone! 🎉
I am always happy when some old PR gets merged 😊

@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot s390x RHEL7 LTO + PGO 3.x has failed when building commit 9953860.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/all/#builders/244/builds/4176) and take a look at the build logs.
  4. Check if the failure is related to this commit (9953860) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/all/#builders/244/builds/4176

Failed tests:

  • test_asyncio
  • test_tools

Failed subtests:

  • test_subprocess_consistent_callbacks - test.test_asyncio.test_subprocess.SubprocessThreadedWatcherTests.test_subprocess_consistent_callbacks
  • test_freeze_simple_script - test.test_tools.test_freeze.TestFreeze.test_freeze_simple_script

Summary of the results of the build (if available):

== Tests result: FAILURE then FAILURE ==

409 tests OK.

10 slowest tests:

  • test_concurrent_futures: 2 min 25 sec
  • test_asyncio: 2 min 9 sec
  • test_multiprocessing_spawn: 2 min 4 sec
  • test_multiprocessing_forkserver: 1 min 33 sec
  • test_math: 1 min 31 sec
  • test_multiprocessing_fork: 1 min 12 sec
  • test_tokenize: 51.7 sec
  • test_signal: 48.5 sec
  • test_hashlib: 43.4 sec
  • test_io: 39.3 sec

1 test failed:
test_asyncio

23 tests skipped:
test_devpoll test_gdb test_idle test_ioctl test_kqueue
test_launcher test_msilib test_peg_generator test_perf_profiler
test_smtpnet test_ssl test_startfile test_tcl test_tix
test_tkinter test_ttk test_ttk_textonly test_turtle
test_winconsoleio test_winreg test_winsound test_wmi
test_zipfile64

2 re-run tests:
test_asyncio test_tools

Total duration: 13 min 17 sec

Click to see traceback logs
Traceback (most recent call last):
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z.lto-pgo/build/Lib/test/test_asyncio/test_subprocess.py", line 770, in test_subprocess_consistent_callbacks
    self.loop.run_until_complete(main())
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z.lto-pgo/build/Lib/asyncio/base_events.py", line 664, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z.lto-pgo/build/Lib/test/test_asyncio/test_subprocess.py", line 762, in main
    self.assertEqual(events, [
AssertionError: Lists differ: [('pi[69 chars]), 'process_exited', 'pipe_connection_lost', '[17 chars]ost'] != [('pi[69 chars]), 'pipe_connection_lost', 'pipe_connection_lo[17 chars]ted']


Traceback (most recent call last):
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z.lto-pgo/build/Lib/test/test_tools/test_freeze.py", line 27, in test_freeze_simple_script
    outdir, scriptfile, python = helper.prepare(script, outdir)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z.lto-pgo/build/Tools/freeze/test/freeze.py", line 146, in prepare
    copy_source_tree(srcdir, SRCDIR)
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z.lto-pgo/build/Tools/freeze/test/freeze.py", line 95, in copy_source_tree
    shutil.copytree(oldroot, newroot, ignore=ignore_non_src)
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z.lto-pgo/build/Lib/shutil.py", line 563, in copytree
    return _copytree(entries=entries, src=src, dst=dst, symlinks=symlinks,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z.lto-pgo/build/Lib/shutil.py", line 517, in _copytree
    raise Error(errors)
shutil.Error: [(<DirEntry 'ziptest2dir'>, '/tmp/test_python_ata8db4y/tmpza4jrie3/cpython/build/test_python_41350æ/test_python_worker_49510æ/tempcwd/ziptest2dir', "[Errno 2] No such file or directory: '/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z.lto-pgo/build/build/test_python_41350æ/test_python_worker_49510æ/tempcwd/ziptest2dir'"), (<DirEntry 'tempcwd'>, '/tmp/test_python_ata8db4y/tmpza4jrie3/cpython/build/test_python_41350æ/test_python_worker_49510æ/tempcwd', "[Errno 2] No such file or directory: '/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z.lto-pgo/build/build/test_python_41350æ/test_python_worker_49510æ/tempcwd'")]


Traceback (most recent call last):
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z.lto-pgo/build/Lib/test/test_asyncio/test_subprocess.py", line 770, in test_subprocess_consistent_callbacks
    self.loop.run_until_complete(main())
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z.lto-pgo/build/Lib/asyncio/base_events.py", line 664, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z.lto-pgo/build/Lib/test/test_asyncio/test_subprocess.py", line 762, in main
    self.assertEqual(events, [
AssertionError: Lists differ: [('pi[29 chars]t'), 'pipe_connection_lost', ('pipe_data_recei[57 chars]ted'] != [('pi[29 chars]t'), ('pipe_data_received', 2, b'stderr'), 'pi[57 chars]ted']

@sobolevn
Copy link
Member Author

sobolevn commented Apr 7, 2023

See #103170 for this buildbot failure.

@bedevere-bot
Copy link

GH-103342 is a backport of this pull request to the 3.11 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.11 only security fixes label Apr 7, 2023
ambv pushed a commit to ambv/cpython that referenced this pull request Apr 7, 2023
…ythonGH-30895)

Co-authored-by: Jelle Zijlstra <[email protected]>
Co-authored-by: Łukasz Langa <[email protected]>
(cherry picked from commit 9953860)

Co-authored-by: Nikita Sobolev <[email protected]>
ambv added a commit that referenced this pull request Apr 7, 2023
warsaw pushed a commit to warsaw/cpython that referenced this pull request Apr 11, 2023
vstinner added a commit to vstinner/cpython that referenced this pull request Sep 2, 2023
vstinner pushed a commit to vstinner/cpython that referenced this pull request Sep 2, 2023
…0895)

Co-authored-by: Jelle Zijlstra <[email protected]>
Co-authored-by: Łukasz Langa <[email protected]>
(cherry picked from commit 9953860)
vstinner added a commit that referenced this pull request Sep 3, 2023
…108820)

* Revert "[3.11] gh-101634: regrtest reports decoding error as failed test (#106169) (#106175)"

This reverts commit d5418e9.

* Revert "[3.11] bpo-46523: fix tests rerun when `setUp[Class|Module]` fails (GH-30895) (GH-103342)"

This reverts commit ecb09a8.

* Revert "gh-95027: Fix regrtest stdout encoding on Windows (GH-98492)"

This reverts commit b2aa28e.

* Revert "[3.11] gh-94026: Buffer regrtest worker stdout in temporary file (GH-94253) (GH-94408)"

This reverts commit 0122ab2.

* Revert "Run Tools/scripts/reindent.py (GH-94225)"

This reverts commit f0f3a42.

* Revert "gh-94052: Don't re-run failed tests with --python option (GH-94054)"

This reverts commit 1347607.

* Revert "[3.11] gh-84461: Fix Emscripten umask and permission issues (GH-94002) (GH-94006)"

This reverts commit 1073184.

* gh-93353: regrtest checks for leaked temporary files (#93776)

When running tests with -jN, create a temporary directory per process
and mark a test as "environment changed" if a test leaks a temporary
file or directory.

(cherry picked from commit e566ce5)

* gh-93353: Fix regrtest for -jN with N >= 2 (GH-93813)

(cherry picked from commit 36934a1)

* gh-93353: regrtest supports checking tmp files with -j2 (#93909)

regrtest now also implements checking for leaked temporary files and
directories when using -jN for N >= 2. Use tempfile.mkdtemp() to
create the temporary directory. Skip this check on WASI.

(cherry picked from commit 4f85cec)

* gh-84461: Fix Emscripten umask and permission issues (GH-94002)

- Emscripten's default umask is too strict, see
  emscripten-core/emscripten#17269
- getuid/getgid and geteuid/getegid are stubs that always return 0
  (root). Disable effective uid/gid syscalls and fix tests that use
  chmod() current user.
- Cannot drop X bit from directory.

(cherry picked from commit 2702e40)

* gh-94052: Don't re-run failed tests with --python option (#94054)

(cherry picked from commit 0ff7b99)

* Run Tools/scripts/reindent.py (#94225)

Reindent files which were not properly formatted (PEP 8: 4 spaces).

Remove also some trailing spaces.

(cherry picked from commit e87ada4)

* gh-94026: Buffer regrtest worker stdout in temporary file (GH-94253)

Co-authored-by: Victor Stinner <[email protected]>
(cherry picked from commit 199ba23)

* gh-96465: Clear fractions hash lru_cache under refleak testing (GH-96689)

Automerge-Triggered-By: GH:zware
(cherry picked from commit 9c8f379)

* gh-95027: Fix regrtest stdout encoding on Windows (#98492)

On Windows, when the Python test suite is run with the -jN option,
the ANSI code page is now used as the encoding for the stdout
temporary file, rather than using UTF-8 which can lead to decoding
errors.

(cherry picked from commit ec1f6f5)

* gh-98903: Test suite fails with exit code 4 if no tests ran (#98904)

The Python test suite now fails wit exit code 4 if no tests ran. It
should help detecting typos in test names and test methods.

* Add "EXITCODE_" constants to Lib/test/libregrtest/main.py.
* Fix a typo: "NO TEST RUN" becomes "NO TESTS RAN"

(cherry picked from commit c76db37)

* gh-100086: Add build info to test.libregrtest (#100093)

The Python test runner (libregrtest) now logs Python build information like
"debug" vs "release" build, or LTO and PGO optimizations.

(cherry picked from commit 3c89202)

* bpo-46523: fix tests rerun when `setUp[Class|Module]` fails (#30895)

Co-authored-by: Jelle Zijlstra <[email protected]>
Co-authored-by: Łukasz Langa <[email protected]>
(cherry picked from commit 9953860)

* gh-82054: allow test runner to split test_asyncio to execute in parallel by sharding. (#103927)

This runs test_asyncio sub-tests in parallel using sharding from Cinder. This suite is typically the longest-pole in runs because it is a test package with a lot of further sub-tests otherwise run serially. By breaking out the sub-tests as independent modules we can run a lot more in parallel.

After porting we can see the direct impact on a multicore system.

Without this change:
  Running make test is 5 min 26 seconds
With this change:
  Running make test takes 3 min 39 seconds

That'll vary based on system and parallelism. On a `-j 4` run similar to what CI and buildbot systems often do, it reduced the overall test suite completion latency by 10%.

The drawbacks are that this implementation is hacky and due to the sorting of the tests it obscures when the asyncio tests occur and involves changing CPython test infrastructure but, the wall time saved it is worth it, especially in low-core count CI runs as it pulls a long tail. The win for productivity and reserved CI resource usage is significant.

Future tests that deserve to be refactored into split up suites to benefit from are test_concurrent_futures and the way the _test_multiprocessing suite gets run for all start methods. As exposed by passing the -o flag to python -m test to get a list of the 10 longest running tests.

---------

Co-authored-by: Carl Meyer <[email protected]>
Co-authored-by: Gregory P. Smith <[email protected]> [Google, LLC]
(cherry picked from commit 9e011e7)

* Display the sanitizer config in the regrtest header. (#105301)

Display the sanitizers present in libregrtest.

Having this in the CI output for tests with the relevant environment
variable displayed will help make it easier to do what we need to
create an equivalent local test run.

(cherry picked from commit 852348a)

* gh-101634: regrtest reports decoding error as failed test (#106169)

When running the Python test suite with -jN option, if a worker stdout
cannot be decoded from the locale encoding report a failed testn so the
exitcode is non-zero.

(cherry picked from commit 2ac3eec)

* gh-108223: test.pythoninfo and libregrtest log Py_NOGIL (#108238)

Enable with --disable-gil --without-pydebug:

    $ make pythoninfo|grep NOGIL
    sysconfig[Py_NOGIL]: 1

    $ ./python -m test
    ...
    == Python build: nogil debug
    ...

(cherry picked from commit 5afe0c1)

* gh-90791: test.pythoninfo logs ASAN_OPTIONS env var (#108289)

* Cleanup libregrtest code logging ASAN_OPTIONS.
* Fix a typo on "ASAN_OPTIONS" vs "MSAN_OPTIONS".

(cherry picked from commit 3a1ac87)

* gh-108388: regrtest splits test_asyncio package (#108393)

Currently, test_asyncio package is only splitted into sub-tests when
using command "./python -m test". With this change, it's also
splitted when passing it on the command line:
"./python -m test test_asyncio".

Remove the concept of "STDTESTS". Python is now mature enough to not
have to bother with that anymore. Removing STDTESTS simplify the
code.

(cherry picked from commit 174e9da)

* regrtest computes statistics (#108793)

test_netrc, test_pep646_syntax and test_xml_etree now return results
in the test_main() function.

Changes:

* Rewrite TestResult as a dataclass with a new State class.
* Add test.support.TestStats class and Regrtest.stats_dict attribute.
* libregrtest.runtest functions now modify a TestResult instance
  in-place.
* libregrtest summary lists the number of run tests and skipped
  tests, and denied resources.
* Add TestResult.has_meaningful_duration() method.
* Compute TestResult duration in the upper function.
* Use time.perf_counter() instead of time.monotonic().
* Regrtest: rename 'resource_denieds' attribute to 'resource_denied'.
* Rename CHILD_ERROR to MULTIPROCESSING_ERROR.
* Use match/case syntadx to have different code depending on the
  test state.

Co-authored-by: Alex Waygood <[email protected]>
(cherry picked from commit d4e534c)

* gh-108822: Add Changelog entry for regrtest statistics (#108821)

---------

Co-authored-by: Christian Heimes <[email protected]>
Co-authored-by: Zachary Ware <[email protected]>
Co-authored-by: Nikita Sobolev <[email protected]>
Co-authored-by: Joshua Herman <[email protected]>
Co-authored-by: Gregory P. Smith <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants