stopTest issue with Python 3.12.1 and skipped tests #277

sodul · 2023-12-12T00:19:59Z

We have a unittest that is decorated with @unittest.skipIf(...), and when green is collecting tests results in stopTest() we get an AttributeError for start_time on this line:
self.test_time = str(time.time() - self.start_time).

Since self.start_time is being set in startTest() but this seems to be skipped on skipped tests, but stopTest() is still being called, it looks like it is a new compatibility issue.

If I silence the AttributeError, we get an other one later on proto_test_result.errors in addProtoTestResult(), so the fix might get tricky.

I will take a look at a permanent fix after 3.5.0 gets released. For new we have commented out the skipped test in our codebase so we can use 3.12.1.

https://docs.python.org/3/whatsnew/changelog.html#python-3-12-1-final

I suspect it might be caused by this change: python/cpython#109725

The text was updated successfully, but these errors were encountered:

CleanCut · 2023-12-15T19:32:02Z

We should be able to check if self.start_time exists in stopTest(), and set self.test_time to a zero duration if it does not.

sodul · 2023-12-19T19:05:10Z

For the short term that should work.

Something I was thinking about looking into is to have the classes used by green to inherit from the official classes so that they will be more future proof. But I would do that after cleaning up the code from all pre 3.8 support to get the codebase cleaned up and organized.

sodul · 2024-01-13T04:42:50Z

My plan is to get v4.0.0 released based of the current main in the next few days so that we can have support for python 3.12.0 with a note that 3.12.1 is known to break on skipped tests. This should allow most users of green to be able to use green with python 3.12 while I work on fixing the skipped tests issue. I will probably release v4.1.0 a few days later with this issue fixed.

sodul · 2024-01-16T22:24:32Z

I've released Green 4.0.0 with python 3.12 support. Now that the minimal python version is 3.8 I should be able to refactor the code more easily and will work on the 3.12.1 compatibility issue this week.

sodul · 2024-01-22T06:45:25Z

I made some progress on this issue. I added a test with the skip annotation to reproduce the bug and I'm able to run ./g -s1 green with python 3.12.1, but without -s1.

> ./g -s1 green   
..........................................ssssssss........................................................................s
[ ... snip ... ]
Ran 253 tests in 4.727s using 1 process

OK (passes=114, skips=9)

And:

> ./g green    
..........................................ssssssss.................................................................[stuck]

After looking at green/result.py and unittest/result.py, it is clear that the internals have drifted upstream so my current approach is to have green's BaseTestResult inherit from unittest.TestResult and call super().<method>() where needed. I believe this is the approach that will provide the most forward compatibility while potentially reducing duplicated code.

I had to add some type annotation to help me track things, but this is triggering some cascading error reporting and I do not want to fix everything at once here.

sodul · 2024-01-22T06:51:53Z

-s2 seems ok

> ./g -s3 green
..........................................ssssssss.................................................................^C^CTraceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/stephane/repos/clumio-oss/green/green/cmdline.py", line 103, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/stephane/repos/clumio-oss/green/green/cmdline.py", line 99, in main
    return _main(argv, testing)
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/stephane/repos/clumio-oss/green/green/cmdline.py", line 67, in _main
    result = run(test_suite, stream, args, testing)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/stephane/repos/clumio-oss/green/green/runner.py", line 134, in run
    proto_test_result: ProtoTest | ProtoTestResult = queue.get()
                                                     ^^^^^^^^^^^

It seems that it is having an issue with the tests in test_runner.py. I might have to update the mocks there since python -m unittest green/test/test_runner.py is getting stuck as well.

sodul · 2024-01-25T05:52:37Z

My original idea to have green's BaseTestResult class inherit from unittest.TestResult seems to get the tests to freeze on the internal tests with macOS and Windows, but not in Linux according to GH actions. I'll shelve this for now and will re-implement the new attributes added in Python 3.12.1 as a short term fix. Longer term I do believe inheriting is the proper approach but we will need refactor the internal queueing code of green to properly identify where the race condition occurs. Thanks to mypy some issues have been uncovered but it will take some time to implement all this refactoring.

main...clumio-oss:green:python-3.12.1

CleanCut · 2024-01-25T21:14:39Z

I think it would be great to mostly inherit and/or use upstream implementations. I tried to do that as much as I could...but it wouldn't work in a lot of places. Maybe it will work better now.

sodul · 2024-01-26T17:48:17Z

There is something odd with the ProtoTestResult queue when I use py3.12.1, after adding the missing attributes, now I'm getting errors because we get None and ProtoTest instances passed to addProtoTestResult(), which then leads to further errors. I tried to 'ignore' by checking for these unexpected types, but then we get stuck on queue.get(). I'm currently looking at adding more type annotation to better understand how the queue is constructed. I suspect that the logic to detect that there is no more work to be done will have to be tuned. I'll get to the bottom of this.

CleanCut · 2024-01-30T20:19:32Z

Ya, the queue communication has always been super finicky and hard to debug. Any improvements to that (or to the surrounding architecture) are most welcome.

sodul · 2024-02-07T01:28:27Z

For reference I did a diff of the unittest folder in python 3.12.0 and 3.12.1 to help me better understand what the actual changes are:

diff -u ut3.12.0/case.py ut3.12.1/case.py
--- ut3.12.0/case.py	2024-02-06 17:25:20
+++ ut3.12.1/case.py	2024-02-06 17:24:26
@@ -606,7 +606,6 @@
         else:
             stopTestRun = None
 
-        result.startTest(self)
         try:
             testMethod = getattr(self, self._testMethodName)
             if (getattr(self.__class__, "__unittest_skip__", False) or
@@ -616,6 +615,9 @@
                             or getattr(testMethod, '__unittest_skip_why__', ''))
                 _addSkip(result, self, skip_why)
                 return result
+
+            # Increase the number of tests only if it hasn't been skipped
+            result.startTest(self)
 
             expecting_failure = (
                 getattr(self, "__unittest_expecting_failure__", False) or
diff -u ut3.12.0/loader.py ut3.12.1/loader.py
--- ut3.12.0/loader.py	2024-02-06 17:25:20
+++ ut3.12.1/loader.py	2024-02-06 17:24:26
@@ -84,9 +84,13 @@
             raise TypeError("Test cases should not be derived from "
                             "TestSuite. Maybe you meant to derive from "
                             "TestCase?")
-        testCaseNames = self.getTestCaseNames(testCaseClass)
-        if not testCaseNames and hasattr(testCaseClass, 'runTest'):
-            testCaseNames = ['runTest']
+        if testCaseClass in (case.TestCase, case.FunctionTestCase):
+            # We don't load any tests from base types that should not be loaded.
+            testCaseNames = []
+        else:
+            testCaseNames = self.getTestCaseNames(testCaseClass)
+            if not testCaseNames and hasattr(testCaseClass, 'runTest'):
+                testCaseNames = ['runTest']
         loaded_suite = self.suiteClass(map(testCaseClass, testCaseNames))
         return loaded_suite
 
@@ -95,7 +99,11 @@
         tests = []
         for name in dir(module):
             obj = getattr(module, name)
-            if isinstance(obj, type) and issubclass(obj, case.TestCase):
+            if (
+                isinstance(obj, type)
+                and issubclass(obj, case.TestCase)
+                and obj not in (case.TestCase, case.FunctionTestCase)
+            ):
                 tests.append(self.loadTestsFromTestCase(obj))
 
         load_tests = getattr(module, 'load_tests', None)
@@ -164,7 +172,11 @@
 
         if isinstance(obj, types.ModuleType):
             return self.loadTestsFromModule(obj)
-        elif isinstance(obj, type) and issubclass(obj, case.TestCase):
+        elif (
+            isinstance(obj, type)
+            and issubclass(obj, case.TestCase)
+            and obj not in (case.TestCase, case.FunctionTestCase)
+        ):
             return self.loadTestsFromTestCase(obj)
         elif (isinstance(obj, types.FunctionType) and
               isinstance(parent, type) and
diff -u ut3.12.0/mock.py ut3.12.1/mock.py
--- ut3.12.0/mock.py	2024-02-06 17:25:20
+++ ut3.12.1/mock.py	2024-02-06 17:24:26
@@ -827,7 +827,7 @@
 
 
     def _format_mock_failure_message(self, args, kwargs, action='call'):
-        message = 'expected %s not found.\nExpected: %s\nActual: %s'
+        message = 'expected %s not found.\nExpected: %s\n  Actual: %s'
         expected_string = self._format_mock_call_signature(args, kwargs)
         call_args = self.call_args
         actual_string = self._format_mock_call_signature(*call_args)
@@ -930,7 +930,7 @@
         if self.call_args is None:
             expected = self._format_mock_call_signature(args, kwargs)
             actual = 'not called.'
-            error_message = ('expected call not found.\nExpected: %s\nActual: %s'
+            error_message = ('expected call not found.\nExpected: %s\n  Actual: %s'
                     % (expected, actual))
             raise AssertionError(error_message)
 
@@ -981,7 +981,7 @@
                 raise AssertionError(
                     f'{problem}\n'
                     f'Expected: {_CallList(calls)}'
-                    f'{self._calls_repr(prefix="Actual").rstrip(".")}'
+                    f'{self._calls_repr(prefix="  Actual").rstrip(".")}'
                 ) from cause
             return
 
diff -u ut3.12.0/result.py ut3.12.1/result.py
--- ut3.12.0/result.py	2024-02-06 17:25:20
+++ ut3.12.1/result.py	2024-02-06 17:24:26
@@ -97,10 +97,12 @@
 
             sys.stdout = self._original_stdout
             sys.stderr = self._original_stderr
-            self._stdout_buffer.seek(0)
-            self._stdout_buffer.truncate()
-            self._stderr_buffer.seek(0)
-            self._stderr_buffer.truncate()
+            if self._stdout_buffer is not None:
+                self._stdout_buffer.seek(0)
+                self._stdout_buffer.truncate()
+            if self._stderr_buffer is not None:
+                self._stderr_buffer.seek(0)
+                self._stderr_buffer.truncate()
 
     def stopTestRun(self):
         """Called once after all tests are executed.

Taking the unittest folder from py3.12.0 and putting it in 3.12.1 seems to help. This is good because this should help with getting to the root of the issue.

sodul · 2024-02-07T19:15:38Z

I have encouraging news!

Python 3.12.2 was released yesterday and the bug fixes include changes to the unittests package. Since pyenv does not support it yet I took the unittest folder from 3.12.2 and placed it inside a 3.12.1 install and green started to work again on skipped tests:

> green --version
Green 4.0.1-dev, Coverage 7.4.1, Python 3.12.1

> green -rv green
...................................................................................................................s....................................................................................................................s

Skipped green.test.test_output.TestGreenStream.testDisableWindowsFalse - Colorama won't strip ANSI unless running on Windows

Skipped green.test.test_windows.TestWindows.test_colorOutput - This test is for windows-specific behavior.

Name                     Stmts   Miss  Cover   Missing
------------------------------------------------------
green/__init__.py            2      2     0%   1-3
green/__main__.py            3      3     0%   1-5
green/cmdline.py            61     13    79%   90-105
green/command.py            46      0   100%
green/config.py              0      0   100%
green/djangorunner.py       62      0   100%
green/examples.py           27     27     0%   1-69
green/exceptions.py          3      0   100%
green/junit.py             123      0   100%
green/loader.py            264      0   100%
green/output.py            101      0   100%
green/process.py            54      0   100%
green/result.py            412      3    99%   205, 223, 342
green/runner.py             73      0   100%
green/suite.py             126     13    90%   109, 113-116, 128-133, 139-142
green/test/__init__.py       0      0   100%
green/version.py             0      0   100%
------------------------------------------------------
TOTAL                     1357     61    96%

Ran 233 tests in 5.456s using 10 processes

OK (passes=231, skips=2)

The diff is pretty small so I might be able to add custom logic just for 3.12.1, or we can just flag 3.12.1 as incompatible with green.

> diff ut3.12.1 ut3.12.2
diff ut3.12.1/case.py ut3.12.2/case.py
608a609
>         result.startTest(self)
618,620d618
< 
<             # Increase the number of tests only if it hasn't been skipped
<             result.startTest(self)
diff ut3.12.1/main.py ut3.12.2/main.py
283c283
<             if self.result.testsRun == 0:
---
>             if self.result.testsRun == 0 and len(self.result.skipped) == 0:
diff ut3.12.1/mock.py ut3.12.2/mock.py
2202,2203c2202,2206
< _CODE_ATTRS = dir(CodeType)
< _CODE_SIG = inspect.signature(partial(CodeType.__init__, None))
---
> try:
>     _CODE_SIG = inspect.signature(partial(CodeType.__init__, None))
>     _CODE_ATTRS = dir(CodeType)
> except ValueError:
>     _CODE_SIG = None
2223,2225c2226,2231
<         code_mock = NonCallableMock(spec_set=_CODE_ATTRS)
<         code_mock.__dict__["_spec_class"] = CodeType
<         code_mock.__dict__["_spec_signature"] = _CODE_SIG
---
>         if _CODE_SIG:
>             code_mock = NonCallableMock(spec_set=_CODE_ATTRS)
>             code_mock.__dict__["_spec_class"] = CodeType
>             code_mock.__dict__["_spec_signature"] = _CODE_SIG
>         else:
>             code_mock = NonCallableMock(spec_set=CodeType)
diff ut3.12.1/result.py ut3.12.2/result.py
100,105c100,103
<             if self._stdout_buffer is not None:
<                 self._stdout_buffer.seek(0)
<                 self._stdout_buffer.truncate()
<             if self._stderr_buffer is not None:
<                 self._stderr_buffer.seek(0)
<                 self._stderr_buffer.truncate()
---
>             self._stdout_buffer.seek(0)
>             self._stdout_buffer.truncate()
>             self._stderr_buffer.seek(0)
>             self._stderr_buffer.truncate()
diff ut3.12.1/runner.py ut3.12.2/runner.py
277c277
<         elif run == 0:
---
>         elif run == 0 and not skipped:

sodul · 2024-02-07T19:23:28Z

This is the bug report on the CPython side: python/cpython#113267

CleanCut · 2024-02-07T19:34:54Z

we can just flag 3.12.1 as incompatible with green

That would be my vote, but I'll let you decide.

sodul · 2024-02-10T06:45:37Z

I'm waiting for pyenv and GH actions to both provide 3.12.2 and then I will update the GH actions, documentation and the toml file to mark 3.12.1 incompatible. There is no rush since pypi.org and pip do not make it easy to update older releases to reflect compatibility conflicts unfortunately.

sodul mentioned this issue Dec 12, 2023

[3.12] gh-106584: Fix exit code for unittest in Python 3.12 (GH-106588) python/cpython#109725

Merged

sodul added this to the 4.1.0 milestone Jan 17, 2024

sodul self-assigned this Jan 17, 2024

CleanCut added the bug label Jan 19, 2024

sodul mentioned this issue Feb 7, 2024

TestResult.stopTest called when test is skipped, despite TestResult.startTest not being called python/cpython#113267

Closed

sodul closed this as completed Feb 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stopTest issue with Python 3.12.1 and skipped tests #277

stopTest issue with Python 3.12.1 and skipped tests #277

sodul commented Dec 12, 2023 •

edited

Loading

CleanCut commented Dec 15, 2023

sodul commented Dec 19, 2023

sodul commented Jan 13, 2024

sodul commented Jan 16, 2024

sodul commented Jan 22, 2024

sodul commented Jan 22, 2024 •

edited

Loading

sodul commented Jan 25, 2024

CleanCut commented Jan 25, 2024

sodul commented Jan 26, 2024 •

edited

Loading

CleanCut commented Jan 30, 2024

sodul commented Feb 7, 2024 •

edited

Loading

sodul commented Feb 7, 2024 •

edited

Loading

sodul commented Feb 7, 2024

CleanCut commented Feb 7, 2024

sodul commented Feb 10, 2024 •

edited

Loading

stopTest issue with Python 3.12.1 and skipped tests #277

stopTest issue with Python 3.12.1 and skipped tests #277

Comments

sodul commented Dec 12, 2023 • edited Loading

CleanCut commented Dec 15, 2023

sodul commented Dec 19, 2023

sodul commented Jan 13, 2024

sodul commented Jan 16, 2024

sodul commented Jan 22, 2024

sodul commented Jan 22, 2024 • edited Loading

sodul commented Jan 25, 2024

CleanCut commented Jan 25, 2024

sodul commented Jan 26, 2024 • edited Loading

CleanCut commented Jan 30, 2024

sodul commented Feb 7, 2024 • edited Loading

sodul commented Feb 7, 2024 • edited Loading

sodul commented Feb 7, 2024

CleanCut commented Feb 7, 2024

sodul commented Feb 10, 2024 • edited Loading

sodul commented Dec 12, 2023 •

edited

Loading

sodul commented Jan 22, 2024 •

edited

Loading

sodul commented Jan 26, 2024 •

edited

Loading

sodul commented Feb 7, 2024 •

edited

Loading

sodul commented Feb 7, 2024 •

edited

Loading

sodul commented Feb 10, 2024 •

edited

Loading