Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stopTest issue with Python 3.12.1 and skipped tests #277

Closed
sodul opened this issue Dec 12, 2023 · 15 comments
Closed

stopTest issue with Python 3.12.1 and skipped tests #277

sodul opened this issue Dec 12, 2023 · 15 comments
Assignees
Labels
Milestone

Comments

@sodul
Copy link
Collaborator

sodul commented Dec 12, 2023

We have a unittest that is decorated with @unittest.skipIf(...), and when green is collecting tests results in stopTest() we get an AttributeError for start_time on this line:
self.test_time = str(time.time() - self.start_time).

Since self.start_time is being set in startTest() but this seems to be skipped on skipped tests, but stopTest() is still being called, it looks like it is a new compatibility issue.

If I silence the AttributeError, we get an other one later on proto_test_result.errors in addProtoTestResult(), so the fix might get tricky.

I will take a look at a permanent fix after 3.5.0 gets released. For new we have commented out the skipped test in our codebase so we can use 3.12.1.

https://docs.python.org/3/whatsnew/changelog.html#python-3-12-1-final

I suspect it might be caused by this change: python/cpython#109725

@CleanCut
Copy link
Owner

We should be able to check if self.start_time exists in stopTest(), and set self.test_time to a zero duration if it does not.

@sodul
Copy link
Collaborator Author

sodul commented Dec 19, 2023

For the short term that should work.

Something I was thinking about looking into is to have the classes used by green to inherit from the official classes so that they will be more future proof. But I would do that after cleaning up the code from all pre 3.8 support to get the codebase cleaned up and organized.

@sodul
Copy link
Collaborator Author

sodul commented Jan 13, 2024

My plan is to get v4.0.0 released based of the current main in the next few days so that we can have support for python 3.12.0 with a note that 3.12.1 is known to break on skipped tests. This should allow most users of green to be able to use green with python 3.12 while I work on fixing the skipped tests issue. I will probably release v4.1.0 a few days later with this issue fixed.

@sodul
Copy link
Collaborator Author

sodul commented Jan 16, 2024

I've released Green 4.0.0 with python 3.12 support. Now that the minimal python version is 3.8 I should be able to refactor the code more easily and will work on the 3.12.1 compatibility issue this week.

@sodul sodul added this to the 4.1.0 milestone Jan 17, 2024
@sodul sodul self-assigned this Jan 17, 2024
@CleanCut CleanCut added the bug label Jan 19, 2024
@sodul
Copy link
Collaborator Author

sodul commented Jan 22, 2024

I made some progress on this issue. I added a test with the skip annotation to reproduce the bug and I'm able to run ./g -s1 green with python 3.12.1, but without -s1.

> ./g -s1 green   
..........................................ssssssss........................................................................s
[ ... snip ... ]
Ran 253 tests in 4.727s using 1 process

OK (passes=114, skips=9)

And:

> ./g green    
..........................................ssssssss.................................................................[stuck]

After looking at green/result.py and unittest/result.py, it is clear that the internals have drifted upstream so my current approach is to have green's BaseTestResult inherit from unittest.TestResult and call super().<method>() where needed. I believe this is the approach that will provide the most forward compatibility while potentially reducing duplicated code.

I had to add some type annotation to help me track things, but this is triggering some cascading error reporting and I do not want to fix everything at once here.

@sodul
Copy link
Collaborator Author

sodul commented Jan 22, 2024

-s2 seems ok

> ./g -s3 green
..........................................ssssssss.................................................................^C^CTraceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/stephane/repos/clumio-oss/green/green/cmdline.py", line 103, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/stephane/repos/clumio-oss/green/green/cmdline.py", line 99, in main
    return _main(argv, testing)
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/stephane/repos/clumio-oss/green/green/cmdline.py", line 67, in _main
    result = run(test_suite, stream, args, testing)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/stephane/repos/clumio-oss/green/green/runner.py", line 134, in run
    proto_test_result: ProtoTest | ProtoTestResult = queue.get()
                                                     ^^^^^^^^^^^

It seems that it is having an issue with the tests in test_runner.py. I might have to update the mocks there since python -m unittest green/test/test_runner.py is getting stuck as well.

@sodul
Copy link
Collaborator Author

sodul commented Jan 25, 2024

My original idea to have green's BaseTestResult class inherit from unittest.TestResult seems to get the tests to freeze on the internal tests with macOS and Windows, but not in Linux according to GH actions. I'll shelve this for now and will re-implement the new attributes added in Python 3.12.1 as a short term fix. Longer term I do believe inheriting is the proper approach but we will need refactor the internal queueing code of green to properly identify where the race condition occurs. Thanks to mypy some issues have been uncovered but it will take some time to implement all this refactoring.

main...clumio-oss:green:python-3.12.1

@CleanCut
Copy link
Owner

I think it would be great to mostly inherit and/or use upstream implementations. I tried to do that as much as I could...but it wouldn't work in a lot of places. Maybe it will work better now.

@sodul
Copy link
Collaborator Author

sodul commented Jan 26, 2024

There is something odd with the ProtoTestResult queue when I use py3.12.1, after adding the missing attributes, now I'm getting errors because we get None and ProtoTest instances passed to addProtoTestResult(), which then leads to further errors. I tried to 'ignore' by checking for these unexpected types, but then we get stuck on queue.get(). I'm currently looking at adding more type annotation to better understand how the queue is constructed. I suspect that the logic to detect that there is no more work to be done will have to be tuned. I'll get to the bottom of this.

@CleanCut
Copy link
Owner

Ya, the queue communication has always been super finicky and hard to debug. Any improvements to that (or to the surrounding architecture) are most welcome.

@sodul
Copy link
Collaborator Author

sodul commented Feb 7, 2024

For reference I did a diff of the unittest folder in python 3.12.0 and 3.12.1 to help me better understand what the actual changes are:

diff -u ut3.12.0/case.py ut3.12.1/case.py
--- ut3.12.0/case.py	2024-02-06 17:25:20
+++ ut3.12.1/case.py	2024-02-06 17:24:26
@@ -606,7 +606,6 @@
         else:
             stopTestRun = None
 
-        result.startTest(self)
         try:
             testMethod = getattr(self, self._testMethodName)
             if (getattr(self.__class__, "__unittest_skip__", False) or
@@ -616,6 +615,9 @@
                             or getattr(testMethod, '__unittest_skip_why__', ''))
                 _addSkip(result, self, skip_why)
                 return result
+
+            # Increase the number of tests only if it hasn't been skipped
+            result.startTest(self)
 
             expecting_failure = (
                 getattr(self, "__unittest_expecting_failure__", False) or
diff -u ut3.12.0/loader.py ut3.12.1/loader.py
--- ut3.12.0/loader.py	2024-02-06 17:25:20
+++ ut3.12.1/loader.py	2024-02-06 17:24:26
@@ -84,9 +84,13 @@
             raise TypeError("Test cases should not be derived from "
                             "TestSuite. Maybe you meant to derive from "
                             "TestCase?")
-        testCaseNames = self.getTestCaseNames(testCaseClass)
-        if not testCaseNames and hasattr(testCaseClass, 'runTest'):
-            testCaseNames = ['runTest']
+        if testCaseClass in (case.TestCase, case.FunctionTestCase):
+            # We don't load any tests from base types that should not be loaded.
+            testCaseNames = []
+        else:
+            testCaseNames = self.getTestCaseNames(testCaseClass)
+            if not testCaseNames and hasattr(testCaseClass, 'runTest'):
+                testCaseNames = ['runTest']
         loaded_suite = self.suiteClass(map(testCaseClass, testCaseNames))
         return loaded_suite
 
@@ -95,7 +99,11 @@
         tests = []
         for name in dir(module):
             obj = getattr(module, name)
-            if isinstance(obj, type) and issubclass(obj, case.TestCase):
+            if (
+                isinstance(obj, type)
+                and issubclass(obj, case.TestCase)
+                and obj not in (case.TestCase, case.FunctionTestCase)
+            ):
                 tests.append(self.loadTestsFromTestCase(obj))
 
         load_tests = getattr(module, 'load_tests', None)
@@ -164,7 +172,11 @@
 
         if isinstance(obj, types.ModuleType):
             return self.loadTestsFromModule(obj)
-        elif isinstance(obj, type) and issubclass(obj, case.TestCase):
+        elif (
+            isinstance(obj, type)
+            and issubclass(obj, case.TestCase)
+            and obj not in (case.TestCase, case.FunctionTestCase)
+        ):
             return self.loadTestsFromTestCase(obj)
         elif (isinstance(obj, types.FunctionType) and
               isinstance(parent, type) and
diff -u ut3.12.0/mock.py ut3.12.1/mock.py
--- ut3.12.0/mock.py	2024-02-06 17:25:20
+++ ut3.12.1/mock.py	2024-02-06 17:24:26
@@ -827,7 +827,7 @@
 
 
     def _format_mock_failure_message(self, args, kwargs, action='call'):
-        message = 'expected %s not found.\nExpected: %s\nActual: %s'
+        message = 'expected %s not found.\nExpected: %s\n  Actual: %s'
         expected_string = self._format_mock_call_signature(args, kwargs)
         call_args = self.call_args
         actual_string = self._format_mock_call_signature(*call_args)
@@ -930,7 +930,7 @@
         if self.call_args is None:
             expected = self._format_mock_call_signature(args, kwargs)
             actual = 'not called.'
-            error_message = ('expected call not found.\nExpected: %s\nActual: %s'
+            error_message = ('expected call not found.\nExpected: %s\n  Actual: %s'
                     % (expected, actual))
             raise AssertionError(error_message)
 
@@ -981,7 +981,7 @@
                 raise AssertionError(
                     f'{problem}\n'
                     f'Expected: {_CallList(calls)}'
-                    f'{self._calls_repr(prefix="Actual").rstrip(".")}'
+                    f'{self._calls_repr(prefix="  Actual").rstrip(".")}'
                 ) from cause
             return
 
diff -u ut3.12.0/result.py ut3.12.1/result.py
--- ut3.12.0/result.py	2024-02-06 17:25:20
+++ ut3.12.1/result.py	2024-02-06 17:24:26
@@ -97,10 +97,12 @@
 
             sys.stdout = self._original_stdout
             sys.stderr = self._original_stderr
-            self._stdout_buffer.seek(0)
-            self._stdout_buffer.truncate()
-            self._stderr_buffer.seek(0)
-            self._stderr_buffer.truncate()
+            if self._stdout_buffer is not None:
+                self._stdout_buffer.seek(0)
+                self._stdout_buffer.truncate()
+            if self._stderr_buffer is not None:
+                self._stderr_buffer.seek(0)
+                self._stderr_buffer.truncate()
 
     def stopTestRun(self):
         """Called once after all tests are executed.

Taking the unittest folder from py3.12.0 and putting it in 3.12.1 seems to help. This is good because this should help with getting to the root of the issue.

@sodul
Copy link
Collaborator Author

sodul commented Feb 7, 2024

I have encouraging news!

Python 3.12.2 was released yesterday and the bug fixes include changes to the unittests package. Since pyenv does not support it yet I took the unittest folder from 3.12.2 and placed it inside a 3.12.1 install and green started to work again on skipped tests:

> green --version
Green 4.0.1-dev, Coverage 7.4.1, Python 3.12.1

> green -rv green
...................................................................................................................s....................................................................................................................s

Skipped green.test.test_output.TestGreenStream.testDisableWindowsFalse - Colorama won't strip ANSI unless running on Windows

Skipped green.test.test_windows.TestWindows.test_colorOutput - This test is for windows-specific behavior.

Name                     Stmts   Miss  Cover   Missing
------------------------------------------------------
green/__init__.py            2      2     0%   1-3
green/__main__.py            3      3     0%   1-5
green/cmdline.py            61     13    79%   90-105
green/command.py            46      0   100%
green/config.py              0      0   100%
green/djangorunner.py       62      0   100%
green/examples.py           27     27     0%   1-69
green/exceptions.py          3      0   100%
green/junit.py             123      0   100%
green/loader.py            264      0   100%
green/output.py            101      0   100%
green/process.py            54      0   100%
green/result.py            412      3    99%   205, 223, 342
green/runner.py             73      0   100%
green/suite.py             126     13    90%   109, 113-116, 128-133, 139-142
green/test/__init__.py       0      0   100%
green/version.py             0      0   100%
------------------------------------------------------
TOTAL                     1357     61    96%

Ran 233 tests in 5.456s using 10 processes

OK (passes=231, skips=2)

The diff is pretty small so I might be able to add custom logic just for 3.12.1, or we can just flag 3.12.1 as incompatible with green.

> diff ut3.12.1 ut3.12.2
diff ut3.12.1/case.py ut3.12.2/case.py
608a609
>         result.startTest(self)
618,620d618
< 
<             # Increase the number of tests only if it hasn't been skipped
<             result.startTest(self)
diff ut3.12.1/main.py ut3.12.2/main.py
283c283
<             if self.result.testsRun == 0:
---
>             if self.result.testsRun == 0 and len(self.result.skipped) == 0:
diff ut3.12.1/mock.py ut3.12.2/mock.py
2202,2203c2202,2206
< _CODE_ATTRS = dir(CodeType)
< _CODE_SIG = inspect.signature(partial(CodeType.__init__, None))
---
> try:
>     _CODE_SIG = inspect.signature(partial(CodeType.__init__, None))
>     _CODE_ATTRS = dir(CodeType)
> except ValueError:
>     _CODE_SIG = None
2223,2225c2226,2231
<         code_mock = NonCallableMock(spec_set=_CODE_ATTRS)
<         code_mock.__dict__["_spec_class"] = CodeType
<         code_mock.__dict__["_spec_signature"] = _CODE_SIG
---
>         if _CODE_SIG:
>             code_mock = NonCallableMock(spec_set=_CODE_ATTRS)
>             code_mock.__dict__["_spec_class"] = CodeType
>             code_mock.__dict__["_spec_signature"] = _CODE_SIG
>         else:
>             code_mock = NonCallableMock(spec_set=CodeType)
diff ut3.12.1/result.py ut3.12.2/result.py
100,105c100,103
<             if self._stdout_buffer is not None:
<                 self._stdout_buffer.seek(0)
<                 self._stdout_buffer.truncate()
<             if self._stderr_buffer is not None:
<                 self._stderr_buffer.seek(0)
<                 self._stderr_buffer.truncate()
---
>             self._stdout_buffer.seek(0)
>             self._stdout_buffer.truncate()
>             self._stderr_buffer.seek(0)
>             self._stderr_buffer.truncate()
diff ut3.12.1/runner.py ut3.12.2/runner.py
277c277
<         elif run == 0:
---
>         elif run == 0 and not skipped:

@sodul
Copy link
Collaborator Author

sodul commented Feb 7, 2024

This is the bug report on the CPython side: python/cpython#113267

@CleanCut
Copy link
Owner

CleanCut commented Feb 7, 2024

we can just flag 3.12.1 as incompatible with green

That would be my vote, but I'll let you decide.

@sodul
Copy link
Collaborator Author

sodul commented Feb 10, 2024

I'm waiting for pyenv and GH actions to both provide 3.12.2 and then I will update the GH actions, documentation and the toml file to mark 3.12.1 incompatible. There is no rush since pypi.org and pip do not make it easy to update older releases to reflect compatibility conflicts unfortunately.

@sodul sodul closed this as completed Feb 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants