Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_concurrent_futures sometimes hangs indefinitely on Windows #92222

Closed
AlexWaygood opened this issue May 3, 2022 · 7 comments
Closed

test_concurrent_futures sometimes hangs indefinitely on Windows #92222

AlexWaygood opened this issue May 3, 2022 · 7 comments
Labels
OS-windows tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error

Comments

@AlexWaygood
Copy link
Member

AlexWaygood commented May 3, 2022

Bug report

test_concurrent_futures appears to be hanging indefinitely on Windows in our CI. This has happened twice three times in two days, on three separate unrelated CI runs.

Two PRs:

test_concurrent_futures has also been running for 4hr30 (and counting) on Windows (x86) on this recent push to main:

Neither PR had anything to do with concurrent.futures (and the push to main didn't have anything to do with concurrent.futures either). For the first PR, test_concurrent_futures was hanging on Windows (x64) but passed on Windows (x86). For the second PR, the test has passed on Windows (x64) but is hanging on Windows (x86).

cc. @pitrou, @brianquinlan, as listed experts for concurrent.futures.

@AlexWaygood AlexWaygood added type-bug An unexpected behavior, bug, or error tests Tests in the Lib/test dir labels May 3, 2022
@AlexWaygood AlexWaygood changed the title test_concurrent_futures hangs indefinitely on Windows test_concurrent_futures sometimes hangs indefinitely on Windows May 3, 2022
@AlexWaygood
Copy link
Member Author

AlexWaygood commented May 3, 2022

I've run python -m test test_concurrent_futures -v five times in succession on my Windows x64 machine, but haven't yet been able to reproduce this behaviour locally.

@AlexWaygood
Copy link
Member Author

AlexWaygood commented May 3, 2022

The most recent significant change to concurrent.futures was #31408, merged by @JelleZijlstra. The most recent significant change to test_concurrent_futures was #91600, by @gpshead. I've no idea if either change is related to this.

@gpshead
Copy link
Member

gpshead commented May 7, 2022

fwiw my change to the test removed this logic that put a 0.1 second delay at the start of most every test (this is already one of the longest running tests in our suite), which was added over a decade ago without explanation and doesn't make logical sense:

    def _prime_executor(self):
        # Make sure that the executor is ready to do work before running the
        # tests. This should reduce the probability of timeouts in the tests.
        futures = [self.executor.submit(time.sleep, 0.1)
                   for _ in range(self.worker_count)]
        for f in futures:
            f.result()

The only thing code like that can do is hope to work around a logical flaw in tests or hide an actual race condition bug. so if there is a logic flaw or race condition somewhere... we'd be best off finding it as the real fix.

So far I've not observed timeout failures, but if anyone can reproduce them it'd be great to get a snapshot of what processes were stuck running and hanging on what so we can understand the root of the issue.

@pitrou
Copy link
Member

pitrou commented May 7, 2022

So far I've not observed timeout failures, but if anyone can reproduce them it'd be great to get a snapshot of what processes were stuck running and hanging on what so we can understand the root of the issue.

If nothing bad happens on the buildbots then great :-)

@AlexWaygood
Copy link
Member Author

I opened this issue because I saw the same CI failure three times in two days. I haven't seen it since, though, and haven't been able to reproduce it locally, so I'll close this for now. If it happens again, I'll reopen.

@neonene
Copy link
Contributor

neonene commented Jun 17, 2022

I saw this issue on 3.11 branch Windows (x64):
https://github.com/python/cpython/runs/6939506633?check_suite_focus=true

@neonene
Copy link
Contributor

neonene commented Jun 18, 2022

I am not able to reproduce locally as well. Would it be possible to inject something that shows what test case is still running or got canceled?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OS-windows tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants