-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-90908: Document asyncio.Task.cancelling() and asyncio.Task.uncancel() #95253
Conversation
ambv
commented
Jul 25, 2022
•
edited by bedevere-bot
Loading
edited by bedevere-bot
- Issue: Introduce task groups to asyncio and change task cancellation semantics #90908
Can I wait until the other reviewers have done their thing? I'm kind of stressed for time right now. |
@gvanrossum Take your time, we'll be shaping this into a final form over the next day or two. Thanks! |
I'll deal with the code example separately but in general, I have this question: don't you think the ordering of documented methods on How about instead of doing:
we could do:
|
Co-authored-by: Thomas Grainger <[email protected]>
.. method:: cancel(msg=None) | ||
|
||
Request the Task to be cancelled. | ||
|
||
This arranges for a :exc:`CancelledError` exception to be thrown | ||
into the wrapped coroutine on the next cycle of the event loop. | ||
|
||
The coroutine then has a chance to clean up or even deny the | ||
request by suppressing the exception with a :keyword:`try` ... | ||
... ``except CancelledError`` ... :keyword:`finally` block. | ||
Therefore, unlike :meth:`Future.cancel`, :meth:`Task.cancel` does | ||
not guarantee that the Task will be cancelled, although | ||
suppressing cancellation completely is not common and is actively | ||
discouraged. | ||
|
||
.. versionchanged:: 3.9 | ||
Added the *msg* parameter. | ||
|
||
.. deprecated-removed:: 3.11 3.14 | ||
*msg* parameter is ambiguous when multiple :meth:`cancel` | ||
are called with different cancellation messages. | ||
The argument will be removed. | ||
|
||
.. _asyncio_example_task_cancel: | ||
|
||
The following example illustrates how coroutines can intercept | ||
the cancellation request:: | ||
|
||
async def cancel_me(): | ||
print('cancel_me(): before sleep') | ||
|
||
try: | ||
# Wait for 1 hour | ||
await asyncio.sleep(3600) | ||
except asyncio.CancelledError: | ||
print('cancel_me(): cancel sleep') | ||
raise | ||
finally: | ||
print('cancel_me(): after sleep') | ||
|
||
async def main(): | ||
# Create a "cancel_me" Task | ||
task = asyncio.create_task(cancel_me()) | ||
|
||
# Wait for 1 second | ||
await asyncio.sleep(1) | ||
|
||
task.cancel() | ||
try: | ||
await task | ||
except asyncio.CancelledError: | ||
print("main(): cancel_me is cancelled now") | ||
|
||
asyncio.run(main()) | ||
|
||
# Expected output: | ||
# | ||
# cancel_me(): before sleep | ||
# cancel_me(): cancel sleep | ||
# cancel_me(): after sleep | ||
# main(): cancel_me is cancelled now | ||
|
||
.. method:: cancelled() | ||
|
||
Return ``True`` if the Task is *cancelled*. | ||
|
||
The Task is *cancelled* when the cancellation was requested with | ||
:meth:`cancel` and the wrapped coroutine propagated the | ||
:exc:`CancelledError` exception thrown into it. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is unchanged, only moved down. IMO cancellation isn't as important as getting other things out of a task. Plus this move allows us to keep cancel-specific methods next to each other, uncancel
in particular being pretty low-level.
ce195cb
to
0203e01
Compare
Doc/library/asyncio-task.rst
Outdated
continue running even in case of the timeout. This can be | ||
implemented with :meth:`uncancel` as follows:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we need to give an implementation of a structured concurrency primitive as an example in the docs, given that we don't expect (or want!) people to do this. I also don't have time to review the example carefully enough to trust it doesn't have bugs that would be replicated if people copy this example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the example to tests but kept it there as reference docs for our own purposes.
Co-authored-by: Guido van Rossum <[email protected]>
Co-authored-by: Thomas Grainger <[email protected]>
This should be used by tasks that catch CancelledError | ||
and wish to continue indefinitely until they are cancelled again. | ||
This should be called by the party that called `cancel()` on the task | ||
beforehand. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous docstring was invalid, we actively don't want for user code to call uncancel()
when catching CancelledError.
finally: | ||
loop.close() | ||
|
||
def test_uncancel_structured_blocks(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like you to look at this test and tell me if you think anything here (esp. the comments!) is not factual.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the length of some of the comments. The history of uncancel() is complex:
- Initially there was a boolean flag, and if it was already set, cancel() would return False early
- Then there was a counter, and if it was already nonzero, cancel() would return False early
- Finally the counter was "defanged" and cancel() would bump the counter but still always go through the rest of its routine
The logic bug in TaskGroup (#95289) doesn't help restore confidence. But I think the following guidelines are helpful:
- Every structured concurrency primitive must use an internal flag that is set when it calls cancel().
- It must check this flag when exiting and if set, call uncancel(), so it always has a balanced pair of cancel()/uncancel() calls.
- If unstructured code calls cancel() without uncancel(), every structured concurrency primitive will act as if surrounded by yet another structured concurrency primitive that has called cancel() but hasn't called uncancel() yet.
Bug #95289 is about TaskGroup failing the second bullet when the CancellationError is suppressed or replaced with something else. The fix is to always check the flag and (if set) call uncancel(), even if no CancellationError was perceived -- because we know cancel() was called so we must call uncancel().
There is no similar bug in timeout because it doesn't check for CancelledError -- it only uses its internal state.
While the block with ``make_request()`` and ``make_another_request()`` | ||
might get cancelled due to the timeout, ``unrelated_code()`` should | ||
continue running even in case of the timeout. This is implemented | ||
with :meth:`uncancel`. :class:`TaskGroup` context managers use | ||
:func:`uncancel` in a similar fashion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to imply that without uncancel()
, await unrelated_code()
would be cancelled. But even with the most naive implementation possible using only primitives existing in 3.10, if except TimeoutError:
is triggered, the following await
will executed just fine. (In Trio this would not be the case, since cancellation there is level-triggered, i.e. once a task is cancelled it stays cancelled and specifically any further await
will immediately be cancelled; but in asyncio
cancellation is edge-triggered, and once the task has regained control, which the context manager can do, it is no longer cancelled.)
So why does uncancel()
exist and need to be called? Because when structured concurrency primitives are nested, they may cancel the same task. The inner primitive(s) of these must let the CancellationError
bubble out, while the outermost one must take the action it promises its caller (e.g. raise TimeoutError
or raise an ExceptionGroup
bundling the error(s) experienced by failed but not cancelled subtasks).
I think this means that to make the example meaningful, you'd have to nest two timeout blocks whose timers go off simultaneously. Then the inner one will raise CancellationError
(so an except TimeoutError
will not trigger!) and the outer one will raise TimeoutError
.
It is not a coincidence that uncancel()
is called in __aexit__()
by both timeout and TaskGroup. The pattern is pretty much
- some async event calls
t.cancel()
on some task, and sets an internal flag indicating it did so; - later, typically in
__aexit__()
, if the internal flag is set,t.uncancel()
is called, and if it returns a value greater than zero,CancelledError
is (re)raised (by returningNone
from__aexit__()
!), else some other action is taken.
Note that if this number is greater than zero but the Task is | ||
still executing, :meth:`cancelled` will still return ``False``. | ||
This is because this number can be lowered by calling :meth:`uncancel`, | ||
which can lead to the task not being cancelled after all if the | ||
cancellation requests go down to zero. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to imply that the effect of cancel()
+ uncancel()
is a no-op, but I'm not sure that's always the case. Honestly after several attempts I'm still not sure how cancel()
and __step()
interacted even in 3.10. :-( But it seems at least possible that .cancel()
sets _must_cancel
which causes __step()
to throw a CancelledError
into the coroutine even if uncancel()
is called immediately after cancel()
.
See also the comment added to cancel()
starting with "These two lines are controversial."
(Also 7d611b4)
# a ConnectionLostError from a database client), simply | ||
# propagate them. | ||
# | ||
# Those checks need to take place in this exact order to make |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Those checks" == (2), (4), (5), right? Because (1) and (3) don't describe "checks".
Other than that nit, these comments seem correct, and the code looks so too (but I'm not taking money :-).
(Another nit: the huge comment interrupts the logic. Maybe move it to the top of the test function?)
# cancellation request count go down to 0. We need to look | ||
# at the counter vs having a simple boolean flag because our | ||
# code might have been nested (think multiple timeouts). See | ||
# commit 7fce1063b6e5a366f8504e039a8ccdd6944625cd for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that commit 7d611b4 changes the behavior again.
and task.uncancel() == 0 | ||
and sys.exc_info()[0] is asyncio.CancelledError | ||
): | ||
# Note the five rules that are needed here to satisfy proper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Canceling the stimulus (timeouts), or guarding against cancelling after calling uncancel() (TaskGroup and Runner) I think is a sixth rule
# Note the five rules that are needed here to satisfy proper | |
# Note the six rules that are needed here to satisfy proper |
Should this become a deferred blocker so we can land it in RC2? |
3.11 final is due next month, so it would be better to get this in earlier than release. |
Assuming this passes the doc tests I'll just merge it, we can debate the remaining issues I had back in July later. |
Thanks @ambv for the PR, and @gvanrossum for merging it 🌮🎉.. I'm working now to backport this PR to: 3.11. |
GH-97708 is a backport of this pull request to the 3.11 branch. |
…ncancel() (pythonGH-95253) Co-authored-by: Thomas Grainger <[email protected]> (cherry picked from commit f00645d) Co-authored-by: Łukasz Langa <[email protected]>
|
I thought this was a docs only PR, but it updates a test. Presumably that’s what caused the buildbot failure? I can look into this tomorrow. |
|
Then again how could test_embed be affected? |
…l() (GH-95253) Co-authored-by: Thomas Grainger <[email protected]> (cherry picked from commit f00645d) Co-authored-by: Łukasz Langa <[email protected]>
…ncancel() (python#95253) Co-authored-by: Thomas Grainger <[email protected]>
More recent runs of the same buildbot passed, so I conclude this was a flake. |
…l() (GH-95253) Co-authored-by: Thomas Grainger <[email protected]> (cherry picked from commit f00645d) Co-authored-by: Łukasz Langa <[email protected]>