Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client request timeout should be optional #1180

Closed
adamrothman opened this issue Sep 16, 2016 · 26 comments
Closed

Client request timeout should be optional #1180

adamrothman opened this issue Sep 16, 2016 · 26 comments
Labels

Comments

@adamrothman
Copy link

adamrothman commented Sep 16, 2016

Long story short

As described in #877, attempting to use aiohttp.Timeout in a Tornado-based app results in a RuntimeError. In that issue, @bdarnell offers a solution that uses Tornado's own timeout utility. Through version 0.22.5, aiohttp's client request plumbing does not use aiohttp.Timeout internally; users were expected to apply it externally when desired.

The 1.0.0 release of aiohttp includes dc79e14, which forces all requests to use a Timeout. This seems a bit heavy-handed as it makes it impossible to use aiohttp's client with Tornado, even with Ben's suggested workaround. If the timeout kwarg to aiohttp.ClientSession::request is None, it should work the way it used to.

Steps to reproduce

The following test server can be used to consistently reproduce a RuntimeError as in #877 (trace follows):

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from asyncio import get_event_loop

from aiohttp import ClientSession
from tornado.httpserver import HTTPServer
from tornado.platform.asyncio import AsyncIOMainLoop
from tornado.web import Application
from tornado.web import RequestHandler


class Handler(RequestHandler):

    async def get(self):
        async with ClientSession() as client:
            async with client.get('https://www.google.com') as response:
                self.write(await response.text())


if __name__ == '__main__':
    AsyncIOMainLoop().install()

    app = Application([
        (r'/', Handler),
    ])
    server = HTTPServer(app)
    server.listen(9999)

    get_event_loop().run_forever()

And the resulting error:

Uncaught exception GET / (127.0.0.1)
HTTPServerRequest(protocol='http', host='localhost:1337', method='GET', uri='/', version='HTTP/1.1', remote_ip='127.0.0.1', headers={'Host': 'localhost:1337', 'Accept-Encoding': 'gzip, deflate', 'Connection': 'keep-alive', 'User-Agent': 'HTTPie/0.9.6', 'Accept': '*/*'})
Traceback (most recent call last):
  File "/Users/adam/.pyenv/versions/3.5.2/lib/python3.5/site-packages/tornado/web.py", line 1469, in _execute
    result = yield result
  File "/Users/adam/.pyenv/versions/3.5.2/lib/python3.5/site-packages/tornado/gen.py", line 1015, in run
    value = future.result()
  File "/Users/adam/.pyenv/versions/3.5.2/lib/python3.5/site-packages/tornado/concurrent.py", line 237, in result
    raise_exc_info(self._exc_info)
  File "<string>", line 3, in raise_exc_info
  File "/Users/adam/.pyenv/versions/3.5.2/lib/python3.5/site-packages/tornado/gen.py", line 285, in wrapper
    yielded = next(result)
  File "<string>", line 6, in _wrap_awaitable
  File "./test.py", line 16, in get
    async with client.get('https://www.google.com') as response:
  File "/Users/adam/.pyenv/versions/3.5.2/lib/python3.5/site-packages/aiohttp/client.py", line 565, in __aenter__
    self._resp = yield from self._coro
  File "/Users/adam/.pyenv/versions/3.5.2/lib/python3.5/site-packages/aiohttp/client.py", line 197, in _request
    with Timeout(timeout, loop=self._loop):
  File "/Users/adam/.pyenv/versions/3.5.2/lib/python3.5/site-packages/async_timeout/__init__.py", line 33, in __enter__
    raise RuntimeError('Timeout context manager should be used '
RuntimeError: Timeout context manager should be used inside a task
500 GET / (127.0.0.1) 3.57ms

Environment

  • macOS 10.11.6
  • Python 3.5.2
  • aiohttp 1.0.0
  • tornado 4.4.1

Thoughts

aiohttp relies on @asvetlov's async_timeout library internally to handle timeouts. async_timeout is incompatible with coroutine runners that are not asyncio, as it relies on calling asyncio.Task.current_task. Perhaps it would be possible to make async_timeout compatible with other runners? Or maybe it's not possible. That's out of the scope of what I'm proposing here. I see the value of making it possible to manage the timeout internally, but it would be nice if there was a way to disable this behavior when desired.

@asvetlov
Copy link
Member

Tornado's couroutine runner is quite not compatible with asyncio, you know.
I'll update (with Guido's preliminary approval) the asyncio PEP 3156 by making explicit statement: every coroutine should be executed executed inside a task -- either directly by loop.create_task(coro()) call or indirectly by using await coro() inside other coroutine.

Unfortunately Tornado doesn't follow the rule (BTW it doesn't support own coroutine runner cancelling as well).

As workaround I'm suggesting the following decorator:

def deco(func):
    async def wrapper(*args, **kwargs):
        return asyncio.get_event_loop().create_task(func(*args, **kwargs))
   return wrapper

class Handler(tornado.web.RequestHandler):

    @deco
    async def get(self):
        ...

See also #1176

P.S.
async_timeout library will be used at least by aiopg and aiomysql soon.
Please live with it.

P.P.S
It's possible to skip asyncio.Task.current_task check in async_timeout if timeout is None but it brings new quirks: if user will change timeout he will get RuntimeError surprisingly.
I believe better to inform a user about potential problem as early as possible.

@adamrothman
Copy link
Author

@asvetlov Thanks for the quick response! I'm not suggesting things change at the level of the async_timeout library. I get the desire to only have it be compatible with native asyncio. But I don't think it's unreasonable for aiohttp to allow consumers to opt out of the usage of Timeout.

@asvetlov
Copy link
Member

async_timeout is 100% compatible with asyncio, sure.

If the proposed decorator works for you -- we might add a hint to documentation.
When I had dug into tornado's source for corutines I was unable to propose a quick patch for solving the issue, sorry. Wrapping every .get() method into a task looks like a solution but it may break old code that uses tornado.gen.task way.
For achieving 100% compatibility with asyncio Tornado should have a cancellable task compatible with asyncio.Task by duck typing maybe.
But, sorry again, I have no time/interest for such deep contribution into Tornado.

@adamrothman
Copy link
Author

adamrothman commented Sep 16, 2016

@asvetlov Right, I understand. But that is much more complicated than what I'm suggesting here. Inside of aiohttp.ClientSession::_request (https://github.com/KeepSafe/aiohttp/blob/c017e8cf3174bbea0b523fcdf94fe5c44fca1663/aiohttp/client.py#L197), I'm proposing something like this:

if timeout is None:
    conn = yield from self._connector.connect(req)
else:
    with Timeout(timeout, loop=self._loop):
        conn = yield from self._connector.connect(req)

(with similar changes applied in the other places Timeout was added)

No decorators required, and no mucking around in the depths of Tornado. This way people can use aiohttp's built-in timeout if they want to, or apply some other timeout mechanism, or not use a timeout at all. When passed timeout=None, these request methods should behave exactly the same as they did before dc79e14.

@bdarnell with asyncio settled into the standard library and considering the update to PEP 3156 proposed above, it seems like Tornado's AsyncIOLoop needs to play by the rules with respect to asyncio.Task if it's going to continue to be compatible with asyncio. What do you think?

@bdarnell
Copy link

I'd really appreciate it if the new rules made it into the PEP first, and then aiohttp started relying on them. Until then aiohttp is relying on undocumented implementation details of asyncio.

What exactly is the proposed modification to PEP 3156 and where is this being discussed? It sounds like it's going to be a backwards-incompatible change for Tornado to become compliant with the proposed new rules; changes like this should probably be hammered out on async-sig so that people like Glyph and I can participate and minimize the disruptions.

I think there's a path forward in which Tornado moves to using asyncio.Task instead of its own coroutine runner on Python versions where asyncio is available, but it needs participation from the asyncio side as well. For example, if the patch I proposed in this thread were merged, asyncio.Task would gain some features that Tornado's coroutine runner currently has, which would make it easier to make this move.

@asvetlov
Copy link
Member

asvetlov commented Sep 17, 2016

@bdarnell a PR for PEP 3156 updating is not exists yet but I want to publish it soon.
Everything that exists now is https://groups.google.com/forum/#!topic/python-ideas/MnzSruoDBms discussion (very short though).
I'll invite you and @glyph to PR's review when the PR will be ready. I hope to finish my work very soon, I've postponed this job until aiohttp 1.0 release which was done yesterday.
Maybe we'll decide that Task.current_task() and task cancellation are asyncio implementation details but now I insist they are should be a part of standard.

UPD: task cancellation is the part of the standard: https://www.python.org/dev/peps/pep-3156/#tasks
But task public API is not described well.

Don't get me wrong: I'd love to see Tornado compatible with aiohttp client (compatibility with server API is not Tornado's goal at all, sure).
aiohttp relies on documented asyncio implementation: asyncio.Task.current_task.
Unfortunately it's not documented explicitly by PEP 3156, this is the problem.

I've looked into Tornado's support for coroutines and found that the problem cannot be solved by couple lines patch, it's true. Please, correct me if I'm wrong but right now Tornado's coroutine runner doesn't support task cancellation, this is the main problem. Accessing to Task.current_task via the single Task class is another problem. current_task is the classmethod, it's really hard to override class methods without monkey patching or modifying Task._all_tasks and Task._current_tasks private class attributes (which both are implementation details, sure).

asyncio event loop has a AbstractEventLoop.create_task() method and even AbstractEventLoop.set_task_factory (both are not mentioned by PEP but reflected in asyncio docs again).

Theoretically Tornado may reimplement the create_task()method or register own task factory for returning something compliant with both Task and tornado.gen.Runner. Sorry, I don't know Tornado's internals well enough for providing working solution.

Your patch makes sense also.
Maybe it's the best way, I don't know honestly. Looks like the patch is harmless and very trivial.
One extra function call cannot slowdown task's execution time significantly -- but maybe we need to check it first.
Unfortunately I've missed to look on it at right time.
Now Python 3.6 is in beta stage, the time for adding new features is gone.
If after the patch resurrecting @gvanrossum will approve it for inclusion into 3.6 and this will solve all our problems with Tornado and asyncio compatibility -- I'll be happy.
On other hand it can be done by already mentioned loop.create_task()/loop.set_task_factory() by overriding Task._step with all gory details required by Tornado.
@gvanrossum and @1st1 what do you think about? I don't want to mention Ned Deily (Python 3.6 Release Manager) right now, maybe the problem could be solved without adding new API in beta period.
Maybe @Haypo has something to add also. Honestly I don't know other persons who are very deeply involved into asyncio but the list is open.

Anyway, aiohttp and all other libraries under aio-libs umbrella will keep compatibility with asyncio only, sorry.
We've rejected trollius for similar reason: it was the pain to support both yield from f() and yield trollius.From(f()).
Using async_timeout.timeout() allowed me to shrink low level aiohttp server about twice. Previously it was built on top of many .call_later() calls and proper delayed call cancellations.
The support of this coding style is really painful -- I had unable to keep all possible combinations in my brain at once.
After rewriting the code could be supported by average user.

That's it.

@bdarnell
Copy link

Interesting. Sounds like there's been a lot of public APIs added to the stdlib asyncio module without either being added to the PEP (which I thought was supposed to define everything interoperability-related) or in most cases even being discussed on the python-tulip or async-sig mailing lists.

EventLoop.create_task() is especially interesting. If we can freeze and document the Task interface, then I think Tornado's coroutine runner should be a superset of asyncio's, so I can plug it in and solve this problem. Implementing cancellation is not trivial, but doable.

But this is the most important part:

Anyway, aiohttp and all other libraries under aio-libs umbrella will keep compatibility with asyncio only, sorry.

Without some assurance that, if I make this change and get it working, you won't unilaterally break it again with no notice in the future, I don't know if I should bother. Instead maybe my time would be better served by figuring out how to distinguish Tornado native coroutines from asyncio native coroutines so we can handle the latter as "foreign" (whether that means wrapping them in an asyncio.Task or failing cleanly), instead of the cryptic sometimes-failure you get now (I'm not sure if this can be done at the application layer; it might need additions to the native coroutine implementation).

@asvetlov
Copy link
Member

Making Tornado's coroutine runner a superset of asyncio's sounds a great idea.

I have no striving to break Tornado's compatibility for freak, sure.
I want to write a code compatible with asyncio. When upstream adds a new API -- we could want to support it if it brings a value.
aiohttp and others keeps backward compatibility with older asyncio versions for long period (right now the minimal supported release is Python 3.4.2).
Is it works for you or do we need to discuss additional guarantees?

P.S.
Task.current_task was in the very first public asyncio release BTW -- it's now a brand new addition.

@gvanrossum
Copy link

On Sun, Sep 18, 2016 at 7:00 AM, Ben Darnell [email protected] wrote:

Interesting. Sounds like there's been a lot of public APIs added to the stdlib asyncio module without either being added to the PEP (which I thought was supposed to define everything interoperability-related) or in most cases even being discussed on the python-tulip or async-sig mailing lists.

Sorry about that. We've been using a variety of venues for these discussions, e.g. the asyncio tracker, python-ideas, python-dev, the bugs.python.org tracker. We also fell behind on updating the PEP -- I actually want it updated, and the API frozen for Python 3.6, and the PEP no longer provisional, so that future API changes will have to be backwards compatible.

So now's a good time to hammer out the remaining issues. Possibly the actual code is more amenable to reuse by Tornado than the PEP, in which case we should just make sure that the PEP describes that behavior in sufficient detail. It's also possible that changes to the actual code need to be made, in which case we're slightly more in a hurry (the Python 3.6 release schedule moves inexorably forward, see PEP 494.

@thehesiod
Copy link
Contributor

FYI: I've created PR #1524 to be able to achieve separate read + connection timeouts discussed in the fifth comment by @adamrothman

@fafhrd91
Copy link
Member

fafhrd91 commented Feb 8, 2017

closing as it is possible to pass None as timeout now

@fafhrd91 fafhrd91 closed this as completed Feb 8, 2017
@adamrothman
Copy link
Author

adamrothman commented Feb 8, 2017

@fafhrd91 I'm not sure the recent changes address the original issue, which was to be able to pass None and have no timeout be applied. That way I can use my own timeout or none at all.

@fafhrd91
Copy link
Member

fafhrd91 commented Feb 8, 2017

if you pass none no timeout logic will be applied to request

@fafhrd91
Copy link
Member

fafhrd91 commented Feb 8, 2017

@adamrothman
Copy link
Author

Sorry! I misunderstood. Thanks @fafhrd91 and @thehesiod.

@thehesiod
Copy link
Contributor

wow, just realized gvanrossum commented in this thread, impressive :)

@adamrothman
Copy link
Author

@bdarnell this server works just fine on Tornado 4.4.2 with timeout=None:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from asyncio import get_event_loop

from aiohttp import ClientSession
from tornado.httpserver import HTTPServer
from tornado.platform.asyncio import AsyncIOMainLoop
from tornado.web import Application
from tornado.web import RequestHandler


class Handler(RequestHandler):

    async def get(self):
        async with ClientSession() as client:
            async with client.get('https://www.google.com', timeout=None) as response:
                self.write(await response.text())


if __name__ == '__main__':
    AsyncIOMainLoop().install()

    app = Application([
        (r'/', Handler),
    ])
    server = HTTPServer(app)
    server.listen(9999)

    get_event_loop().run_forever()

@fafhrd91
Copy link
Member

fafhrd91 commented Feb 8, 2017

@adamrothman technically you can create custom TimeService and use it with session, in that case session can use custom timeout context manager.

@jcugat
Copy link
Contributor

jcugat commented Apr 24, 2017

This is the decorator I ended using based on @asvetlov's answer (there was a missing await):

import functools

def deco(func):
    @functools.wraps(func)
    async def wrapper(*args, **kwargs):
        return await asyncio.get_event_loop().create_task(func(*args, **kwargs))
    return wrapper

class Handler(tornado.web.RequestHandler):

    @deco
    async def get(self):
        ...

@fafhrd91
Copy link
Member

this situation with asyncio.Task.current_task is bad. api is published but there is no any way how to participate with it.

@gvanrossum
Copy link

What change would you propose for asyncio.Task.current_task? Doc changes? PEP changes? Code changes? Is there an issue in the bugs.python.org tracker? Something that can be fixed in 3.6.2? 3.7?

@fafhrd91
Copy link
Member

I think current_task() should be event loop's method

I will work on bug report.

@soar
Copy link

soar commented Jul 24, 2017

I've spent about 2 hours finding problem and right solution. Thank you @adamrothman @asvetlov and @jcugat - It would be great to see these snippets in docs.

@AraHaan
Copy link
Contributor

AraHaan commented Jul 25, 2017

@jcugat last time I knew AbstractEventLoop.create_task was an normal syncronous function, not an asyncronous one.

@gvanrossum
Copy link

gvanrossum commented Jul 25, 2017 via email

@lock
Copy link

lock bot commented Oct 28, 2019

This thread has been automatically locked since there has not been
any recent activity after it was closed. Please open a new issue for
related bugs.

If you feel like there's important points made in this discussion,
please include those exceprts into that new issue.

@lock lock bot added the outdated label Oct 28, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

9 participants