-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flow control support for large transfers #19
Comments
Hey Greg, i think it is too early to make mailing list for aiohttp, i dont want to create another dead mailing list.
large requests are also supported, you can pass generator as
you can also chain get request with post request:
i think main problem right now is documentation |
i created separate issue for file object problem #20 |
Oh cool, thanks for the info. I figured out the read side out but hadn't caught on to the send side; makes sense though. |
I tried this technique and it was still gobbling up memory. I wrote a quick test script so maybe you can tell me where I've gone horribly wrong: from asyncio import coroutine, get_event_loop
from aiohttp import EofStream, request
def send_data():
with open('big1Gfile', 'rb') as fp:
chunk = fp.read(65536)
while chunk:
yield chunk
chunk = fp.read(65536)
@coroutine
def func():
response = yield from request(
'PUT',
'https://host/path',
headers={'x-auth-token': 'blah'},
data=send_data(),
chunked=True)
try:
while True:
chunk = yield from response.content.read()
print(chunk)
except EofStream:
pass
response.close()
get_event_loop().run_until_complete(func()) |
I get similarly large memory usage when not using chunked transfer encoding as well. When I GET a large file the memory usage is fine, but I wonder if that's just because I can write the response to disk much faster than the network can receive. In other words, if I was chaining the GET to a PUT to another, much slower, host if the memory usage would balloon? |
Ah yeah, verified that a big GET with a slow reader of the response also uses a lot of memory. I've got to be doing something wrong but I'm not sure how to tell aiohttp how large its buffers may be: from asyncio import coroutine, get_event_loop, sleep
from aiohttp import EofStream, request
@coroutine
def func():
response = yield from request(
'GET',
'https://host/big1Gfile',
headers={'x-auth-token': 'blah'})
try:
while True:
chunk = yield from response.content.read()
yield from sleep(1)
except EofStream:
pass
response.close()
get_event_loop().run_until_complete(func()) |
Greg, You pointed to real problems. Client part has to implement flow control subsystem. |
Ah okay. No rush at all as I'm not trying to use this on a production environment or anything yet. I had read a bit on the flow control and it seems like something you'd want to take your time on to get just right. Thanks for all your work! |
I've just commited write flow control, could you try latest master for 'put' request. |
Yes, looks good on the write end of things. Uploaded a 1G file and my process never exceeded 21m resident memory now. It figures that read flow control is more difficult and yet less commonly an issue, hah. I was hitting my request timeout as I had set it to 60 thinking that was the time of no activity before expiring, but I now see it's the overall time of the request so I set that back to None. I don't suppose there's a no-activity timeout? I wonder if that's something I should put into my send_data generator somehow? |
i've just commited read flow control. read control flow is useful serverside, for example if you use third party http service. timeout is a timeout for sending request and receiving all headers. then you can use asyncio.wait_for to read response body. |
Seems to be working like a charm. Thank you much! |
Do you folks have a mailing list or IRC channel for aiohttp discussions? Or is creating Issues good enough?
My current thoughts lie in the aiohttp.client area. The HttpResponse looks to need some love with respect to large responses. It seems it would load a 5G download entirely into memory first if read() is called. Using content.read() directly is certainly okay, but I wonder if you have plans to improve this area already?
On the flip side, what about sending large requests? I'm pretty new to all of this (asyncio and aiohttp) so forgive me if this is obvious stuff to you folks. I've tried setting data to a file-like object from the standard open(path) but it just hangs for me. If I pass open(path).read() instead it works, but of course that's bad if it's a huge file.
Just to give some context, I'm working on an SDK to work as a client against OpenStack Swift / Rackspace Cloud Files, something I'm quite familiar with ;) as I've been working on that project for years now, but only in the Python 2.6 and 2.7 realm. This is my first real attempt with Python 3, specifically 3.4.
The text was updated successfully, but these errors were encountered: