-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cleanup download utils #8273
cleanup download utils #8273
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/8273
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit dabc313 with merge base f21c6bd (): BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pmeier , LGTM but please check my comment below
for chunk in content: | ||
# filter out keep-alive new chunks | ||
if not chunk: | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new changes are only equivalent to the old ones if this continue
was meant to be a break
. Was this meant to be a break
? In other words, can there be an empty chunk and then a non-empty chunk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TL;DR they are equivalent since content
now can only be one type or iterator for which the branch will never trigger.
These keep-alive chunks are b""
. Content iterators coming from requests
could actually contain them. However, the content iterator coming from our _urlretrieve
function cannot.
The way iter
works when you have two inputs is quite unintuitive. Basically waht iter(lambda: response.read(chunk_size), b"")
does is it calls the lambda
forever until that returns the sentinel b""
. If that happens it raises a StopIteration
. Meaning
for chunk in iter(lambda: response.read(chunk_size), b""):
if not chunk:
# this will never be reached
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To answer your question directly:
In other words, can there be an empty chunk and then a non-empty chunk?
No.
items = [b"foo", b"", b"bar"]
idx = 0
def fn():
global idx
item = items[idx]
idx += 1
return item
for chunk in iter(fn, b""):
print(chunk)
b'foo'
Reviewed By: vmoens Differential Revision: D55062790 fbshipit-source-id: d85114aa859d025a467642ee53197ff1d88b1d1d
Follow-up to #8237. We only had the
_save_response_content
as a separate function because it was used for the GDrive as well. Now, we can just inline it.