-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement sliced object download in the Python Client library. #388
Comments
Thanks for your detailed request. I'll look into this. |
I'm surprised 16 slice download improves your time by 10x. Does it really take ten slices or more to saturate your download bandwidth? Is this perhaps functionally a workaround for some sort of bandwidth limiting in your ingress or Google's egress? |
I'm not sure - ran it on a GCE instance (e2-highmem-16). How can I specifically test the network saturation? |
Thanks, that should be enough info; if it was on a GCE instance then we can use that for analysis when we're tackling this feature. |
Sounds good - and to be clear, the key characteristic here is that when running a sliced download with |
I see, so it's CPU-bound on your use case. That will be the first thing to look into, then. Thanks. |
This is solved by #1002. |
@andrewsg @tqa236 Your help is really appreciated: Basically blob.download_to_filename and transfer_manager.download_chunks_concurrently not showing any difference on cloud run. whereas it works well on cloud shell. |
@gdhananjay I'm sorry, I don't have any insight into Cloud Run in particular and I'll have to recommend you reach out to support for that product. |
@gdhananjay If your issue persists and you believe it is a problem with the client library, please feel free to file a new issue here with more details as to observed performance of single-threaded vs. multi-threaded download, and the context of your application. It looked like you mentioned it was Cloud Functions Gen 2, which I thought was separate from Cloud Run - more information will be helpful. |
I tried on cloud run also. First option is Python client library as my application is in python, Download never goes beyond 67 MB/s . Tried with many combination of number of processes and chunk sizes using transfer manager. Attached are stats. My basic doubt is for 48 worker count it should at least consume more CPU. It seems it's not consuming more cpu. It is possible to verify download speed and if really using all cpu cores on cloud run. Fact is this work as expected on cloud shell. Could you guide me on how to reach out cloud run support, I already raised it in forum This is so critical for us as if it wont work we have to try AWS s3 and Lambda for required speed in download in python |
Also can we test this under platform support |
@gdhananjay Okay, please open a separate issue on this github tracker, as I won't get notifications for comments on this closed issue. When you open that separate issue, please answer this question as well: Are you sure you are not running into the maximum allocated network speed of your Cloud Run or Functions instance? Are there other services that you can access with higher throughput? |
Is your feature request related to a problem? Please describe.
My use case is to download a single, large blob (~16GBs) into memory in a Python application. This happens as part of a startup process that currently takes 5min. The command line utility,
gsutil
, has a way to enable sliced downloads and only takes 30s (same machine+network). I would like to take advantage of this optimization in a Pythonic way.Describe the solution you'd like
Enable sliced downloads in the Python client library such as:
blob.download_to_filename(..., sliced_downloads=True, max_components=16)
This would match
gsutil
which copies the blob to the local filesystem. It would be great, however, if the blob could be downloaded into memory like:blob.download_as_bytes(..., sliced_downloads=True, max_components=16)
Describe alternatives you've considered
Knowing that
gsutil
can run the download concurrently, I tried using thesubprocess
module to call it. This doesn't work bc it will not run more than one process unlike calling it from the command line. It's also not great to run a shell command from a Python process because it assumes Cloud SDK is setup.I've tried using
ChunkedDownloads
in conjunction withmultiprocessing
but I have not been able to get it to download chunks in parallel. There is also the additional overhead of dealing with the byte stream buffer, transport authentication, checksum/data validation, etc making it non-trivial.Additional context
Since
gsutil
is a Python executable itself, I would imagine this could be implemented in the client library (ultimately making the same HTTP Range Requests).The
gsutil
command I used on a GCE instance with 16 vCPUs:gsutil -o ‘GSUtil:parallel_thread_count=1’ -o ‘GSUtil:sliced_object_download_max_components=16’ cp gs://bucket/key /path/to/destination
Open to existing solution I'm not aware of, either, but documentation is sparse on this topic.
The text was updated successfully, but these errors were encountered: