transfer_manager sliced download is slow on cloud run #1093

gdhananjay · 2023-07-25T19:19:37Z

My aim is to achieve more download speed. 1GB file 4 cpu 8 GB memory using serverless method cloud run.

First option is Python client library as my application is in python, Download never goes beyond 67 MB/s . Tried with many combination of number of processes and chunk sizes using transfer manager. Attached are stats. My basic doubt is for 48 worker count it should at least consume more CPU. It seems it's not consuming more cpu.

Source code:

from google.cloud.storage import Client, transfer_manager
from datetime import datetime, timezone
import os

storage_client = Client()

bucket = storage_client.bucket('myBucket')
chunk_list = [33554432, 52428800, 78905344, 104857600]
work_list = [4, 8, 16, 22, 32, 48]
for chunk in chunk_list:
    for worker in work_list:
      blob = bucket.blob('data_sets/my1GbFile')
      print('download started: ', 'worker:', worker, 'chunk_size: ', chunk)
      start_time = datetime.now(timezone.utc)
      transfer_manager.download_chunks_concurrently(blob, '/tmp/myTmpFile_' + str(worker) + '_' + str(chunk), chunk_size=chunk, max_workers=worker)
      deltaTime = (datetime.now(timezone.utc) - start_time)
      executionTimeMilliSec = (round(deltaTime.total_seconds()*1000))
      print('download completed: ', worker, chunk, executionTimeMilliSec)
      os.remove('/tmp/myTmpFile_' + str(worker) + '_' + str(chunk))

Output stats on cloud shell :
output_cloudshell.txt

Output stats on cloud run:

gen 2- stats.csv

It is possible to verify download speed and if really using all cpu cores on cloud run. Fact is this work as expected on cloud shell.

Could you guide me on how to reach out cloud run support, I already raised it in forum
https://www.googlecloudcommunity.com/gc/Serverless/cloud-bucket-blob-download-is-very-slow-in-cloud-run/m-p/614852/highlight/true#M1926

Note: i tried cloud run with 8 cpu and 16 GB ram still no change in stats.

attached my dockerfile for cloud run:

Dockerfile.txt

The text was updated successfully, but these errors were encountered:

andrewsg · 2023-07-25T19:59:03Z

Can you verify that your Cloud Run instance can achieve higher download speeds from sources other than Cloud Storage? I'm unclear about the baseline performance expectations of Run instances.

gdhananjay · 2023-07-25T20:29:24Z

I will try on download from other than gcs. I also don't see any special comment on baseline performance in docs. I am searching it from last 4-5 days. Is it possible to check from your side ?
Actually this is very important, this can change track of my project. If thats the case it's also good to document in all gcs client libraries.

andrewsg · 2023-07-26T00:28:56Z

No, I do not have any information on Cloud Run performance I'm afraid. I would recommend actually performing realistic tests of downloading data from other sources from your actual application and seeing if any of them get substantially faster than your performance from GCS.

gdhananjay · 2023-08-13T06:12:39Z

I tried same container on GKE and Local machine and on cloud run. It seems cloud run is slow with download/upload speed.

product-auto-label bot added the api: storage Issues related to the googleapis/python-storage API. label Jul 25, 2023

gdhananjay mentioned this issue Jul 25, 2023

Implement sliced object download in the Python Client library. #388

Closed

BrennaEpp assigned andrewsg Jul 31, 2023

andrewsg closed this as completed Aug 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

transfer_manager sliced download is slow on cloud run #1093

transfer_manager sliced download is slow on cloud run #1093

gdhananjay commented Jul 25, 2023 •

edited

Loading

andrewsg commented Jul 25, 2023

gdhananjay commented Jul 25, 2023

andrewsg commented Jul 26, 2023

gdhananjay commented Aug 13, 2023

transfer_manager sliced download is slow on cloud run #1093

transfer_manager sliced download is slow on cloud run #1093

Comments

gdhananjay commented Jul 25, 2023 • edited Loading

andrewsg commented Jul 25, 2023

gdhananjay commented Jul 25, 2023

andrewsg commented Jul 26, 2023

gdhananjay commented Aug 13, 2023

gdhananjay commented Jul 25, 2023 •

edited

Loading