Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extra disk space usage when scheduling multiple large background uploads to S3 #5022

Closed
tspop opened this issue Oct 30, 2023 · 2 comments
Closed
Labels
pending-community-response Issue is pending response from the issue requestor question General question

Comments

@tspop
Copy link

tspop commented Oct 30, 2023

Let's say I want to schedule background uploads for 5 large files, each being 1GB in size.

Will the S3 SDK divide these files into smaller chunks before initiating the upload, potentially utilizing an additional 5GB of storage space?

@phantumcode
Copy link
Contributor

@tspop Thanks for submitting your question. When uploading files larger than 5MB, Amplify does rely on temporary local caching to chunk the file being uploaded into smaller chunks of 5MBs, thus requiring additional space to temporarily store and upload the chunked parts.

@phantumcode phantumcode added question General question pending-community-response Issue is pending response from the issue requestor labels Oct 30, 2023
@zamzamfp
Copy link

zamzamfp commented Sep 17, 2024

Hi @phantumcode We are facing an issue with the AWS iOS SDK when using multipart upload to upload large files (e.g., 10GB). The SDK chunks the entire file into smaller parts at the beginning, before the upload even starts. This behaviour requires the device to have enough additional storage space (e.g., 10GB extra) to accommodate the chunked file. It also significantly delays the upload process, as it can take up to a minute for the chunking to complete, with no clear indication of this happening unless debug logging is enabled.

Questions:

  1. Is there a technical reason the chunking process has to occur entirely before the upload starts, rather than chunking parts on-the-fly (i.e., chunk the first parts, upload them, and then proceed with the rest)?
  2. Are there any plans to change this behavior in the future to improve efficiency?
  3. Is there a way to track the progress of the chunking process so we can communicate it in the UI?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending-community-response Issue is pending response from the issue requestor question General question
Projects
None yet
Development

No branches or pull requests

4 participants