-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GCSStore
#1547
base: main
Are you sure you want to change the base?
Add GCSStore
#1547
Conversation
@SchahinRohani Here's the PR with the approach we discussed to add GCS Store |
@asr2003 could convert to a Draft PR until ready? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot! Before we move on to a more in-depth review, please ensure the following:
- Don't import openssl or reqwest as they pull in too many dependencies and won't easily build into a statically linked executable. GCS supports gRPC, so let's go with a tonic implementation (I believe the protos are here: https://github.com/googleapis/googleapis/blob/master/google/storage/v2/storage.proto and there is infra for it in the
nativelink-proto
directory). - Please ensure that we have test coverage for core functionality. The scope of this should be at least what the S3Store has.
- Regarding the failing CI jobs, something that might be helpful as well is to take a look at the https://www.nativelink.com/docs/contribute/guidelines#local-development-setup which comes with the exact versions and environment that our builds run in and the
bazel test //...
command which is what most of CI invokes. See also some details on toolchains etc here: https://www.nativelink.com/docs/contribute/bazel#test. Tbh I'm not sure whether the cargo build can generate protos at all 😅 My guess is that this part will only work with Bazel.
Reviewable status: 0 of 2 LGTMs obtained, and 0 of 11 files reviewed, and pending CI: Bazel Dev / ubuntu-24.04, Cargo Dev / ubuntu-22.04, Coverage, Installation / ubuntu-22.04, NativeLink.com Cloud / Remote Cache / ubuntu-24.04, Publish image, Publish nativelink-worker-init, Remote / lre-cc / large-ubuntu-22.04, Remote / lre-rs / large-ubuntu-22.04, docker-compose-compiles-nativelink (22.04), integration-tests (22.04), and 3 discussions need to be resolved
To work with bazel , I will agree with you @aaronmondal
Ahh, while the google-cloud-storage crate simplifies things, It seems the tonic/gRPC approach is a cleaner and more future-proof solution for interacting with GCS without pulling in unnecessary dependencies (reqwest/openssl). If Bazel and CI integration for tonic/protobuf generation is feasible I will rewrite them to this. Can I ahead with this @aaronmondal |
First of all, thank you for your effort @asr2003. |
5e4393d
to
19074d1
Compare
19074d1
to
069bbc4
Compare
Dear reviewers, Now
This are my views and any reviews and suggestions are much appreciated. Let me know if I am missing anything and I would like to incorporate them |
@asr2003 Looks much nicer than the previous implementation! Some things I'm noticing:
|
In the first post on the issue, there is an issue template. You should complete it like the other issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please include a summary of the changes and the related issue. Please also
include relevant motivation and context.
This is a placeholder for what you are supposed to add at the top of the issue.
@aaronmondal The gprc/tonic GCS store is tightly coupled with the |
Description
Please include a summary of the changes and the related issue. Please also
include relevant motivation and context.
Add
Google Cloud Storage Store
GCS Store
implementation closely mirrors theS3
version but adapts it to theGCS gRPC API
andmetadata
structure. It aligns with S3's structure and behavior while leveraging GCS-specific features. And it also eliminates manual generating ofprotobufs
mpsc::channel
andFuturesUnordered
. GCS currently processes chunks sequentially in the provided implementation.. GCS lacks native support for concurrent part uploads in a single upload session which can be a performance bottleneck for very large files.Fixes #659
/claim #659
Type of change
Please delete options that aren't relevant.
How Has This Been Tested?
Please also list any relevant details for your test configuration
Checklist
bazel test //...
passes locallygit amend
see some docsThis change is