-
-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support s3 as published collab storage #798
Conversation
95981e0
to
5938ff2
Compare
5938ff2
to
9c193be
Compare
tokio::spawn(async move { | ||
let result = bucket_client.put_blob(&object_key, &data).await; | ||
if let Err(err) = result { | ||
debug!("Failed to publish collab to S3: {}", err); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can create a mut vec![]
to store all the handles returned by tokio::spawn
, then await them all. this will ensure that if the endpoint returns success, we have indeed put all the data in S3 and avoid situation where a success code is returned to client, but there's error putting in S3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, i was wondering whether we should do so. The downside of waiting for the request is, the time taken to make sure that the collab is written to both postgres and S3 will increase. Will need further discussion to determine whether the increase in latency is worth it or not, since we do have a fallback to postgres.
Ok(Some((metadata, resp.to_blob()))) | ||
}, | ||
Err(_) => { | ||
let result = match select_published_data_for_view_id(&self.pg_pool, view_id).await? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When sending collab state to S3, are we removing it from Postgres? Otherwise, I think we should check if we have Error::NotFound here. Without it, over transition period from Postgres→S3, we might get stale state from postgres.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, the collab state are written to both postgres and s3 on every publish, so both s3 and postgres will have the same data.
9c193be
to
15f9c44
Compare
This PR will introduce a new trait, PublishedCollabStore, which will handle the different operations for publish collabs. Two different implementation has been introduced: one write and read exclusively from postgres as per status quo, the other will write to both s3 and postgres, and read from s3 with postgres as fallback.