Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support s3 as published collab storage #798

Merged
merged 2 commits into from
Sep 11, 2024

Conversation

khorshuheng
Copy link
Collaborator

@khorshuheng khorshuheng commented Sep 6, 2024

This PR will introduce a new trait, PublishedCollabStore, which will handle the different operations for publish collabs. Two different implementation has been introduced: one write and read exclusively from postgres as per status quo, the other will write to both s3 and postgres, and read from s3 with postgres as fallback.

@khorshuheng khorshuheng marked this pull request as draft September 6, 2024 09:06
@khorshuheng khorshuheng force-pushed the supprot-s3-as-published-collab-storage branch from 95981e0 to 5938ff2 Compare September 9, 2024 08:59
@khorshuheng khorshuheng marked this pull request as ready for review September 9, 2024 09:03
@khorshuheng khorshuheng changed the title Draft: feat: Support s3 as published collab storage feat: Support s3 as published collab storage Sep 9, 2024
@khorshuheng khorshuheng force-pushed the supprot-s3-as-published-collab-storage branch from 5938ff2 to 9c193be Compare September 9, 2024 09:16
tokio::spawn(async move {
let result = bucket_client.put_blob(&object_key, &data).await;
if let Err(err) = result {
debug!("Failed to publish collab to S3: {}", err);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can create a mut vec![] to store all the handles returned by tokio::spawn, then await them all. this will ensure that if the endpoint returns success, we have indeed put all the data in S3 and avoid situation where a success code is returned to client, but there's error putting in S3.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, i was wondering whether we should do so. The downside of waiting for the request is, the time taken to make sure that the collab is written to both postgres and S3 will increase. Will need further discussion to determine whether the increase in latency is worth it or not, since we do have a fallback to postgres.

Ok(Some((metadata, resp.to_blob())))
},
Err(_) => {
let result = match select_published_data_for_view_id(&self.pg_pool, view_id).await? {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When sending collab state to S3, are we removing it from Postgres? Otherwise, I think we should check if we have Error::NotFound here. Without it, over transition period from Postgres→S3, we might get stale state from postgres.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, the collab state are written to both postgres and s3 on every publish, so both s3 and postgres will have the same data.

@khorshuheng khorshuheng force-pushed the supprot-s3-as-published-collab-storage branch from 9c193be to 15f9c44 Compare September 11, 2024 03:06
@appflowy appflowy merged commit abae8d2 into main Sep 11, 2024
9 of 10 checks passed
@appflowy appflowy deleted the supprot-s3-as-published-collab-storage branch September 11, 2024 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants