-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Successful writes return error when using concurrent writers #2279
Comments
@helanto thanks for the issue : ) I am not sure though if that change you did in your repo would be the best way to solve this. Ideally you update the state of the table to the (commit version - 1), and then you merge the state with the actions that the write action did so you end up with the version |
Thanks for the quick reply! By "commit version" do you mean this? Maybe I'm misunderstanding, but I don't understand how this will create a correct snapshot. We don't necessarily know which version is the newest, or what the other actions that have committed since our successful |
Thank you @ion-elgreco for your reply! This means we need to read the state from the if let Some(mut snapshot) = this.snapshot {
# In case existing snapshot version is not the previous version
if snapshot.version() != version -1 {
# Update the snapshot to version - 1
snapshot.update(this.log_store.clone(), Some(version -1)).await?;
}
# Then we are good to merge!
snapshot.merge(actions, &operation, version)?;
Ok(DeltaTable::new_with_state(this.log_store, snapshot))
} else {
# The same
} |
Yes, the version that is return after doing the commit action. So I am saying you update the table snapshot to that version substracted by one, since only then merging the state with the actions of that writer will lead to the version of the commit action. |
@helanto Yes that should work I believe |
Environment
Delta-rs version:
deltalake 0.17.1
Binding:
rust
Environment:
S3
MacOs Sonoma 14.2.1
and using Docker base imagerust:1.76-bookworm
Bug
What happened:
I have two concurrent writers into the same table. Each writer performs an Append operation to the table inserting one row. I use
S3DynamoDbLogStore
to prevent data loss. One writer returns back an error while the other succeeds. However when querying the table both entries are successfully inserted.What you expected to happen:
When the writer succeeds, I expect the corresponding data to be part of the resulting table.
When the writer fails, I expect the data corresponding to that write not to be part of the resulting table.
How to reproduce it:
First we set up
S3DynamoDBLogStore
. Then we use two parallel threads to insert two rows in parallel.We get back:
When reading the table I get back both values (while the one write returned an error).
More details:
By digging a bit deeper into the code, seems that the error originates at the final part of the write operation. At the end of the write operation, the writer attempts to advance the (in-memory) snapshot so to include the commit made by the writer. However the snapshot was already advanced by a different process / writer, that the writer is not aware of. We get back a DeltaTableError::Generic("Version mismatch").
When making the change here the issue is fixed. Not sure however if this is correct path to follow.
The text was updated successfully, but these errors were encountered: