Generic DeltaTable error: Version mismatch with new schema merge functionality in AWS S3 #2262

liamphmurphy · 2024-03-07T21:52:17Z

Environment

Delta-rs version: python v0.16

Binding: ^^

Environment:

Cloud provider: AWS s3 with dynamo

Bug

What happened:

To test the rust engine, we cleared out any existing delta tables in our nonprod environment and switched from pyarrow over to the rust engine with schema merging, with this write_deltalake call:

 write_deltalake(s3_path, table, schema=pyarrow_schema, mode="append", engine="rust", partition_by=["Uid","date","hour"], schema_mode="merge", configuration={"delta.logRetentionDuration": "interval 7 day"})

Despite it being a brand new Delta table and after some successful writes, eventually the lambdas started erroring with Generic DeltaTable error: Version mismatch. I believe the error is coming from here:

delta-rs/crates/core/src/table/state.rs

Line 192 in 3e6a4d6

return Err(DeltaTableError::Generic("Version mismatch".to_string()));

What you expected to happen:

Especially since we are testing with a fresh table, I'd expect all writes to work (and not just some) even with the new schema merge flag set.

How to reproduce it:
I was not able to reproduce with a randomly generated dataset locally, so my guess is its something more to do with the dynamo locking on S3 If you have thoughts on how I could test this better, please let me know.

Note that we have roughly 10 concurrent lambdas that could potentially write to Lambda. However, before this change we had 50 writing with pyarrow and all was well.

The text was updated successfully, but these errors were encountered:

rtyler · 2024-03-09T01:10:38Z

Does this only manifest with the schema evolution? Or are you able to see errors with append or merge writes as well?

ion-elgreco · 2024-03-28T21:33:21Z

Does this only manifest with the schema evolution? Or are you able to see errors with append or merge writes as well?

It happens at any operation when there is concurrency and the state gets updated at the end

liamphmurphy added the bug Something isn't working label Mar 7, 2024

ion-elgreco added the storage/aws AWS S3 storage related label Mar 7, 2024

ion-elgreco mentioned this issue Mar 11, 2024

fix(rust): update snapshot when having concurrent writers #2280

Closed

ion-elgreco mentioned this issue Apr 7, 2024

feat(rust): advance state in post commit #2396

Merged

Blajda closed this as completed in #2396 Apr 27, 2024

Blajda closed this as completed in 28ad395 Apr 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generic DeltaTable error: Version mismatch with new schema merge functionality in AWS S3 #2262

Generic DeltaTable error: Version mismatch with new schema merge functionality in AWS S3 #2262

liamphmurphy commented Mar 7, 2024

rtyler commented Mar 9, 2024

ion-elgreco commented Mar 28, 2024

Generic DeltaTable error: Version mismatch with new schema merge functionality in AWS S3 #2262

Generic DeltaTable error: Version mismatch with new schema merge functionality in AWS S3 #2262

Comments

liamphmurphy commented Mar 7, 2024

Environment

Bug

rtyler commented Mar 9, 2024

ion-elgreco commented Mar 28, 2024