Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI integration test fails randomly: test_restore_by_datetime #1925

Closed
ion-elgreco opened this issue Nov 30, 2023 · 4 comments
Closed

CI integration test fails randomly: test_restore_by_datetime #1925

ion-elgreco opened this issue Nov 30, 2023 · 4 comments
Labels
bug Something isn't working

Comments

@ion-elgreco
Copy link
Collaborator

One of our tests (test_restore_by_datetime) in the CI - integration tests sometimes randomly fails: https://github.com/delta-io/delta-rs/actions/runs/7044053111/job/19170981661?pr=1924

@ion-elgreco ion-elgreco added the bug Something isn't working label Nov 30, 2023
@r3stl355
Copy link
Contributor

There are 3 writes to the table in that test, could all 3 be within the same millisecond? The test is picking up v1 timestamp then retrieving a version for it. The error can be thrown if the snapshot version also has the same timestamp

@r3stl355
Copy link
Contributor

Interesting results after some extra tracing within this test

All the history timestamps are [1703869558237, 1703869559278, 1703869560403, 1703869561411] showing that they are in sequence with 1 sec gap between them so I doubt the issue can be fixed by adding a gap between writes so I am going to remove the current delay between writes from the test

Test is looking for a timestamp 1703869559278, which is a correct value at index 1. I confirmed that is gets converted to a correct DateTime (converting it to timestamps gives a correct value: 1703869559278)

So looks like there is something wrong with the lookup logic

@r3stl355
Copy link
Contributor

Problem located - this is not a flaky team IMO but a bug.

It starts here with creating a new DeltaTable with no history/timestamps loaded: https://github.com/delta-io/delta-rs/blob/02e26e5ecd7a8a64609bd21e8cb30e9bad4cc17b/crates/deltalake-core/src/operations/restore.rs#L142C44-L142C44

The consequences surface here: https://github.com/delta-io/delta-rs/blob/02e26e5ecd7a8a64609bd21e8cb30e9bad4cc17b/crates/deltalake-core/src/table/mod.rs#L884C28-L884C49 in turn leading to

Consequences: If your writes are fast enough to millisecond between the start of write and commit to storage - you are good, otherwise not and the restore to DataTime will not works as expected, possibly resulting in an earlier version being restored.

I'll submit a fix in the next days

@rtyler
Copy link
Member

rtyler commented Jan 3, 2024

I believe @r3stl355 fixed this

@rtyler rtyler closed this as completed Jan 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants