Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_last_checkpoint size field is incorrect #1468

Closed
chelseajonesr opened this issue Jun 15, 2023 · 0 comments · Fixed by #1477
Closed

_last_checkpoint size field is incorrect #1468

chelseajonesr opened this issue Jun 15, 2023 · 0 comments · Fixed by #1477
Labels
binding/rust Issues for the Rust crate bug Something isn't working good first issue Good for newcomers

Comments

@chelseajonesr
Copy link

The size field in the _last_checkpoint file should be the number of actions that are stored in the checkpoint. The code is instead writing the size of the file (which could be written to the optional field sizeInBytes).

field definitions

size | The number of actions that are stored in the checkpoint.
...
sizeInBytes | The number of bytes of the checkpoint. This field is optional.

https://github.com/delta-io/delta-rs/blob/0dda99bbb080d5cf8a6ecc350d0a789e7fa56549/rust/src/action/checkpoints.rs#LL122C43-L122C43

    let size = parquet_bytes.len() as i64;
    let checkpoint = CheckPoint::new(version, size, None);
@wjones127 wjones127 added bug Something isn't working binding/rust Issues for the Rust crate good first issue Good for newcomers labels Jun 15, 2023
cmackenzie1 added a commit to cmackenzie1-contrib/delta-rs that referenced this issue Jun 19, 2023
…actions

The `size` field should be the number of actions stored in the checkpoint
while `sizeInBytes` is used for the total size in bytes.

Added `CheckPointBuilder` to make the creation of these easier to use.

- Closes delta-io#1468
cmackenzie1 added a commit to cmackenzie1-contrib/delta-rs that referenced this issue Jun 20, 2023
…actions

The `size` field should be the number of actions stored in the checkpoint
while `sizeInBytes` is used for the total size in bytes.

Added `CheckPointBuilder` to make the creation of these easier to use.

- Closes delta-io#1468
wjones127 pushed a commit that referenced this issue Jun 20, 2023
…actions (#1477)

# Description
The `size` field should be the number of actions stored in the
checkpoint while `sizeInBytes` is used for the total size in bytes.

Added `CheckPointBuilder` to make the creation of these easier to use.

# Related Issue(s)
- Closes #1468 

# Documentation


https://github.com/delta-io/delta/blob/master/PROTOCOL.md#last-checkpoint-file
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/rust Issues for the Rust crate bug Something isn't working good first issue Good for newcomers
Projects
None yet
2 participants