Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adopt the delta kernel types #2489

Closed
rtyler opened this issue May 7, 2024 · 6 comments
Closed

Adopt the delta kernel types #2489

rtyler opened this issue May 7, 2024 · 6 comments
Assignees
Labels
binding/rust Issues for the Rust crate enhancement New feature or request

Comments

@rtyler
Copy link
Member

rtyler commented May 7, 2024

What feels like long ago, @roeap imported some kernel types manually into the kernel module from the early work on delta-kernel-rs. Now delta_kernel has been released with a 0.0.1 we can move to adopting those symbols and begin incrementally adopting functionality from the kernel.

This issue is just about the types. I'm not expecting to pull any of the log replay or snapshot code in

@rtyler rtyler added enhancement New feature or request binding/rust Issues for the Rust crate labels May 7, 2024
@ion-elgreco
Copy link
Collaborator

I noticed the types were incorrect in kernel for timestamps. Would be good to backport fixes from delta-rs into kernel, otherwise we will have to fix some bugs again ^^

@rtyler
Copy link
Member Author

rtyler commented May 7, 2024

@ion-elgreco we do indeed have some divergence 😱

How confident are you in our tests for timestamp support? 😄

@nicklan
Copy link

nicklan commented May 7, 2024

I noticed the types were incorrect in kernel for timestamps. Would be good to backport fixes from delta-rs into kernel, otherwise we will have to fix some bugs again ^^

We should definitely just fix those in kernel. @ion-elgreco , do you have links to the fixes by any chance? Or just the divergence so I can figure out what needs to change? I can probably figure it out but might be faster if you just point me at it.

Thanks!

@ion-elgreco
Copy link
Collaborator

@nicklan I fixed the decimal parsing to prevent decimals being parsed that are beyond 38 precision or scale, see here (https://github.com/delta-io/delta-rs/pull/2332/files):

if precision > DECIMAL_MAX_PRECISION || scale > DECIMAL_MAX_SCALE {
return Err(serde::de::Error::custom(format!(
"Precision or scale is larger than 38: {}, {}",
precision, scale
)));

and here:

pub fn decimal(precision: u8, scale: i8) -> Result<Self, ProtocolError> {
if precision > DECIMAL_MAX_PRECISION || scale > DECIMAL_MAX_SCALE {
return Err(ProtocolError::InvalidField(format!(
"decimal({},{})",
precision, scale
)));
}
Ok(DataType::Primitive(PrimitiveType::Decimal(

Here for timestampNtz:

Timestamp(val) => {
Arc::new(TimestampMicrosecondArray::from_value(*val, num_rows).with_timezone("UTC"))
}

and some parsing of multiple formats here: https://github.com/delta-io/delta-rs/pull/2383/files#diff-8e1a60f799cd1ad29923ee75871099a19c15ddfc00bef4617d82bdc02cbe3198

@roeap
Copy link
Collaborator

roeap commented May 8, 2024

take

@roeap
Copy link
Collaborator

roeap commented May 8, 2024

started work on this, we do require some updates on kernel first to be able to consume the types, but hoping that this will not be too invasive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/rust Issues for the Rust crate enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants