This repository has been archived by the owner on Feb 18, 2024. It is now read-only.
Round Trip [Rust -> arrow2_convert -> Arrow -> Parquet -> Arrow -> Rust] #1376
Labels
bug
Something isn't working
Hi! I've been cobbling together an end to end example of serializing deeply nested Rust structs through arrow2_convert to parquet and back (using their complex example with the fixed size buffers removed since fixed sized types are not currently implemented in arrow2).
I've reproduced the relevant code from the arrow2_convert
complex_example.rs
example, without fixed sized types. I'm able to create a Vec, convert to Arrow, and serialize the result to a buffer of Parquet bytes. Metadata + statistics reading back out of the buffer appears fine, but when I iterate through the chunks, I get an OutOfSpec error:I have a feeling this is just a serialization misuse on my end that's getting propogated downstream. Attempting to read the intermediate Parquet file with Pyarrow results in an
OSError: Malformed levels. min: 0 max: 3 out of range. Max Level: 2
, which again points to a serialization error. My suspicion is that I'm not defining the row groups (RowGroupIterator
) correctly, specifically theencodings
, which I have as avec![vec![Encoding::Plain; 25]]
.I'm happy to contribute this back as a test case after we get it working; I think it is a common usecase that'd be valuable to document.
Thanks for the help!
Deeply Nested Structs & arrow2_convert implementation
Failing Test Case
Rust Backtrace
The text was updated successfully, but these errors were encountered: