You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I expected the files to be different (different compressed sizes), but they are byte-by-byte identical. As a consequence the batch sizes are lost when reading the data back.
Do I assume correctly the file should consist of chunksize long buffers for each column (per recordbatch) and these buffers are independently compressed using lz4 or zstd?
Component(s)
Python, C++, Format
The text was updated successfully, but these errors were encountered:
Describe the usage question you have. Please include as many useful details as possible.
I tried this with pyarrow 19:
I expected the files to be different (different compressed sizes), but they are byte-by-byte identical. As a consequence the batch sizes are lost when reading the data back.
Do I assume correctly the file should consist of chunksize long buffers for each column (per recordbatch) and these buffers are independently compressed using lz4 or zstd?
Component(s)
Python, C++, Format
The text was updated successfully, but these errors were encountered: