You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.
Add the boundary of 349_526 rows with 349_525 nulls and the last value specified the parquet file that is written is incorrect.
This seems to also be related to the row groups size: see original issue report: pola-rs/polars#6289
The most minimal example I could make is:
f=io.BytesIO()
df=pl.Series('a', [*[None]*349_525, [1, 2]], dtype=pl.List(pl.UInt32)).to_frame()
print(df.tail(1))
f.seek(0)
df.write_parquet(f)
f.seek(0)
print(pl.read_parquet(f).tail(1)) # we expect the same `[1, 2]` here, but we get `[null, null]`
Add the boundary of
349_526
rows with349_525
nulls and the last value specified the parquet file that is written is incorrect.This seems to also be related to the row groups size: see original issue report: pola-rs/polars#6289
The most minimal example I could make is:
The state of the
df
is:When we use the pyarrow backend for writing the output is as expected.
The text was updated successfully, but these errors were encountered: