You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of Polars.
Issue description
On really precise conditions, apparently when row_group_size is the third of the Dataframe height, the pl.List(pl.UInt32) type will not be written properly
Reproducible example
importpolarsaspldf=pl.Series('a', [*[None]*900_000, [1, 2]], dtype=pl.List(pl.UInt32)).to_frame()
print(df.tail(1)) # all the output should be the same as this one# shape: (1, 1)# ┌──────────────┐# │ a │# │ --- │# │ list[u32] │# ╞══════════════╡# │ [1, 2] │# └──────────────┘df.write_parquet('test.pq')
print(pl.read_parquet('test.pq').tail(1)) # not the same (not working)# shape: (1, 1)# ┌──────────────┐# │ a │# │ --- │# │ list[u32] │# ╞══════════════╡# │ [null, null] │# └──────────────┘df.write_parquet('test.pq', row_group_size=300_000)
print(pl.read_parquet('test.pq').tail(1)) # the same (working)# shape: (1, 1)# ┌──────────────┐# │ a │# │ --- │# │ list[u32] │# ╞══════════════╡# │ [1, 2] │# └──────────────┘df.write_parquet('test.pq', row_group_size=300_001)
print(pl.read_parquet('test.pq').tail(1)) # not the same (not working)# shape: (1, 1)# ┌──────────────┐# │ a │# │ --- │# │ list[u32] │# ╞══════════════╡# │ [null, null] │# └──────────────┘
Polars version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of Polars.
Issue description
On really precise conditions, apparently when row_group_size is the third of the Dataframe height, the pl.List(pl.UInt32) type will not be written properly
Reproducible example
Expected behavior
Installed versions
The text was updated successfully, but these errors were encountered: