You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If I run code like the following on a Parquet file that contains nulls, I get an error:
import polars as pl
pqt_file = <path to a Parquet file containing nulls>
pl.scan_parquet(pqt_file).select(pl.col("*")).collect()
The error is Any(ArrowError(NotYetImplemented("Reading Null from parquet still not implemented"))).
If I instead read the data using PyArrow first, I get a different error:
import polars as pl
import pyarrow.dataset as ds
pqt_file = <path to a Parquet file containing nulls>
data = ds.dataset(pqt_file)
df = pl.from_arrow(data.to_table())
df.lazy().select(pl.col("*")).collect()
In this case, the error is InvalidArgumentError("all columns in a record batch must have the same length"), though I suspect the underlying issue is the same. This only seems to happen when using the lazy API; if I read the file using pl.read_parquet it seems to work fine.
Is reading nulls from Parquet expected to be implemented any time soon?
The text was updated successfully, but these errors were encountered:
If I run code like the following on a Parquet file that contains nulls, I get an error:
The error is
Any(ArrowError(NotYetImplemented("Reading Null from parquet still not implemented")))
.If I instead read the data using PyArrow first, I get a different error:
In this case, the error is
InvalidArgumentError("all columns in a record batch must have the same length")
, though I suspect the underlying issue is the same. This only seems to happen when using the lazy API; if I read the file usingpl.read_parquet
it seems to work fine.Is reading nulls from Parquet expected to be implemented any time soon?
The text was updated successfully, but these errors were encountered: