-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Panic when reading feather file with 16-bit floating point column #3533
Comments
Rust does not support float16 natively. Theoretically it could be implemented with https://docs.rs/half/latest/half/ . |
Note to self. I should stop |
Not sure if this report is still 100% accurate but it seems like pyarrow also can't do much with float16. https://issues.apache.org/jira/browse/ARROW-13762 import pyarrow as pa
import numpy as np
In [44]: pa.array(np.array([1, 2.0], dtype='float32')).cast(pa.float64())
Out[44]:
<pyarrow.lib.DoubleArray object at 0x7f8372f1ca00>
[
1,
2
]
In [45]: pa.array(np.array([1, 2.0], dtype='float32')).cast(pa.float16())
---------------------------------------------------------------------------
ArrowNotImplementedError Traceback (most recent call last)
<ipython-input-45-abe3f5f71404> in <module>
----> 1 pa.array(np.array([1, 2.0], dtype='float32')).cast(pa.float16())
~/software/anaconda3/envs/polars_test/lib/python3.9/site-packages/pyarrow/array.pxi in pyarrow.lib.Array.cast()
~/software/anaconda3/envs/polars_test/lib/python3.9/site-packages/pyarrow/compute.py in cast(arr, target_type, safe)
373 else:
374 options = CastOptions.unsafe(target_type)
--> 375 return call_function("cast", [arr], options)
376
377
~/software/anaconda3/envs/polars_test/lib/python3.9/site-packages/pyarrow/_compute.pyx in pyarrow._compute.call_function()
~/software/anaconda3/envs/polars_test/lib/python3.9/site-packages/pyarrow/_compute.pyx in pyarrow._compute.Function.call()
~/software/anaconda3/envs/polars_test/lib/python3.9/site-packages/pyarrow/error.pxi in pyarrow.lib.pyarrow_internal_check_status()
~/software/anaconda3/envs/polars_test/lib/python3.9/site-packages/pyarrow/error.pxi in pyarrow.lib.check_status()
ArrowNotImplementedError: Unsupported cast from float to halffloat using function cast_half_float
In [46]: pa.array(np.array([1, 2.0], dtype='float16')).cast(pa.float16())
Out[46]:
<pyarrow.lib.HalfFloatArray object at 0x7f8372f1f7c0>
[
15360,
16384
]
In [47]: pa.array(np.array([1, 2.0], dtype='float16')).cast(pa.float32())
---------------------------------------------------------------------------
ArrowNotImplementedError Traceback (most recent call last)
<ipython-input-47-ac152adc06f1> in <module>
----> 1 pa.array(np.array([1, 2.0], dtype='float16')).cast(pa.float32())
~/software/anaconda3/envs/polars_test/lib/python3.9/site-packages/pyarrow/array.pxi in pyarrow.lib.Array.cast()
~/software/anaconda3/envs/polars_test/lib/python3.9/site-packages/pyarrow/compute.py in cast(arr, target_type, safe)
373 else:
374 options = CastOptions.unsafe(target_type)
--> 375 return call_function("cast", [arr], options)
376
377
~/software/anaconda3/envs/polars_test/lib/python3.9/site-packages/pyarrow/_compute.pyx in pyarrow._compute.call_function()
~/software/anaconda3/envs/polars_test/lib/python3.9/site-packages/pyarrow/_compute.pyx in pyarrow._compute.Function.call()
~/software/anaconda3/envs/polars_test/lib/python3.9/site-packages/pyarrow/error.pxi in pyarrow.lib.pyarrow_internal_check_status()
~/software/anaconda3/envs/polars_test/lib/python3.9/site-packages/pyarrow/error.pxi in pyarrow.lib.check_status()
ArrowNotImplementedError: Unsupported cast from halffloat to float using function cast_float |
For me it shows this with the package from pip: In [3]: polars_df = pl.read_ipc("test.ftr", use_pyarrow=True)
thread '<unnamed>' panicked at 'internal error: entered unreachable code', /github/home/.cargo/git/checkouts/arrow2-945af624853845da/f7c3daf/src/datatypes/mod.rs:240:24
---------------------------------------------------------------------------
PanicException Traceback (most recent call last)
<ipython-input-3-469b61d6a200> in <module>
----> 1 polars_df = pl.read_ipc("test.ftr", use_pyarrow=True)
~/software/anaconda3/envs/ctxcore/lib/python3.8/site-packages/polars/io.py in read_ipc(file, columns, n_rows, use_pyarrow, memory_map, storage_options, row_count_name, row_count_offset, rechunk, **kwargs)
833
834 tbl = pa.feather.read_table(data, memory_map=memory_map, columns=columns)
--> 835 return DataFrame._from_arrow(tbl, rechunk=rechunk)
836
837 return DataFrame._read_ipc(
~/software/anaconda3/envs/ctxcore/lib/python3.8/site-packages/polars/internals/frame.py in _from_arrow(cls, data, columns, rechunk)
441 DataFrame
442 """
--> 443 return cls._from_pydf(arrow_to_pydf(data, columns=columns, rechunk=rechunk))
444
445 @classmethod
~/software/anaconda3/envs/ctxcore/lib/python3.8/site-packages/polars/internals/construction.py in arrow_to_pydf(data, columns, rechunk)
567 pydf = pli.DataFrame._from_pandas(tbl.to_pandas())._df
568 else:
--> 569 pydf = PyDataFrame.from_arrow_record_batches(tbl.to_batches())
570 else:
571 pydf = pli.DataFrame([])._df
PanicException: internal error: entered unreachable code |
Hi @ritchie46, how difficult would FP16 integration in |
I don't want the extra code bloat and complexity of yet another type (that can be represented by f32/f64). |
What language are you using?
Python.
What version of polars are you using?
The latest version, as of this writing, version
0.13.39
.What operating system are you using polars on?
MacOS ARM and Ubuntu ARM.
What language version are you using
Python 3.9.
Describe your bug.
Polars panics when reading a Feather file containing a 16-bit floating point column with
polars.read_ipc()
.What are the steps to reproduce the behavior?
What is the actual behavior?
Polars panics with the following exception:
What is the expected behavior?
It would be nice if polars automatically upcasts
float16
tofloat32
, or possibly allows the user to upcast explicitly in order to be able to represent the column with polars.The text was updated successfully, but these errors were encountered: