Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_add_actions fails with deltalake 0.13 #1835

Closed
aersam opened this issue Nov 10, 2023 · 1 comment · Fixed by #1836
Closed

get_add_actions fails with deltalake 0.13 #1835

aersam opened this issue Nov 10, 2023 · 1 comment · Fixed by #1836
Labels
binding/python Issues for the Python package bug Something isn't working

Comments

@aersam
Copy link
Contributor

aersam commented Nov 10, 2023

Environment

Delta-rs version: 0.13.0

Binding: Python

Environment:

  • Cloud provider:
  • OS: Windows Server / Windows 10 (both tested)
  • Other:

Bug

What happened:

get_add_actions fails, with both flatten=True/False:

from deltalake import DeltaTable
dt = DeltaTable("tbl")
dt.get_add_actions()

gives:

thread '<unnamed>' panicked at crates\deltalake-core\src\table\state_arrow.rs:175:53:
called Option::unwrap() on a None value
stack backtrace:
   0:     0x7ffc6523beca - BrotliDecoderVersion
   1:     0x7ffc65263dbb - BrotliDecoderVersion
   2:     0x7ffc65237231 - BrotliDecoderVersion
   3:     0x7ffc6523bc4a - BrotliDecoderVersion
   4:     0x7ffc6523e6ba - BrotliDecoderVersion
   5:     0x7ffc6523e328 - BrotliDecoderVersion
   6:     0x7ffc6523ed6e - BrotliDecoderVersion
   7:     0x7ffc6523ec1a - BrotliDecoderVersion
   8:     0x7ffc6523cb89 - BrotliDecoderVersion
   9:     0x7ffc6523e960 - BrotliDecoderVersion
  10:     0x7ffc653a5365 - BrotliDecoderVersion
  11:     0x7ffc653a5412 - BrotliDecoderVersion
  12:     0x7ffc62951a14 - PyInit__internal
  13:     0x7ffc62746857 - BrotliDecoderSetParameter
  14:     0x7ffc6271b8ce - BrotliDecoderSetParameter
  15:     0x7ffc627377c1 - BrotliDecoderSetParameter
  16:     0x7ffca3f8961e - PyUnicode_ToDecimalDigit
  17:     0x7ffca3ed494b - PyObject_Vectorcall
  18:     0x7ffca3ed5da4 - PyEval_EvalFrameDefault
  19:     0x7ffca3ef2f73 - PyMapping_Check
  20:     0x7ffca3ef27db - PyEval_EvalCode
  21:     0x7ffca3f62582 - Py_SourceAsString
  22:     0x7ffca3f624fe - Py_SourceAsString
  23:     0x7ffca40a3df4 - PyThread_tss_is_created
  24:     0x7ffca4006b99 - PyRun_SimpleFileObject
  25:     0x7ffca40078e0 - PyRun_AnyFileObject
  26:     0x7ffca4007387 - PyDict_DelItemString
  27:     0x7ffca4007243 - PyDict_DelItemString
  28:     0x7ffca3fc0d80 - Py_RunMain
  29:     0x7ffca3fc0a2d - Py_RunMain
  30:     0x7ffca3ec81b5 - Py_Main
  31:     0x7ff62cf51230 - <unknown>
  32:     0x7ffcca5684d4 - BaseThreadInitThunk
  33:     0x7ffccbf61791 - RtlUserThreadStart
Traceback (most recent call last):
  File "k:\Software\AZCopy\CSVGetter\test_ds.py", line 5, in <module>
    dt.get_add_actions()
  File "K:\Software\AZCopy\CSVGetter\.venv\Lib\site-packages\deltalake\table.py", line 859, in get_add_actions
    return self._table.get_add_actions(flatten)

           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pyo3_runtime.PanicException: called Option::unwrap() on a None value

What you expected to happen:
Well, it should not panic

How to reproduce it:

Repo is here: tester.zip

More details:

@aersam aersam added the bug Something isn't working label Nov 10, 2023
@aersam
Copy link
Contributor Author

aersam commented Nov 10, 2023

Probably a duplicate of #1579
However it's worse now since setting flatten=True does not help anymore

aersam added a commit to bmsuisse/delta-rs that referenced this issue Nov 10, 2023
@ion-elgreco ion-elgreco added the binding/python Issues for the Python package label Nov 22, 2023
roeap pushed a commit that referenced this issue Nov 24, 2023
# Description
get_actions wrongly assumes that partition_columns from schema and
partitionValues from log must be the same. This is not true since
partition_columns are logical column names while partitionValues are
physical column names.

Tests pending

# Related Issue(s)

- closes #1835

# Documentation

https://github.com/delta-io/delta/blob/master/PROTOCOL.md#writer-requirements-for-column-mapping
"Track partition values and column level statistics with the physical
name of the column in the transaction log."

---------

Co-authored-by: Will Jones <[email protected]>
ion-elgreco pushed a commit to ion-elgreco/delta-rs that referenced this issue Nov 25, 2023
…#1836)

# Description
get_actions wrongly assumes that partition_columns from schema and
partitionValues from log must be the same. This is not true since
partition_columns are logical column names while partitionValues are
physical column names.

Tests pending

# Related Issue(s)

- closes delta-io#1835

# Documentation

https://github.com/delta-io/delta/blob/master/PROTOCOL.md#writer-requirements-for-column-mapping
"Track partition values and column level statistics with the physical
name of the column in the transaction log."

---------

Co-authored-by: Will Jones <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants