Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

un-tracked columns are giving json error when pyarrow schema have feild with nullable=False and create_checkpoint is trigged #2675

Closed
sherlockbeard opened this issue Jul 16, 2024 · 0 comments · Fixed by #2680
Labels
bug Something isn't working

Comments

@sherlockbeard
Copy link
Contributor

Environment

Delta-rs version:
0.18.2

Binding:
python

Environment:

  • Cloud provider:
  • OS: mac m1
  • Other:

Bug

What happened:
un-tracked columns are giving Json error when pyarrow schema have feild with nullable=False and create_checkpoint is trigged

What you expected to happen:
create_checkpoint command is successful

How to reproduce it:

import deltalake
import pyarrow as pa
import polars as pl
from importlib.metadata import version
print(version("deltalake"))

pylist = [{'year': 2023, 'n_party': 0}, {'year': 2024, 'n_party': 1}]

my_schema = pa.schema([

    pa.field('year', pa.int64(), nullable=False),

    pa.field('n_party', pa.int64(), nullable=False),])

data = pa.Table.from_pylist(pylist, schema=my_schema)

print(data)
deltalake.write_deltalake("temp5", data, configuration={"delta.dataSkippingNumIndexedCols": "1"},)


print(deltalake.DeltaTable("temp5").to_pyarrow_table())

deltalake.DeltaTable("temp5").create_checkpoint()

print(deltalake.DeltaTable("temp5").to_pyarrow_table())

error

    self._table.create_checkpoint()
Exception: Json error: whilst decoding field 'add': whilst decoding field 'stats_parsed': whilst decoding field 'minValues': Encountered unmasked nulls in non-nullable StructArray child: Field { name: "n_party", data_type: Int64, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }

More details:
see https://delta-users.slack.com/archives/C013LCAEB98/p1721098853604139 thread

@sherlockbeard sherlockbeard added the bug Something isn't working label Jul 16, 2024
@sherlockbeard sherlockbeard changed the title un-tracked columns are giving Json error when pyarrow schema have feild with nullable=False and create_checkpoint is trigged un-tracked columns are giving json error when pyarrow schema have feild with nullable=False and create_checkpoint is trigged Jul 16, 2024
rtyler pushed a commit that referenced this issue Jul 18, 2024
when creating min , max  schema make sure the nullable is true 
closes #2675
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant