You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What happened: When writing to a delta table, the metadata is checked when evaluating if the table can be written. I am using unity catalog from databricks and it is storing field-level comments in the field metadata. When I am appending new data to an existing table that I wrote comments for, I cannot write the data because my local table does not have the comment information in the field metadata.
What you expected to happen: The data is appended if the fields and datatypes of the fields match. If schema mode is not overwrite, then the existing metadata is unchanged and the new data is added to the table.
How to reproduce it:
write a delta table with metadata
Try to write new data with mode="append" with a compatible local table
You will see ValueError: Schema of data does not match table schema
More details:
I believe the check occurs here, where if there are any differences at all in the schema there is an error. I would think if the field names/types match then the data could be written.
Sorry it seems that I was mistaken. When trying to reproduce I found out that the source of error was not this function. Would you be open to a PR updating the schema validation error message? It would be easier to debug in this scenario
iftable: # already existsifsort_arrow_schema(schema) !=sort_arrow_schema(
table.schema().to_pyarrow(as_large_types=large_dtypes)
) andnot (mode=="overwrite"andschema_mode=="overwrite"):
table_schema=table.schema().to_pyarrow(as_large_types=large_dtypes)
table_fields=set(zip(table_schema.names, table_schema.types))
data_fields=set(zip(schema.names, schema.types))
missing=table_fields-data_fieldsextra=data_fields-table_fieldsraiseValueError(
"Schema of data does not match table schema\n"f"Missing: {missing}\n Extra: {extra}"
)
Environment
Delta-rs version: 0.16.4
Binding: python
Environment:
Bug
What happened: When writing to a delta table, the metadata is checked when evaluating if the table can be written. I am using unity catalog from databricks and it is storing field-level comments in the field metadata. When I am appending new data to an existing table that I wrote comments for, I cannot write the data because my local table does not have the comment information in the field metadata.
What you expected to happen: The data is appended if the fields and datatypes of the fields match. If schema mode is not overwrite, then the existing metadata is unchanged and the new data is added to the table.
How to reproduce it:
mode="append"
with a compatible local tableMore details:
I believe the check occurs here, where if there are any differences at all in the schema there is an error. I would think if the field names/types match then the data could be written.
delta-rs/python/deltalake/writer.py
Lines 347 to 354 in aa8f4d5
The text was updated successfully, but these errors were encountered: