You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to read a delta table from Azure, but it throws an error:
importdeltalakepath="az://..."storage_options= {
"AZURE_STORAGE_TENANT_ID": "...",
"AZURE_STORAGE_CLIENT_ID": "...",
"AZURE_STORAGE_CLIENT_SECRET": "...",
"AZURE_STORAGE_ACCOUNT_NAME": "...",
}
dt=deltalake.DeltaTable(path, version=None, storage_options=storage_options)
# DeltaError: Delta protocol violation: Invalid action field: type for schemaString in metaData action should be string
Full stacktrace:
---> 17 dt = deltalake.DeltaTable(full_path, version=None, storage_options=credentials)
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-a6639292-dd1e-4998-a05e-f09df9265cb7/lib/python3.10/site-packages/deltalake/table.py:238, in DeltaTable.__init__(self, table_uri, version, storage_options, without_files)
225 """
226 Create the Delta Table from a path with an optional version.
227 Multiple StorageBackends are currently supported: AWS S3, Azure Data Lake Storage Gen2, Google Cloud Storage (GCS) and local URI.
(...)
235 DeltaTable will be loaded with a significant memory reduction.
236 """
237 self._storage_options = storage_options
--> 238 self._table = RawDeltaTable(
239 str(table_uri),
240 version=version,
241 storage_options=storage_options,
242 without_files=without_files,
243 )
244 self._metadata = Metadata(self._table)
DeltaError: Delta protocol violation: Invalid action field: type for schemaString in metaData action should be string
I was able to get details on the table I am trying to read via Databricks delta package:
After some digging, it seems like our Databricks job first does a DLT REFRESH operation, which creates version 0 of the table. This has no metadata (this trips up deltalake). Then it does a DLT SETUP operation, which creates version 1 and adds metadata. Then it does a WRITE operation. Version 0 is invalid (no metadata) which crashes deltalake.
What you expected to happen:
I would expect to be able to read the table normally.
How to reproduce it:
I don't know how to help you reproduce this.
The text was updated successfully, but these errors were encountered:
Environment
Delta-rs version: 0.10.0
Binding: Python
Environment:
Bug
What happened:
I'm trying to read a delta table from Azure, but it throws an error:
Full stacktrace:
I was able to get details on the table I am trying to read via Databricks
delta
package:After some digging, it seems like our Databricks job first does a DLT REFRESH operation, which creates version 0 of the table. This has no metadata (this trips up
deltalake
). Then it does a DLT SETUP operation, which creates version 1 and adds metadata. Then it does a WRITE operation. Version 0 is invalid (no metadata) which crashesdeltalake
.What you expected to happen:
I would expect to be able to read the table normally.
How to reproduce it:
I don't know how to help you reproduce this.
The text was updated successfully, but these errors were encountered: