We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delta-rs version: 0.10.0
Binding: python
Environment:
What happened: deltalake can't write to tables created using generated columns. Also dt.metadata() doesn't return anything.
deltalake
dt.metadata()
What you expected to happen: Table should write out to the pre-created table.
How to reproduce it:
The following script should reproduce the same error.
from datetime import datetime from pyspark.sql import SparkSession import delta from delta import configure_spark_with_delta_pip import deltalake as dl import polars as pl builder = ( SparkSession.builder.appName("delta") .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension") .config( "spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog", ) ) spark = configure_spark_with_delta_pip(builder).getOrCreate() spark_dt = ( delta.DeltaTable.createOrReplace(spark) .tableName(f"my_table") .location(f"./my_table") .addColumn("timestamp", "timestamp") .addColumn("__year__", "INT", generatedAlwaysAs="YEAR(timestamp)") .addColumn("__month__", "INT", generatedAlwaysAs="MONTH(timestamp)") .addColumn("__day__", "INT", generatedAlwaysAs="DAY(timestamp)") .partitionedBy("__year__", "__month__") ) spark_dt.execute()
# Now try writing a dataframe to the table dt = dl.DeltaTable( "./spark-warehouse/my_table", without_files=True, ) df = pl.DataFrame( [ { "timestamp": datetime.fromisoformat("2021-01-01T00:00:00"), "__year__": 2021, "__month__": 1, "__day__": 1, } ] ).with_columns( pl.col("__year__").cast(pl.Int32), pl.col("__month__").cast(pl.Int32), pl.col("__day__").cast(pl.Int32), ) print(df.dtypes) table = df.to_arrow() dl.write_deltalake( "./spark-warehouse/my_table", table, mode="append", )
More details: It could also be useful to just have a rough delta protocol version mapping.
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Environment
Delta-rs version: 0.10.0
Binding: python
Environment:
Bug
What happened:
deltalake
can't write to tables created using generated columns. Alsodt.metadata()
doesn't return anything.What you expected to happen:
Table should write out to the pre-created table.
How to reproduce it:
The following script should reproduce the same error.
More details:
It could also be useful to just have a rough delta protocol version mapping.
The text was updated successfully, but these errors were encountered: