Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to merge column names starting from numbers #2230

Closed
t1g0rz opened this issue Feb 28, 2024 · 2 comments · Fixed by #2271
Closed

Unable to merge column names starting from numbers #2230

t1g0rz opened this issue Feb 28, 2024 · 2 comments · Fixed by #2271
Labels
binding/python Issues for the Python package bug Something isn't working

Comments

@t1g0rz
Copy link
Contributor

t1g0rz commented Feb 28, 2024

Environment

Delta-rs version:
deltalake==0.15.3
python 3.11

Binding:
python

Environment:

  • Cloud provider: Local
  • OS: MacOS
  • Other:

Bug

What happened:
Delta tables with column names starting from numbers are created without problems until you try to merge data.

DeltaError: Generic DeltaTable error: Schema error: No field named s. Valid fields are s.time, s."1inch", __delta_rs_source, t.time, t."1inch", t.__delta_rs_path, __delta_rs_target.

What you expected to happen:
Just merge data into the table, or tell me that it's impossible. But merging through Spark works just fine.

How to reproduce it:

from deltalake import DeltaTable
import pandas as pd
import pyarrow as pa

DeltaTable.create(
    table_uri="/tmp/test_3",
    schema=pa.schema([
        pa.field("time", pa.timestamp("us"), nullable=False),
        pa.field("1inch", pa.float64(), nullable=False),
    ]),
    mode="overwrite"
)

dl3 = DeltaTable("/tmp/test_3")

print(dl3.to_pandas().columns)   # Index(['time', '1inch'], dtype='object')

dl3.merge(
    pd.DataFrame({
        "time": pd.to_datetime(pd.Series(["2021-01-01 00:00:00"])),
        "1inch": [1.0]
    }),
    predicate="s.time = t.time",
    source_alias='s',
    target_alias='t'
).when_not_matched_insert_all().execute()

More details:

@t1g0rz t1g0rz added the bug Something isn't working label Feb 28, 2024
@echai58
Copy link

echai58 commented Mar 1, 2024

looks to be the same underlying problem as #2167, you can workaround by

.when_not_matched_insert({
    "time": "s.`time`",
    "1inch": "s.`1inch`"
})

@ion-elgreco
Copy link
Collaborator

I think I can put in a fix to always wrap a column in ``

@rtyler rtyler added the binding/python Issues for the Python package label Mar 7, 2024
ion-elgreco added a commit that referenced this issue Mar 9, 2024
…ctions (#2271)

# Description
- Always encapsulates column names in backticks to in the insert_all and
update_all calls.
- Added note that users need to add backticks for special column names
- Removed bigint cast, this was temporarily needed while we were still
relying on a physical plan

# Related Issue(s)
- closes #2230
- closes #2167
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants