Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Row order no longer preserved after merge operation #2165

Closed
stinodego opened this issue Feb 5, 2024 · 2 comments
Closed

Row order no longer preserved after merge operation #2165

stinodego opened this issue Feb 5, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@stinodego
Copy link

stinodego commented Feb 5, 2024

Environment

Delta-rs version: 0.15.2

Binding: Python

Environment:

  • Cloud provider: n/a
  • OS: WSL Ubuntu 22
  • Other: Python 3.12

Bug

What happened:

Row order is inconsistent after a merge operation.

What you expected to happen:

Keep existing row order.

How to reproduce it:

Run the script below a few times (it will fail randomly).

import polars as pl
from polars.testing import assert_frame_equal

path = "my_table"
df = pl.DataFrame({"a": [1, 2, 3]})

df.write_delta(path, mode="overwrite")

merger = df.write_delta(
    path,
    mode="merge",
    delta_merge_options={
        "predicate": "s.a = t.a",
        "source_alias": "s",
        "target_alias": "t",
    },
)
merger.when_matched_delete(predicate="t.a > 2").execute()

table = pl.read_delta(path)

assert_frame_equal(df.filter(pl.col("a") <= 2), table)

More details:
This is a regression introduced in 0.15.2.

Example of the test failing in our CI:
https://github.com/pola-rs/polars/actions/runs/7784828865/job/21226108137

@Blajda
Copy link
Collaborator

Blajda commented Feb 6, 2024

@stinodego This isn't a bug since there is no requirement that merge will preserve row order.
You are likely seeing this occur now since Merge can now use multiple threads where it was limited to just one before.

@Blajda Blajda closed this as not planned Won't fix, can't repro, duplicate, stale Feb 6, 2024
@stinodego
Copy link
Author

Thanks for clarifying 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants