Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust writer not encoding correct URL for partitions in delta table #2634

Closed
gprashmi opened this issue Jun 28, 2024 · 10 comments · Fixed by #2705
Closed

Rust writer not encoding correct URL for partitions in delta table #2634

gprashmi opened this issue Jun 28, 2024 · 10 comments · Fixed by #2705
Labels
binding/rust Issues for the Rust crate bug Something isn't working

Comments

@gprashmi
Copy link

Environment

Delta-rs version: 0.17.4


Bug

We write data to delta table using delta-rs with PyArrow engine with DayHour as partition column. However when we run the optimize.compact() on the table, it creates partitions with spaces and does not properly encode the partition urls as shown in the below image i.e; it creates new partitions url with spaces (.zstd.parquet).

image

Can you please let me know how we can run the optimize.compact without having partitions with spaces?

@gprashmi gprashmi added the bug Something isn't working label Jun 28, 2024
@gprashmi
Copy link
Author

@g12-al

@g12-al
Copy link

g12-al commented Jul 1, 2024

Confirmed that this also seems to happen in 0.18.0.

This breaks compatibility for our Trino connector to enable visualization on dashboards. Ideally, Trino wouldn't care about spaces (it has a lot of other issues like not being compatible with timezone-aware timestamps).

@ion-elgreco ion-elgreco changed the title Optimize compact not encoding correct URL for partitions in delta table Rust writer not encoding correct URL for partitions in delta table Jul 1, 2024
@ion-elgreco ion-elgreco added the binding/rust Issues for the Rust crate label Jul 1, 2024
@gprashmi
Copy link
Author

gprashmi commented Jul 8, 2024

@ion-elgreco the issue we see here is with the Pyarrow engine

@gprashmi gprashmi changed the title Rust writer not encoding correct URL for partitions in delta table Pyarrow writer not encoding correct URL for partitions in delta table Jul 8, 2024
@gprashmi
Copy link
Author

gprashmi commented Jul 15, 2024

@ion-elgreco on-elgreco Any update on this issue?

@ion-elgreco
Copy link
Collaborator

@gprashmi sorry I don't have time to look into it unfortunately

@gprashmi
Copy link
Author

@ion-elgreco do you have any other suggestions on how this can be handled to have the correct URL encoding with optimizing the table?

@ion-elgreco ion-elgreco changed the title Pyarrow writer not encoding correct URL for partitions in delta table Rust writer not encoding correct URL for partitions in delta table Jul 23, 2024
@ion-elgreco
Copy link
Collaborator

I'll take a look at this

@ion-elgreco
Copy link
Collaborator

@gprashmi @g12-al fix is on it's way in #2705, will be released in 0.18.3

@g12-al
Copy link

g12-al commented Oct 24, 2024

I'm late to responding, but thank you so much for addressing this @ion-elgreco !

@gprashmi
Copy link
Author

Thank you very much @ion-elgreco for addressing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/rust Issues for the Rust crate bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants