Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: don't re-encode paths #1613

Merged
merged 4 commits into from
Sep 11, 2023
Merged

fix: don't re-encode paths #1613

merged 4 commits into from
Sep 11, 2023

Conversation

wjones127
Copy link
Collaborator

@wjones127 wjones127 commented Sep 3, 2023

Description

In the delta log, paths are percent encoded. We decode them here:

pub fn path_decoded(self) -> Result<Self, ProtocolError> {
decode_path(&self.path).map(|path| Self { path, ..self })
}

Which is good. But then we've been re-encoding them with Path::from. This PR changes to use Path::parse when possible instead. Instead of propagating errors, we just fallback to Path::from for now. Read more here: https://docs.rs/object_store/0.7.0/object_store/path/struct.Path.html#encode

Related Issue(s)

Documentation

@github-actions github-actions bot added binding/python Issues for the Python package binding/rust Issues for the Rust crate rust labels Sep 3, 2023
@wjones127 wjones127 marked this pull request as ready for review September 3, 2023 21:33
@wjones127 wjones127 enabled auto-merge (squash) September 5, 2023 04:38
@wjones127 wjones127 added this to the Python v0.10.2 milestone Sep 5, 2023
rtyler
rtyler previously approved these changes Sep 5, 2023
@mahinshaw
Copy link

This also resolves the issue in #1228 based on my testing.

@wjones127 wjones127 merged commit 41efefd into delta-io:main Sep 11, 2023
@ABChristian
Copy link

Testing with 0.10.2 i could still reproduce the issue #1391
Writing to a path containing spaces (due to a parent folder, i.e. "./deltalake/data") works correctly as expected, using the same location for reading will yield an error:

FileNotFoundError: Object at location C:\Users\Me\FirstPart%20-%20SecondPart\PyDev\deltalake\data\0-f8ffed19-9033-4ba6-8cbe-043295af18d1-0.parquet not found

@djouallah
Copy link

yep, same problem here

@wjones127
Copy link
Collaborator Author

@ABChristian Thanks for testing. I'll take a look at that soon.

@Phil-T1
Copy link

Phil-T1 commented Sep 25, 2023

Using 0.10.2, I'm having this same issue where Windows paths cannot be decoded due to their spaceyness.

Writing only succeeds where paths which have no space characters. :(

@JvdH-NL
Copy link

JvdH-NL commented Apr 17, 2024

Using 0.16.4 I am having same issue. Running Python code from a path that contains spaces gives me 'object at location <FILL PATH WITH SPACES that have been replaced with %20 .... parquet file> not found ....
I have some sample data, using code, fails on last statement. And path is on corporate OneDrive containing spaces.
delta_table_path = 'deltaTable/'
dt = DeltaTable(delta_table_path)

Read Data from Delta table

dt.to_pandas()

@JvdH-NL
Copy link

JvdH-NL commented Apr 17, 2024

Quickstart on Homepage results in error in reading (writing part goes ok), same issue as my previous post.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package binding/rust Issues for the Rust crate rust
Projects
None yet
7 participants