Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python][FS][Azure] C++ Exceptions leaking from PyArrow #44269

Closed
Tjev opened this issue Sep 30, 2024 · 1 comment
Closed

[Python][FS][Azure] C++ Exceptions leaking from PyArrow #44269

Tjev opened this issue Sep 30, 2024 · 1 comment

Comments

@Tjev
Copy link

Tjev commented Sep 30, 2024

Describe the bug, including details regarding any error messages, version, and platform.

There seem to be few C++ Exceptions leaking from the pyarrow implementation of the pyarrow.fs.AzureFileSystem.

One example is the Azure::core::Http::TransportException: Timeout waiting for socket to read coming from the Azure C++ SDK.

Another example is that if user provides the AzureFileSystem class constructor with account_key which is not a valid base64 encoded string, this exception is raised and kills the process:
libc++abi: terminating due to uncaught exception of type std::runtime_error: Unexpected character in Base64 encoded string

Would it be please possible to catch the C++ Exceptions such as the TransportException and raise them as Python exception of some kind?
It would enable the users to decide how to handle such issues instead of failing/crashing the user application.

To reproduce

example 1:

>>> import pyarrow.fs
>>> fs = pyarrow.fs.AzureFileSystem(account_name="doesntexist", account_key="bl==")
>>> fs.create_dir("bla/bla")
libc++abi: terminating due to uncaught exception of type Azure::Core::Http::TransportException: Fail to get a new connection for: https://doesnt_exist.dfs.core.windows.net. Couldn't resolve host name

example 2:

>>> import pyarrow.fs
>>> fs = pyarrow.fs.AzureFileSystem(account_name="doesntexist", account_key="bla")
>>> fs.create_dir("bla/bla")
libc++abi: terminating due to uncaught exception of type std::runtime_error: Unexpected character in Base64 encoded string

Versions

Python version: 3.12.2
PyArrow version: 17.0.0

Component(s)

Python

@Tjev Tjev added the Type: bug label Sep 30, 2024
@Tjev Tjev changed the title [Python][FS][Azure] C++ TransportException leaking from pyarrow [Python][FS][Azure] C++ Exceptions leaking from PyArrow Sep 30, 2024
kou added a commit to kou/arrow that referenced this issue Oct 1, 2024
…ort check

`Azure::Storage::Files::DataLake::DataLakeDirectoryClient` may throw
`Azure::Core::Http::TransportException` and `std::runtime_error`
exceptions but they aren't caught. Arrow C++ uses `arrow::Status` not
C++ exception. So we must catch all exceptions from Azure SDK for C++.
kou added a commit that referenced this issue Oct 2, 2024
…eck (#44274)

### Rationale for this change

`Azure::Storage::Files::DataLake::DataLakeDirectoryClient` may throw `Azure::Core::Http::TransportException` and `std::runtime_error` exceptions but they aren't caught. Arrow C++ uses `arrow::Status` not C++ exception. So we must catch all exceptions from Azure SDK for C++.

### What changes are included in this PR?

Add catches `Azure::Core::Http::TransportException` and `std::exception`. 

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Yes.
* GitHub Issue: #44269

Authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
@kou kou added this to the 18.0.0 milestone Oct 2, 2024
@kou
Copy link
Member

kou commented Oct 2, 2024

Issue resolved by pull request 44274
#44274

@kou kou closed this as completed Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants