-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(python): handle PyCapsule interface objects in write_deltalake #2534
Conversation
ACTION NEEDED delta-rs follows the Conventional Commits specification for release automation. The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. |
77d03f0
to
1c805e1
Compare
@kylebarron can you fix the linting issues? Then we can merge it Also wondering, how we should typehint this now, since an input can have the c_stream attribute or not |
I'm pretty packed but I can try to find some time soon.
You can use these type hints: https://arrow.apache.org/docs/format/CDataInterface/PyCapsuleInterface.html#protocol-typehints |
@kylebarron ah nice, do you mind adding those typehints when you find the time |
I believe I fixed the lint and fixed the type hinting. In the future, a more involved PR could remove pyarrow as a required dependency entirely by passing the C stream pycapsule directly to Rust (arrow-rs has an example of how to do that here) |
@kylebarron that would be nice, we could potentially make it opt-in since it would then only be needed for reading |
387b819
to
dfe6e4a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this!
Description
Adds support for the Arrow PyCapsule interface.
Since pyarrow is already a required dependency, this takes the minimal route of converting pycapsule interface objects into pyarrow objects. This requires pyarrow 15 or higher for the stream conversion (apache/arrow#39217).
This doesn't modify the existing hard-coded support for pyarrow and pandas
Related Issue(s)
Documentation