Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] DataFrame Interchange Protocol for pyarrow Table #33346

Closed
Tracked by #33982
asfimport opened this issue Oct 25, 2022 · 1 comment
Closed
Tracked by #33982

[Python] DataFrame Interchange Protocol for pyarrow Table #33346

asfimport opened this issue Oct 25, 2022 · 1 comment

Comments

@asfimport
Copy link
Collaborator

For the documentation and spec: https://data-apis.org/dataframe-protocol/latest/index.html
Blog post about it: https://data-apis.org/blog/dataframe_protocol_rfc/
 
Several DataFrame libraries now support this interchange protocol (pandas, modin, vaex, cudf).

Reporter: Alenka Frim / @AlenkaF
Assignee: Alenka Frim / @AlenkaF

PRs and other links:

Note: This issue was originally created as ARROW-18152. Please see the migration documentation for further details.

@asfimport asfimport added this to the 11.0.0 milestone Jan 11, 2023
jorisvandenbossche added a commit that referenced this issue Jan 13, 2023
…14804)

This PR implements the Dataframe Interchange Protocol for `pyarrow.Table`.
See: https://data-apis.org/dataframe-protocol/latest/index.html

Lead-authored-by: Alenka Frim <[email protected]>
Co-authored-by: Alenka Frim <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
@jorisvandenbossche
Copy link
Member

Issue resolved by pull request 14804
#14804

Yicong-Huang added a commit to Texera/texera that referenced this issue May 5, 2023
This PR bumps Apache Arrow version from 10.0.0 to 11.0.0.

Main changes related to PyAmber:

## Java/Scala side:

- Distribute Apple M1 compatible JNI libraries via mavencentral
([#14472](apache/arrow#14472)).
- Improve performance by short-circuiting null checks when comparing non
null field types ([#15106](apache/arrow#15106)).
- Extend Table copy functionality, and support returning copies of
individual vectors
([#14389](apache/arrow#14389)).
- Several enhancements to dictionary encoding
([#14891](apache/arrow#14891),
([#14902](apache/arrow#14902),
([#14874](apache/arrow#14874)).
- Extend Table to support additional vector types
([#14573](apache/arrow#14573)).
- Enhance and simplify handling of allocation management by integrating
C Data into allocator hierarchy
([#14506](apache/arrow#14506)).

## Python side:
- PyArrow now requires pandas >= 1.0
([ARROW-18173](https://issues.apache.org/jira/browse/ARROW-18173)).
- Added support for the [DataFrame Interchange
Protocol](https://data-apis.org/dataframe-protocol/latest/purpose_and_scope.html)
for pyarrow.Table
([GH-33346](apache/arrow#33346)).
- Support for custom metadata of record batches in the IPC read and
write APIs
([ARROW-16430](https://issues.apache.org/jira/browse/ARROW-16430)).
- The Time32Scalar, Time64Scalar, Date32Scalar and Date64Scalar classes
got a .value attribute to access the underlying integer value, similar
to the other date-time related scalars
([ARROW-18264](https://issues.apache.org/jira/browse/ARROW-18264)).
- Casting to string is now supported for duration
([ARROW-15822](https://issues.apache.org/jira/browse/ARROW-15822)) and
decimal
([ARROW-17458](https://issues.apache.org/jira/browse/ARROW-17458))
types, which also means those can now be written to CSV.

## Issues fixed:
- Now Do_action (from Python server back to Java Client) is returning a
stream of results properly, and it alerts when the results are not fully
consumed by the client. Such results will be used to send the flow
control credits back from the Python side. We limit the results to be
exact 1 for now, although it can be a stream.
- Fix a bug in the Python proxy server, when unregistered action is
invoked, it should not parse and return the results.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants