Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read date32, date64, decimal128 from Arrow datasets #829

Merged
merged 4 commits into from
Dec 4, 2019

Conversation

sc1f
Copy link
Contributor

@sc1f sc1f commented Dec 3, 2019

This PR adds support for reading date32, date64, and decimal128 columns from Arrow-serialized datasets. date32 and date64 columns are typed as date, while decimal128 columns are typed as int64.

Additionally, utility functions have been added for loading Arrows from the filesystem (in JS tests) and generating Arrow binaries using PyArrow (for Python tests). This allows us to quickly add test cases for Arrow datasets in Perspective.

@sc1f sc1f requested a review from texodus December 3, 2019 17:42
@sc1f sc1f added C++ enhancement Feature requests or improvements JS Python labels Dec 3, 2019
Copy link
Member

@texodus texodus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed! Looks great - awesome test coverage for this feature as well. Some takeaways for future work:

  • Follow up on stream vs file arrows across 14 & 15 - we have both seen bugs in this and I suspect it is a 15 regression on file.
  • Partial update semantics are not well defined for Arrow yet.
  • Type promotion is not well defined.

@texodus
Copy link
Member

texodus commented Dec 4, 2019

Thanks for the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ enhancement Feature requests or improvements JS Python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants