Read date32, date64, decimal128 from Arrow datasets #829

sc1f · 2019-12-03T17:42:02Z

This PR adds support for reading date32, date64, and decimal128 columns from Arrow-serialized datasets. date32 and date64 columns are typed as date, while decimal128 columns are typed as int64.

Additionally, utility functions have been added for loading Arrows from the filesystem (in JS tests) and generating Arrow binaries using PyArrow (for Python tests). This allows us to quickly add test cases for Arrow datasets in Perspective.

texodus

Reviewed! Looks great - awesome test coverage for this feature as well. Some takeaways for future work:

Follow up on stream vs file arrows across 14 & 15 - we have both seen bugs in this and I suspect it is a 15 regression on file.
Partial update semantics are not well defined for Arrow yet.
Type promotion is not well defined.

texodus · 2019-12-04T19:14:10Z

Thanks for the PR!

sc1f added 3 commits December 2, 2019 23:14

support date32, date64, add comprehensive tests for python stream arrow

6a452d0

add test_arrows.js to test spec, fix JS arrow tests

2d3ccb3

fix broken JS tests

455f2bd

sc1f requested a review from texodus December 3, 2019 17:42

fix flaking python test

0bb967e

finos-admin added the cla-present label Dec 3, 2019

sc1f added C++ enhancement Feature requests or improvements JS Python labels Dec 3, 2019

texodus approved these changes Dec 4, 2019

View reviewed changes

texodus merged commit 837336d into master Dec 4, 2019

texodus deleted the arrow-load-dates branch December 4, 2019 19:14

RandomFractals mentioned this pull request Dec 5, 2019

specify variable types when previewing files RandomFractals/vscode-data-preview#171

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Read date32, date64, decimal128 from Arrow datasets #829

Read date32, date64, decimal128 from Arrow datasets #829

sc1f commented Dec 3, 2019

texodus left a comment

texodus commented Dec 4, 2019

Read date32, date64, decimal128 from Arrow datasets #829

Read date32, date64, decimal128 from Arrow datasets #829

Conversation

sc1f commented Dec 3, 2019

texodus left a comment

Choose a reason for hiding this comment

texodus commented Dec 4, 2019