-
Notifications
You must be signed in to change notification settings - Fork 224
Async sink interfaces for IPC and Parquet IO #876
Conversation
Codecov Report
@@ Coverage Diff @@
## main #876 +/- ##
==========================================
+ Coverage 71.50% 71.65% +0.14%
==========================================
Files 335 338 +3
Lines 18147 18474 +327
==========================================
+ Hits 12976 13237 +261
- Misses 5171 5237 +66
Continue to review full report at Codecov.
|
Holy molly amazing PR ❤️ IMO this is basically ready to merge; the API looks great, it is ready easy to follow it, and overall really great work. Some comments:
|
Sure, I'll split it up, move the tests, and re-submit!
|
Thanks a lot, @dexterduck 🙇 yes, for now it is ok to break APIs in |
Following up on discussion here: jorgecarleitao/parquet2#78.
This PR implements new types that expose the
futures::Sink
trait for Arrow IPC and Parquet writers. It also implements an async stream type for reading the IPC file format (since currently async reading is only available for the IPC stream format). Specific new types:io::ipc::read::file_async::FileStream
- implementsfutures::Stream
for IPC files.io::ipc::write::file_async::FileSink
- implementsfutures::Sink
for IPC files.io::ipc::write::file_async::StreamSink
- implementsfutures::Sink
for IPC streams.io::parquet::write::FileSink
- implementsfutures::Sink
for Parquet files.Definitely interested in feedback on these changes!
In particular I felt pretty unsure about file structure and naming for the new types so happy to change those in any way that makes sense. I also added tests directly into some of the modules, but I wasn't sure if there was a different place I should be putting those.