-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(rust): make PartitionWriter public #2525
feat(rust): make PartitionWriter public #2525
Conversation
ACTION NEEDED delta-rs follows the Conventional Commits specification for release automation. The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. |
@adriangb I'm not opposed to the idea, can you share more about how this can be used outside of the delta context, I'm curious |
This isn't to use outside of the delta context necessarily, just hook in more low level and with less code churn. I have a Rust service that does buffering of data before writing to Delta. I control all of the writes going to Delta so I want to avoid all of the overhead of checking that the schemas match, etc which happens if you go through the public APIs. I also want to take control of the write retry loop because I may have to switch to pessimistic concurrency control under high write load (take out a lock on a low latency system like Redis only at the stage of writing the commit) to avoid write contention. Currently data is written out manually using ArrowWriter as hive partitioned parquet. So if I can hook into PartitionWriter it's relatively little code changes (actually saves me a good bit of code doing the partitioning) and I can then go implement the writing of the commits as a later step to convert to a delta table. |
Sounds reasonable to me! I think this API is stable enough to make public in the next release, please run a |
Head branch was pushed to by a user without write access
done @rtyler ! Seems like it dismissed your review :( |
@rtyler @adriangb - recently I have been thinking if we should adopt the visibility crate to expose a larger api behind a feature gate and less guarantees. We uses this - so gar successfully I believe- in kernel ... There also is some writing stuff - e.g. muti-part put from object_store on my todo list, that may affect this. |
@roeap not sure I understood fully, are you saying that this right now should not be made public or that there is an opportunity for a more thoughtful approach to a public API within the rust crates in the future? |
@adriangb - i think we can make this public now, but maybe behind a feature gate. The visibility crate enables this. @ion-elgreco - yes, still some things to cross off first, and we'll work from where we are, when i get to it 🙂. |
Interesting, TIL! |
I would like to use PartitionWriter within my own crate. Would you mind making this public so I don't have to vendor it and all of the associated code?