-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about design #15
Comments
Hi!
The const generic bool argument indicates whether or not the array can contain null values i.e. if a validity bitmap is allocated. I considered the following alternative methods to expose this:
I haven't measured this, but I'm assuming that the performance benefit of this is negligible.
I'm building this crate to support the more narrow use-case where array types are known at compile time. Whenever data enters the application (either through io or ffi) the expected array type is specified and the read schema is used to validate that it is compatible. An incompatible schema results in a runtime error. #[derive(Array, ...)]
pub struct Foo {
...
}
fn read_ipc<T, const N: bool>(...) -> Result<StructArray<T, N>, Error> {
...
}
fn main() -> Result<(), Error> {
// Read a non-nullable Foo array from an ipc file.
// Returns an error when schema of ipc file is not compatible.
let foo_array = read_ipc::<Foo, false>(...)?;
...
Ok(())
} To more precisely answer your question:
When types are not known at compile time this crate is not useful and I would use arrow2. A compatibility layer can be added to make use of methods from other arrow implementations e.g.:
|
So sorry, for some reason your answer slipped under my radar. Thank you very much for this summary! I think it makes sense. If you think there is anything that you would benefit from arrow2, please let me know and we can try to move it out of arrow2 into a common crate. Also, an idea to further simplify here is jorgecarleitao/arrow2#385, which changes arrow2 to use rust default |
Hey!
This is a really interesting approach; I have tried this before working on "arrow2", but hit some walls when working with nested data in IO boundaries, so I am excited to see someone else trying!
A couple of questions, since I am very curious about it:
bool
. Why for this decision? Is there a major performance difference between this and only being known at runtime?the later was what ultimately led me to abandon the static design and continue to work under a
dyn
design.The text was updated successfully, but these errors were encountered: