-
Notifications
You must be signed in to change notification settings - Fork 224
Conversation
@sundy-li here is a proposal. Would it work for you? You would need to match as such: match array.data_type() {
DataType::Extension("date16", UInt16, _) => {
let array = array.as_any().downcast_ref::<ExtensionArray>().unwrap();
let array: &UInt16Array = array.inner().as_any().downcast_ref().unwrap();
}
} |
Codecov Report
@@ Coverage Diff @@
## main #350 +/- ##
==========================================
- Coverage 81.01% 80.16% -0.85%
==========================================
Files 326 329 +3
Lines 21172 21507 +335
==========================================
+ Hits 17152 17242 +90
- Misses 4020 4265 +245
Continue to review full report at Codecov.
|
e68dd3e
to
e911fd8
Compare
Yes, this would work in my case, LGTM. So |
|
This pr looks good except for one thing: how does the If I just looked deep into the ClickHouse's array system, there is no logic type involved inside the array system. https://github.com/ClickHouse/ClickHouse/blob/master/src/Columns/ColumnVector.h#L232 |
AFIAK the compute cannot make many assumptions about extension type because an extension type may have different semantics. With that said, there are still a lot of things we can assume and forward the call to the corresponding function (e.g. we should still support filter, take, concat, etc). Are those the type of compute operations that you are thinking about? |
I have been thinking a lot about this and I am not sure this is the right way of doing this; I am starting to agree with you @sundy-li that we need the physical type idea here. I.e. not use a registry, but still use the |
Currently, this pr can meet our demand, so I decided to introduce custom logic types in Datafuse base on this pr. This may help in sharing some compute operations, like So I think this pr can be merged now because it did not break any old behaviors, let's take it as an experimental feature. I do agree that |
Let calculations follow specific actual types, not logical types. |
@ritchie46 , does polars use |
Polars keep track of the data type itself on the Logical type |
Datafuse used polars's series idea, the But So And polars did not use flight and IPC, so currently it's easy to have custom logic data types inside it. |
Closing in favor of #359 |
This PR adds support for Arrow's "Extension type".
This is represented by
DataType::Extension
andExtensionArray
, respectively.For now, extension arrays are only supported to be shared via IPC (i.e. not FFI, since metadata is still not supported in FFI).
This PR adds an example demonstrating how to use it.
All of this is pending passing integration tests, as usual, as well as more tests covering the feature.