-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Upstream Spice AI changes to datafusion-federation
#51
Conversation
* testing * testing2 * Rewrite all table scans to use federated table name * Wrap the table scan in a subquery alias
* Rewrite subquery table scans to point to remote table * Add tracing * more tracing * more debug logging * Handle subquery in binary expressions
* Fix the table scan rewrite to properly rewrite column relations * Handle more expressions for table scan rewrites
Update Support for multiple table occurrences in single expression Support for multiple different tables in single aggregation expression Document rewrite_column_name_in_expr logic Update Fix spelling
…uery when executing (#10) * keep original qualified_field when wrap_projections * only mutate the table reference when sending plan to execute * formatting * rename method
* do not over eager rewrite column when col relation is there * add more tests * Update sources/sql/src/lib.rs Co-authored-by: Phillip LeBlanc <[email protected]> --------- Co-authored-by: Phillip LeBlanc <[email protected]>
* Add List types parsing support to schema cast * Update to use arrow-json for lists parsing
* Add List types parsing support to schema cast * Update to use arrow-json for lists parsing * Add interval cast given original schema * Fix test --------- Co-authored-by: sgrebnov <[email protected]> Co-authored-by: Phillip LeBlanc <[email protected]>
* Add Structs parsing support to schema cast * Optimize memory space
Co-authored-by: Suriya Kandaswamy <[email protected]> Co-authored-by: Suriya Kandaswamy <[email protected]>
…r from datafusion-flight-sql-server"
…on-migrate-spiceai-41
…ceai-41 rebase spiceai-41 branch
datafusion-federation
datafusion-federation
edition = "2021" | ||
license = "MIT" | ||
license = "Apache-2.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I felt this would be better since DataFusion is an Apache project?
pub use table_provider::{FederatedTableProviderAdaptor, FederatedTableSource}; | ||
|
||
// TODO clean up this | ||
// TODO move schema_cast.rs to schema_cast directory | ||
pub mod schema_cast; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This schema_cast module is useful when the schema registered in DataFusion and the schema returned by the underlying provider don't match exactly. The schema_cast provides a mechanism to keep the schema aligned with what DataFusion expects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put TODO
there as reminder to clean up the module by moving schema_cast.rs to schema_cast/mod.rs
@@ -100,6 +113,481 @@ impl OptimizerRule for SQLFederationOptimizerRule { | |||
false | |||
} | |||
} | |||
|
|||
/// Rewrite table scans to use the original federated table name. | |||
fn rewrite_table_scans( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed to support the table name in the remote source being different from the table name registered in DataFusion. This required a lot of trial and error to get right, and we have a good integration test coverage of this in spiceai.
The rewriting does seem like good generic functionality so I'm fine to accept it in. |
This PR contains all of the changes that the spiceai project made to
datafusion-federation
to fix bugs and add features to make federation work better for our use-cases.The main two features that were added: