-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add datafusion storage catalog #1381
Conversation
/// Underlying object store | ||
store: Arc<dyn ObjectStore>, | ||
/// A map of table names to a fully quilfied storage location | ||
tables: DashMap<String, String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an artifact of implementing this SchemaProvider
that requires the use of DashMap? Since this adds a new dependency, I'm curious what forces this requirement since it's not clear to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also go the "classic" route using Arc<Mutex<...>>
, however since datafusion internally switched in most places, we already have that as a transient dependency if we use datafusion.
It does however simplify all code where we use this.
parent = p; | ||
} | ||
} | ||
for table in tables.into_iter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bit of logic is a bit obtuse to me, especially the series of chained calls to create file_name
. I think it's a good candidate for refactoring into a private function with some unit tests, but at your discretion there since I know sometimes these objects are a PITA to scaffold for tests
Adds a datafusion `SchemaProvider` implementation based on discovering Delta tables in a storage location via `_delta_log` folders. <!--- For example: - closes #106 ---> <!--- Share links to useful documentation --->
Description
Adds a datafusion
SchemaProvider
implementation based on discovering Delta tables in a storage location via_delta_log
folders.Related Issue(s)
Documentation