-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support select .. FROM 'parquet.file'
in datafusion-cli
#4838
Conversation
40acdf2
to
befb42e
Compare
Thank you for this PR @unconsolable -- I plan to review this PR tomorrow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much @unconsolable ! I tried it out locally and it was soo cool:
I think it is ok to merge without tests because we don't really have a good testing story for datafusion-cli at the moment. However, I think this feature could be useful for other users of datafusion (not just datafusion-cli) so I will file a ticket to add this feature to the core of datafusion as well
(arrow_dev) alamb@MacBook-Pro-8:~/Software/arrow-datafusion2/datafusion-cli$ CARGO_TARGET_DIR=/Users/alamb/Software/target-df2 cargo run --bin datafusion-cli
Finished dev [unoptimized + debuginfo] target(s) in 1.33s
Running `/Users/alamb/Software/target-df2/debug/datafusion-cli`
DataFusion CLI v15.0.0
❯ select * from '/Users/alamb/.influxdb_iox//1/8/1/6/e24b6549-f76c-4fc0-a4f4-152ed60eb4e3.parquet';
+---------+---------------------+------+---------+----------+---------+---------------------+-------+---------+---------+
| blocked | host | idle | running | sleeping | stopped | time | total | unknown | zombies |
+---------+---------------------+------+---------+----------+---------+---------------------+-------+---------+---------+
| 0 | MacBook-Pro-8.local | 0 | 2 | 697 | 0 | 2022-07-18T21:05:10 | 700 | 0 | 1 |
| 0 | MacBook-Pro-8.local | 0 | 2 | 696 | 0 | 2022-07-18T21:05:20 | 699 | 0 | 1 |
+---------+---------------------+------+---------+----------+---------+---------------------+-------+---------+---------+
2 rows in set. Query took 0.096 seconds.
❯
Benchmark runs are scheduled for baseline = 3d75bb8 and contender = f9b72f4. f9b72f4 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Not sure if others understand the elegance of @unconsolable 's solution here, but it works for other file types as well a directories of files:
I will add some documentation about this feature as it is very cool |
proposed docs: #4851 |
Which issue does this PR close?
Closes #4580 .
Rationale for this change
See #4580
What changes are included in this PR?
Follow up #4581
DynamicFileCatalog
,DynamicFileCatalogProvider
,DynamicFileCatalogProvider
, which can try to create aListingTable
when table is not found.DynamicFileCatalog
inSessionContext
Are these changes tested?
Manually tested.
Are there any user-facing changes?
Syntax
select .. FROM 'abc.parquet'
is supported indatafusion-cli
.