-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable object_store reading for all the file types #6177
Comments
@winding-lines any PR open/ updates for this ? |
@chitralverma I am still working on #6830, to fully integrate async in python. After some dead-ends I see the possible architecture that will marry the thread-heavy code in Polars with the async capabilities. See today's comment for my current thinking. |
@winding-lines we have also been thinking about this problem. In particular, we are trying to understand what needs to be done by Polars itself for performance reasons and what could be done externally to avoid adding too much to Polars. Here is a table summarizing our understanding so far (✓ means supported, ? means unsupported, x means Polars does not need to support it):
The reason why We are also interested in streaming to object storage, both parquet and ipc. See #6178. For write operations we have this table:
I believe the "eager S3" operations can be implemented today using the underlying So, according to our (limited) understanding, the pending operations which must happen on the Polars side are not that many. Please let me know if I missed anything. Thank you! |
Problem description
Right now the
object_store
crate is integrated for reading on the parquet streaming path. Enable cloud url reading for all the file types.The text was updated successfully, but these errors were encountered: