-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(python): add parameter to DeltaTable.to_pyarrow_dataset() #2465
Conversation
@adriangb you might just better passthrough |
They both seem useful right? It seems like the |
@adriangb that's true. If you can fix the tests then we can merge |
it looks like the test just fails on older pyarrow versions and only for the map type. How about I split it in two and skip the failing one on pyarrow < 10? |
@adriangb can you fix the tests? Then we can merge it :) |
@@ -1022,6 +1022,8 @@ def to_pyarrow_dataset( | |||
partitions: Optional[List[Tuple[str, str, Any]]] = None, | |||
filesystem: Optional[Union[str, pa_fs.FileSystem]] = None, | |||
parquet_read_options: Optional[ParquetReadOptions] = None, | |||
schema: Optional[pyarrow.Schema] = None, | |||
as_large_types: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc description is missing for this param. I would also mention if the schema is passed that takes precedence over as_large_types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
Thankss @adriangb |
Otherwise there is no way to union this with another dataset.