-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix names and nullability when creating RecordBatch from MapArray #1258
Conversation
Thanks for the pull request! @balbok0 would you be able to take your failing example and turn that into a unit or integration test so we don't have this regression in the future? |
I’d like for us to fix this, but I also hope we can find a way to simultaneously support older PyArrow versions in the Python bindings. |
@wjones127 are there some specific pyarrow versions you have in mind? I wasn't able to find a specific arrow-cpp/pyarrow version that changed the naming scheme. |
Added failing example as a test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally looks good. I was worried about PyArrow and other languages because the Arrow standard specifies field names as "key" not "keys" and "value" not "values" (arrow-rs deviates from this). But the PyArrow C Data Interface import actually ignores these field names on map types anyways, so it shouldn't matter that they are different. So we can ignore that worry.
I have a few small nits on the tests, but otherwise looks good to go.
Oh and could you rebase? We just fixed some of the deprecation warnings. |
2ab8ad7
to
17c9b2a
Compare
Thanks for comments! I applied changes and rebased. Lmk if there is anything else that needs to done before merge. |
Description
When creating a RecordBatch with one of the columns being a MapArray, there are issues with naming of sub-columns, as well as nullability of the "elements/key_value" sub-column.
Related Issue(s)
Documentation
See https://www.rustexplorer.com/b/o7bfm4 for a breaking example.
I initially thought that the issue is specifically for array with 0-length maps, but
nullable
mismatch happens even when all of the map elements have values.