Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read list field correctly #234

Merged
merged 1 commit into from
Apr 29, 2021
Merged

Read list field correctly #234

merged 1 commit into from
Apr 29, 2021

Conversation

nevi-me
Copy link
Contributor

@nevi-me nevi-me commented Apr 27, 2021

Which issue does this PR close?

Closes #167 .

Rationale for this change

We have been creating incorrect list types when reading Parquet lists to Arrow. If we have a list with field type:

let list_field = Field::new("a", DataType::List(Box::new(Field::new("item", DataType::_, true))), true);

We would instead read this list in as

let list_field = Field::new("a", DataType::List(Box::new(Field::new("a", DataType::_, true))), true);

With the list child taking the list's name of "a".

What changes are included in this PR?

This PR fixes the above issue.
it also fixes the order of the LevelInfo that's generated, as I identified this issue while fixing the test case for the above issue.

Are there any user-facing changes?

No user changes

@codecov-commenter
Copy link

Codecov Report

Merging #234 (2018f43) into master (111d5d6) will increase coverage by 0.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #234      +/-   ##
==========================================
+ Coverage   82.49%   82.51%   +0.02%     
==========================================
  Files         162      162              
  Lines       43621    43655      +34     
==========================================
+ Hits        35983    36022      +39     
+ Misses       7638     7633       -5     
Impacted Files Coverage Δ
parquet/src/arrow/array_reader.rs 77.14% <100.00%> (+0.57%) ⬆️
parquet/src/arrow/arrow_writer.rs 98.32% <100.00%> (+0.09%) ⬆️
arrow/src/record_batch.rs 85.71% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 111d5d6...2018f43. Read the comment docs.

Copy link
Member

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @nevi-me !

@jorgecarleitao jorgecarleitao added the parquet Changes to the parquet crate label Apr 28, 2021
@jorgecarleitao jorgecarleitao changed the title [Parquet] Read list field correctly Read list field correctly Apr 28, 2021
@alamb alamb merged commit 2121150 into apache:master Apr 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Read list field correctly in <struct<list>>
4 participants