Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve list builders, iteration and construction #3419

Merged
merged 3 commits into from
May 18, 2022
Merged

Conversation

ritchie46
Copy link
Member

@ritchie46 ritchie46 commented May 18, 2022

  • Greatly improves performance of the list builders
    by: remove accidental quadratic null_count jorgecarleitao/arrow2#991

  • List builders now also support nested dtypes like List and Struct

  • Python DataFrame and Series constructor now support better nested dtype
    construction

  • Fix a wrong implementation of agg_list for dtype Struct

fixes #3415
fixes #3418

ritchie46 added 2 commits May 17, 2022 10:55
- Greatly improves performance of the list builders
 by: jorgecarleitao/arrow2#991

- List builders now also support nested dtypes like List and Struct

- Python DataFrame and Series constructor now support better nested dtype
construction
@github-actions github-actions bot added python Related to Python Polars rust Related to Rust Polars labels May 18, 2022
@codecov-commenter
Copy link

codecov-commenter commented May 18, 2022

Codecov Report

Merging #3419 (7e8abd6) into master (50f7bd9) will decrease coverage by 0.01%.
The diff coverage is 65.43%.

@@            Coverage Diff             @@
##           master    #3419      +/-   ##
==========================================
- Coverage   61.61%   61.60%   -0.02%     
==========================================
  Files         395      396       +1     
  Lines       68428    68573     +145     
==========================================
+ Hits        42165    42241      +76     
- Misses      26263    26332      +69     
Impacted Files Coverage Δ
polars/polars-arrow/src/array/list.rs 98.30% <ø> (ø)
...olars/polars-core/src/chunked_array/builder/mod.rs 39.04% <0.00%> (-0.55%) ⬇️
polars/polars-core/src/chunked_array/cast.rs 68.68% <0.00%> (-0.71%) ⬇️
polars/polars-core/src/chunked_array/list/mod.rs 62.50% <ø> (ø)
polars/polars-core/src/chunked_array/ndarray.rs 36.71% <0.00%> (-0.59%) ⬇️
polars/polars-core/src/fmt.rs 32.93% <0.00%> (+0.62%) ⬆️
.../polars-core/src/frame/groupby/aggregations/mod.rs 64.31% <ø> (ø)
polars/polars-ops/src/chunked_array/list/mod.rs 100.00% <ø> (ø)
polars/polars-time/src/chunkedarray/datetime.rs 40.19% <ø> (ø)
py-polars/src/list_construction.rs 27.16% <14.28%> (-1.42%) ⬇️
... and 28 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 50f7bd9...7e8abd6. Read the comment docs.

assert pl.DataFrame(
{
"list_of_struct": [
[{"a": 1, "b": 4}, {"a": 3, "b": 6}],
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cbilot creating list of structs is now a bit more ergonomic.

@ritchie46 ritchie46 merged commit 6e71b89 into master May 18, 2022
@ritchie46 ritchie46 deleted the list_struct branch May 18, 2022 06:54
moritzwilksch pushed a commit to moritzwilksch/polars that referenced this pull request May 29, 2022
* list namespace to polars-ops

* Improve list builders, iteration and construction

- Greatly improves performance of the list builders
 by: jorgecarleitao/arrow2#991

- List builders now also support nested dtypes like List and Struct

- Python DataFrame and Series constructor now support better nested dtype
construction

* fix tests and fix struct::agg_list
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python Related to Python Polars rust Related to Rust Polars
Projects
None yet
2 participants