Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(perf): Improve perf of bulk loader with Reuse allocator and assinging tags to allocator (#7360) #7547

Merged
merged 3 commits into from
Mar 23, 2021

Conversation

aman-bansal
Copy link
Contributor

@aman-bansal aman-bansal commented Mar 11, 2021

This PR is cherry-pick of #7360 and #7516. This solves the performance of reduce phase of bulk loader which was taking too much time because of more frequent GC(allocator fix) and numerous runtime.Caller calls (tags change)

This change is Reviewable

@github-actions github-actions bot added the area/bulk-loader Issues related to bulk loading. label Mar 11, 2021
manishrjain and others added 3 commits March 23, 2021 15:59
Instead of creating a new z.Allocator for every encoder, this PR reuses the allocator. On a 21M dataset, bulk loader takes 2m22s on master, and only 1m35s on this PR. That's a major 35% performance improvement.
@aman-bansal aman-bansal force-pushed the aman/bulk_loader_perf branch from ebed678 to 304662e Compare March 23, 2021 10:34
Copy link
Contributor

@NamanJain8 NamanJain8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to mention the PRs it is cherry-picking. I believe we are doing 2 cherry-picks in this PR.

Copy link
Contributor

@jarifibrahim jarifibrahim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please run bulk loader locally to verify we're not breaking/affecting anything accidentally.

@aman-bansal aman-bansal merged commit 3ccc521 into release/v20.11 Mar 23, 2021
@aman-bansal aman-bansal deleted the aman/bulk_loader_perf branch March 23, 2021 13:14
@aman-bansal
Copy link
Contributor Author

I have checked the basic bulk loader flow. I will ask for more comprehensive testing from QA around bulk loader

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/bulk-loader Issues related to bulk loading.
Development

Successfully merging this pull request may close these issues.

4 participants