Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: reduce stack mem-move #1739

Closed
wants to merge 1 commit into from
Closed

perf: reduce stack mem-move #1739

wants to merge 1 commit into from

Conversation

h-a-n-a
Copy link
Collaborator

@h-a-n-a h-a-n-a commented Feb 2, 2023

Summary

Test Plan

Related issue (if exists)

How does Webpack handle this? (if exists)

Is this a workaround for the Webpack's implementation?

Check if Webpack has the same feature and but we're taking a workaround for it.

  • Yes. Issue for resolving the workaround:
  • No

Further reading

@changeset-bot
Copy link

changeset-bot bot commented Feb 2, 2023

⚠️ No Changeset found

Latest commit: ba67ffe

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@h-a-n-a
Copy link
Collaborator Author

h-a-n-a commented Feb 2, 2023

!bench

@github-actions
Copy link
Contributor

github-actions bot commented Feb 2, 2023

Benchmark Results

group                                                 baseline                               pr
-----                                                 --------                               --
criterion_benchmark/css_heavy                         1.00     45.3±2.31ms        ? ?/sec    1.01     45.8±2.51ms        ? ?/sec
criterion_benchmark/lodash                            1.00     69.6±0.99ms        ? ?/sec    1.01     70.1±0.95ms        ? ?/sec
criterion_benchmark/ten_copy_of_threejs               1.00    729.4±5.53ms        ? ?/sec    1.01    738.1±5.61ms        ? ?/sec
high_cost_benchmark/ten_copy_of_threejs_production    1.00       4.7±0.01s        ? ?/sec    1.00       4.7±0.01s        ? ?/sec

@@ -34,7 +34,7 @@ impl Hash for Ast {
impl Ast {
pub fn new(root: SwcStylesheet, source_map: Arc<SourceMap>) -> Self {
Self {
root,
root: box root,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI box keyword / box pattern is perma-unstable rust-lang/rust#49733 (comment)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intended to place the item in the box to the heap directly. There are some places that used this, if the syntax changes, I will update this.

@hyf0
Copy link
Collaborator

hyf0 commented Feb 3, 2023

How do we measure the cost of stack mem-move?

@h-a-n-a
Copy link
Collaborator Author

h-a-n-a commented Feb 3, 2023

How do we measure the cost of stack mem-move?

@hyf0 Theoretically, for an AST contains huge nodes, it will improve.

@IWANABETHATGUY
Copy link
Contributor

IWANABETHATGUY commented Feb 3, 2023

According to perf book

Furthermore, Rust types that are larger than 128 bytes are copied with memcpy rather than inline code. If memcpy shows up in non-trivial amounts in profiles, DHAT’s “copy profiling” mode will tell you exactly where the hot memcpy calls are and the types involved. Shrinking these types to 128 bytes or less can make the code faster by avoiding memcpy calls and reducing memory traffic.

Only struct bigger than 128bytes this perf is meaningful, would you mind double checking the type size of struct?

@h-a-n-a
Copy link
Collaborator Author

h-a-n-a commented Feb 3, 2023

According to perf book

Furthermore, Rust types that are larger than 128 bytes are copied with memcpy rather than inline code. If memcpy shows up in non-trivial amounts in profiles, DHAT’s “copy profiling” mode will tell you exactly where the hot memcpy calls are and the types involved. Shrinking these types to 128 bytes or less can make the code faster by avoiding memcpy calls and reducing memory traffic.

Only struct bigger than 128bytes this perf is meaningful, would you mind double checking the type size of struct?

@IWANABETHATGUY I missed the fact that almost every huge node downstream are already being boxed, so this results in this optimization being trivial.

@h-a-n-a h-a-n-a closed this Feb 3, 2023
@IWANABETHATGUY IWANABETHATGUY deleted the perf/stack branch February 3, 2023 08:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants