Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix lint perf regressions #105485

Merged
merged 3 commits into from
Dec 12, 2022
Merged

Conversation

nnethercote
Copy link
Contributor

#104863 caused small but widespread regressions in lint performance. I tried to improve things in #105291 and #105416 with minimal success, before fully understanding what caused the regression. This PR effectively reverts all of #105291 and part of #104863 to fix the perf regression.

r? @cjgillot

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 9, 2022
@nnethercote
Copy link
Contributor Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 9, 2022
@bors
Copy link
Contributor

bors commented Dec 9, 2022

⌛ Trying commit 33cc7df30e6e99df87ca5e82cd8877577717c872 with merge b6d121e1232514d0c98d173e0ab05c40684601ab...

@bors
Copy link
Contributor

bors commented Dec 9, 2022

☀️ Try build successful - checks-actions
Build commit: b6d121e1232514d0c98d173e0ab05c40684601ab (b6d121e1232514d0c98d173e0ab05c40684601ab)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (b6d121e1232514d0c98d173e0ab05c40684601ab): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.6% [-1.5%, -0.2%] 121
Improvements ✅
(secondary)
-0.9% [-2.5%, -0.1%] 95
All ❌✅ (primary) -0.6% [-1.5%, -0.2%] 121

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.8% [-1.0%, -0.6%] 2
All ❌✅ (primary) - - 0

Cycles

This benchmark run did not return any relevant results for this metric.

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 9, 2022
Comment on lines +303 to +305
struct RuntimeCombinedEarlyLintPass<'a> {
passes: &'a mut [EarlyLintPassObject],
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instezad of having separate branches for the builtin-only vs builtin+others, can something like this work?

Suggested change
struct RuntimeCombinedEarlyLintPass<'a> {
passes: &'a mut [EarlyLintPassObject],
}
struct RuntimeCombinedEarlyLintPass<'a, T: EarlyLintPass> {
builtin: T,
passes: &'a mut [EarlyLintPassObject],
}

This would require changing the macros, but we may have both the static dispatch for builtin lints and limit the complexity.

Copy link
Contributor Author

@nnethercote nnethercote Dec 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That won't help. Under the original structure, we had builtin_lints and passes, but passes is always for empty normal compilation. (It's non-empty when bootstrapping, because rustc has some rustc-specific lints, and it's also non-empty for clippy.)

I then combined builtin_lints into passes. This meant that even in the normal builtins-only cases, the code had to iterate over the passes slice for every check_* call. This is what caused the slowdown. It took me a while to work out because the iterator methods are inlined so it wasn't that obvious.

With this new design, which is very similar to the original design, if passes is empty then we avoid all iteration over the slice. This gives the speed wins.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pseudocode may make it clearer. Simplifying greatly, it's basically the difference between this:

for node in ast {
    builtin_lints.check_node(node);
}

and this:

let passes = vec![builtin_lints]
for node in ast {
    for pass in passes {
        pass.check_node(node);
    }
}

})*
}
)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an explanation why this commit has an effect on performance? Is this a matter of code size vs indirection vs inlining?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above: the last commit is the one that regains the performance, because it separates the builtin_lints handling from the passes handling, avoiding all slice iteration when passes is empty. The second last commit is preparatory work.

(And I forgot to mention that the first five commits are repeated from #105416, sorry.)

Comment on lines +403 to +409
if passes.is_empty() {
late_lint_crate_inner(tcx, context, builtin_lints);
} else {
passes.push(Box::new(builtin_lints));
let pass = RuntimeCombinedLateLintPass { passes: &mut passes[..] };
late_lint_crate_inner(tcx, context, pass);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the important bit: there are two different calls to late_lint_crate_inner. The builtins-only one is faster than the passes one, even if passes has just one pass, because it doesn't involve slice iteration for each check_foo call.

Another thing worth mentioning: this new structure is better than the old structure when passes is non-empty. The original structure involved two AST traversals, roughly:

late_lint_crate_inner(builtin_lints);
if !passes.is_empty() {
    late_lint_crate_inner(passes);
}

The new structure always involves one AST traversal:

if !passes.is_empty() {
    late_lint_crate_inner(builtin_lints);
} else {
    late_lint_crate_inner(bulitin_lints + passes);
}

So this should make bootstrapping and clippy slightly faster, though the difference may not be significant in practice.

@cjgillot
Copy link
Contributor

Thank you. r=me once #105416 lands.

This matches the name used in `late.rs`.
I removed these in rust-lang#105291, and subsequently learned they are necessary
for performance.

This commit reinstates them with the new and more descriptive names
`RuntimeCombined{Early,Late}LintPass`, similar to the existing passes
like `BuiltinCombinedEarlyLintPass`. It also adds some comments,
particularly emphasising how we have ways to combine passes at both
compile-time and runtime. And it moves some comments around.
This commit partly undoes rust-lang#104863, which combined the builtin lints pass
with other lints. This caused a slowdown, because often there are no
other lints, and it's faster to do a pass with a single lint directly
than it is to do a combined pass with a `passes` vector containing a
single lint.
@nnethercote nnethercote force-pushed the fix-lint-perf-regressions branch from 33cc7df to 4ff5a36 Compare December 11, 2022 22:48
@nnethercote
Copy link
Contributor Author

Thank you. r=me once #105416 lands.

#105416 has landed.

@bors r=cjgillot

@bors
Copy link
Contributor

bors commented Dec 11, 2022

📌 Commit 4ff5a36 has been approved by cjgillot

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 11, 2022
@bors
Copy link
Contributor

bors commented Dec 12, 2022

⌛ Testing commit 4ff5a36 with merge b397bc0...

@bors
Copy link
Contributor

bors commented Dec 12, 2022

☀️ Test successful - checks-actions
Approved by: cjgillot
Pushing b397bc0 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Dec 12, 2022
@bors bors merged commit b397bc0 into rust-lang:master Dec 12, 2022
@rustbot rustbot added this to the 1.68.0 milestone Dec 12, 2022
@nnethercote nnethercote deleted the fix-lint-perf-regressions branch December 12, 2022 05:08
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (b397bc0): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.6% [-1.5%, -0.2%] 138
Improvements ✅
(secondary)
-0.8% [-2.5%, -0.2%] 93
All ❌✅ (primary) -0.6% [-1.5%, -0.2%] 138

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.4% [2.4%, 2.4%] 1
Regressions ❌
(secondary)
1.8% [1.8%, 1.8%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.4% [2.4%, 2.4%] 1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.2% [-1.6%, -0.9%] 2
Improvements ✅
(secondary)
-3.2% [-3.3%, -3.2%] 2
All ❌✅ (primary) -1.2% [-1.6%, -0.9%] 2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants