Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add assume into NonZeroIntX::get #119452

Merged

Conversation

AngelicosPhosphoros
Copy link
Contributor

@AngelicosPhosphoros AngelicosPhosphoros commented Dec 30, 2023

LLVM currently don't support range metadata for function arguments so it fails to optimize non zero integers using their invariant if they are provided using by-value function arguments.

Related to #119422
Related to llvm/llvm-project#76628
Related to #49572

@rustbot
Copy link
Collaborator

rustbot commented Dec 30, 2023

r? @joshtriplett

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Dec 30, 2023
@AngelicosPhosphoros
Copy link
Contributor Author

AngelicosPhosphoros commented Dec 30, 2023

I would also like to check if we can just add valid range attribute to the function call.

It seems to work.

There is also a problem that get function gets inlined by MIR optimizer.

@AngelicosPhosphoros AngelicosPhosphoros force-pushed the make_nonzeroint_get_assume_nonzero branch from 5acf68b to b895a3e Compare December 30, 2023 20:50
@saethlin
Copy link
Member

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 30, 2023
@bors
Copy link
Contributor

bors commented Dec 30, 2023

⌛ Trying commit b895a3e with merge 5a3a8b4...

bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 30, 2023
…get_assume_nonzero, r=<try>

Add assume into `NonZeroIntX::get`

LLVM currently don't support range metadata for function arguments so it fails to optimize non zero integers using their invariant if they are provided using by-value function arguments.

Related to rust-lang#119422
Related to llvm/llvm-project#76628
Related to rust-lang#49572
Comment on lines 114 to 119
if self.0 == 0 {
// SAFETY: It is an invariant of this type.
unsafe {
crate::hint::unreachable_unchecked()
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if self.0 == 0 {
// SAFETY: It is an invariant of this type.
unsafe {
crate::hint::unreachable_unchecked()
}
}
unsafe { std::intrinsics::assume(self.0 != 0) };

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nilstrieb assume is not const-stable, so it can’t be called here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably should be const stable then, I don't see any problems with that (cc @rust-lang/wg-const-eval)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah seems fine to mark that intrinsic as callable from stable const fn

@bors
Copy link
Contributor

bors commented Dec 30, 2023

☀️ Try build successful - checks-actions
Build commit: 5a3a8b4 (5a3a8b4ee378f283b1e4916e12f60874e412486b)

@rust-timer

This comment has been minimized.

@scottmcm scottmcm self-assigned this Dec 31, 2023
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (5a3a8b4): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
10.1% [10.1%, 10.1%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-3.3% [-3.3%, -3.3%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 3.4% [-3.3%, 10.1%] 2

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.4% [-2.6%, -2.1%] 2
All ❌✅ (primary) - - 0

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.1% [0.0%, 0.3%] 5
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.8% [-2.0%, -0.1%] 3
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.2% [-2.0%, 0.3%] 8

Bootstrap: 669.431s -> 669.699s (0.04%)
Artifact size: 311.78 MiB -> 311.73 MiB (-0.02%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 31, 2023
@Noratrieb
Copy link
Member

Doesn't make compilation or the compiler faster, but that of course doesn't mean it's not worth it. Seems reasonable to add this (with an assume) but looks like @scottmcm wants to do the final decision on that :)

@EFanZh
Copy link
Contributor

EFanZh commented Jan 1, 2024

@AngelicosPhosphoros: Could you also update the From<NonZero*> implementation to utilize this change?

@AngelicosPhosphoros AngelicosPhosphoros force-pushed the make_nonzeroint_get_assume_nonzero branch from b895a3e to 97f74e7 Compare January 2, 2024 16:23
@AngelicosPhosphoros
Copy link
Contributor Author

@EFanZh Done.

@scottmcm
Copy link
Member

scottmcm commented Jan 3, 2024

I agree that this should be using assume -- it'll save space in MIR, among other things. As Ralf said, if unreachable_unchecked is const-stable, there's no concern about assume being const-stable too.

And feel free to review if you notice it first, @Nilstrieb -- I just wanted to make sure it didn't end up waiting on a libs-api person who could be doing ACP reviews instead :)

@AngelicosPhosphoros
Copy link
Contributor Author

@Nilstrieb @RalfJung Do I need to run through whole stabilization process?
const_assume was started to be implemented in 2020 and seems to be inactive since: #76972

@Noratrieb
Copy link
Member

It's an intrinsic, just mark it as stable, that's it.

@RalfJung
Copy link
Member

RalfJung commented Jan 3, 2024

Yeah, the intrinsic is still unstable, just const-stable. So it cannot be called directly, just internally in stable const fn inside core/std. You can just add the attribute with wg-const-eval approval.

@scottmcm
Copy link
Member

scottmcm commented Jan 3, 2024

@AngelicosPhosphoros For context, the reason for this gating is to be careful that we don't accidentally expose something that couldn't be done in const fn before. But assume doesn't make anything possible that wasn't already possible with unreachable_unchecked, so there's no concerns for this particular one.

(The conversation would be more complicated if it was an intrinsic for getting a random number or reading a file or something.)

AngelicosPhosphoros added a commit to AngelicosPhosphoros/rust that referenced this pull request Jan 4, 2024
@AngelicosPhosphoros AngelicosPhosphoros marked this pull request as ready for review January 6, 2024 13:57
@scottmcm
Copy link
Member

The last perf run here looks good, but since the biggest reason this wasn't done before was perf concerns, I want to do another one as the code has changed, just from an abundance of caution.

@bors try @rust-timer queue

But r=me assuming the perf results are still good! 🤞

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 12, 2024
@bors
Copy link
Contributor

bors commented Jan 12, 2024

⌛ Trying commit 8f432d4 with merge 8a63333...

bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 12, 2024
…get_assume_nonzero, r=<try>

Add assume into `NonZeroIntX::get`

LLVM currently don't support range metadata for function arguments so it fails to optimize non zero integers using their invariant if they are provided using by-value function arguments.

Related to rust-lang#119422
Related to llvm/llvm-project#76628
Related to rust-lang#49572
@bors
Copy link
Contributor

bors commented Jan 12, 2024

☀️ Try build successful - checks-actions
Build commit: 8a63333 (8a63333b5c8e305f85f6968d7b857cc6bc94b154)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (8a63333): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.5% [0.4%, 0.6%] 3
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.7% [-0.7%, -0.7%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.2% [-0.7%, 0.6%] 4

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
3.0% [1.1%, 7.5%] 5
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-3.9% [-5.1%, -2.7%] 2
Improvements ✅
(secondary)
-0.9% [-0.9%, -0.9%] 1
All ❌✅ (primary) 1.1% [-5.1%, 7.5%] 7

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.6% [0.6%, 0.6%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.6% [-0.6%, -0.6%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.0% [-0.6%, 0.6%] 2

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 666.39s -> 663.51s (-0.43%)
Artifact size: 308.38 MiB -> 308.40 MiB (0.00%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jan 12, 2024
@AngelicosPhosphoros
Copy link
Contributor Author

Does that mean that using assume is different compared to unreachable_unchecked?

@scottmcm
Copy link
Member

scottmcm commented Jan 12, 2024

It's certainly different -- it changes the MIR, after all, such as fewer BBs by not having an if -- but to me these results are fine.

With more stuff in the methods, having the cycles being maybe slightly higher is, I think, ok. Looking at the overall wall time, including the non-significant ones, I think it's fair to call this roughly neutral. And 3 seconds saved on bootstrap is nothing to sneeze at!

@bors r+ rollup=never

@bors
Copy link
Contributor

bors commented Jan 12, 2024

📌 Commit 8f432d4 has been approved by scottmcm

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 12, 2024
@bors
Copy link
Contributor

bors commented Jan 12, 2024

⌛ Testing commit 8f432d4 with merge 2319be8...

@bors
Copy link
Contributor

bors commented Jan 12, 2024

☀️ Test successful - checks-actions
Approved by: scottmcm
Pushing 2319be8 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jan 12, 2024
@bors bors merged commit 2319be8 into rust-lang:master Jan 12, 2024
12 checks passed
@rustbot rustbot added this to the 1.77.0 milestone Jan 12, 2024
@AngelicosPhosphoros AngelicosPhosphoros deleted the make_nonzeroint_get_assume_nonzero branch January 12, 2024 23:25
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (2319be8): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.5% [0.4%, 0.9%] 4
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.7% [-0.7%, -0.7%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.3% [-0.7%, 0.9%] 5

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.7% [0.8%, 5.3%] 4
Regressions ❌
(secondary)
4.5% [4.5%, 4.5%] 1
Improvements ✅
(primary)
-1.0% [-1.3%, -0.5%] 3
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.1% [-1.3%, 5.3%] 7

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.5% [-0.7%, -0.4%] 3
Improvements ✅
(secondary)
-2.5% [-4.1%, -0.9%] 3
All ❌✅ (primary) -0.5% [-0.7%, -0.4%] 3

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 666.309s -> 665.459s (-0.13%)
Artifact size: 308.10 MiB -> 308.09 MiB (-0.00%)

@scottmcm
Copy link
Member

Instructions have a couple red in instruction counts for opt, but that's entirely reasonable for something intended to enable optimizations. Notably, the cycles are green, with no regressions. So I think this is fine.

@pnkfelix
Copy link
Member

  • scottmcm writes: "Instructions have a couple red in instruction counts for opt, but that's entirely reasonable for something intended to enable optimizations. Notably, the cycles are green, with no regressions. So I think this is fine."
  • marking as triaged

@rustbot label: +perf-regression-triaged

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Jan 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.