Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missed optimization for if let, even though equivalent match is optimized #110097

Closed
davemilter opened this issue Apr 8, 2023 · 3 comments · Fixed by #120268
Closed

missed optimization for if let, even though equivalent match is optimized #110097

davemilter opened this issue Apr 8, 2023 · 3 comments · Fixed by #120268
Assignees
Labels
A-mir-opt Area: MIR optimizations C-bug Category: This is a bug. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@davemilter
Copy link

davemilter commented Apr 8, 2023

I tried this code:

pub fn change_value(val: Result<i32, ()>) -> Result<i32, ()> {
    if let Ok(x) = val {
        Ok(x * 2)
    } else {
        Err(())
    }
}

pub fn change_value2(val: Result<i32, ()>) -> Result<i32, ()> {
    match val {
        Ok(x) => Ok(x * 2),
        Err(()) => Err(())
    }
}

I expected to see this happen:
in release build this code would translated into exact the same assembly/llvm-ir

Instead, this happened:

change_value:
	xorl	%eax, %eax
	testl	%edi, %edi
	setne	%al
	leal	(%rsi,%rsi), %edx
	retq
change_value2:
	movl	%edi, %eax
	leal	(%rsi,%rsi), %edx
	retq

and the llvm-ir is different:

define { i32, i32 } @_ZN10playground12change_value17h0230bd4091fd6529E(i32 noundef %0, i32 %1) unnamed_addr #0 {
start:
  %2 = icmp eq i32 %0, 0
  %_4 = shl i32 %1, 1
  %not. = xor i1 %2, true
  %.sroa.0.0 = zext i1 %not. to i32
  %.sroa.3.0 = select i1 %2, i32 %_4, i32 undef
  %3 = insertvalue { i32, i32 } undef, i32 %.sroa.0.0, 0
  %4 = insertvalue { i32, i32 } %3, i32 %.sroa.3.0, 1
  ret { i32, i32 } %4
}
define { i32, i32 } @_ZN10playground13change_value217h3302fbc2853d0a41E(i32 noundef %0, i32 %1) unnamed_addr #0 {
start:
  %switch = icmp eq i32 %0, 0
  %_4 = shl i32 %1, 1
  %.sroa.3.0 = select i1 %switch, i32 %_4, i32 undef
  %2 = insertvalue { i32, i32 } undef, i32 %0, 0
  %3 = insertvalue { i32, i32 } %2, i32 %.sroa.3.0, 1
  ret { i32, i32 } %3
}

Meta

I tried thin in rust-playground for 1.68.2, beta and nightly.

See also https://internals.rust-lang.org/t/why-match-is-better-then-if/18636

@davemilter davemilter added the C-bug Category: This is a bug. label Apr 8, 2023
@jyn514
Copy link
Member

jyn514 commented Apr 9, 2023

To be clear, is the bug that they are different? or that the first one has a missed optimization? we don't guarantee at the language level that if is lowered to match.

@jyn514 jyn514 added I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 9, 2023
@steffahn
Copy link
Member

steffahn commented Apr 9, 2023

As far as I know from the linked IRLO discussion, the main point is that it's a missed optimization. Besides the already posted information, looking at the MIR is also instructional, as things like looking at the case where match is used with a _ => pattern instead of Err(()) =>, or comparing the behavior here (where a 2-variant enum discriminant is used) with how similar code using a bool would be optimized. I will not reproduce the whole discussion here, but it's worth a look.

@jyn514 jyn514 changed the title match and if converted to different llvm-ir missed optimization for if, even though equivalent match is optimized Apr 9, 2023
@jyn514 jyn514 changed the title missed optimization for if, even though equivalent match is optimized missed optimization for if let, even though equivalent match is optimized Apr 9, 2023
@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023
@DianQK
Copy link
Member

DianQK commented Feb 28, 2024

@rustbot label +A-mir-opt
@rustbot claim

@rustbot rustbot added the A-mir-opt Area: MIR optimizations label Feb 28, 2024
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 28, 2024
…tchs, r=<try>

Replace the default branch with an unreachable branch If it is the last variant

Fixes rust-lang#119520. Fixes rust-lang#110097.

LLVM currently has limited ability to eliminate dead branches in switches, even with the patch of llvm/llvm-project#73446.

The main reasons are as follows:

- Additional costs are required to calculate the range of values, and there exist many scenarios that cannot be analyzed accurately.
- Matching values by bitwise calculation cannot handle odd branches, nor can it handle values like `-1, 0, 1`. See [SimplifyCFG.cpp#L5424](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/Transforms/Utils/SimplifyCFG.cpp#L5424) and https://llvm.godbolt.org/z/qYMqhvMa8
- The current range information is continuous, even if the metadata for the range is submitted. See [ConstantRange.cpp#L1869-L1870](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/IR/ConstantRange.cpp#L1869-L1870).
- The metadata of the range may be lost in passes such as SROA. See https://rust.godbolt.org/z/e7f87vKMK.

Although we can make improvements, I think it would be more appropriate to put this issue to rustc first. After all, we can easily know the possible values.

Note that we've currently found a slow compilation problem in the presence of unreachable branches. See
llvm/llvm-project#78578.

r? compiler
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 29, 2024
…tchs, r=<try>

Replace the default branch with an unreachable branch If it is the last variant

Fixes rust-lang#119520. Fixes rust-lang#110097.

LLVM currently has limited ability to eliminate dead branches in switches, even with the patch of llvm/llvm-project#73446.

The main reasons are as follows:

- Additional costs are required to calculate the range of values, and there exist many scenarios that cannot be analyzed accurately.
- Matching values by bitwise calculation cannot handle odd branches, nor can it handle values like `-1, 0, 1`. See [SimplifyCFG.cpp#L5424](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/Transforms/Utils/SimplifyCFG.cpp#L5424) and https://llvm.godbolt.org/z/qYMqhvMa8
- The current range information is continuous, even if the metadata for the range is submitted. See [ConstantRange.cpp#L1869-L1870](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/IR/ConstantRange.cpp#L1869-L1870).
- The metadata of the range may be lost in passes such as SROA. See https://rust.godbolt.org/z/e7f87vKMK.

Although we can make improvements, I think it would be more appropriate to put this issue to rustc first. After all, we can easily know the possible values.

Note that we've currently found a slow compilation problem in the presence of unreachable branches. See
llvm/llvm-project#78578.

r? compiler
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 29, 2024
…tchs, r=<try>

Replace the default branch with an unreachable branch If it is the last variant

Fixes rust-lang#119520. Fixes rust-lang#110097.

LLVM currently has limited ability to eliminate dead branches in switches, even with the patch of llvm/llvm-project#73446.

The main reasons are as follows:

- Additional costs are required to calculate the range of values, and there exist many scenarios that cannot be analyzed accurately.
- Matching values by bitwise calculation cannot handle odd branches, nor can it handle values like `-1, 0, 1`. See [SimplifyCFG.cpp#L5424](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/Transforms/Utils/SimplifyCFG.cpp#L5424) and https://llvm.godbolt.org/z/qYMqhvMa8
- The current range information is continuous, even if the metadata for the range is submitted. See [ConstantRange.cpp#L1869-L1870](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/IR/ConstantRange.cpp#L1869-L1870).
- The metadata of the range may be lost in passes such as SROA. See https://rust.godbolt.org/z/e7f87vKMK.

Although we can make improvements, I think it would be more appropriate to put this issue to rustc first. After all, we can easily know the possible values.

Note that we've currently found a slow compilation problem in the presence of unreachable branches. See
llvm/llvm-project#78578.

r? compiler
bors added a commit to rust-lang-ci/rust that referenced this issue Mar 8, 2024
…tchs, r=<try>

Replace the default branch with an unreachable branch If it is the last variant

Fixes rust-lang#119520. Fixes rust-lang#110097.

LLVM currently has limited ability to eliminate dead branches in switches, even with the patch of llvm/llvm-project#73446.

The main reasons are as follows:

- Additional costs are required to calculate the range of values, and there exist many scenarios that cannot be analyzed accurately.
- Matching values by bitwise calculation cannot handle odd branches, nor can it handle values like `-1, 0, 1`. See [SimplifyCFG.cpp#L5424](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/Transforms/Utils/SimplifyCFG.cpp#L5424) and https://llvm.godbolt.org/z/qYMqhvMa8
- The current range information is continuous, even if the metadata for the range is submitted. See [ConstantRange.cpp#L1869-L1870](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/IR/ConstantRange.cpp#L1869-L1870).
- The metadata of the range may be lost in passes such as SROA. See https://rust.godbolt.org/z/e7f87vKMK.

Although we can make improvements, I think it would be more appropriate to put this issue to rustc first. After all, we can easily know the possible values.

Note that we've currently found a slow compilation problem in the presence of unreachable branches. See
llvm/llvm-project#78578.

r? compiler
@bors bors closed this as completed in 14fbc3c Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-mir-opt Area: MIR optimizations C-bug Category: This is a bug. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants