-
Notifications
You must be signed in to change notification settings - Fork 12.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redundant branches with ctlz and cttz #47467
Labels
Comments
We expand the intrinsics in -codegenprepare, and I'm not sure where we would solve this. machine-cse seems like the most likely candidate, but it would require tracking eflags state across basic blocks. Not sure if we do that: TEST64rr %5, %5, implicit-def $eflags
JCC_1 %bb.2, 4, implicit $eflags IR going into SDAG: define zeroext i1 @_ZN10playground20can_represent_as_f6417h8c9d47bab619cb5fE(i64 %x) unnamed_addr {
start:
%cmpz = icmp eq i64 %x, 0
br i1 %cmpz, label %cond.end, label %cond.false
cond.false: ; preds = %start
%0 = tail call i64 @llvm.ctlz.i64(i64 %x, i1 true)
br label %cond.end
cond.end: ; preds = %start, %cond.false
%ctz = phi i64 [ 64, %start ], [ %0, %cond.false ]
%1 = trunc i64 %ctz to i32
%cmpz3 = icmp eq i64 %x, 0
br i1 %cmpz3, label %cond.end2, label %cond.false1
cond.false1: ; preds = %cond.end
%2 = tail call i64 @llvm.cttz.i64(i64 %x, i1 true)
br label %cond.end2
cond.end2: ; preds = %cond.end, %cond.false1
%ctz4 = phi i64 [ 64, %cond.end ], [ %2, %cond.false1 ]
%3 = trunc i64 %ctz4 to i32
%_2 = add nuw nsw i32 %1, %3
%4 = icmp ugt i32 %_2, 10
ret i1 %4
} |
Note that this should not be an issue when compiling for more recent x86:
|
After #102885 the branches are now avoided on any x86 target with cmov: playground::can_represent_as_f64::h8c9d47bab619cb5f: # @playground::can_represent_as_f64::h8c9d47bab619cb5f
bsrq %rdi, %rax
movl $127, %ecx
cmovneq %rax, %rcx
xorl $63, %ecx
bsfq %rdi, %rax
movl $64, %edx
cmovneq %rax, %rdx
addl %ecx, %edx
cmpl $11, %edx
setae %al
retq |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Extended Description
Rust code:
LLVM IR:
Assembly:
Instead of performing the comparison twice, the code should immediately branch to LBB0_4.
The text was updated successfully, but these errors were encountered: