Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DebugInfo@O2] Incorrect debug info for escaped variables after LowerDbgDeclare runs #48063

Open
OCHyams opened this issue Jan 11, 2021 · 5 comments
Labels
bugzilla Issues migrated from bugzilla debuginfo llvm:optimizations

Comments

@OCHyams
Copy link
Contributor

OCHyams commented Jan 11, 2021

Bugzilla Link 48719
Version trunk
OS All
Blocks #38116
CC @gregbedwell,@jmorse,@jdm

Extended Description

TL;DR

llvm's debug info model is ineffective for variables which cannot be promoted in the first round of SROA/mem2reg and we can see incorrect debug info being generated as a result.

Example

clang built at 1ecae1e (10th Jan 2021).

We can get incorrect debug info for variables which are promoted (or partially promoted) after InstCombine runs LowerDbgDeclare. I think this is a general problem with the debug info model which manifests in more than one place (see summary at end), but here is a specific example first:

$ cat -n test.c
     1	void esc(int* p);
     2	void fluff();
     3	int fun(int first, int second) {
     4	  if (first)
     5	    first = second;
     6	  fluff();
     7	  esc(&first); //< first is escaped here.
     8	  return 0;
     9	}

Compile it with the following command and look at the IR:

$ clang -O2 -g -emit-llvm -S -mllvm -print-after-all test.c -o -

    define dso_local i32 @&#8203;fun(i32 %first, i32 %second) local_unnamed_addr #&#8203;0 !dbg !&#8203;7 {
    entry:
      %first.addr = alloca i32, align 4
      call void @&#8203;llvm.dbg.value(metadata i32 %first, metadata !&#8203;12, metadata !DIExpression()), !dbg !&#8203;14
      call void @&#8203;llvm.dbg.value(metadata i32 %second, metadata !&#8203;13, metadata !DIExpression()), !dbg !&#8203;14
      %tobool.not = icmp eq i32 %first, 0, !dbg !&#8203;15
      %spec.select = select i1 %tobool.not, i32 0, i32 %second, !dbg !&#8203;17
      store i32 %spec.select, i32* %first.addr, align 4, !tbaa !&#8203;18
      tail call void (...) @&#8203;fluff() #&#8203;3, !dbg !&#8203;22
      call void @&#8203;llvm.dbg.value(metadata i32* %first.addr, metadata !&#8203;12, metadata !DIExpression(DW_OP_deref)), !dbg !&#8203;14
      call void @&#8203;esc(i32* nonnull %first.addr) #&#8203;3, !dbg !&#8203;23
      ret i32 0, !dbg !&#8203;24
    }
    ...
    !&#8203;12 = !DILocalVariable(name: "first", arg: 1, scope: !&#8203;7, file: !&#8203;1, line: 3, type: !&#8203;10)
    !&#8203;13 = !DILocalVariable(name: "second", arg: 2, scope: !&#8203;7, file: !&#8203;1, line: 3, type: !&#8203;10)

Notice that the second dbg.value for "first" (#12) unexpectedly comes after the call to "fluff()". If you run this example through a debugger you will see "first=5" on line 6 when you'd expect "first=20". The dbg.value should instead be positioned immediately after the select instruction.

Example walkthrough

In the example "first" is escaped so it is not promoted by the early SROA/mem2reg pass. InstCombine will partially promote it by merging the stores (the initial store and the store on the if-then branch) into their common successor by inserting a PHI and store. Later SimplifyCFG folds all of the blocks together, turning the PHI into a select. InstCombine doesn't insert a dbg.value after the PHI and SimplifyCFG doesn't insert one when converting the PHI to a select, so the merged value is untracked in debug info.

We can see what happened by looking at the -print-after-all output. SROA/mem2reg runs early and is not able to promote "first" because it is escaped. So "first"'s alloca and dbg.declare survive:

    *** IR Dump After SROA ***
    define dso_local i32 @&#8203;fun(i32 %first, i32 %second) #&#8203;0 !dbg !&#8203;7 {
    entry:
      %first.addr = alloca i32, align 4
      store i32 %first, i32* %first.addr, align 4, !tbaa !&#8203;14
      call void @&#8203;llvm.dbg.declare(metadata i32* %first.addr, metadata !&#8203;12, metadata !DIExpression()), !dbg !&#8203;18
      call void @&#8203;llvm.dbg.value(metadata i32 %second, metadata !&#8203;13, metadata !DIExpression()), !dbg !&#8203;19
      %0 = load i32, i32* %first.addr, align 4, !dbg !&#8203;20, !tbaa !&#8203;14
      %tobool = icmp ne i32 %0, 0, !dbg !&#8203;20
      br i1 %tobool, label %if.then, label %if.end, !dbg !&#8203;22

    if.then:                                          ; preds = %entry
      store i32 %second, i32* %first.addr, align 4, !dbg !&#8203;23, !tbaa !&#8203;14
      br label %if.end, !dbg !&#8203;24

    if.end:                                           ; preds = %if.then, %entry
      call void (...) @&#8203;fluff(), !dbg !&#8203;25
      call void @&#8203;esc(i32* %first.addr), !dbg !&#8203;26
      ret i32 0, !dbg !&#8203;27
    }
    ...
    !&#8203;12 = !DILocalVariable(name: "first", arg: 1, scope: !&#8203;7, file: !&#8203;1, line: 3, type: !&#8203;10)
    !&#8203;13 = !DILocalVariable(name: "second", arg: 2, scope: !&#8203;7, file: !&#8203;1, line: 3, type: !&#8203;10)

Later, InstCombine runs and before doing any optimisations it calls LowerDbgDeclare to convert the dbg.declare into a set of dbg.value intrinsics:

    ### IR after LowerDbgDeclare runs but before InstCombine starts combining things ###
    define dso_local i32 @&#8203;fun(i32 %first, i32 %second) local_unnamed_addr #&#8203;0 !dbg !&#8203;7 {
    entry:
      %first.addr = alloca i32, align 4
      call void @&#8203;llvm.dbg.value(metadata i32 %first, metadata !&#8203;12, metadata !DIExpression()), !dbg !&#8203;14
      store i32 %first, i32* %first.addr, align 4, !tbaa !&#8203;15
      call void @&#8203;llvm.dbg.value(metadata i32 %second, metadata !&#8203;13, metadata !DIExpression()), !dbg !&#8203;14
      %tobool = icmp ne i32 %first, 0, !dbg !&#8203;19
      br i1 %tobool, label %if.then, label %if.end, !dbg !&#8203;21

    if.then:                                          ; preds = %entry
      call void @&#8203;llvm.dbg.value(metadata i32 %second, metadata !&#8203;12, metadata !DIExpression()), !dbg !&#8203;14
      store i32 %second, i32* %first.addr, align 4, !dbg !&#8203;22, !tbaa !&#8203;15
      br label %if.end, !dbg !&#8203;23

    if.end:                                           ; preds = %if.then, %entry
      call void (...) @&#8203;fluff(), !dbg !&#8203;24
      call void @&#8203;llvm.dbg.value(metadata i32* %first.addr, metadata !&#8203;12, metadata !DIExpression(DW_OP_deref)), !dbg !&#8203;14
      call void @&#8203;esc(i32* %first.addr), !dbg !&#8203;25
      ret i32 0, !dbg !&#8203;26
    }
    ...
    !&#8203;12 = !DILocalVariable(name: "first", arg: 1, scope: !&#8203;7, file: !&#8203;1, line: 3, type: !&#8203;10)
    !&#8203;13 = !DILocalVariable(name: "second", arg: 2, scope: !&#8203;7, file: !&#8203;1, line: 3, type: !&#8203;10)

InstCombine merges the two stores to %first.addr ("first"'s alloca address) into the common successor "if.end" as a PHI and store. Notice how it does not insert a dbg.value to describe the merged value %storemerge.

*** IR Dump After Combine redundant instructions ***
    define dso_local i32 @&#8203;fun(i32 %first, i32 %second) local_unnamed_addr #&#8203;0 !dbg !&#8203;7 {
    entry:
      %first.addr = alloca i32, align 4
      call void @&#8203;llvm.dbg.value(metadata i32 %first, metadata !&#8203;12, metadata !DIExpression()), !dbg !&#8203;14
      call void @&#8203;llvm.dbg.value(metadata i32 %second, metadata !&#8203;13, metadata !DIExpression()), !dbg !&#8203;14
      %tobool.not = icmp eq i32 %first, 0, !dbg !&#8203;15
      br i1 %tobool.not, label %if.end, label %if.then, !dbg !&#8203;17

    if.then:                                          ; preds = %entry
      call void @&#8203;llvm.dbg.value(metadata i32 %second, metadata !&#8203;12, metadata !DIExpression()), !dbg !&#8203;14
      br label %if.end, !dbg !&#8203;18

    if.end:                                           ; preds = %if.then, %entry
      %storemerge = phi i32 [ %second, %if.then ], [ %first, %entry ]
      store i32 %storemerge, i32* %first.addr, align 4, !tbaa !&#8203;19
      call void (...) @&#8203;fluff() #&#8203;3, !dbg !&#8203;23
      call void @&#8203;llvm.dbg.value(metadata i32* %first.addr, metadata !&#8203;12, metadata !DIExpression(DW_OP_deref)), !dbg !&#8203;14
      call void @&#8203;esc(i32* nonnull %first.addr) #&#8203;3, !dbg !&#8203;24
      ret i32 0, !dbg !&#8203;25
    }
    ...
    !&#8203;12 = !DILocalVariable(name: "first", arg: 1, scope: !&#8203;7, file: !&#8203;1, line: 3, type: !&#8203;10)
    !&#8203;13 = !DILocalVariable(name: "second", arg: 2, scope: !&#8203;7, file: !&#8203;1, line: 3, type: !&#8203;10)

SimplifyCFG folds these blocks together, converting the PHI to a select, and leaves us with the incorrect debug info shown earlier:

    *** IR Dump After Simplify the CFG ***
    define dso_local i32 @&#8203;fun(i32 %first, i32 %second) local_unnamed_addr #&#8203;0 !dbg !&#8203;7 {
    entry:
      %first.addr = alloca i32, align 4
      call void @&#8203;llvm.dbg.value(metadata i32 %first, metadata !&#8203;12, metadata !DIExpression()), !dbg !&#8203;14
      call void @&#8203;llvm.dbg.value(metadata i32 %second, metadata !&#8203;13, metadata !DIExpression()), !dbg !&#8203;14
      %tobool.not = icmp eq i32 %first, 0, !dbg !&#8203;15
      %spec.select = select i1 %tobool.not, i32 %first, i32 %second, !dbg !&#8203;17
      store i32 %spec.select, i32* %first.addr, align 4, !tbaa !&#8203;18
      call void (...) @&#8203;fluff() #&#8203;3, !dbg !&#8203;22
      call void @&#8203;llvm.dbg.value(metadata i32* %first.addr, metadata !&#8203;12, metadata !DIExpression(DW_OP_deref)), !dbg !&#8203;14
      call void @&#8203;esc(i32* nonnull %first.addr) #&#8203;3, !dbg !&#8203;23
      ret i32 0, !dbg !&#8203;24
    }
    ...
    !&#8203;12 = !DILocalVariable(name: "first", arg: 1, scope: !&#8203;7, file: !&#8203;1, line: 3, type: !&#8203;10)
    !&#8203;13 = !DILocalVariable(name: "second", arg: 2, scope: !&#8203;7, file: !&#8203;1, line: 3, type: !&#8203;10)

Summary

This example shows that InstCombine and SimplifyCFG compose to produce incorrect debug info for escaped variables. However, the reason that I think this is a general problem rather than a specific bug, and why I am struggling to hold any one pass accountable, is because of the following inconsistencies:

If "first" in this example can be fully promoted by mem2reg (https://godbolt.org/z/Es89z7) then there is no issue because mem2reg will insert a dbg.value after the PHI which eventually becomes the select.

However, mem2reg can also cause the same bug because it only inserts dbg.values when promoting a variable if and only if it has a dbg.declare. It's possible for LowerDbgDeclare to remove the dbg.declare before mem2reg runs. For example, when a variable is escaped but the escaping function is later inlined. You can see this happening here https://godbolt.org/z/KrdjGs.

Lastly, we don't see this issue in any of these cases if there isn't any block folding because LiveDebugValues will merge live-out variable locations from preds (https://godbolt.org/z/sjcnbv).

As far as I can tell there are no rules or examples demonstrating how debug info should be updated when promoting or partially promoting variables after LowerDbgDeclare has removed the dbg.declare.

@OCHyams
Copy link
Contributor Author

OCHyams commented Jan 15, 2021

I've written a few of these cases as dexter tests in D94761. unused-merged-value.c in that patch demonstrates how we are able to run into this issue for non-escaped variables too.

@OCHyams
Copy link
Contributor Author

OCHyams commented Jan 29, 2021

Oops, I said:

If you run this example through a debugger you will see "first=5" on line 6 when you'd expect "first=20".
but the example isn't linkable and doesn't include a call to the troublesome function. Here is the executable reproducer which gives the necessary context:

$ cat test.c
attribute((optnone)) void esc(int* p) {}
attribute((optnone)) void fluff() {}
attribute((noinline)) int fun(int first, int second) {
if (first)
first = second;
fluff();
esc(&first);
return 0;
}
int main() {
return fun(5, 20);
}

$ clang -O2 -g test.c -o test

$ lldb test
(lldb) target create "test"
Current executable set to 'test' (x86_64).
(lldb) b fun
Breakpoint 1: where = test`fun + 1 at test.c:4:7, address = 0x00000000004004a1
(lldb) run
Process 13431 launched: 'test' (x86_64)
Process 13431 stopped

  • thread #​1, name = 'test', stop reason = breakpoint 1.1
    frame #​0: 0x00000000004004a1 test`fun(first=5, second=20) at test.c:4:7
    1 attribute((optnone)) void esc(int* p) {}
    2 attribute((optnone)) void fluff() {}
    3 attribute((noinline)) int fun(int first, int second) {
    -> 4 if (first)
    5 first = second;
    6 fluff();
    7 esc(&first);
    (lldb) s
    Process 13431 stopped
  • thread #​1, name = 'test', stop reason = step in
    frame #​0: 0x00000000004004aa test`fun(first=5, second=20) at test.c:6:3
    3 attribute((noinline)) int fun(int first, int second) {
    4 if (first)
    5 first = second;
    -> 6 fluff();
    7 esc(&first);
    8 return 0;
    9 }

Notice that we see first=5 on line 6, which is wrong. We should see first=20 because the condition (first != 0) guarding the assignment (first = second) was true.

@llvmbot
Copy link
Member

llvmbot commented Jul 5, 2024

@llvm/issue-subscribers-debuginfo

Author: Orlando Cazalet-Hyams (OCHyams)

| | | | --- | --- | | Bugzilla Link | [48719](https://llvm.org/bz48719) | | Version | trunk | | OS | All | | Blocks | llvm/llvm-project#38116 | | CC | @gregbedwell,@jmorse,@jdm |

Extended Description

TL;DR

llvm's debug info model is ineffective for variables which cannot be promoted in the first round of SROA/mem2reg and we can see incorrect debug info being generated as a result.

Example

clang built at 1ecae1e (10th Jan 2021).

We can get incorrect debug info for variables which are promoted (or partially promoted) after InstCombine runs LowerDbgDeclare. I think this is a general problem with the debug info model which manifests in more than one place (see summary at end), but here is a specific example first:

$ cat -n test.c
1 void esc(int* p);
2 void fluff();
3 int fun(int first, int second) {
4 if (first)
5 first = second;
6 fluff();
7 esc(&first); //< first is escaped here.
8 return 0;
9 }

Compile it with the following command and look at the IR:
$ clang -O2 -g -emit-llvm -S -mllvm -print-after-all test.c -o -

define dso_local i32 @&amp;#<!-- -->8203;fun(i32 %first, i32 %second) local_unnamed_addr #&amp;#<!-- -->8203;0 !dbg !&amp;#<!-- -->8203;7 {
entry:
  %first.addr = alloca i32, align 4
  call void @&amp;#<!-- -->8203;llvm.dbg.value(metadata i32 %first, metadata !&amp;#<!-- -->8203;12, metadata !DIExpression()), !dbg !&amp;#<!-- -->8203;14
  call void @&amp;#<!-- -->8203;llvm.dbg.value(metadata i32 %second, metadata !&amp;#<!-- -->8203;13, metadata !DIExpression()), !dbg !&amp;#<!-- -->8203;14
  %tobool.not = icmp eq i32 %first, 0, !dbg !&amp;#<!-- -->8203;15
  %spec.select = select i1 %tobool.not, i32 0, i32 %second, !dbg !&amp;#<!-- -->8203;17
  store i32 %spec.select, i32* %first.addr, align 4, !tbaa !&amp;#<!-- -->8203;18
  tail call void (...) @&amp;#<!-- -->8203;fluff() #&amp;#<!-- -->8203;3, !dbg !&amp;#<!-- -->8203;22
  call void @&amp;#<!-- -->8203;llvm.dbg.value(metadata i32* %first.addr, metadata !&amp;#<!-- -->8203;12, metadata !DIExpression(DW_OP_deref)), !dbg !&amp;#<!-- -->8203;14
  call void @&amp;#<!-- -->8203;esc(i32* nonnull %first.addr) #&amp;#<!-- -->8203;3, !dbg !&amp;#<!-- -->8203;23
  ret i32 0, !dbg !&amp;#<!-- -->8203;24
}
...
!&amp;#<!-- -->8203;12 = !DILocalVariable(name: "first", arg: 1, scope: !&amp;#<!-- -->8203;7, file: !&amp;#<!-- -->8203;1, line: 3, type: !&amp;#<!-- -->8203;10)
!&amp;#<!-- -->8203;13 = !DILocalVariable(name: "second", arg: 2, scope: !&amp;#<!-- -->8203;7, file: !&amp;#<!-- -->8203;1, line: 3, type: !&amp;#<!-- -->8203;10)

Notice that the second dbg.value for "first" (#12) unexpectedly comes after the call to "fluff()". If you run this example through a debugger you will see "first=5" on line 6 when you'd expect "first=20". The dbg.value should instead be positioned immediately after the select instruction.

Example walkthrough

In the example "first" is escaped so it is not promoted by the early SROA/mem2reg pass. InstCombine will partially promote it by merging the stores (the initial store and the store on the if-then branch) into their common successor by inserting a PHI and store. Later SimplifyCFG folds all of the blocks together, turning the PHI into a select. InstCombine doesn't insert a dbg.value after the PHI and SimplifyCFG doesn't insert one when converting the PHI to a select, so the merged value is untracked in debug info.

We can see what happened by looking at the -print-after-all output. SROA/mem2reg runs early and is not able to promote "first" because it is escaped. So "first"'s alloca and dbg.declare survive:

*** IR Dump After SROA ***
define dso_local i32 @&amp;#<!-- -->8203;fun(i32 %first, i32 %second) #&amp;#<!-- -->8203;0 !dbg !&amp;#<!-- -->8203;7 {
entry:
  %first.addr = alloca i32, align 4
  store i32 %first, i32* %first.addr, align 4, !tbaa !&amp;#<!-- -->8203;14
  call void @&amp;#<!-- -->8203;llvm.dbg.declare(metadata i32* %first.addr, metadata !&amp;#<!-- -->8203;12, metadata !DIExpression()), !dbg !&amp;#<!-- -->8203;18
  call void @&amp;#<!-- -->8203;llvm.dbg.value(metadata i32 %second, metadata !&amp;#<!-- -->8203;13, metadata !DIExpression()), !dbg !&amp;#<!-- -->8203;19
  %0 = load i32, i32* %first.addr, align 4, !dbg !&amp;#<!-- -->8203;20, !tbaa !&amp;#<!-- -->8203;14
  %tobool = icmp ne i32 %0, 0, !dbg !&amp;#<!-- -->8203;20
  br i1 %tobool, label %if.then, label %if.end, !dbg !&amp;#<!-- -->8203;22

if.then:                                          ; preds = %entry
  store i32 %second, i32* %first.addr, align 4, !dbg !&amp;#<!-- -->8203;23, !tbaa !&amp;#<!-- -->8203;14
  br label %if.end, !dbg !&amp;#<!-- -->8203;24

if.end:                                           ; preds = %if.then, %entry
  call void (...) @&amp;#<!-- -->8203;fluff(), !dbg !&amp;#<!-- -->8203;25
  call void @&amp;#<!-- -->8203;esc(i32* %first.addr), !dbg !&amp;#<!-- -->8203;26
  ret i32 0, !dbg !&amp;#<!-- -->8203;27
}
...
!&amp;#<!-- -->8203;12 = !DILocalVariable(name: "first", arg: 1, scope: !&amp;#<!-- -->8203;7, file: !&amp;#<!-- -->8203;1, line: 3, type: !&amp;#<!-- -->8203;10)
!&amp;#<!-- -->8203;13 = !DILocalVariable(name: "second", arg: 2, scope: !&amp;#<!-- -->8203;7, file: !&amp;#<!-- -->8203;1, line: 3, type: !&amp;#<!-- -->8203;10)

Later, InstCombine runs and before doing any optimisations it calls LowerDbgDeclare to convert the dbg.declare into a set of dbg.value intrinsics:

### IR after LowerDbgDeclare runs but before InstCombine starts combining things ###
define dso_local i32 @&amp;#<!-- -->8203;fun(i32 %first, i32 %second) local_unnamed_addr #&amp;#<!-- -->8203;0 !dbg !&amp;#<!-- -->8203;7 {
entry:
  %first.addr = alloca i32, align 4
  call void @&amp;#<!-- -->8203;llvm.dbg.value(metadata i32 %first, metadata !&amp;#<!-- -->8203;12, metadata !DIExpression()), !dbg !&amp;#<!-- -->8203;14
  store i32 %first, i32* %first.addr, align 4, !tbaa !&amp;#<!-- -->8203;15
  call void @&amp;#<!-- -->8203;llvm.dbg.value(metadata i32 %second, metadata !&amp;#<!-- -->8203;13, metadata !DIExpression()), !dbg !&amp;#<!-- -->8203;14
  %tobool = icmp ne i32 %first, 0, !dbg !&amp;#<!-- -->8203;19
  br i1 %tobool, label %if.then, label %if.end, !dbg !&amp;#<!-- -->8203;21

if.then:                                          ; preds = %entry
  call void @&amp;#<!-- -->8203;llvm.dbg.value(metadata i32 %second, metadata !&amp;#<!-- -->8203;12, metadata !DIExpression()), !dbg !&amp;#<!-- -->8203;14
  store i32 %second, i32* %first.addr, align 4, !dbg !&amp;#<!-- -->8203;22, !tbaa !&amp;#<!-- -->8203;15
  br label %if.end, !dbg !&amp;#<!-- -->8203;23

if.end:                                           ; preds = %if.then, %entry
  call void (...) @&amp;#<!-- -->8203;fluff(), !dbg !&amp;#<!-- -->8203;24
  call void @&amp;#<!-- -->8203;llvm.dbg.value(metadata i32* %first.addr, metadata !&amp;#<!-- -->8203;12, metadata !DIExpression(DW_OP_deref)), !dbg !&amp;#<!-- -->8203;14
  call void @&amp;#<!-- -->8203;esc(i32* %first.addr), !dbg !&amp;#<!-- -->8203;25
  ret i32 0, !dbg !&amp;#<!-- -->8203;26
}
...
!&amp;#<!-- -->8203;12 = !DILocalVariable(name: "first", arg: 1, scope: !&amp;#<!-- -->8203;7, file: !&amp;#<!-- -->8203;1, line: 3, type: !&amp;#<!-- -->8203;10)
!&amp;#<!-- -->8203;13 = !DILocalVariable(name: "second", arg: 2, scope: !&amp;#<!-- -->8203;7, file: !&amp;#<!-- -->8203;1, line: 3, type: !&amp;#<!-- -->8203;10)

InstCombine merges the two stores to %first.addr ("first"'s alloca address) into the common successor "if.end" as a PHI and store. Notice how it does not insert a dbg.value to describe the merged value %storemerge.

*** IR Dump After Combine redundant instructions ***
define dso_local i32 @&#8203;fun(i32 %first, i32 %second) local_unnamed_addr #&#8203;0 !dbg !&#8203;7 {
entry:
%first.addr = alloca i32, align 4
call void @&#8203;llvm.dbg.value(metadata i32 %first, metadata !&#8203;12, metadata !DIExpression()), !dbg !&#8203;14
call void @&#8203;llvm.dbg.value(metadata i32 %second, metadata !&#8203;13, metadata !DIExpression()), !dbg !&#8203;14
%tobool.not = icmp eq i32 %first, 0, !dbg !&#8203;15
br i1 %tobool.not, label %if.end, label %if.then, !dbg !&#8203;17

if.then:                                          ; preds = %entry
  call void @&amp;#<!-- -->8203;llvm.dbg.value(metadata i32 %second, metadata !&amp;#<!-- -->8203;12, metadata !DIExpression()), !dbg !&amp;#<!-- -->8203;14
  br label %if.end, !dbg !&amp;#<!-- -->8203;18

if.end:                                           ; preds = %if.then, %entry
  %storemerge = phi i32 [ %second, %if.then ], [ %first, %entry ]
  store i32 %storemerge, i32* %first.addr, align 4, !tbaa !&amp;#<!-- -->8203;19
  call void (...) @&amp;#<!-- -->8203;fluff() #&amp;#<!-- -->8203;3, !dbg !&amp;#<!-- -->8203;23
  call void @&amp;#<!-- -->8203;llvm.dbg.value(metadata i32* %first.addr, metadata !&amp;#<!-- -->8203;12, metadata !DIExpression(DW_OP_deref)), !dbg !&amp;#<!-- -->8203;14
  call void @&amp;#<!-- -->8203;esc(i32* nonnull %first.addr) #&amp;#<!-- -->8203;3, !dbg !&amp;#<!-- -->8203;24
  ret i32 0, !dbg !&amp;#<!-- -->8203;25
}
...
!&amp;#<!-- -->8203;12 = !DILocalVariable(name: "first", arg: 1, scope: !&amp;#<!-- -->8203;7, file: !&amp;#<!-- -->8203;1, line: 3, type: !&amp;#<!-- -->8203;10)
!&amp;#<!-- -->8203;13 = !DILocalVariable(name: "second", arg: 2, scope: !&amp;#<!-- -->8203;7, file: !&amp;#<!-- -->8203;1, line: 3, type: !&amp;#<!-- -->8203;10)

SimplifyCFG folds these blocks together, converting the PHI to a select, and leaves us with the incorrect debug info shown earlier:

*** IR Dump After Simplify the CFG ***
define dso_local i32 @&amp;#<!-- -->8203;fun(i32 %first, i32 %second) local_unnamed_addr #&amp;#<!-- -->8203;0 !dbg !&amp;#<!-- -->8203;7 {
entry:
  %first.addr = alloca i32, align 4
  call void @&amp;#<!-- -->8203;llvm.dbg.value(metadata i32 %first, metadata !&amp;#<!-- -->8203;12, metadata !DIExpression()), !dbg !&amp;#<!-- -->8203;14
  call void @&amp;#<!-- -->8203;llvm.dbg.value(metadata i32 %second, metadata !&amp;#<!-- -->8203;13, metadata !DIExpression()), !dbg !&amp;#<!-- -->8203;14
  %tobool.not = icmp eq i32 %first, 0, !dbg !&amp;#<!-- -->8203;15
  %spec.select = select i1 %tobool.not, i32 %first, i32 %second, !dbg !&amp;#<!-- -->8203;17
  store i32 %spec.select, i32* %first.addr, align 4, !tbaa !&amp;#<!-- -->8203;18
  call void (...) @&amp;#<!-- -->8203;fluff() #&amp;#<!-- -->8203;3, !dbg !&amp;#<!-- -->8203;22
  call void @&amp;#<!-- -->8203;llvm.dbg.value(metadata i32* %first.addr, metadata !&amp;#<!-- -->8203;12, metadata !DIExpression(DW_OP_deref)), !dbg !&amp;#<!-- -->8203;14
  call void @&amp;#<!-- -->8203;esc(i32* nonnull %first.addr) #&amp;#<!-- -->8203;3, !dbg !&amp;#<!-- -->8203;23
  ret i32 0, !dbg !&amp;#<!-- -->8203;24
}
...
!&amp;#<!-- -->8203;12 = !DILocalVariable(name: "first", arg: 1, scope: !&amp;#<!-- -->8203;7, file: !&amp;#<!-- -->8203;1, line: 3, type: !&amp;#<!-- -->8203;10)
!&amp;#<!-- -->8203;13 = !DILocalVariable(name: "second", arg: 2, scope: !&amp;#<!-- -->8203;7, file: !&amp;#<!-- -->8203;1, line: 3, type: !&amp;#<!-- -->8203;10)

Summary

This example shows that InstCombine and SimplifyCFG compose to produce incorrect debug info for escaped variables. However, the reason that I think this is a general problem rather than a specific bug, and why I am struggling to hold any one pass accountable, is because of the following inconsistencies:

If "first" in this example can be fully promoted by mem2reg (https://godbolt.org/z/Es89z7) then there is no issue because mem2reg will insert a dbg.value after the PHI which eventually becomes the select.

However, mem2reg can also cause the same bug because it only inserts dbg.values when promoting a variable if and only if it has a dbg.declare. It's possible for LowerDbgDeclare to remove the dbg.declare before mem2reg runs. For example, when a variable is escaped but the escaping function is later inlined. You can see this happening here https://godbolt.org/z/KrdjGs.

Lastly, we don't see this issue in any of these cases if there isn't any block folding because LiveDebugValues will merge live-out variable locations from preds (https://godbolt.org/z/sjcnbv).

As far as I can tell there are no rules or examples demonstrating how debug info should be updated when promoting or partially promoting variables after LowerDbgDeclare has removed the dbg.declare.

@jmorse
Copy link
Member

jmorse commented Jul 15, 2024

@OCHyams Can we close this now that assignment-tracking exists? LTO-mode notwithstanding

@OCHyams
Copy link
Contributor Author

OCHyams commented Jul 15, 2024

Hmm yes and no. The more fundamental issue has been fixed but it looks like issue that SimplifyCFG introduces still exists

https://godbolt.org/z/3xP5qjx73

define dso_local noundef i32 @fun(int, int)(i32 noundef %first, i32 noundef %second) local_unnamed_addr #0 !dbg !10 {
entry:
  %first.addr = alloca i32, align 4, !DIAssignID !18
    #dbg_assign(i1 undef, !16, !DIExpression(), !18, ptr %first.addr, !DIExpression(), !19)
    #dbg_assign(i32 %first, !16, !DIExpression(), !20, ptr %first.addr, !DIExpression(), !19)
    #dbg_value(i32 %second, !17, !DIExpression(), !19)
  %tobool.not = icmp eq i32 %first, 0, !dbg !21
; XXX
    #dbg_assign(i32 %second, !16, !DIExpression(), !20, ptr %first.addr, !DIExpression(), !19)
  %spec.select = select i1 %tobool.not, i32 0, i32 %second, !dbg !23
  store i32 %spec.select, ptr %first.addr, align 4, !tbaa !24, !DIAssignID !20
  tail call void @fluff()(), !dbg !28
  call void @esc(int*)(ptr noundef nonnull %first.addr), !dbg !29
  ret i32 0, !dbg !30
}

I think the dbg.assign after ; XXX wants to either have spec.select or undef as its value operand. The dbg.assign is linked to the store (good) meaning we'll produce a stack location from there, but between the dbg.assign and that store we'll emit an implicit location of the value %second, which is wrong (conditional assignment has become unconditional in debug info).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla debuginfo llvm:optimizations
Projects
None yet
Development

No branches or pull requests

4 participants