Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DebugInfo@O2][Dexter] Misleading watch value of aggregate member at -O2 optimisation. #38112

Open
llvmbot opened this issue Aug 29, 2018 · 8 comments
Assignees
Labels
bugzilla Issues migrated from bugzilla wrong-debug

Comments

@llvmbot
Copy link
Member

llvmbot commented Aug 29, 2018

Bugzilla Link 38764
Version trunk
OS Linux
Blocks #38116
Reporter LLVM Bugzilla Contributor
CC @chrisjbris,@gregbedwell,@MaskRay,@CarlosAlbertoEnciso,@jmorse,@OCHyams,@pogo59

Extended Description

When the following example:


1 struct SROAThis {
2 SROAThis(int n_a, int n_b, int n_c, int n_d)
3 : a(n_a), b(n_b), c(n_c), d(n_d){}
4
5 int a, b, c, d;
6 };
7
8 int doesSroaStuff(SROAThis & aggr) {
9 aggr.a += 1;
10 aggr.b += 2;
11 aggr.c += 3;
12 aggr.d += 4;
13 aggr.a += aggr.b + aggr.c + aggr.d;
14 return aggr.a;
15 }
11
12 int main(int argc, const char ** argv) {
13 SROAThis aggr(argc, argc, argc, argc);
14
15 auto a = doesSroaStuff(aggr);
16
17 return a;
18 }


is compiled with the following command:

clang test.cpp -o test.elf -g -O2 -fno-inline

The line table entries/debug locations for doesSroaStuff no longer accurately reflect the source.

The disassembly for doesSroaStuff shows that the constant additions at the start of the function have been interleaved with movs, leas and adds:


// from objdump -M intel -d ./test.elf

0000000000400480 <_Z13doesSroaStuffR8SROAThis>:
400480: 8b 47 04 mov eax,DWORD PTR [rdi+0x4]
400483: 8b 4f 08 mov ecx,DWORD PTR [rdi+0x8]
400486: 83 c0 02 add eax,0x2
400489: 8b 17 mov edx,DWORD PTR [rdi]
40048b: 01 c2 add edx,eax
40048d: 89 47 04 mov DWORD PTR [rdi+0x4],eax
400490: 8d 41 03 lea eax,[rcx+0x3]
400493: 89 47 08 mov DWORD PTR [rdi+0x8],eax
400496: 8b 47 0c mov eax,DWORD PTR [rdi+0xc]
400499: 8d 70 04 lea esi,[rax+0x4]
40049c: 89 77 0c mov DWORD PTR [rdi+0xc],esi
40049f: 8d 0c 11 lea ecx,[rcx+rdx1]
4004a2: 83 c1 03 add ecx,0x3
4004a5: 8d 04 08 lea eax,[rax+rcx
1]
4004a8: 83 c0 05 add eax,0x5
4004ab: 89 07 mov DWORD PTR [rdi],eax
4004ad: c3 ret


A line table dump from llvm-dwarfdump shows the following:


0x0000000000400480 8 0 1 0 0 is_stmt 0x0000000000400480 10 10 1 0 0 is_stmt prologue_end 0x0000000000400483 11 10 1 0 0 is_stmt 0x0000000000400486 10 10 1 0 0 is_stmt

if we step to line 11 we will only have executed the following load instruction:

400480: 8b 47 04 mov eax,DWORD PTR [rdi+0x4]

placing our source mapping at the following line

8 int doesSroaStuff(SROAThis & aggr) {
9 aggr.a += 1;
10 aggr.b += 2;
11 aggr.c += 3; < < <

however, no additions have taken place.

When debugging using LLDB, I get the following watch values for aggr.a and aggr.b at line 11:


Watchpoint 1: addr = 0x7fffffffe4a0 size = 4 state = enabled type = w
declare @ '/home/clangbox/dev/temp/test/./sroa_temp.cpp:8'
watchpoint spec = 'aggr.a'
new value: 1
Watchpoint 2: addr = 0x7fffffffe4a4 size = 4 state = enabled type = w
declare @ '/home/clangbox/dev/temp/test/./sroa_temp.cpp:8'
watchpoint spec = 'aggr.b'
new value: 1


The user should expect to see aggr.a = 2 and aggr.b = 3 if the program is run with no options.

If re-implementing this example using none-aggregates but the same structure, such as :


int doesSroaStuff(int & a, int & b, int & c, int & d) {
a += 1;
b += 2;
c += 3;
d += 4;
a += b + c + d;
return a;
}

int main(int argc, const char ** argv) {

auto a = doesSroaStuff(argc, argc, argc, argc);

return a;
}


Then doesSroaStuff has the adds in the "correct" order despite being optimised.

0000000000400480 <Z13doesSroaStuffRiS_S_S>:
400480: 83 07 01 add DWORD PTR [rdi],0x1
400483: 83 06 02 add DWORD PTR [rsi],0x2
400486: 83 02 03 add DWORD PTR [rdx],0x3
400489: 8b 01 mov eax,DWORD PTR [rcx]
40048b: 83 c0 04 add eax,0x4
40048e: 89 01 mov DWORD PTR [rcx],eax
400490: 03 06 add eax,DWORD PTR [rsi]
400492: 03 02 add eax,DWORD PTR [rdx]
400494: 03 07 add eax,DWORD PTR [rdi]
400496: 89 07 mov DWORD PTR [rdi],eax
400498: c3 ret


@llvmbot
Copy link
Member Author

llvmbot commented Aug 29, 2018

assigned to @CarlosAlbertoEnciso

@llvmbot
Copy link
Member Author

llvmbot commented Aug 29, 2018

This was found using https://github.com/SNSystems/dexter

@CarlosAlbertoEnciso
Copy link
Member

The following modified test case reproduces the issue described in comment 0

-------->8--------
struct S {
S(int pa, int pb) : a(pa), b(pb) {}

int a;
int b;
};

int foo(S &aggr) {
aggr.a += 0x33;
aggr.b += 0x44;
aggr.a += aggr.b;
return aggr.a;
}

int main() {
S aggr(0x11, 0x22);
auto af = foo(aggr);
return af;
}
--------8<--------

@CarlosAlbertoEnciso
Copy link
Member

*** IR Dump Before Dead Store Elimination ***

define i32 @​_Z3fooR1S(%struct.S* %aggr) {
entry:
...
%a = getelementptr inbounds %struct.S, %struct.S* %aggr, i64 0, i32 0, !dbg !​24
%0 = load i32, i32* %a, align 4, !dbg !​24, !tbaa !​25
%add = add nsw i32 %0, 33, !dbg !​24
store i32 %add, i32* %a, align 4, !dbg !​24, !tbaa !​25

%b = getelementptr inbounds %struct.S, %struct.S* %aggr, i64 0, i32 1, !dbg !​30
%1 = load i32, i32* %b, align 4, !dbg !​30, !tbaa !​31
%add1 = add nsw i32 %1, 44, !dbg !​30
store i32 %add1, i32* %b, align 4, !dbg !​30, !tbaa !​31
...
}

Before DSE, it is clear that 'a' gets updated.

But during DSE:

DEAD: store i32 %add, i32* %a, align 4, !dbg !​24, !tbaa !​25

@CarlosAlbertoEnciso
Copy link
Member

*** IR Dump After Dead Store Elimination ***

define i32 @​_Z3fooR1S(%struct.S* %aggr) {
entry:
...
%a = getelementptr inbounds %struct.S, %struct.S* %aggr, i64 0, i32 0, !dbg !​24
%0 = load i32, i32* %a, align 4, !dbg !​24, !tbaa !​25
%add = add nsw i32 %0, 33, !dbg !​24

%b = getelementptr inbounds %struct.S, %struct.S* %aggr, i64 0, i32 1, !dbg !​30
%1 = load i32, i32* %b, align 4, !dbg !​30, !tbaa !​31
%add1 = add nsw i32 %1, 44, !dbg !​30
store i32 %add1, i32* %b, align 4, !dbg !​30, !tbaa !​31
...
}

And 'a' is only updated after 'b' is updated before returning.

%add4 = add nsw i32 %add1, %add, !dbg !​32
store i32 %add4, i32* %a, align 4, !dbg !​32, !tbaa !​25
ret i32 %add4, !dbg !​33

@MaskRay
Copy link
Member

MaskRay commented Jan 20, 2021

For the code in comment 2,

clang -O2 a.cc -g -gdwarf-5 -fno-legacy-pass-manager -fno-inline -o a

Address Line Column File ISA Discriminator Flags


0x0000000000401110 8 0 0 0 0 is_stmt
0x0000000000401110 9 10 0 0 0 is_stmt prologue_end
0x0000000000401112 10 10 0 0 0 is_stmt
0x000000000040111b 11 10 0 0 0 is_stmt
0x0000000000401122 12 3 0 0 0 is_stmt
0x0000000000401130 15 0 0 0 0 is_stmt
0x000000000040113a 16 5 0 0 0 is_stmt prologue_end
0x000000000040114c 17 13 0 0 0 is_stmt
0x0000000000401154 18 3 0 0 0 is_stmt
0x000000000040115a 18 3 0 0 0 is_stmt end_sequence
0x0000000000401160 2 0 0 0 0 is_stmt
0x0000000000401160 2 23 0 0 0 is_stmt prologue_end
0x0000000000401162 2 30 0 0 0
0x0000000000401165 2 37 0 0 0
0x0000000000401166 2 37 0 0 0 end_sequence

0000000000401110 <_Z3fooR1S>:
401110: 8b 07 movl (%rdi), %eax
401112: 8b 4f 04 movl 4(%rdi), %ecx
401115: 8d 51 44 leal 68(%rcx), %edx
401118: 89 57 04 movl %edx, 4(%rdi)
40111b: 01 c8 addl %ecx, %eax
40111d: 83 c0 77 addl $119, %eax
401120: 89 07 movl %eax, (%rdi)
401122: c3 retq
401123: 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:(%rax,%rax)
40112d: 0f 1f 00 nopl (%rax)

DSEPass drops the store instructions. Then would will be a better representation making values less misleading?

@jmorse
Copy link
Member

jmorse commented Feb 5, 2021

I think we didn't realise it at the time, but the root of the problem is bug 34136. Eliminating dead stores to memory fundamentally re-orders the appearance of assignments, leading to misleading values in existing variable locations. Seeing how the first assignment to the 'a' field in comment 2 gets DSE'd, there's not much we can do about this. Most recent discussion on how to fix this:

https://lists.llvm.org/pipermail/llvm-dev/2020-October/145697.html

Although that's more about tracking what happens when stores get deleted, rather than how to represent it in the output.

Another facet of this is jumpy line tables -- Toms original reproducer featured a step backwards. Happily with trunk and the new pass manager, we only have steps forwards.

@jmorse
Copy link
Member

jmorse commented Nov 27, 2021

mentioned in issue #38116

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla wrong-debug
Projects
None yet
Development

No branches or pull requests

4 participants