-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DebugInfo@O2][Dexter] Misleading watch value of aggregate member at -O2 optimisation. #38112
Comments
assigned to @CarlosAlbertoEnciso |
This was found using https://github.com/SNSystems/dexter |
The following modified test case reproduces the issue described in comment 0 -------->8-------- int a; int foo(S &aggr) { int main() { |
*** IR Dump Before Dead Store Elimination *** define i32 @_Z3fooR1S(%struct.S* %aggr) { %b = getelementptr inbounds %struct.S, %struct.S* %aggr, i64 0, i32 1, !dbg !30 Before DSE, it is clear that 'a' gets updated. But during DSE: DEAD: store i32 %add, i32* %a, align 4, !dbg !24, !tbaa !25 |
*** IR Dump After Dead Store Elimination *** define i32 @_Z3fooR1S(%struct.S* %aggr) { %b = getelementptr inbounds %struct.S, %struct.S* %aggr, i64 0, i32 1, !dbg !30 And 'a' is only updated after 'b' is updated before returning. %add4 = add nsw i32 %add1, %add, !dbg !32 |
For the code in comment 2, clang -O2 a.cc -g -gdwarf-5 -fno-legacy-pass-manager -fno-inline -o a Address Line Column File ISA Discriminator Flags 0x0000000000401110 8 0 0 0 0 is_stmt 0000000000401110 <_Z3fooR1S>: DSEPass drops the store instructions. Then would will be a better representation making values less misleading? |
I think we didn't realise it at the time, but the root of the problem is bug 34136. Eliminating dead stores to memory fundamentally re-orders the appearance of assignments, leading to misleading values in existing variable locations. Seeing how the first assignment to the 'a' field in comment 2 gets DSE'd, there's not much we can do about this. Most recent discussion on how to fix this: https://lists.llvm.org/pipermail/llvm-dev/2020-October/145697.html Although that's more about tracking what happens when stores get deleted, rather than how to represent it in the output. Another facet of this is jumpy line tables -- Toms original reproducer featured a step backwards. Happily with trunk and the new pass manager, we only have steps forwards. |
mentioned in issue #38116 |
Extended Description
When the following example:
1 struct SROAThis {
2 SROAThis(int n_a, int n_b, int n_c, int n_d)
3 : a(n_a), b(n_b), c(n_c), d(n_d){}
4
5 int a, b, c, d;
6 };
7
8 int doesSroaStuff(SROAThis & aggr) {
9 aggr.a += 1;
10 aggr.b += 2;
11 aggr.c += 3;
12 aggr.d += 4;
13 aggr.a += aggr.b + aggr.c + aggr.d;
14 return aggr.a;
15 }
11
12 int main(int argc, const char ** argv) {
13 SROAThis aggr(argc, argc, argc, argc);
14
15 auto a = doesSroaStuff(aggr);
16
17 return a;
18 }
is compiled with the following command:
clang test.cpp -o test.elf -g -O2 -fno-inline
The line table entries/debug locations for doesSroaStuff no longer accurately reflect the source.
The disassembly for doesSroaStuff shows that the constant additions at the start of the function have been interleaved with movs, leas and adds:
// from objdump -M intel -d ./test.elf
0000000000400480 <_Z13doesSroaStuffR8SROAThis>:
400480: 8b 47 04 mov eax,DWORD PTR [rdi+0x4]
400483: 8b 4f 08 mov ecx,DWORD PTR [rdi+0x8]
400486: 83 c0 02 add eax,0x2
400489: 8b 17 mov edx,DWORD PTR [rdi]
40048b: 01 c2 add edx,eax
40048d: 89 47 04 mov DWORD PTR [rdi+0x4],eax
400490: 8d 41 03 lea eax,[rcx+0x3]
400493: 89 47 08 mov DWORD PTR [rdi+0x8],eax
400496: 8b 47 0c mov eax,DWORD PTR [rdi+0xc]
400499: 8d 70 04 lea esi,[rax+0x4]
40049c: 89 77 0c mov DWORD PTR [rdi+0xc],esi
40049f: 8d 0c 11 lea ecx,[rcx+rdx1]
4004a2: 83 c1 03 add ecx,0x3
4004a5: 8d 04 08 lea eax,[rax+rcx1]
4004a8: 83 c0 05 add eax,0x5
4004ab: 89 07 mov DWORD PTR [rdi],eax
4004ad: c3 ret
A line table dump from llvm-dwarfdump shows the following:
0x0000000000400480 8 0 1 0 0 is_stmt 0x0000000000400480 10 10 1 0 0 is_stmt prologue_end 0x0000000000400483 11 10 1 0 0 is_stmt 0x0000000000400486 10 10 1 0 0 is_stmt
if we step to line 11 we will only have executed the following load instruction:
400480: 8b 47 04 mov eax,DWORD PTR [rdi+0x4]
placing our source mapping at the following line
8 int doesSroaStuff(SROAThis & aggr) {
9 aggr.a += 1;
10 aggr.b += 2;
11 aggr.c += 3; < < <
however, no additions have taken place.
When debugging using LLDB, I get the following watch values for aggr.a and aggr.b at line 11:
Watchpoint 1: addr = 0x7fffffffe4a0 size = 4 state = enabled type = w
declare @ '/home/clangbox/dev/temp/test/./sroa_temp.cpp:8'
watchpoint spec = 'aggr.a'
new value: 1
Watchpoint 2: addr = 0x7fffffffe4a4 size = 4 state = enabled type = w
declare @ '/home/clangbox/dev/temp/test/./sroa_temp.cpp:8'
watchpoint spec = 'aggr.b'
new value: 1
The user should expect to see aggr.a = 2 and aggr.b = 3 if the program is run with no options.
If re-implementing this example using none-aggregates but the same structure, such as :
int doesSroaStuff(int & a, int & b, int & c, int & d) {
a += 1;
b += 2;
c += 3;
d += 4;
a += b + c + d;
return a;
}
int main(int argc, const char ** argv) {
auto a = doesSroaStuff(argc, argc, argc, argc);
return a;
}
Then doesSroaStuff has the adds in the "correct" order despite being optimised.
0000000000400480 <Z13doesSroaStuffRiS_S_S>:
400480: 83 07 01 add DWORD PTR [rdi],0x1
400483: 83 06 02 add DWORD PTR [rsi],0x2
400486: 83 02 03 add DWORD PTR [rdx],0x3
400489: 8b 01 mov eax,DWORD PTR [rcx]
40048b: 83 c0 04 add eax,0x4
40048e: 89 01 mov DWORD PTR [rcx],eax
400490: 03 06 add eax,DWORD PTR [rsi]
400492: 03 02 add eax,DWORD PTR [rdx]
400494: 03 07 add eax,DWORD PTR [rdi]
400496: 89 07 mov DWORD PTR [rdi],eax
400498: c3 ret
The text was updated successfully, but these errors were encountered: