Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DebugInfo@O2][Dexter] Bad value reported for function argument #38300

Closed
jmorse opened this issue Sep 14, 2018 · 7 comments
Closed

[DebugInfo@O2][Dexter] Bad value reported for function argument #38300

jmorse opened this issue Sep 14, 2018 · 7 comments
Assignees
Labels
bugzilla Issues migrated from bugzilla llvm:codegen wrong-debug

Comments

@jmorse
Copy link
Member

jmorse commented Sep 14, 2018

Bugzilla Link 38952
Resolution FIXED
Resolved on Mar 07, 2019 07:34
Version trunk
OS Linux
Blocks #38116
CC @adrian-prantl,@dwblaikie,@gregbedwell,@CarlosAlbertoEnciso,@pogo59,@vedantk
Fixed by commit(s) 345996

Extended Description

The trivial program below causes an incorrect value of 'argc' to be reported to debuggers, when optimised, compiled "-O2 -g -fno-inline" with llvm/clang r341546 targeting x86_64.

Pretty simply, when launched in gdb or lldb, the first line of 'main' will report that 'argc' has the value zero, when in truth it's one. [Note that the value of argc changes depending on what you pass on the command line, but it should always be at least one with no arguments].

-------->8--------
#include <string.h>

#define BUFSZ 256

int foo[BUFSZ];

int
main(int argc, char **argv)
{
if (argc + 1 > BUFSZ)
return 0;

memset(foo, 0, argc * sizeof(int));

return foo[argc / 2];
}
--------8<--------

Looking at the location-data for 'argc' and the first few instructions of the program, when compiled with the options above:

llvm-dwarfdump-6.0 a.out --name=argc:
-------->8--------
0x00000073: DW_TAG_formal_parameter
DW_AT_location (0x00000000
0x0000000000000000 - 0x0000000000000003: DW_OP_reg5 RDI
0x0000000000000003 - 0x000000000000000d: DW_OP_reg3 RBX)
--------8<--------

Disassembly in gdb, having run "start" then "disassemble":
-------->8--------
Dump of assembler code for function main(int, char**):
0x0000000000400500 <+0>: push %rbx
0x0000000000400501 <+1>: xor %eax,%eax
=> 0x0000000000400503 <+3>: cmp $0xff,%edi
0x0000000000400509 <+9>: jg 0x400532 <main(int, char**)+50>
0x000000000040050b <+11>: mov %edi,%ebx
--------8<--------

In the disassembly, the mov at +0xb stashes argc to %ebx to save it over the call to memset. The location data believes this is happening earlier, at +0x3, and as a result the as-yet-unwritten contents of %ebx are reported as the value of argc.

Removing either the memset call, or the conditional statement, eliminates this problem. It's also mildly annoying that 'argc' isn't defined over more of the body of the program, despite being in a register the whole time, but that's for a different ticket. In case this was duplicate of another SimplifyCFG bug I've tried compiling with Carlos' latest patch from https://reviews.llvm.org/D51976 (165505) but that didn't make a difference.

This problem doesn't occur with clang-6.0 installed on Ubuntu 18, I don't have clang-7.0 builds handy.

@jmorse
Copy link
Member Author

jmorse commented Sep 14, 2018

assigned to @jmorse

@jmorse
Copy link
Member Author

jmorse commented Oct 16, 2018

I'll take a look at this. From a first glance, PostRA Machine Sink moves the store-to-ebx into a different block but doesn't move an associated DBG_VALUE, leading to the observed illegal value. However, it doesn't move the DBG_VALUE because LiveDebugVariables puts it back in the wrong place, i.e. not adjacent to the defining instruction. Before livedebugvars and regalloc:

--------8<--------
0B bb.0.entry:
successors: [blah]
liveins: $edi
DBG_VALUE debug-use $edi, debug-use $noreg, !"argc", [blah]
DBG_VALUE debug-use $rsi, debug-use $noreg, !"argv", [blah]
16B %2:gr32 = COPY $edi
DBG_VALUE debug-use %2:gr32, debug-use $noreg, !"argc", [blah]
32B %15:gr32 = MOV32r0 implicit-def dead $eflags
-------->8--------

After:
--------8<--------
0B bb.0.entry:
successors: [blah]
liveins: $edi
DBG_VALUE debug-use $edi, debug-use $noreg, !"argc", [blah]
DBG_VALUE debug-use $rsi, debug-use $noreg, !"argv", [blah]
16B renamable $ebx = COPY $edi
32B renamable $eax = MOV32r0 implicit-def dead $eflags
DBG_VALUE debug-use $ebx, debug-use $noreg, !"argc", [blah]
-------->8--------

Where the MOV32r0 has moved between the copy of EDI and it's DBG_VALUE.

@jmorse
Copy link
Member Author

jmorse commented Oct 16, 2018

OK, the reason why the DBG_VALUE gets moved further down, is because the lexical scope of argc is considered to start further down the function (at instructions I didn't copy into the previous comment). This raises a variety of questions for me; it's not obvious whether there's actually any bug, instead there seem to be two assumptions that conflict with each other:

  • DBG_VALUEs always follow the register def
  • DBG_VALUEs must be inside the lexical scope of the corresponding variabble

I'll be at LLVM-Dev this week, so will try and pick peoples brains about this.

@jmorse
Copy link
Member Author

jmorse commented Oct 16, 2018

For full clarity on where I believe those assumptions come from, the idea that DBG_VALUEs always follow definitions:

https://github.com/llvm-mirror/llvm/blob/42b54435076482d29015e5a6678d5b8691d48e40/lib/CodeGen/MachineInstr.cpp#L2088

i.e., when you collect debug values, stop looking after a non-debug-insn following the definition. Note that that function was originally in MachineSink and was moved by Carlos in r341025, so the assumption might not be prevalent elsewhere,

DBG_VALUE ranges getting trimmed:

https://github.com/llvm-mirror/llvm/blob/66586cc1b53a8f0af607636cda0a38b5a05c17e9/lib/CodeGen/LiveDebugVariables.cpp#L833

Which is what shifts the DBG_VALUE an instruction down in the reproducer for this ticket.

@jmorse
Copy link
Member Author

jmorse commented Oct 30, 2018

After chatting to Adrian at the conference, he seemed to feel the "DBG_VALUE immediately follows definition" assumption was wrong. That fits with possibility of a calculation being placed far from where it becomes a variable value.

I'll now look at what can be done about making collectDebugValues look further, without damaging compile-times too much.

@jmorse
Copy link
Member Author

jmorse commented Nov 1, 2018

Candidate patch: https://reviews.llvm.org/D53992

Assuming that solution is acceptable, I'd also like to adjust collectDebugValues to avoid this mistake in the future, for example by adding an assertion that it's not being called after register allocation (after which point it's DBG_VALUE-follows-immediately assumption doesn't hold).

@jmorse
Copy link
Member Author

jmorse commented Mar 7, 2019

Fix was committed.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla llvm:codegen wrong-debug
Projects
None yet
Development

No branches or pull requests

1 participant