Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DebugInfo@O2] SelectionDAG invents its own debug use-before-defs #39773

Closed
jmorse opened this issue Jan 23, 2019 · 3 comments
Closed

[DebugInfo@O2] SelectionDAG invents its own debug use-before-defs #39773

jmorse opened this issue Jan 23, 2019 · 3 comments
Assignees
Labels
bugzilla Issues migrated from bugzilla llvm:codegen wrong-debug

Comments

@jmorse
Copy link
Member

jmorse commented Jan 23, 2019

Bugzilla Link 40427
Resolution FIXED
Resolved on Feb 05, 2019 02:02
Version trunk
OS Linux
Blocks #38102
CC @dwblaikie,@gregbedwell,@CarlosAlbertoEnciso,@pogo59
Fixed by commit(s) 352350

Extended Description

With the code below, llc command line "llc -start-after=codegenprepare -stop-before=expand-isel-pseudos input.ll -o -", using hot-off-the-press llvm r351956, SelectionDAG transforms the single dbg.value intrinsic into a use-before-free DBG_VALUE. The heavily contrived code:

--------8<--------
define i16 @​lolwat(i1 %spoons, i64 *%bees, i16 %yellow, i64 *%more) {
entry:
br i1 %spoons, label %trueb, label %falseb
trueb:
br label %block
falseb:
br label %block
block:
%foo = phi i64 *[%bees, %trueb], [%more, %falseb] ; ir line 1
%forks = bitcast i64 *%foo to i32 * ; ir line 2
%ret = load i32, i32 *%forks, !dbg !​6 ; ir line 3
%cast = trunc i32 %ret to i16, !dbg !​6 ; ir line 4
call void @​llvm.dbg.value(metadata i16 %cast, metadata !​1, metadata !DIExpression()), !dbg !​6
%orly2 = add i16 %yellow, 1 ; ir line 5
br label %bb1 ; ir line 6
bb1:
%cheese = add i16 %orly2, %cast
ret i16 %cheese, !dbg !​6
}

declare void @​llvm.dbg.value(metadata, metadata, metadata)

!llvm.module.flags = !{#4}
!llvm.dbg.cu = !{#2}
!​1 = !DILocalVariable(name: "bees", scope: !​5, type: null)
!​2 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !​3, producer: "beards", isOptimized: true, runtimeVersion: 4, emissionKind: FullDebug)
!​3 = !DIFile(filename: "bees.cpp", directory: "")
!​4 = !{i32 2, !"Debug Info Version", i32 3}
!​5 = distinct !DISubprogram(name: "nope", scope: !​2, file: !​3, line: 1, unit: !​2)
!​6 = !DILocation(line: 0, scope: !​5)
!​7 = !DILocalVariable(name: "flannel", scope: !​5, type: null)
-------->8--------

And the output of basic block 'block' after isel:

--------8<--------
bb.3.block:
successors: %bb.4(0x80000000)

%0:gr64 = PHI %6, %bb.2, %4, %bb.1
DBG_VALUE %1, $noreg, !&#8203;5, !DIExpression(), debug-location !&#8203;3
%1:gr16 = MOV16rm %0, 1, $noreg, 0, $noreg, debug-location !&#8203;3 :: (load 2 from %ir.forks, align 4)
%10:gr32 = IMPLICIT_DEF
%9:gr32 = INSERT_SUBREG %10, %7, %subreg.sub_16bit
%11:gr32 = INC32r %9, implicit-def dead $eflags
%2:gr16 = COPY %11.sub_16bit

-------->8--------

Observe that the DBG_VALUE of %1 appears before %1 is defined (there are no further definitions of %1 in the program).

This bug is caused by ScheduleDAGSDNode's ProcessSourceNode helper [0]. My understanding of what's going on in this function is that, when SDNodes are scheduled into a sequence of instructions:

  • Each SDNode has retained an IR line number, identifying the IR instruction the SDNode originates from,
  • A mapping of IR line number -> MachineInsts is built in the 'Orders' collection, identifying the first/earliest MachineInst of an IR line number
  • EmitSchedule later inserts DBG_VALUEs in front of the first instruction that comes after the source dbg.value intrinsic

Obviously with instructions being rescheduled around the place, there's a risk of DBG_VALUEs being re-ordered: this bug is only about the debug use-before-def demonstrated above.

The behaviour that ProcessSourceNode does not account for is that some SDNodes do not create instructions. ProcessSourceNode always takes the last instruction created as being the start of this IR line number (line 765 in [0]), pointing IR line numbers at instructions that are too early.

In the example copied above, we get the following SDNode schedule:

--------8<--------
SU(9): t2: i64,ch = CopyFromReg [ORD=2] [ID=9]# D:0 t0, Register:i64 %0# D:0
SU(8): t16: i16,ch = MOV16rm<Mem:(load 2 from %ir.forks, align 4)> [ORD=3] [ID=8]# D:0bees.cpp:0 DbgVal(0):"bees" t2, TargetConstant:i8<1>, Register:i64 $noreg# D:0, TargetConstant:i32<0>, Register:i32 $noreg# D:0, t0, bees.cpp:0
SU(7): t8: ch = CopyToReg [ORD=4] [ID=7]# D:0bees.cpp:0 t0, Register:i16 %1# D:0, t16, bees.cpp:0
SU(5): t10: i16,ch = CopyFromReg [ORD=5] [ID=5]# D:0 t0, Register:i16 %7# D:0
SU(6): t23: i32 = IMPLICIT_DEF [ORD=5] [ID=6]# D:0
SU(4): t17: i32 = INSERT_SUBREG [ORD=5] [ID=4]# D:0 IMPLICIT_DEF:i32 [ORD=5] [ID=6]# D:0, t10, TargetConstant:i32<4>
SU(3): t19: i32,i32 = INC32r [ORD=5] [ID=3]# D:0 t17
SU(2): t20: i16 = EXTRACT_SUBREG [ORD=5] [ID=2]# D:0 t19, TargetConstant:i32<4>
SU(1): t14: ch = CopyToReg [ORD=5] [ID=1]# D:0 t0, Register:i16 %2# D:0, t20
SU(0): t15: ch = TokenFactor [ORD=6] [ID=0]# D:0 t8, t14
-------->8--------

Where the ORD=$num numbers correspond to the "ir line $num" comments I've put in the IR paste above. Because Copy{From,To}Reg do not produce instructions, and ProcessSourceNode always takes the last instruction as the beginning of an IR line, we get the following mappings in the Orders collection, the targets being the instructions in the MIR paste above:

2 -> nullptr
3 -> %1:gr16 = MOV16rm
4 -> %1:gr16 = MOV16rm
5 -> %1:gr16 = MOV16rm

The order mapping for 3 is correct; it's 4 and 5 that are incorrect. We get these because SU(7) and SU(5) are the first SDNodes for each ir line respectively, two Copy nodes which do not create instructions and thus are mapped to the previous instruction, the load of %1. EmitSchedule then determines that for the dbg.value in the source IR on ir line 4, it should insert a DBG_VALUE before the instructions for ir line 5, which it sees as the load of %1. Thus, it places the DBG_VALUE before its operands definition.

~

There's substantial amounts of misery in ProcessSourceNode -- this specific issue and the "at start of block -> nullptr" mapping off-by-ones a variety of DBG_VALUEs. I also have a strong suspicion that the first conditional block of the function (which passes Order=0 to ProcessSDDbgValues, bypassing various checks) is re-ordering DBG_VALUEs where it could be avoided, although I haven't written a test for this yet.

This bug is not related to the placeDbgValues misery-train because the dbg.value for %cast immediately follows its definition -- however other CodeGenPrepare optimisations interfere with the test above, so we need -start-after=codegenprepare. I'm marking this as blocking bug 38754 because my patches for that greatly amplify this issue.

[0] https://github.com/llvm-mirror/llvm/blob/27f17bfee31bec92b918f4ca6a6f7a2a37e4d00c/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp#L742

@jmorse
Copy link
Member Author

jmorse commented Jan 23, 2019

assigned to @jmorse

@jmorse
Copy link
Member Author

jmorse commented Jan 23, 2019

Even after proof-reading, I missed that the "use-before-free" in the first paragraph should be "use-before-def". Curses.

@jmorse
Copy link
Member Author

jmorse commented Jan 24, 2019

Candidate patch: https://reviews.llvm.org/D57163

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla llvm:codegen wrong-debug
Projects
None yet
Development

No branches or pull requests

1 participant