You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With the code below, llc command line "llc -start-after=codegenprepare -stop-before=expand-isel-pseudos input.ll -o -", using hot-off-the-press llvm r351956, SelectionDAG transforms the single dbg.value intrinsic into a use-before-free DBG_VALUE. The heavily contrived code:
--------8<--------
define i16 @lolwat(i1 %spoons, i64 *%bees, i16 %yellow, i64 *%more) {
entry:
br i1 %spoons, label %trueb, label %falseb
trueb:
br label %block
falseb:
br label %block
block:
%foo = phi i64 *[%bees, %trueb], [%more, %falseb] ; ir line 1
%forks = bitcast i64 *%foo to i32 * ; ir line 2
%ret = load i32, i32 *%forks, !dbg !6 ; ir line 3
%cast = trunc i32 %ret to i16, !dbg !6 ; ir line 4
call void @llvm.dbg.value(metadata i16 %cast, metadata !1, metadata !DIExpression()), !dbg !6
%orly2 = add i16 %yellow, 1 ; ir line 5
br label %bb1 ; ir line 6
bb1:
%cheese = add i16 %orly2, %cast
ret i16 %cheese, !dbg !6
}
Observe that the DBG_VALUE of %1 appears before %1 is defined (there are no further definitions of %1 in the program).
This bug is caused by ScheduleDAGSDNode's ProcessSourceNode helper [0]. My understanding of what's going on in this function is that, when SDNodes are scheduled into a sequence of instructions:
Each SDNode has retained an IR line number, identifying the IR instruction the SDNode originates from,
A mapping of IR line number -> MachineInsts is built in the 'Orders' collection, identifying the first/earliest MachineInst of an IR line number
EmitSchedule later inserts DBG_VALUEs in front of the first instruction that comes after the source dbg.value intrinsic
Obviously with instructions being rescheduled around the place, there's a risk of DBG_VALUEs being re-ordered: this bug is only about the debug use-before-def demonstrated above.
The behaviour that ProcessSourceNode does not account for is that some SDNodes do not create instructions. ProcessSourceNode always takes the last instruction created as being the start of this IR line number (line 765 in [0]), pointing IR line numbers at instructions that are too early.
In the example copied above, we get the following SDNode schedule:
Where the ORD=$num numbers correspond to the "ir line $num" comments I've put in the IR paste above. Because Copy{From,To}Reg do not produce instructions, and ProcessSourceNode always takes the last instruction as the beginning of an IR line, we get the following mappings in the Orders collection, the targets being the instructions in the MIR paste above:
The order mapping for 3 is correct; it's 4 and 5 that are incorrect. We get these because SU(7) and SU(5) are the first SDNodes for each ir line respectively, two Copy nodes which do not create instructions and thus are mapped to the previous instruction, the load of %1. EmitSchedule then determines that for the dbg.value in the source IR on ir line 4, it should insert a DBG_VALUE before the instructions for ir line 5, which it sees as the load of %1. Thus, it places the DBG_VALUE before its operands definition.
~
There's substantial amounts of misery in ProcessSourceNode -- this specific issue and the "at start of block -> nullptr" mapping off-by-ones a variety of DBG_VALUEs. I also have a strong suspicion that the first conditional block of the function (which passes Order=0 to ProcessSDDbgValues, bypassing various checks) is re-ordering DBG_VALUEs where it could be avoided, although I haven't written a test for this yet.
This bug is not related to the placeDbgValues misery-train because the dbg.value for %cast immediately follows its definition -- however other CodeGenPrepare optimisations interfere with the test above, so we need -start-after=codegenprepare. I'm marking this as blocking bug 38754 because my patches for that greatly amplify this issue.
Extended Description
With the code below, llc command line "llc -start-after=codegenprepare -stop-before=expand-isel-pseudos input.ll -o -", using hot-off-the-press llvm r351956, SelectionDAG transforms the single dbg.value intrinsic into a use-before-free DBG_VALUE. The heavily contrived code:
--------8<--------
define i16 @lolwat(i1 %spoons, i64 *%bees, i16 %yellow, i64 *%more) {
entry:
br i1 %spoons, label %trueb, label %falseb
trueb:
br label %block
falseb:
br label %block
block:
%foo = phi i64 *[%bees, %trueb], [%more, %falseb] ; ir line 1
%forks = bitcast i64 *%foo to i32 * ; ir line 2
%ret = load i32, i32 *%forks, !dbg !6 ; ir line 3
%cast = trunc i32 %ret to i16, !dbg !6 ; ir line 4
call void @llvm.dbg.value(metadata i16 %cast, metadata !1, metadata !DIExpression()), !dbg !6
%orly2 = add i16 %yellow, 1 ; ir line 5
br label %bb1 ; ir line 6
bb1:
%cheese = add i16 %orly2, %cast
ret i16 %cheese, !dbg !6
}
declare void @llvm.dbg.value(metadata, metadata, metadata)
!llvm.module.flags = !{#4}
!llvm.dbg.cu = !{#2}
!1 = !DILocalVariable(name: "bees", scope: !5, type: null)
!2 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !3, producer: "beards", isOptimized: true, runtimeVersion: 4, emissionKind: FullDebug)
!3 = !DIFile(filename: "bees.cpp", directory: "")
!4 = !{i32 2, !"Debug Info Version", i32 3}
!5 = distinct !DISubprogram(name: "nope", scope: !2, file: !3, line: 1, unit: !2)
!6 = !DILocation(line: 0, scope: !5)
!7 = !DILocalVariable(name: "flannel", scope: !5, type: null)
-------->8--------
And the output of basic block 'block' after isel:
--------8<--------
bb.3.block:
successors: %bb.4(0x80000000)
-------->8--------
Observe that the DBG_VALUE of %1 appears before %1 is defined (there are no further definitions of %1 in the program).
This bug is caused by ScheduleDAGSDNode's ProcessSourceNode helper [0]. My understanding of what's going on in this function is that, when SDNodes are scheduled into a sequence of instructions:
Obviously with instructions being rescheduled around the place, there's a risk of DBG_VALUEs being re-ordered: this bug is only about the debug use-before-def demonstrated above.
The behaviour that ProcessSourceNode does not account for is that some SDNodes do not create instructions. ProcessSourceNode always takes the last instruction created as being the start of this IR line number (line 765 in [0]), pointing IR line numbers at instructions that are too early.
In the example copied above, we get the following SDNode schedule:
--------8<--------
SU(9): t2: i64,ch = CopyFromReg [ORD=2] [ID=9]# D:0 t0, Register:i64 %0# D:0
SU(8): t16: i16,ch = MOV16rm<Mem:(load 2 from %ir.forks, align 4)> [ORD=3] [ID=8]# D:0bees.cpp:0 DbgVal(0):"bees" t2, TargetConstant:i8<1>, Register:i64 $noreg# D:0, TargetConstant:i32<0>, Register:i32 $noreg# D:0, t0, bees.cpp:0
SU(7): t8: ch = CopyToReg [ORD=4] [ID=7]# D:0bees.cpp:0 t0, Register:i16 %1# D:0, t16, bees.cpp:0
SU(5): t10: i16,ch = CopyFromReg [ORD=5] [ID=5]# D:0 t0, Register:i16 %7# D:0
SU(6): t23: i32 = IMPLICIT_DEF [ORD=5] [ID=6]# D:0
SU(4): t17: i32 = INSERT_SUBREG [ORD=5] [ID=4]# D:0 IMPLICIT_DEF:i32 [ORD=5] [ID=6]# D:0, t10, TargetConstant:i32<4>
SU(3): t19: i32,i32 = INC32r [ORD=5] [ID=3]# D:0 t17
SU(2): t20: i16 = EXTRACT_SUBREG [ORD=5] [ID=2]# D:0 t19, TargetConstant:i32<4>
SU(1): t14: ch = CopyToReg [ORD=5] [ID=1]# D:0 t0, Register:i16 %2# D:0, t20
SU(0): t15: ch = TokenFactor [ORD=6] [ID=0]# D:0 t8, t14
-------->8--------
Where the ORD=$num numbers correspond to the "ir line $num" comments I've put in the IR paste above. Because Copy{From,To}Reg do not produce instructions, and ProcessSourceNode always takes the last instruction as the beginning of an IR line, we get the following mappings in the Orders collection, the targets being the instructions in the MIR paste above:
2 -> nullptr
3 -> %1:gr16 = MOV16rm
4 -> %1:gr16 = MOV16rm
5 -> %1:gr16 = MOV16rm
The order mapping for 3 is correct; it's 4 and 5 that are incorrect. We get these because SU(7) and SU(5) are the first SDNodes for each ir line respectively, two Copy nodes which do not create instructions and thus are mapped to the previous instruction, the load of %1. EmitSchedule then determines that for the dbg.value in the source IR on ir line 4, it should insert a DBG_VALUE before the instructions for ir line 5, which it sees as the load of %1. Thus, it places the DBG_VALUE before its operands definition.
~
There's substantial amounts of misery in ProcessSourceNode -- this specific issue and the "at start of block -> nullptr" mapping off-by-ones a variety of DBG_VALUEs. I also have a strong suspicion that the first conditional block of the function (which passes Order=0 to ProcessSDDbgValues, bypassing various checks) is re-ordering DBG_VALUEs where it could be avoided, although I haven't written a test for this yet.
This bug is not related to the placeDbgValues misery-train because the dbg.value for %cast immediately follows its definition -- however other CodeGenPrepare optimisations interfere with the test above, so we need -start-after=codegenprepare. I'm marking this as blocking bug 38754 because my patches for that greatly amplify this issue.
[0] https://github.com/llvm-mirror/llvm/blob/27f17bfee31bec92b918f4ca6a6f7a2a37e4d00c/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp#L742
The text was updated successfully, but these errors were encountered: