You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this [0] review Bjorn suggested that introducing new dbg.value records that use DW_OP_deref might be sketchy (see inline comments). It turns out that adding DW_OP_deref is indeed unsafe, and already leads to poor debug experience on trunk (I'm using r352480). Compile the following with -O2 -g -fno-linine:
--------8<--------
int
foo(int *bar, int arg, int more)
{
int redundant = *bar;
int loaded = *bar;
arg &= more + loaded;
*bar = 0;
return more + *bar;
}
int
main() {
int lala = 987654;
return foo(&lala, 1, 2);
}
-------->8--------
Here, the two loads of *bar get CSE'd and "redundant" is salvaged. It picks up a DW_OP_deref to achieve this. There are two line numbers in "foo" that one can step onto with gdb: "*bar = 0" and the return statement. On the first, printing "redundant" produces 987654, on the second printing "redundant" produces 0. This isn't an accurate representation of the original program as the value of redundant should not change.
Off the top of my head I can't see any way to fix salvage operations that add DW_OP_deref: while we could terminate the location range to stop at the next write to memory, I doubt any LLVM passes that move memory instructions currently examine what intervening dbg.values do, meaning there's a risk of dbg.values seeing the wrong stored value if stores get moved.
IMHO the easiest quickest soundest fix is to not salvage such instructions: doing so on a clang-3.4 build gives a 1.1% drop in variable location coverage and 1% drop in scope bytes coverage. This would suck quite considerably, however AFAIK there are currently no guard-rails for this kind of DIExpression.
Fix review is in https://reviews.llvm.org/D57962, committed in r353824, I'm copy&pasting an example from the review here so that future archaeologists have examples all in one place:
[...] I think the greater issue is that with DW_OP_derefs present it's difficult to determine when code movement in optimisation passes affects dbg.values. Consider this example, compiling "-O2 -g -fno-inline -fno-unroll-loops" with clang@r353515:
static int qux[] = { 1, 2, 3, 4 };
int
foo(int baz, int *out)
{
int sum = 0;
for (int i = 0; i < 4; i++) {
int extra = *out;
sum += qux[i] * baz;
sum %= 4;
*out = sum;
}
return sum;
}
int
main(int argc, char **argv)
{
int out = 12;
return foo(argc, &out);
}
(The assign to 'extra' is contrived but eliminating loads is something LLVM does all the time I believe). In this code, LICM promotes '*out' to an SSA register for the body of the loop. However, before LICM the dbg.value for 'extra' has its load operand folded into a DW_OP_deref, and points at the variable in 'main'. Stepping through this program in gdb, 'extra' always reads as '12', rather than any of the values computed in the loop. Invalidating the dbg.value at the next memory store wouldn't help, as the location specified by the dbg.value never holds the value we're interested in.
My main point here is that not only would LICM need to be taught that moving the store to '*out' could invalidate some dbg.values, determining which dbg.values are affected would involve digging into DIExpressions looking for dereferences, potentially through GEPs that were folded in too, possibly even requiring alias analysis on any dbg.value with a deref. Which (as far as I'm aware) is a reasonably large amount of code complexity and computation. DeadStoreElimination would need to behave similarly (a dbg.value might try to point at the memory of a store that's later eliminated).
Extended Description
In this [0] review Bjorn suggested that introducing new dbg.value records that use DW_OP_deref might be sketchy (see inline comments). It turns out that adding DW_OP_deref is indeed unsafe, and already leads to poor debug experience on trunk (I'm using r352480). Compile the following with -O2 -g -fno-linine:
--------8<--------
int
foo(int *bar, int arg, int more)
{
int redundant = *bar;
int loaded = *bar;
arg &= more + loaded;
*bar = 0;
return more + *bar;
}
int
main() {
int lala = 987654;
return foo(&lala, 1, 2);
}
-------->8--------
Here, the two loads of *bar get CSE'd and "redundant" is salvaged. It picks up a DW_OP_deref to achieve this. There are two line numbers in "foo" that one can step onto with gdb: "*bar = 0" and the return statement. On the first, printing "redundant" produces 987654, on the second printing "redundant" produces 0. This isn't an accurate representation of the original program as the value of redundant should not change.
Off the top of my head I can't see any way to fix salvage operations that add DW_OP_deref: while we could terminate the location range to stop at the next write to memory, I doubt any LLVM passes that move memory instructions currently examine what intervening dbg.values do, meaning there's a risk of dbg.values seeing the wrong stored value if stores get moved.
IMHO the easiest quickest soundest fix is to not salvage such instructions: doing so on a clang-3.4 build gives a 1.1% drop in variable location coverage and 1% drop in scope bytes coverage. This would suck quite considerably, however AFAIK there are currently no guard-rails for this kind of DIExpression.
[0] https://reviews.llvm.org/D56788
The text was updated successfully, but these errors were encountered: