-
Notifications
You must be signed in to change notification settings - Fork 12.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DebugInfo@O2][Dexter] Illegal value appears in variable when conditional blocks folded #38102
Comments
assigned to @jmorse |
tl;dr: CodeGenPrepare is messing with the test in a broken way, but at the same time it protects against dbg.value's getting dropped for other reasons. At a basic level, it would appear that the assignment of "qux = read2 / 2" in the false branch of the first condition gets optimised away, and a dbg.value is inserted with an expression that calculates the value of qux from read2. Unfortunately CodeGenPrepare::placeDbgValues [0] then moves that dbg.value to be next to the definition of read2, outside the conditional block, making the assignment visible on a path where it never happens. placeDbgValues makes my mind boggle, due to its extremely simplistic movement of dbg.values. My understanding of dbg.value's is that their placement needs to be carefully considered, because their movement changes the order of variable valuations the developer sees. This can lead to developers observing states of the program that at best don't make sense, or worst mislead the developer. Simply relocating dbg.values to be adjacent to their Value operand ignores that concern, and causes this bug by making 'qux''s conditional value visible unconditionally. IMHO placeDbgValues should be killed off: it's not possible to know whether moving dbg.values between BBs is correct at that point in compilation, and doing so should be the responsibility of whatever moved the Value in the first place. Various comments say that CodeGenPrepare is very old, this particular function body originates from r132151 circa 2011 [1], seemingly before DIExpressions could be attached to dbg.values. It's no surprise it doesn't account for bugs such as this bugzilla ticket. I've experimented with just disabling placeDbgValues; this causes three regression tests to fail, two of which require investigation (they seem to expose other faults). A bigger question though is what happens to the debug experience by doing this: happily the Dexter tests (insert advertorial here) show up five more differences in behaviour, two positive three negative. One negative case is more or less spurious, one smells fixable, but the last seems to represent exactly what placeDbgValues is written to address: a dbg.value that's too far from it's def gets dropped. Curiously it's right next to a use of that Value, so the reason for dropping is unclear. Anyhoo, I plan on:
~ Analysis of the regression tests and dexter tests that change behaviour: Regression tests: LLVM :: DebugInfo/COFF/register-variables.ll LLVM :: DebugInfo/X86/#37234 .ll LLVM :: Transforms/CodeGenPrepare/X86/select.ll Dexter tests: these are located here [2] while I'm getting them merged into the main repo, the tests that are affected by the change: SimplifyCFG/FoldThreading: is this bug SimplifyCFG/CSEInPred2: is bug 38753, and disabling placeDbgValues fixes that bug report. GVN/NumGVNLoad: The load of "corge = *ptr" on line 23 is merged with earlier loads of *ptr, however the dbg.value is shifted upwards too, causing line 22 to see the following load early. Disabling placeDbgValues fixes this. Vectorize/TempLoopVar: The 'tmp' variable calculation in the first loop gets hoisted to the start of the function; with placeDbgValues disabled the dbg.value stays behind. This is normally fine, however some other bug means the later vectorized loops have a line number in the earlier loop: with placeDbgValues enabled "tmp" is visible across pretty much the whole program, but when disabled it's only visible when actually inside that loop. Dexter interprets this as a mild loss in debug experience. LICM/hoist: The visibility of "hoist2" is reduced, value is optimized out on line 21 (the call to getrandvar(i)). This seems to be because the loop-invariant value of hoist2 is hoisted, but without placeDbgValues its dbg.value is not, and falls behind the call somehow. SimplifyCFG/MergeConditionalStores: This whole function gets speculated into select insns, except for the store to "global", where the two stores are condensed into one. Unfortunately the control flow at the end of the function looks like this: dostore: end: placeDbgValues moves the dbg.value up to the assignment to %final_lolret, potentially earlier than linenumbers in the dostore block. Without placeDbgValues however the dbg.value remains the penultimate insn and seems to get deleted somewhere else, meaning the final value of lolret is not observable. [0] https://github.com/llvm-mirror/llvm/blob/47fe3c0d7915103a5b262e2a55a281013b4167e5/lib/CodeGen/CodeGenPrepare.cpp#L6713 |
There's some discussion here on the pros/cons of getting rid of placeDbgValues: https://reviews.llvm.org/D46100. To recap; Pro:
Con:
What I'd recommend is to hack up a domtree query in placeDbgValues. That's a ~1 line change: reorder only if !DT.dominates(defOfValue, debugUseOfValue). Next, run dexter, and/or check how many unique variables with locations we see in a stage2 clang build. If killing off the modified placeDbgValues doesn't look like a regression, I'd say go for it. |
Thanks for the info, good to know I'm not the only one looking at placeDbgValues. The domtree query triggers a move 8 million times in a stage2 build, I agree it doesn't sound feasible to make placeDbgValues just go away. In terms of searching for how many variable locations we get in a stage2 build of clang, I'm not incredibly familiar with the structure of DWARF, my numbers below are from searching by:
With no changes to clang I see 618797 location lists in a clang-8 binary, with the domtree query guarding placeDbgValues I see 580400 location lists. This is a 6% reduction, which is quite unfortunate. I believe the remaining location lists have shrunk in coverage too: plotting the number of bytes covered by each location list on a histogram gives https://i.imgur.com/hOyGMbF.png . The x axis is bins in multiples of roughly 1000 bytes of coverage each, y axis the number of locationlists in that bin. Blue series is normal clang, orange has the domtree patch. Having to use a log scale obscures any interesting detail, so focusing on the smallest location lists in https://i.imgur.com/u9WQb8o.png is a bit more revealing: IMO a reasonable number of location lists are getting smaller, leading to more bunching at the bottom of the x axis. I'm all for trading some variable coverage in exchange for fewer broken debug locations, but it's a nontrivial amount of coverage and it's unclear how many broken debug locations would be fixed (many would live on when the domtree query triggers). Bjorn suggests in D46100 that SelectionDAG limitations might be solvable: I'll try to get a better understanding of those limitations, to see how easy it is to sweeten the deal. |
Jeremy, check with Wolfgang, I think he has a python script for |
Just to put a number on this, what happens to scope coverage & # of unique variables with location with placeDbgValues entirely disabled?
Have you tried llvm-dwarfdump --statistics?
Could be a good thing if it means we're getting rid of bogus dbg.values. Thanks for looking into this.
|
Vedant wrote:
Cheers, the more you know... getting actually-correct numbers with that option then: Normal clang (r345852): Domtree query patch'd clang: No placeDbgValues at all: With the domtree patch then, we have 2% fewer vars with locations, and a little more than 1% reduction in covered bytes. There's relatively little difference between the domtree-querying version and no placeDbgValues at all. I didn't fully realise until now, however, that dbg.value use-before-def is legal in LLVM. A 2% reduction in variable coverage in exchange for eliminating known-broken behaviour is a worthy trade-off in my opinion: a small number of incorrect variable valuations damages confidence in all valuations, after all. Digging into SelectionDAG shows there's no low-hanging-fruit that recover some of the lost variables, so I doubt I can sweeten the deal further. I'll look at a random sample of changed variables to see if there're any obvious patterns, and the regression tests that fail. As it turns out that clang can cope without placeDbgValues at all, and the modified method doesn't salvage very many variables when left in, I'm leaning towards completely killing placeDbgValues, other opinions most welcome though. |
Based on this, it doesn't look like placeDbgValues really pays for its complexity. As you pointed out, it'd be good to double-check that the variables-with-location we lose by disabling placeDbgValues really are invalid (at least, more often than not).
Definitely not ideal. Flipping the switch and making it illegal is more work than I expected it to be.
|
tl;dr: There's actually some quite problematic behaviour in SelectionDAG going I've dived into the regression test failures, the deteriorating dexter Digging deeply through inlining proved labour intensive unfortunately, so
My feeling is that SelectionDAG still seems to make many mistakes when Obviously none of this is ideal: however I think we can consider it a gift. With The examples where very simple code has variables dropped (in particular
In the meantime I've got two WIP patches [3, 4] that improve SelectionDAGs [0] https://github.com/llvm-mirror/clang/blob/11fd5cf9fd813938a0d3140fb878851c22d67a65/lib/Sema/SemaObjCProperty.cpp#L381 |
Awesome research - one rather minor point: "An alternative would be using -start-after=codegenprepare for tests, although Generally, tests should be isolated down to a fairly narrow degree (like a single pass), like in the middle end. Backend tests have historically had larger, excessive scope due to the lack of MIR and control over specific backend passes. So testing more narrowly isn't a bug, it's desirable. (though I wouldn't mind a flag as well, but sounds like it may not be necessary) |
What David said. A couple of other points:
It sounded like you were taking the direction of not pursuing the I haven't looked at it myself, but based on past experience with other
We (Sony) had a request to make sure these were uniquely named, so Great work, thanks for pursuing these! |
Sitrep: I'm still drilling into the issues exposed by disabling placeDbgValues, and opening corresponding bugs that block this one. Using some ugly hacks to resolve the currently open bugs recovers about half of the lost variables. llvm-dwarfdump (I'm using a 6.0 binary) reports: "variables with location":17342168 for a recent build comparable to the numbers above. Of the remaining variables I'm looking at, some of the coverage is beginning to appear bogus. An excellent example is this [0] "Var" variable: the computation for Var gets either common'd or hoisted to be ahead of the surrounding switch statement. Using trunk (r346686, now slightly stale) placeDbgValues hoists the dbg.value to being outside of the switch too; in the output binary the location list ends up covering three x86 instructions at the start of the switch, and nothing else. Line numbers from the relevant switch case do make it into the output binary, none of which are covered. This, of course, is useless to a developer. This raises the possibility that some of the currently covered locations don't actually correspond to linenumbers that make sense, and where perhaps useful coverage is impossible. I'll take a shot at measuring this sometime soon, I reckon if I:
|
Sitrep: here are some statistics using clang builds that have:
plain clang: with changes and placeDbgValues disabled: with changes but placeDbgValues enabled: Thus, in terms of variable coverage there's no serious regression, although toggling placeDbgValues on and off does make a slight difference. The real benefit turns out to be the range that variables cover rather than the number of variables covered (aside from fixing placeDbgValues making conditional assignments appear unconditional, which was the original point of this ticket). For the example in comment 11, rather than covering 9 bytes at the top of the switch statement roughly 260 bytes are covered across the center of the function. Each change I've been working on is fairly independent, I'll start uploading patches sometime soon. Statistics blobs, note the absolute number of variables doesn't line up between them as my stage2 builds were of my current repo rather than a fixed checkout (alas). |
Candidate patch: https://reviews.llvm.org/D58453 (and dependencies). It turns out the statistics from my last comment were rubbish due to extra bugs I was introducing. As discussed in the linked patch, variable location coverage drops by 1.8% or so by disabling placeDbgValues, and I haven't found a way to fix the rest of the dropped variable locations. IMO, this is almost entirely because the dbg.value intrinsics are referring to Values at points in the program where they've ceased to be live. With placeDbgValues this almost never happens, because dbg.values are shifted to a position where they're guaranteed to be live (hence the re-ordering). Thus the final patch has a trade-off: to avoid mass-reordering of variable locations (as in the original bug report here), we have to accept other variables going out of liveness. Which IMO is a true-er reflection of the original program. |
placeDbgValues was significantly limited in 00e2388, and it looks like the patch is sticking. There's still some work to be done with debug use-before-defs (which are already an error-state), but the vast majority of faulty behaviour should now be fixed. Horray! (I'll clean up the other tickets this depends on, which fall into the category of "things that Jeremy saw and bothered him", at some other time). |
mentioned in issue #38116 |
mentioned in issue #39073 |
mentioned in issue #39102 |
mentioned in issue #39133 |
mentioned in issue #39213 |
mentioned in issue #39284 |
mentioned in issue #39287 |
mentioned in issue #39357 |
mentioned in issue #39773 |
Extended Description
With the test below, debuggers report "qux" as having the value two for the whole program when compiled -O2 -g. This is wrong, because not only does "qux" never have the value two, no variable in the program ever does either.
Using an up-to-date toolchain (r340912), and compiling with "clang++ test.cpp -g -O2 -o a.out" for x86_64, printing the "qux" variable with gdb and lldb yields two on all lines past quxes initialisation, wheras if compiled -O0 the correct value of zero then eight is seen.
I've tried to reduce this test further, but that led to the value reported for "qux" changing (but still being a fixed value). My feeling is that whatever calculates the DWARF expression for "qux" is interpreting the not-taken else-blocks wrong in some way, deleting the bodies of those blocks changes debug behaviour too. FWIW, I generated this test while trying to stimulate SimplifyCFG's jump threader (which tries to merge the two if conditions into one), SimplifyEqualityComparisonWithOnlyPredecessor.
Found using DExTer ( https://github.com/SNSystems/dexter ).
-------->8--------
int
main()
{
volatile int foo = 4;
int read = foo;
int read1 = foo;
int read2 = foo;
int baz = 0;
int bar = 0, qux = 0;
if (read == 4) {
baz = 4;
bar = read1 * 2;
qux = read2 * 2;
} else {
baz = read;
bar = read1 / 2;
qux = read2 / 2;
}
bar &= 0xffff;
if (baz == 4) {
bar += baz + qux;
} else {
bar -= baz + qux;
}
return bar;
}
--------8<--------
The text was updated successfully, but these errors were encountered: