Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DebugInfo@O2][Dexter] Illegal value appears in variable when conditional blocks folded #38102

Closed
jmorse opened this issue Aug 29, 2018 · 24 comments
Assignees
Labels
bugzilla Issues migrated from bugzilla wrong-debug

Comments

@jmorse
Copy link
Member

jmorse commented Aug 29, 2018

Bugzilla Link 38754
Resolution FIXED
Resolved on Dec 11, 2019 03:07
Version trunk
OS Linux
Depends On #39102 #39213 #31226 #39073 #39133 #39284 #39287 #39357 #39773
Blocks #38116
CC @adrian-prantl,@bjope,@dwblaikie,@gregbedwell,@CarlosAlbertoEnciso,@JDevlieghere,@pogo59,@vedantk
Fixed by commit(s) 00e2388

Extended Description

With the test below, debuggers report "qux" as having the value two for the whole program when compiled -O2 -g. This is wrong, because not only does "qux" never have the value two, no variable in the program ever does either.

Using an up-to-date toolchain (r340912), and compiling with "clang++ test.cpp -g -O2 -o a.out" for x86_64, printing the "qux" variable with gdb and lldb yields two on all lines past quxes initialisation, wheras if compiled -O0 the correct value of zero then eight is seen.

I've tried to reduce this test further, but that led to the value reported for "qux" changing (but still being a fixed value). My feeling is that whatever calculates the DWARF expression for "qux" is interpreting the not-taken else-blocks wrong in some way, deleting the bodies of those blocks changes debug behaviour too. FWIW, I generated this test while trying to stimulate SimplifyCFG's jump threader (which tries to merge the two if conditions into one), SimplifyEqualityComparisonWithOnlyPredecessor.

Found using DExTer ( https://github.com/SNSystems/dexter ).

-------->8--------
int
main()
{
volatile int foo = 4;

int read = foo;
int read1 = foo;
int read2 = foo;
int baz = 0;
int bar = 0, qux = 0;

if (read == 4) {
baz = 4;
bar = read1 * 2;
qux = read2 * 2;
} else {
baz = read;
bar = read1 / 2;
qux = read2 / 2;
}

bar &= 0xffff;

if (baz == 4) {
bar += baz + qux;
} else {
bar -= baz + qux;
}

return bar;
}
--------8<--------

@jmorse
Copy link
Member Author

jmorse commented Aug 29, 2018

assigned to @jmorse

@jmorse
Copy link
Member Author

jmorse commented Nov 5, 2018

tl;dr: CodeGenPrepare is messing with the test in a broken way, but at the same time it protects against dbg.value's getting dropped for other reasons.

At a basic level, it would appear that the assignment of "qux = read2 / 2" in the false branch of the first condition gets optimised away, and a dbg.value is inserted with an expression that calculates the value of qux from read2. Unfortunately CodeGenPrepare::placeDbgValues [0] then moves that dbg.value to be next to the definition of read2, outside the conditional block, making the assignment visible on a path where it never happens.

placeDbgValues makes my mind boggle, due to its extremely simplistic movement of dbg.values. My understanding of dbg.value's is that their placement needs to be carefully considered, because their movement changes the order of variable valuations the developer sees. This can lead to developers observing states of the program that at best don't make sense, or worst mislead the developer. Simply relocating dbg.values to be adjacent to their Value operand ignores that concern, and causes this bug by making 'qux''s conditional value visible unconditionally.

IMHO placeDbgValues should be killed off: it's not possible to know whether moving dbg.values between BBs is correct at that point in compilation, and doing so should be the responsibility of whatever moved the Value in the first place. Various comments say that CodeGenPrepare is very old, this particular function body originates from r132151 circa 2011 [1], seemingly before DIExpressions could be attached to dbg.values. It's no surprise it doesn't account for bugs such as this bugzilla ticket.

I've experimented with just disabling placeDbgValues; this causes three regression tests to fail, two of which require investigation (they seem to expose other faults). A bigger question though is what happens to the debug experience by doing this: happily the Dexter tests (insert advertorial here) show up five more differences in behaviour, two positive three negative. One negative case is more or less spurious, one smells fixable, but the last seems to represent exactly what placeDbgValues is written to address: a dbg.value that's too far from it's def gets dropped. Curiously it's right next to a use of that Value, so the reason for dropping is unclear.

Anyhoo, I plan on:

  • Digging into the changes in regression test behaviour without placeDbgValues, they seem unrelated,
  • Digging into why SelectionDAG drops a dbg.value right next to a use of the Value
  • Seeing how people feel about killing placeDbgValues; hopefully I haven't missed something significant.

~

Analysis of the regression tests and dexter tests that change behaviour:

Regression tests:

LLVM :: DebugInfo/COFF/register-variables.ll
We no longer hoist dbg.values outside of the BB / scope they were defined in, so they no longer appear before "test %esi, esi". However, they also disappear from in front of "addl $1, %eax", and one insn later "a" incorrectly takes on the value of "b".

LLVM :: DebugInfo/X86/#37234 .ll
Mostly spurious changes: a duplicate DBG_VALUE become a single. A DBG_VALUE of a zerod-register is dropped but is covered by an earlier DBG_VALUE of const-zero with no change in value visible to the user. But then the return value is inexplicably DBG_VALUE'd to undef on the return instruction.

LLVM :: Transforms/CodeGenPrepare/X86/select.ll
Here, debugify directly tests for the behaviour I'm disabling, so it likely doesn't need to be considered.

Dexter tests: these are located here [2] while I'm getting them merged into the main repo, the tests that are affected by the change:

SimplifyCFG/FoldThreading: is this bug

SimplifyCFG/CSEInPred2: is bug 38753, and disabling placeDbgValues fixes that bug report.

GVN/NumGVNLoad: The load of "corge = *ptr" on line 23 is merged with earlier loads of *ptr, however the dbg.value is shifted upwards too, causing line 22 to see the following load early. Disabling placeDbgValues fixes this.

Vectorize/TempLoopVar: The 'tmp' variable calculation in the first loop gets hoisted to the start of the function; with placeDbgValues disabled the dbg.value stays behind. This is normally fine, however some other bug means the later vectorized loops have a line number in the earlier loop: with placeDbgValues enabled "tmp" is visible across pretty much the whole program, but when disabled it's only visible when actually inside that loop. Dexter interprets this as a mild loss in debug experience.

LICM/hoist: The visibility of "hoist2" is reduced, value is optimized out on line 21 (the call to getrandvar(i)). This seems to be because the loop-invariant value of hoist2 is hoisted, but without placeDbgValues its dbg.value is not, and falls behind the call somehow.

SimplifyCFG/MergeConditionalStores: This whole function gets speculated into select insns, except for the store to "global", where the two stores are condensed into one. Unfortunately the control flow at the end of the function looks like this:
%final_lolret = select [...]
br %condition, label %dostore, label %end

dostore:
%global_value = [some calculation]
store i32 %global_value, i32 *@global
br label %end

end:
llvm.dbg.value(%final_lolret, [metadata for lolret])
ret i32 %final_lolret

placeDbgValues moves the dbg.value up to the assignment to %final_lolret, potentially earlier than linenumbers in the dostore block. Without placeDbgValues however the dbg.value remains the penultimate insn and seems to get deleted somewhere else, meaning the final value of lolret is not observable.

[0] https://github.com/llvm-mirror/llvm/blob/47fe3c0d7915103a5b262e2a55a281013b4167e5/lib/CodeGen/CodeGenPrepare.cpp#L6713
[1] https://reviews.llvm.org/rL132151
[2] https://github.com/jmorse/dexter/tree/dextests/tests/nostdlib/llvm_passes

@vedantk
Copy link
Collaborator

vedantk commented Nov 5, 2018

There's some discussion here on the pros/cons of getting rid of placeDbgValues: https://reviews.llvm.org/D46100. To recap;

Pro:

  • placeDbgValues in particular (& CGP in general) is not amazingly principled, and may reorder things when it shouldn't
  • It's more principled to teach individual passes to move debug users to the right place

Con:

  • That's.. a lot of passes to update (anything that does I->moveAfter or similar), and CGP does enough work that we should be skeptical of it ever going away
  • We might get 80-90% of what we want by just teaching placeDbgValues to query a domtree before reordering dbg.values

What I'd recommend is to hack up a domtree query in placeDbgValues. That's a ~1 line change: reorder only if !DT.dominates(defOfValue, debugUseOfValue). Next, run dexter, and/or check how many unique variables with locations we see in a stage2 clang build.

If killing off the modified placeDbgValues doesn't look like a regression, I'd say go for it.

@jmorse
Copy link
Member Author

jmorse commented Nov 7, 2018

Thanks for the info, good to know I'm not the only one looking at placeDbgValues. The domtree query triggers a move 8 million times in a stage2 build, I agree it doesn't sound feasible to make placeDbgValues just go away.

In terms of searching for how many variable locations we get in a stage2 build of clang, I'm not incredibly familiar with the structure of DWARF, my numbers below are from searching by:

  • In every CU, in every subprogram,
  • Descending into all DW_TAG_lexical_block's and inlined_subroutine's,
  • Collecting all DW_TAG_variable's and DW_TAG_formal_parameter that have locations
    Please correct me if there's a limitation I've missed here.

With no changes to clang I see 618797 location lists in a clang-8 binary, with the domtree query guarding placeDbgValues I see 580400 location lists. This is a 6% reduction, which is quite unfortunate.

I believe the remaining location lists have shrunk in coverage too: plotting the number of bytes covered by each location list on a histogram gives https://i.imgur.com/hOyGMbF.png . The x axis is bins in multiples of roughly 1000 bytes of coverage each, y axis the number of locationlists in that bin. Blue series is normal clang, orange has the domtree patch. Having to use a log scale obscures any interesting detail, so focusing on the smallest location lists in https://i.imgur.com/u9WQb8o.png is a bit more revealing: IMO a reasonable number of location lists are getting smaller, leading to more bunching at the bottom of the x axis.

I'm all for trading some variable coverage in exchange for fewer broken debug locations, but it's a nontrivial amount of coverage and it's unclear how many broken debug locations would be fixed (many would live on when the domtree query triggers). Bjorn suggests in D46100 that SelectionDAG limitations might be solvable: I'll try to get a better understanding of those limitations, to see how easy it is to sweeten the deal.

@pogo59
Copy link
Collaborator

pogo59 commented Nov 7, 2018

Jeremy, check with Wolfgang, I think he has a python script for
measuring variable coverage in location lists.

@vedantk
Copy link
Collaborator

vedantk commented Nov 7, 2018

Thanks for the info, good to know I'm not the only one looking at
placeDbgValues. The domtree query triggers a move 8 million times in a
stage2 build, I agree it doesn't sound feasible to make placeDbgValues just
go away.

Just to put a number on this, what happens to scope coverage & # of unique variables with location with placeDbgValues entirely disabled?

In terms of searching for how many variable locations we get in a stage2
build of clang,

Have you tried llvm-dwarfdump --statistics?

I'm not incredibly familiar with the structure of DWARF, my
numbers below are from searching by:

  • In every CU, in every subprogram,
  • Descending into all DW_TAG_lexical_block's and inlined_subroutine's,
  • Collecting all DW_TAG_variable's and DW_TAG_formal_parameter that have
    locations
    Please correct me if there's a limitation I've missed here.

With no changes to clang I see 618797 location lists in a clang-8 binary,
with the domtree query guarding placeDbgValues I see 580400 location lists.
This is a 6% reduction, which is quite unfortunate.

Could be a good thing if it means we're getting rid of bogus dbg.values.

Thanks for looking into this.

I believe the remaining location lists have shrunk in coverage too: plotting
the number of bytes covered by each location list on a histogram gives
https://i.imgur.com/hOyGMbF.png . The x axis is bins in multiples of roughly
1000 bytes of coverage each, y axis the number of locationlists in that bin.
Blue series is normal clang, orange has the domtree patch. Having to use a
log scale obscures any interesting detail, so focusing on the smallest
location lists in https://i.imgur.com/u9WQb8o.png is a bit more revealing:
IMO a reasonable number of location lists are getting smaller, leading to
more bunching at the bottom of the x axis.

I'm all for trading some variable coverage in exchange for fewer broken
debug locations, but it's a nontrivial amount of coverage and it's unclear
how many broken debug locations would be fixed (many would live on when the
domtree query triggers). Bjorn suggests in D46100 that SelectionDAG
limitations might be solvable: I'll try to get a better understanding of
those limitations, to see how easy it is to sweeten the deal.

@jmorse
Copy link
Member Author

jmorse commented Nov 8, 2018

Vedant wrote:

Have you tried llvm-dwarfdump --statistics?

Cheers, the more you know... getting actually-correct numbers with that option then:

Normal clang (r345852):
"unique source variables":13953593
"source variables":1270949712
"variables with location":17517510
"scope bytes total":806358864
"scope bytes covered":354600297

Domtree query patch'd clang:
"unique source variables":13913574
"source variables":1248321854
"variables with location":17193356
"scope bytes total":796640899
"scope bytes covered":353297950

No placeDbgValues at all:
"unique source variables":13910639
"source variables":1234266567
"variables with location":17150695
"scope bytes total":795530393
"scope bytes covered":353621572

With the domtree patch then, we have 2% fewer vars with locations, and a little more than 1% reduction in covered bytes. There's relatively little difference between the domtree-querying version and no placeDbgValues at all. I didn't fully realise until now, however, that dbg.value use-before-def is legal in LLVM.

A 2% reduction in variable coverage in exchange for eliminating known-broken behaviour is a worthy trade-off in my opinion: a small number of incorrect variable valuations damages confidence in all valuations, after all. Digging into SelectionDAG shows there's no low-hanging-fruit that recover some of the lost variables, so I doubt I can sweeten the deal further.

I'll look at a random sample of changed variables to see if there're any obvious patterns, and the regression tests that fail. As it turns out that clang can cope without placeDbgValues at all, and the modified method doesn't salvage very many variables when left in, I'm leaning towards completely killing placeDbgValues, other opinions most welcome though.

@vedantk
Copy link
Collaborator

vedantk commented Nov 8, 2018

Vedant wrote:

Have you tried llvm-dwarfdump --statistics?

Cheers, the more you know... getting actually-correct numbers with that
option then:

Normal clang (r345852):
"unique source variables":13953593
"source variables":1270949712
"variables with location":17517510
"scope bytes total":806358864
"scope bytes covered":354600297

Domtree query patch'd clang:
"unique source variables":13913574
"source variables":1248321854
"variables with location":17193356
"scope bytes total":796640899
"scope bytes covered":353297950

No placeDbgValues at all:
"unique source variables":13910639
"source variables":1234266567
"variables with location":17150695
"scope bytes total":795530393
"scope bytes covered":353621572

Based on this, it doesn't look like placeDbgValues really pays for its complexity. As you pointed out, it'd be good to double-check that the variables-with-location we lose by disabling placeDbgValues really are invalid (at least, more often than not).

With the domtree patch then, we have 2% fewer vars with locations, and a
little more than 1% reduction in covered bytes. There's relatively little
difference between the domtree-querying version and no placeDbgValues at
all. I didn't fully realise until now, however, that dbg.value
use-before-def is legal in LLVM.

Definitely not ideal. Flipping the switch and making it illegal is more work than I expected it to be.

A 2% reduction in variable coverage in exchange for eliminating known-broken
behaviour is a worthy trade-off in my opinion: a small number of incorrect
variable valuations damages confidence in all valuations, after all. Digging
into SelectionDAG shows there's no low-hanging-fruit that recover some of
the lost variables, so I doubt I can sweeten the deal further.

I'll look at a random sample of changed variables to see if there're any
obvious patterns, and the regression tests that fail. As it turns out that
clang can cope without placeDbgValues at all, and the modified method
doesn't salvage very many variables when left in, I'm leaning towards
completely killing placeDbgValues, other opinions most welcome though.

@jmorse
Copy link
Member Author

jmorse commented Nov 19, 2018

tl;dr: There's actually some quite problematic behaviour in SelectionDAG going
on behind the scenes that causes painful regressions if we just delete
placeDbgValues right now. A more incremental approach is likely better.

I've dived into the regression test failures, the deteriorating dexter
tests, and the dropped variable coverage in clang: there are some WIP patches
below that improve those matters. On the topic of clang variable coverage,
there are mixed results. Examining variables found to be missing by
llvm-dwarfdump --statistics, fully a third of them are "this" variables
of tiny methods that have been heavily inlined.

Digging deeply through inlining proved labour intensive unfortunately, so
looking only at non-inlined functions where locations go missing, I noticed
a few themes:

  • In several locations dbg.values move only a single BB, typically from a
    loop pre-header or condition evaluation, but get dropped despite almost
    certainly still being alive. In [0] the call to getPropertyAttributes
    is made redundant / merged with a call in the if-condition, but the
    DBG_VALUE for "Attrs" is dropped for reasons unknown. In the same theme,
    in [1] the "vla" pointer is dropped, apparently because the dbg.value is
    in the conditional BB and the value definition is outside.
  • Roughly half of the not-inlined instances I looked at were named
    "__begin" / "__end" / "__range" variables, which I assume to be
    implementation details of range-based loops that no-one cares about
    too much,
  • In this [2] function a large amount of constructor and method are inlined,
    and the "DAG" pointer is never used with the declared type of DAG after
    the constructor finishes (the return statement manually bitcasts the
    return Value of new). When the dbg.value for "DAG" is left in the "right"
    location after the constructor, there are then no users to keep it alive,
    and it's never emitted. Previously with placeDbgValues enabled it was
    moved to immediately after "new", and has its lifetime extended much later
    in the pipeline.
  • Only two of the changes observed were due to use-before-def or
    non-dominated-use (out of about twenty changes fully examined). I didn't
    see any examples of bogus DIExpressions being eliminated at all.

My feeling is that SelectionDAG still seems to make many mistakes when
preserving locations across basic blocks (which placeDbgValues has been
masking); there's also something funky with pointer types that I haven't dug
far enough into yet.

Obviously none of this is ideal: however I think we can consider it a gift. With
placeDbgValues enabled we know that variable locations can be covered, but
with placeDbgValues disabled we drop variable locations almost certainly due to
bugs in SelectionDAG or thereabouts. This is thus a goldmine for bug hunting.

The examples where very simple code has variables dropped (in particular
[0, 1]) swings me back in the direction of not killing off placeDbgValues
immediately, they'd be quite irritating regressions to introduce. IMHO the most
stable path forwards would be:

  • Add some hidden option, --disable-place-dbg-values, which disables
    placeDbgValues in CGP
  • Start hunting & fixing regressions, building up a suite of tests with that
    option enabled
  • Switch over to just deleting placeDbgValues some time in the future, and
    remove --disable-place-dbg-values from tests
    An alternative would be using -start-after=codegenprepare for tests, although
    then we wouldn't be testing as much of LLVM as we could. Opinions on this
    approach / alternatives most welcome.

In the meantime I've got two WIP patches [3, 4] that improve SelectionDAGs
DBG_VALUE dropping behaviour a little, and two [5, 6] that work around
MachineInstr::collectDebugValue's assumption that DBG_VALUEs always follow
the relevant definition (which disabling placeDbgValues would violate).
I'll continue to look into missing locations uncovered by disabling
placeDbgValues.

[0] https://github.com/llvm-mirror/clang/blob/11fd5cf9fd813938a0d3140fb878851c22d67a65/lib/Sema/SemaObjCProperty.cpp#L381
[1] https://github.com/llvm-mirror/clang/blob/b6508aa4110621b4c5441dd67013cee4ec0d3036/lib/CodeGen/CGExpr.cpp#L3331
[2] https://github.com/llvm-mirror/llvm/blob/c8c80b0f2f4fecead57d6bfdb6afd894c59361cc/lib/CodeGen/MachineScheduler.cpp#L3246
[3] https://reviews.llvm.org/D54712
[4] https://reviews.llvm.org/D54715
[5] https://reviews.llvm.org/D54716
[6] https://reviews.llvm.org/D54717

@dwblaikie
Copy link
Collaborator

Awesome research - one rather minor point:

"An alternative would be using -start-after=codegenprepare for tests, although
then we wouldn't be testing as much of LLVM as we could. Opinions on this
approach / alternatives most welcome."

Generally, tests should be isolated down to a fairly narrow degree (like a single pass), like in the middle end. Backend tests have historically had larger, excessive scope due to the lack of MIR and control over specific backend passes.

So testing more narrowly isn't a bug, it's desirable.

(though I wouldn't mind a flag as well, but sounds like it may not be necessary)

@pogo59
Copy link
Collaborator

pogo59 commented Nov 19, 2018

What David said.

A couple of other points:

Examining variables found to be missing by
llvm-dwarfdump --statistics, fully a third of them are "this" variables
of tiny methods that have been heavily inlined.

It sounded like you were taking the direction of not pursuing the
inlining difficulties right now, which might mean those cases want to
be spun off to their own bug.

I haven't looked at it myself, but based on past experience with other
compilers, my guess is that we have a "this" identifier at multiple
scopes all trying to share the same register in the same range. My
preference would be to build a way for LLVM to recognize that truth,
and be able to associate all those identifiers with the register where
their value does in fact live. Right now I think we treat ranges as
always disjoint, and they are not.

  • Roughly half of the not-inlined instances I looked at were named
    "__begin" / "__end" / "__range" variables, which I assume to be
    implementation details of range-based loops that no-one cares about
    too much,

We (Sony) had a request to make sure these were uniquely named, so
somebody does actually care about these artificial variables. Loop
optimizations are yet another beast, and those probably are already
gone before you get to SDAG. If that's the case, again probably
best to spin those off to their own bug.

Great work, thanks for pursuing these!

@jmorse
Copy link
Member Author

jmorse commented Dec 3, 2018

Sitrep: I'm still drilling into the issues exposed by disabling placeDbgValues, and opening corresponding bugs that block this one. Using some ugly hacks to resolve the currently open bugs recovers about half of the lost variables. llvm-dwarfdump (I'm using a 6.0 binary) reports:

"variables with location":17342168

for a recent build comparable to the numbers above.

Of the remaining variables I'm looking at, some of the coverage is beginning to appear bogus. An excellent example is this [0] "Var" variable: the computation for Var gets either common'd or hoisted to be ahead of the surrounding switch statement. Using trunk (r346686, now slightly stale) placeDbgValues hoists the dbg.value to being outside of the switch too; in the output binary the location list ends up covering three x86 instructions at the start of the switch, and nothing else. Line numbers from the relevant switch case do make it into the output binary, none of which are covered. This, of course, is useless to a developer.

This raises the possibility that some of the currently covered locations don't actually correspond to linenumbers that make sense, and where perhaps useful coverage is impossible. I'll take a shot at measuring this sometime soon, I reckon if I:

  • Keep placeDbgValues enabled
  • Drop a constant-valued dbg.value where placeDbgValues moves
    instructions from
  • Examine the output location lists to see whether the constant-value
    range interrupts another live-range
    that will highlight circumstances where the current placeDbgValue behaviour correctly covers the intended portion of code (and thus dropping the variable is a likely bug), and those where it doesn't.

[0] https://github.com/llvm-mirror/clang/blob/aa528ab4a083268edddd038d0f251afe792cfa36/lib/CodeGen/CGBlocks.cpp#L1858

@jmorse
Copy link
Member Author

jmorse commented Jan 2, 2019

Sitrep: here are some statistics using clang builds that have:

  • Plain r350008
  • Same revision with my raft of changes and placeDbgValues disabled
  • Same revision with the raft of changes, but placeDbgValues unchanged
    The point being to compare current clang, the benefits of the changes, and how much a difference placeDbgValues being on and off makes. Full json blobs at the bottom, I've used the llvm-dwarfdump from trunk rather than the -6 release as it appears to generate saner numbers, so this isn't comparable to earlier stats. Headline figures:

plain clang:
Variable coverage: 61.6%
Scope bytes coverage: 41.9%

with changes and placeDbgValues disabled:
Variable coverage: 61.6%
Scope bytes coverage: 47.1%

with changes but placeDbgValues enabled:
Variable coverage: 61.7%
Scope bytes coverage: 46.3%

Thus, in terms of variable coverage there's no serious regression, although toggling placeDbgValues on and off does make a slight difference. The real benefit turns out to be the range that variables cover rather than the number of variables covered (aside from fixing placeDbgValues making conditional assignments appear unconditional, which was the original point of this ticket). For the example in comment 11, rather than covering 9 bytes at the top of the switch statement roughly 260 bytes are covered across the center of the function.

Each change I've been working on is fairly independent, I'll start uploading patches sometime soon.

Statistics blobs, note the absolute number of variables doesn't line up between them as my stage2 builds were of my current repo rather than a fixed checkout (alas).
Clang r350008: {"version":1,"file":"./plain-r350008-clang-8","format":"ELF64-x86-64","source functions":374444,"inlined functions":4642745,"unique source variables":1215521,"source variables":9593004,"variables with location":5915224,"call site entries":0,"scope bytes total":855512828,"scope bytes covered":359067613,"total function size":53947782,"total inlined function size":41194523}
Clang with changes: {"version":1,"file":"./curstatus-clang8","format":"ELF64-x86-64","source functions":374607,"inlined functions":4643563,"unique source variables":1211262,"source variables":9582673,"variables with location":5909594,"call site entries":0,"scope bytes total":851322505,"scope bytes covered":401823565,"total function size":53958093,"total inlined function size":41208557}
Clang with changes and placedbgvalues: {"version":1,"file":"./curstatus-wplacedbg-clang8","format":"ELF64-x86-64","source functions":374608,"inlined functions":4643679,"unique source variables":1211329,"source variables":9582954,"variables with location":5914589,"call site entries":0,"scope bytes total":852109102,"scope bytes covered":394982541,"total function size":53958959,"total inlined function size":41209076}

@jmorse
Copy link
Member Author

jmorse commented Feb 20, 2019

Candidate patch: https://reviews.llvm.org/D58453 (and dependencies).

It turns out the statistics from my last comment were rubbish due to extra bugs I was introducing. As discussed in the linked patch, variable location coverage drops by 1.8% or so by disabling placeDbgValues, and I haven't found a way to fix the rest of the dropped variable locations.

IMO, this is almost entirely because the dbg.value intrinsics are referring to Values at points in the program where they've ceased to be live. With placeDbgValues this almost never happens, because dbg.values are shifted to a position where they're guaranteed to be live (hence the re-ordering). Thus the final patch has a trade-off: to avoid mass-reordering of variable locations (as in the original bug report here), we have to accept other variables going out of liveness. Which IMO is a true-er reflection of the original program.

@jmorse
Copy link
Member Author

jmorse commented Dec 11, 2019

placeDbgValues was significantly limited in 00e2388, and it looks like the patch is sticking. There's still some work to be done with debug use-before-defs (which are already an error-state), but the vast majority of faulty behaviour should now be fixed. Horray!

(I'll clean up the other tickets this depends on, which fall into the category of "things that Jeremy saw and bothered him", at some other time).

@jmorse
Copy link
Member Author

jmorse commented Nov 27, 2021

mentioned in issue #38116

@jmorse
Copy link
Member Author

jmorse commented Nov 27, 2021

mentioned in issue #39073

@jmorse
Copy link
Member Author

jmorse commented Nov 27, 2021

mentioned in issue #39102

@jmorse
Copy link
Member Author

jmorse commented Nov 27, 2021

mentioned in issue #39133

@jmorse
Copy link
Member Author

jmorse commented Nov 27, 2021

mentioned in issue #39213

@jmorse
Copy link
Member Author

jmorse commented Nov 27, 2021

mentioned in issue #39284

@jmorse
Copy link
Member Author

jmorse commented Nov 27, 2021

mentioned in issue #39287

@jmorse
Copy link
Member Author

jmorse commented Nov 27, 2021

mentioned in issue #39357

@jmorse
Copy link
Member Author

jmorse commented Nov 27, 2021

mentioned in issue #39773

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla wrong-debug
Projects
None yet
Development

No branches or pull requests

4 participants