Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DebugInfo@O2] InstCombine pointer type-cast sinking needlessly reduces dbg location ranges #39133

Closed
jmorse opened this issue Nov 26, 2018 · 5 comments
Labels
bugzilla Issues migrated from bugzilla wrong-debug

Comments

@jmorse
Copy link
Member

jmorse commented Nov 26, 2018

Bugzilla Link 39786
Resolution FIXED
Resolved on Feb 18, 2019 03:53
Version trunk
OS Linux
Blocks #38116 #38102
CC @dwblaikie,@gregbedwell,@CarlosAlbertoEnciso,@pogo59
Fixed by commit(s) 353936

Extended Description

The example program at the bottom of this bug report, using llvm r346686 with options "-O2 -g", demonstrates an example of pointer-typing gratuitously reducing the availability of variable locations. Specifically, the "xyzzy" pointer of "main" isn't visible except for the line "((baz | qux) & 0xaaaaaa) % 199" in the "something" method, despite being alive all the way up to that point. The same for the "this" pointer of "something".

The example code is extremely cooked as I was trying to replicate a fault in bug 38754, when reading it ignore any arithmetic expressions (they're there to defeat SimplifyCFG speculation) and consider only accesses to memory / object members. I've also sprinkled "volatile" to prevent further optimisations of some parts.

Essentially: the whole file gets inlined into the main function, and optimisers determine that the object pointer that's been allocated isn't needed for the "foo" and "bar" members of "xyzzy", they spend their lifetimes in SSA registers. That pointer is then only needed for the writes and reads of "baz" and "qux" because they're volatile. This causes InstCombine to sink the i8* to %class.trains* bitcast instruction for "xyzzy" into the BB for the first return statement in "something", because nothing else uses it. It also sinks the dbg.value for "xyzzy", to prevent use-before-free of the bitcast.

This is a totally valid optimisation, and a totally valid prevention of use-before-free, but the net effect is that the developer can't inspect the object they just allocated (in the "main" function, or the start of the "something" function). Given that the pointer has to be live up til the middle of the "something" method, that's particularly upsetting.

As with bug 39726, the use of a typed pointer artificially makes it look like the dbg.value for xyzzy must be sunk, wheras at the machine level that's not the case. (More generally maybe InstCombine should be leaving a dbg.value(undef...) around to prevent earlier variable locations leaking, but I'll fry that fish some other day).

--------8<--------
#include <stdlib.h>

class trains {
public:
int foo;
int bar;
volatile int baz;
volatile int qux;
trains() {
foo = 1; bar = 2;
baz = 3; qux = 4;
}

attribute((always_inline)) int something() {
volatile int quux = foo + bar;
if (quux > 0)
return ((baz | qux) & 0xaaaaaa) % 199;
else {
if (quux > 10000)
abort();
return (foo * bar) % 200;
}
}
};

attribute((always_inline)) trains *alloc() {
volatile int foo = 0;
auto *p = new trains();
if (foo != 0)
abort();
return p;
}

int main() {
int toreturn = 0;
if (auto xyzzy = alloc()) {
toreturn = xyzzy->something();
}
return toreturn;
}
-------->8--------

@jmorse
Copy link
Member Author

jmorse commented Nov 26, 2018

Curses: I forgot my header saying that this fault only becomes visible when placeDbgValues in CodeGenPrepare gets disabled, see bug 38754. The analysis and example above are still completely valid, and InstCombine still does the problematic sinking regardless, it's just that right now placeDbgValues covers for it through its redistribution of dbg.values. So still a bug IMO, just not immediately visible with the revision I cited.

@dwblaikie
Copy link
Collaborator

Presumably finishing the typeless pointer work might help here

@jmorse
Copy link
Member Author

jmorse commented Nov 28, 2018

David wrote:

Presumably finishing the typeless pointer work might help here

I believe it would; on this topic, I'm aware of some past material on typeless pointers [0], but not a lot about the current status / activity on that topic, are there any mailing list posts / talk recordings / similar around that would help me catch up? My first impression is that it's a long term objective, rather than something that can be immediately applied to InstCombine.

There's an identical problem in MachineSink where a bitcast becomes a COPY, sinks five or six BBs picking up all the DBG_VALUEs along the way, which is more or less a typed-pointer problem too.

In the meantime, inserting some hacks to
a) Simplify dbg.value's through pointer bitcasts, and
b) To not sink dbg.value's with pointer bitcasts
Achieves the desired effect, and leaves dbg.values in the correct BB, but it's unpleasant. I'll be taking some time to work through all the other associated bugs and then determine how best to commit.

[0] https://llvm.org/devmtg/2015-10/slides/Blaikie-OpaquePointerTypes.pdf

@dwblaikie
Copy link
Collaborator

David wrote:

Presumably finishing the typeless pointer work might help here

I believe it would; on this topic, I'm aware of some past material on
typeless pointers [0], but not a lot about the current status / activity on
that topic, are there any mailing list posts / talk recordings / similar
around that would help me catch up? My first impression is that it's a long
term objective

Kind of yes/no.

I started on this work a few years back, but kind of lost my way/burnt out/lost my momentum & went to work on other things.

It'd be good to come back to/finish off, though I'd love some folks to help me with that at some point.

The status is basically: Types were added to most parts of the IR that need it (but required to match the types coming through pointer parameter types, etc) - notable missing pieces include byval/inalloca, which currently figure out the number of bytes from the pointee type, so instead the attribute would need to take a size (like the dereferencable attribute) or a type (like alloca) to determine the number of bytes to copy or whatnot.

After that (& any other missing IR pieces) it's about going through all the passes to make sure they don't depend on this type information (that they look at the explicit types on instructions, etc instead of pointee types of parameters) before removing non-address-space-pointer casts & removing type information from pointers.

@jmorse
Copy link
Member Author

jmorse commented Jan 17, 2019

Candidate patch: https://reviews.llvm.org/D56788

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla wrong-debug
Projects
None yet
Development

No branches or pull requests

2 participants