-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refine effects based on optimizer-derived information #50805
Conversation
76d19dc
to
2ab07cc
Compare
The current `:consistent` effect bit carries dual meanings: 1. "the return value is always consistent" 2. "this method does not cause any undefined behavior". This design makes the effect bit unclear and hard to manage. Specifically, the current design prevents a post-inference analysis (as discussed in #50805) from safely refining "consistent"-cy using post-optimization state IR. This is because it is impossible to tell whether the `:consistent`-cy has been influenced by the first or second meaning. To address this, this commit splits them into two distinct effect bits: `:consistent` for consistent return values and `:noub` for no undefined behavior.
The current `:consistent` effect bit carries dual meanings: 1. "the return value is always consistent" 2. "this method does not cause any undefined behavior". This design makes the effect bit unclear and hard to manage. Specifically, the current design prevents a post-inference analysis (as discussed in #50805) from safely refining "consistent"-cy using post-optimization state IR. This is because it is impossible to tell whether the `:consistent`-cy has been influenced by the first or second meaning. To address this, this commit splits them into two distinct effect bits: `:consistent` for consistent return values and `:noub` for no undefined behavior. This commit also introduces an override mechanism for `:noub` as it is necessary for `@assume_effects` to concrete-evaluate the annotated methods. While this might sound risky and not in line with the existing designs of `:nonoverlayed` and `:noinbounds`, where their overrides are prohibited, but we already have an override mechanism in place for `:consistent`, which implicitly overrides `:noub`. Given this precedent, the override for `:noub` should probably be justified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now, there is a bit of a complication that we have generally so far taken the position that the optimizer may do non-IPO-safe optimizations, although in practice we never actually implemented any. This was a sensible choice, because we weren't really doing anything with the post-optimized IR other than feeding it into codegen anyway. However, with irinterp and this change, there's now two consumers of IPO-safely optimized IR. I do still think we may at some point want to run passes that allow IPO-unsafe optimizations, but we can always add them at the end of the pipeline.
We don't do as many as we should, but there is a long way between doing only a few and being prohibited from doing any ever. For a direct example, Base.inferencebarrier
is required to be optimized away, but must not improve IPO information afterwards. Other similar examples may include 1. the legality of removing boundscheck depends on having determined whether it is guaranteed to be not inlined (which premise is broken by this PR). 2. folding Expr(:new) with undef fields to a constant and other similar operations with undefined bits such as accessing .parameters
or doing math with floating point numbers assumes IPO cannot see that change to the implementation 3. DCE cannot mark a value as unused after all loads are removed (changing it to readnone), since later IPO effects may incorrectly assume this argument could be optimized away in the caller
You're not prohibited from doing any ever, you're just prohibited from doing them before the analysis pass runs. There's no problem doing more things afterwards.
Yes, that needed to be adjusted in this PR.
This PR is on the path of overhauling this mechanism. It's possible there's some extra tweaking that needs to happen here in this intermediate step.
I don't believe we do this in the current pass pipeline.
We don't currently derive such information interprocedurally. |
In particular though, this means the analysis pass must run before inlining, since inlining already ran all passes. Which seems hardly indistinguishable from not being able to do any useful optimizations?
I am talking about the default environment, which is not affected by that PR. In the default cases where it is not forced on or forced off, the effect of inlining on boundscheck branches is known to not be consistent, but that inconsistency is usually removed by the inlining pass. Though I suppose that falls into the larger bucket above that inlining is not allowed before the refinement analysis, so it might not matter as an independent point. |
The current `:consistent` effect bit carries dual meanings: 1. "the return value is always consistent" 2. "this method does not cause any undefined behavior". This design makes the effect bit unclear and hard to manage. Specifically, the current design prevents a post-inference analysis (as discussed in #50805) from safely refining `:consistent`-cy using post-optimization state IR. This is because it is impossible to tell whether the `:consistent`-cy has been tainted by the first or second meaning. To address this, this commit splits them into two distinct effect bits: `:consistent` for consistent return values and `:noub` for no undefined behavior. This commit also introduces an override mechanism for `:noub` as it is necessary for `@assume_effects` to concrete-evaluate the annotated methods. While this might sound risky and not in line with the existing designs of `:nonoverlayed` and `:noinbounds`, where their overrides are prohibited, but we already have an override mechanism in place for `:consistent`, which implicitly overrides `:noub`. Given this precedent, the override for `:noub` should probably be justified.
The current `:consistent` effect bit carries dual meanings: 1. "the return value is always consistent" 2. "this method does not cause any undefined behavior". This design makes the effect bit unclear and hard to manage. Specifically, the current design prevents a post-inference analysis (as discussed in #50805) from safely refining `:consistent`-cy using post-optimization state IR. This is because it is impossible to tell whether the `:consistent`-cy has been tainted by the first or second meaning. To address this, this commit splits them into two distinct effect bits: `:consistent` for consistent return values and `:noub` for no undefined behavior. This commit also introduces an override mechanism for `:noub` as it is necessary for `@assume_effects` to concrete-evaluate the annotated methods. While this might sound risky and not in line with the existing designs of `:nonoverlayed` and `:noinbounds`, where their overrides are prohibited, but we already have an override mechanism in place for `:consistent`, which implicitly overrides `:noub`. Given this precedent, the override for `:noub` should probably be justified.
The current `:consistent` effect bit carries dual meanings: 1. "the return value is always consistent" 2. "this method does not cause any undefined behavior". This design makes the effect bit unclear and hard to manage. Specifically, the current design prevents a post-inference analysis (as discussed in #50805) from safely refining `:consistent`-cy using post-optimization state IR. This is because it is impossible to tell whether the `:consistent`-cy has been tainted by the first or second meaning. To address this, this commit splits them into two distinct effect bits: `:consistent` for consistent return values and `:noub` for no undefined behavior. This commit also introduces an override mechanism for `:noub` as it is necessary for `@assume_effects` to concrete-evaluate the annotated methods. While this might sound risky and not in line with the existing designs of `:nonoverlayed` and `:noinbounds`, where their overrides are prohibited, but we already have an override mechanism in place for `:consistent`, which implicitly overrides `:noub`. Given this precedent, the override for `:noub` should probably be justified.
The current `:consistent` effect bit carries dual meanings: 1. "the return value is always consistent" 2. "this method does not cause any undefined behavior". This design makes the effect bit unclear and hard to manage. Specifically, the current design prevents a post-inference analysis (as discussed in #50805) from safely refining `:consistent`-cy using post-optimization state IR. This is because it is impossible to tell whether the `:consistent`-cy has been tainted by the first or second meaning. To address this, this commit splits them into two distinct effect bits: `:consistent` for consistent return values and `:noub` for no undefined behavior. This commit also introduces an override mechanism for `:noub` as it is necessary for `@assume_effects` to concrete-evaluate the annotated methods. While this might sound risky and not in line with the existing designs of `:nonoverlayed` and `:noinbounds`, where their overrides are prohibited, but we already have an override mechanism in place for `:consistent`, which implicitly overrides `:noub`. Given this precedent, the override for `:noub` should probably be justified.
As discussed today, we'll stop doing :boundscheck transforms in inlining and move those to codegen instead. Then we can cache the IR as IPO safe. At whatever point in the future we want to do IPO unsafe IR transforms at the julia level, we'll have a callback from codegen back into julia that does that. |
The current `:consistent` effect bit carries dual meanings: 1. "the return value is always consistent" 2. "this method does not cause any undefined behavior". This design makes the effect bit unclear and hard to manage. Specifically, the current design prevents a post-inference analysis (as discussed in #50805) from safely refining `:consistent`-cy using post-optimization state IR. This is because it is impossible to tell whether the `:consistent`-cy has been tainted by the first or second meaning. To address this, this commit splits them into two distinct effect bits: `:consistent` for consistent return values and `:noub` for no undefined behavior. This commit also introduces an override mechanism for `:noub` as it is necessary for `@assume_effects` to concrete-evaluate the annotated methods. While this might sound risky and not in line with the existing designs of `:nonoverlayed` and `:noinbounds`, where their overrides are prohibited, but we already have an override mechanism in place for `:consistent`, which implicitly overrides `:noub`. Given this precedent, the override for `:noub` should probably be justified.
The current `:consistent` effect bit carries dual meanings: 1. "the return value is always consistent" 2. "this method does not cause any undefined behavior". This design makes the effect bit unclear and hard to manage. Specifically, the current design prevents a post-inference analysis (as discussed in #50805) from safely refining "consistent"-cy using post-optimization state IR. This is because it is impossible to tell whether the `:consistent`-cy has been tainted by the first or second meaning. To address this, this commit splits them into two distinct effect bits: `:consistent` for consistent return values and `:noub` for no undefined behavior. This commit also introduces an override mechanism for `:noub` as it is necessary for `@assume_effects` to concrete-evaluate the annotated methods. While this might sound risky and not in line with the existing designs of `:nonoverlayed` and `:noinbounds`, where their overrides are prohibited, but we already have an override mechanism in place for `:consistent`, which implicitly overrides `:noub`. Given this precedent, the override for `:noub` should probably be justified. @nanosoldier `runbenchmarks("inference", vs=":master")`
2ab07cc
to
0aa6b47
Compare
How did we decide to deal with the fact this means that the result of codegen (e.g. validity of using concrete eval) will no longer be consistent with the IPO flags on the same CodeInfo object? |
You mean CodeInstance? They already have two sets of flags, but if something is determined for IPO, it should always be valid even after codegen transforms. |
@@ -509,7 +711,7 @@ function run_passes( | |||
@pass "compact 2" ir = compact!(ir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pass currently seems to not be IPO safe, since the access to ci.propagate_inbounds
(and :boundscheck
removal folding) is not currently IPO safe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm working on that right now.
96b55e0
to
5f7d950
Compare
@@ -1647,9 +1647,9 @@ function infer_effects(@nospecialize(f), @nospecialize(types=default_tt(f)); | |||
for match in matches.matches | |||
match = match::Core.MethodMatch | |||
frame = Core.Compiler.typeinf_frame(interp, | |||
match.method, match.spec_types, match.sparams, #=run_optimizer=#false) | |||
match.method, match.spec_types, match.sparams, #=run_optimizer=#true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New optional argument post_optimization_analysis::Bool=true
might be useful for testing/debugging purposes. We can add it later when needed though.
mi = result.edge | ||
if mi !== nothing && is_foldable(effects) | ||
if mi !== nothing && is_foldable(effects, !stmt_taints_inbounds_consistency(sv)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we need a similar change to concrete_eval_invoke
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I don't think it's required here at all. Both concrete eval and semi concrete eval are safe here independent of the state of the inbounds flag.
boundscheck = stmt.args[end] | ||
argextype(boundscheck, ir) === Bool || return false | ||
isa(boundscheck, SSAValue) || return false | ||
return true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Include isexpr(ir[boundscheck][:stmt], :boundscheck)
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is meant to check for having a boundscheck argument rather than the boundscheck expr, I'll rename the function.
base/compiler/optimize.jl
Outdated
else isa(stmt, GotoIfNot) | ||
bb = block_for_inst(ir, idx) | ||
sa, sb = ir.cfg.blocks[bb].succs | ||
for succ in iterated_dominance_frontier(ir.cfg, BlockLiveness((sa, sb), nothing), get!(lazydomtree)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for succ in iterated_dominance_frontier(ir.cfg, BlockLiveness((sa, sb), nothing), get!(lazydomtree)) | |
for succ in iterated_dominance_frontier(ir.cfg, BlockLiveness(ir.cfg.blocks[bb].succs, nothing), get!(lazydomtree)) |
and then we can keep typing BlockLiveness.def_bbs::Vector{Int}
.
@@ -1390,7 +1410,7 @@ function process_node!(compact::IncrementalCompact, result_idx::Int, inst::Instr | |||
if cfg_transforms_enabled | |||
# Rename phi node edges | |||
let bb_rename_pred=bb_rename_pred | |||
map!(i::Int32->bb_rename_pred[i], stmt.edges, stmt.edges) | |||
map!(i::Int32->i == 0 ? 0 : bb_rename_pred[i], stmt.edges, stmt.edges) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change necessary for this PR, or a fix for some related issues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It fixed some test failure, but I don't think it's related to this PR other than by coincidence. There's a special i == 0
predecessor for entry into catch blocks.
base/compiler/optimize.jl
Outdated
if isexpr(stmt, :enter) | ||
# try/catch not yet modeled | ||
had_trycatch = true | ||
return true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return true | |
return false |
Shouldn't return false
to bail out the scan?
base/compiler/optimize.jl
Outdated
return | ||
end | ||
else | ||
callback(idx, stmt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
callback(idx, stmt) | |
callback(idx) |
stmt::PhiNode
seems to be never used.
Doesn't absint-based constprop' usually lead to better inference? I'm in favor of preserving sources for irinterp, but I want to assert my understanding that irinterp is faster but less precise than the absint constprop'. |
I don't think it's necessarily always true that irinterp is less precise than absint. For example, irinterp can in theory propagate through more complicated sroa patterns. I didn't track down exactly what was going on in this case, but the failing test was arrayops 1746. |
25cf427
to
1ed5f15
Compare
The optimizer may be able to derive information that is not available to inference. For example, it may SROA a mutable value to derive additional constant information. Additionally, some effects, like :consistent are path-dependent and should ideally be scanned once all optimizations are done. Now, there is a bit of a complication that we have generally so far taken the position that the optimizer may do non-IPO-safe optimizations, although in practice we never actually implemented any. This was a sensible choice, because we weren't really doing anything with the post-optimized IR other than feeding it into codegen anyway. However, with irinterp and this change, there's now two consumers of IPO-safely optimized IR. I do still think we may at some point want to run passes that allow IPO-unsafe optimizations, but we can always add them at the end of the pipeline. With these changes, the effect analysis is a lot more precise. For example, we can now derive :consistent for these functions: ``` function f1(b) if Base.inferencebarrier(b) error() end return b end function f3(x) @fastmath sqrt(x) return x end ``` and we can derive `:nothrow` for this function: ``` function f2() if Ref(false)[] error() end return true end ```
I was seeing a fair number of inference instabilities on master and tracked this down to us deleting the inferred code after codegen making it unavailable to irinterp.
Also adds some cosmetic changes.
Also adds some cosmetic changes.
Also adds some cosmetic changes.
Also adds some cosmetic changes.
Also adds some cosmetic changes.
Also adds some cosmetic changes.
Also adds some cosmetic changes.
Following the discussions and changes in #50805, we now consider post-inlining IR as IPO-valid. Revisiting EA, I've realized that running EA twice—once for computing IPO-valid escape cache and once for local optimization analysis—is redundant. This commit streamlines the EA process to perform the analysis just once on post-optimization IR, and caches that result. This change also removes all interprocedural EA code, which had significant overlap with inlining code.
Following the discussions and changes in #50805, we now consider post-inlining IR as IPO-valid. Revisiting EA, I've realized that running EA twice—once for computing IPO-valid escape cache and once for local optimization analysis—is redundant. This commit streamlines the EA process to perform the analysis just once on post-optimization IR, and caches that result. This change also removes all interprocedural EA code, which had significant overlap with inlining code.
Following the discussions and changes in #50805, we now consider post-inlining IR as IPO-valid. Revisiting EA, I've realized that running EA twice—once for computing IPO-valid escape cache and once for local optimization analysis—is redundant. This commit streamlines the EA process to perform the analysis just once on post-optimization IR, and caches that result. This change also removes all interprocedural EA code, which had significant overlap with inlining code.
Following the discussions and changes in #50805, we now consider post-inlining IR as IPO-valid. Revisiting EA, I've realized that running EA twice—once for computing IPO-valid escape cache and once for local optimization analysis—is redundant. This commit streamlines the EA process to perform the analysis just once on post-optimization IR, and caches that result. This change also removes all interprocedural EA code, which had significant overlap with inlining code.
Following the discussions and changes in #50805, we now consider post-inlining IR as IPO-valid. Revisiting EA, I've realized that running EA twice—once for computing IPO-valid escape cache and once for local optimization analysis—is redundant. This commit streamlines the EA process to perform the analysis just once on post-optimization IR, and caches that result. This change also removes all interprocedural EA code, which had significant overlap with inlining code.
…A` (#51318) Following the discussions and changes in #50805, we now consider post-inlining IR as IPO-valid. Revisiting EA, I've realized that running EA twice—once for computing IPO-valid escape cache and once for local optimization analysis—is redundant. This commit streamlines the EA process to perform the analysis just once on post-optimization IR, and caches that result. This change also removes all interprocedural EA code, which had significant overlap with inlining code. --------- Co-authored-by: Julian Samaroo <[email protected]>
…A` (#51318) Following the discussions and changes in #50805, we now consider post-inlining IR as IPO-valid. Revisiting EA, I've realized that running EA twice—once for computing IPO-valid escape cache and once for local optimization analysis—is redundant. This commit streamlines the EA process to perform the analysis just once on post-optimization IR, and caches that result. This change also removes all interprocedural EA code, which had significant overlap with inlining code. --------- Co-authored-by: Julian Samaroo <[email protected]>
This particular fix was part of JuliaLang#50805, but it wasn't included in version 1.10, leading to situations where an incorrect `:nothrow` could occur in 1.10 (JuliaLang#53062). This commit implements a minimal correction in 1.10 and also added some test cases. Fixes JuliaLang#53062.
The optimizer may be able to derive information that is not available to inference. For example, it may SROA a mutable value to derive additional constant information. Additionally, some effects, like :consistent are path-dependent and should ideally be scanned once all optimizations are done. Now, there is a bit of a complication that we have generally so far taken the position that the optimizer may do non-IPO-safe optimizations, although in practice we never actually implemented any. This was a sensible choice, because we weren't really doing anything with the post-optimized IR other than feeding it into codegen anyway. However, with irinterp and this change, there's now two consumers of IPO-safely optimized IR. I do still think we may at some point want to run passes that allow IPO-unsafe optimizations, but we can always add them at the end of the pipeline.
With these changes, the effect analysis is a lot more precise. For example, we can now derive :consistent for these functions:
and we can derive
:nothrow
for this function: