Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Generated function recompilation with edges #32774

Conversation

NHDaly
Copy link
Member

@NHDaly NHDaly commented Aug 2, 2019

WIP: Generated function recompilation with edges

This PR changes the implementation of Generated Functions to always set a backedge from the generator to the staged function itself, so that if the body of the generator is invalidated, the generated function is recompiled. This allows us to stop freezing the world-age on generated function bodies, which -- among other things -- allows generated functions to be used for computing values, as part of generic interfaces.

In particular, the aim of this PR is to enable us to remove this restriction from Generated Functions:

  1. Instead of calculating something or performing some action, you return a quoted expression which, when evaluated, does what you want.

An example using julia built from this PR:

julia> @generated function zero_tuple(::Type{T}) where T<:Tuple
           Tuple(zero(t) for t in fieldtypes(T))
       end
zero_tuple (generic function with 1 method)

julia> @code_typed zero_tuple(Tuple{Int, Float32})
CodeInfo(
1return (0, 0.0f0)
) => Tuple{Int64,Float32}

julia> using FixedPointDecimals

julia> @code_typed zero_tuple(Tuple{Int, Float32, FixedDecimal{Int,2}})
CodeInfo(
1return (0, 0.0f0, FixedDecimal{Int64,2}(0.00))
) => Tuple{Int64,Float32,FixedDecimal{Int64,2}}

From what I understand, there were a lot of changes implemented in the compiler to fix the #265-style issues that plagued Cassette, culminating here: #32237.

My understanding is that the work done to fix #265 for julia at large was not able to extend to generated functions for a number of reasons, but that the huge amount of work done for Cassette, above, has largely alleviated those problems. I understand that #32237 now allows generated functions to set forward edges for methods that should trigger recompilation of the generated function if they are invalidated. This mechanism is opt-in, and was not applied to all generated functions by default.

In this PR, we simply use that mechanism to set a backedge from the generator function body to the staged function itself, so that if the generator is invalidated (because any functions it calls are invalidated), then the generated function will be re-generated the next time it's called.

This recompilation means that generated functions can participate in the same recompilation process as normal Julia functions, and therefor no longer need to have their world-age frozen. Lifting this restriction gives us these benefits:

  1. Generated functions can safely be used to stage the computation of values, because the values will be re-computed if their dependent computations change, so there shouldn't be surpising/mismatched results.
  2. Generated functions can call functions whose method-tables might be updated by users, allowing them to be part of generic interfaces, and perform reflection/inspection on user-defined types.
  3. Generated functions will behave the same at the REPL as they do inside precompiled packages, eliminating this confusing behavior.
  4. Interactive programming w/ generated functions becomes much easier, allowing you to experiment with generated functions at the REPL the same as you would with other functions.

Together, the first two benefits allows us to stage computation of values that depends on user-defined computations over user-defined types. There is currently no good way to express such computations, such that they are safe and guaranteed to compute at compile-time.


This PR is still WIP. I would not be surprised if there are still some lingering roadblocks that make this more challenging than just slapping a backedge onto the generated functions. But the benefits of this PR would be very substantial, and would lead to significantly simpler, and easier to reason about, code. So I think it's worth working through those issues to get this right! :)


¿¿Stretch goals??:

  1. ??? With this well-defined recompilation behavior, maybe we can remove the mystery around whether generated functions can "be recompiled arbitrarily often", and provide stronger guarantees about that performance?
  2. ??? Maybe this improvement/extension can be applied to @pure functions as well, so that users can write code that is intended to execute at compiletime, without the overhead of generating entire methodinstances just to make the result of a computation available to the compiler.

Motivating Examples

I discussed a number of examples for which the existing mechanisms for controlling compiletime execution are not sufficiently satisfying in my talk at JuliaCon 2019. You can reference those here:
https://github.com/NHDaly/juliaCon2019-If_Runtime_isn-t_Funtime-Slides

A number of those examples would be greatly simplified or solved by this PR. Here are a couple of them reiterated:

Calculating "magic numbers"

coefficient(::Type{FD{T, f}}) where {T, f} = T(10)^f

After this PR, this could be written as

@generated coefficient(::Type{FD{T, f}}) where {T, f} = T(10)^f

And if we were able to apply this same extension to @pure functions (see stretch goals above), this could become:

@pure coefficient(::Type{FD{T, f}}) where {T, f} = T(10)^f

I used the coefficient example because it's simple yet it doesn't const-fold. But a bigger issue is this function, which is quite complex, and once could really never expect it to constfold:
https://github.com/JuliaMath/FixedPointDecimals.jl/blob/v0.3.0/src/FixedPointDecimals.jl#L465

Or, dearer to my heart, this example, which we want to add as part of JuliaMath/FixedPointDecimals.jl#45, where we want to precompute 2^64/10^f so that we can replace ÷ 100 with * 184467440737095516 << 64, which computes the same value but much more cheaply:
https://github.com/JuliaMath/FixedPointDecimals.jl/blob/026513ffb3a09e7b1e4943a3c15ffec8e3181b42/src/fldmod_by_const.jl#L126

Error-checking of types and compiler constants

The error-checking computation in this function only refers to compiler constants and types, but there is no safe way to stage that computation such that it occurs only at compiletime:
https://github.com/JuliaMath/FixedPointDecimals.jl/blob/07d24d994d67a6f0980ad127898f89c2b4767283/src/FixedPointDecimals.jl#L84-L98

t1(T) = typemax(T)
@test f_type2(1) == typemax(Int)
t1(T) = 3
@test_broken f_type2(1) == 3
Copy link
Member Author

@NHDaly NHDaly Aug 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the currently broken examples. Somehow, when the generator (#s5#3 in this case) is compiled, it doesn't cause any backedges to be set from t1 to it.

This simple utility function (defined here) prints out all the backedges for all specializations of a function, and you can see that there are none for t1:

julia> NHDalyUtils.func_all_backedges(t1)
1-element Array{Pair{Any,Any},1}:
 :MethodTable => Any[]

Whereas the previous, working example does get such backedges set:

julia> NHDalyUtils.func_all_backedges(t)
2-element Array{Pair{Any,Array{Any,1}},1}:
                 :MethodTable => [Tuple{typeof(t),Any}, MethodInstance for #s5#4(::Any, ::Any)]
 Tuple{typeof(t),Type{Int64}} => [MethodInstance for #s5#4(::Any, ::Any)]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some digging, I think I've found a MWE that explains why this case doesn't work, but it boggles my understanding of how julia recompiles functions...
As I show in this example, functions that call typemax don't get backedges set from typemax, despite the fact that they do inline the result.
But by some mysterious magic, they still get recompiled when typemax is updated, so somehow there's something other than backedges that can cause function recompilation??

julia> struct X x end

julia> Base.typemax(::Type{X}) = 8

julia> f1(t) = Base.typemax(t)
f1 (generic function with 1 method)

julia> f1(X)
8

julia> Base.method_instances(typemax, (Type{X},))[1].backedges
ERROR: UndefRefError: access to undefined reference
Stacktrace:
 [1] getproperty(::Any, ::Symbol) at ./sysimg.jl:18

julia> f2(x::T) where T = Base.typemax(T)  # This function _does_ get a backedge added to typemax.
f2 (generic function with 1 method)

julia> f2(X(2))
8

julia> Base.method_instances(typemax, (Type{X},))[1].backedges
1-element Array{Any,1}:
 MethodInstance for f2(::X)

julia> Base.typemax(::Type{X}) = 100

julia> f1(X)  # wat! if there is no backedge to f1 from typemax, how is this getting recompiled?!?
100

julia> @code_typed f1(X)  # But we know it _is_ getting recompiled, because the value is inlined, here:
CodeInfo(
1return 100
) => Int64

julia> methods(typemax).mt.backedges  # And it's not hiding on the MethodTable, as we can see.
ERROR: UndefRefError: access to undefined reference

So my test is failing because there's no backedge from typemax to f1. And in my test case above, f1 is the generator body. So the generator body is never properly invalidated, and so my generated function is never re-generated.

But somehow typemax is able to cause f1 to recompile via some mystery voodoo that i don't understand. And so I guess I want to also apply that to my generator body?


@vtjnash you pointed to this test case as an example of why "this approach won't work". Is this what you were referring to? That sometimes there is a different mechanism besides backedges that is used to trigger function recompilation?

Copy link
Member Author

@NHDaly NHDaly Aug 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now explained via my explanation below, #32774 (comment).

It turns out that -- despite what it looks like -- the call to typemax(t) in f1 above is triggering a dynamic dispatch, so as explained in that comment, the backedges mechanism implemented in this PR so far has no way to trigger regenerating the generated function. (It looks like the call to typemax is being inlined above, but this is actually due to @code_typed showing fully specialized code, even though it isn't actually fully specialized in reality: #32834.)

A simpler example that showcases why this doesn't work can be found here -- this example fails for the current implementation in this PR (as of f20d374):

julia> baz() = 2
baz (generic function with 2 methods)

julia> f() = Any[baz][1]()  # f calls `baz()` through a type-erased `Any` reference, and inference cannot de-virtualize it.
f (generic function with 2 methods)

julia> @generated foo() = f()
foo (generic function with 2 methods)

julia> foo()  # When computing the result for `foo()`, `baz()` is called via dynamic dispatch
2

julia> baz() = 4  # Updating baz does not trigger regenerating foo
baz (generic function with 2 methods)

julia> foo()  # So foo() still returns 2, instead of 4
2

NHDaly added 20 commits August 5, 2019 17:23
…to fix it for StagedFunctions, but not here yet...
…s need to be Any to get the right method!

Everything works now!!!! :'D
…e method

Now it fails in a different way, still needs to be investigated.
…l create specializations even if they don't exist. But it's not doing that either...
There are still some broken cases -- working through them slowly.
Instead of calling the generator via Core._apply_pure in julia, just
don't set a weird world-age
This reverts commit 4df638d.

It caused a weird error during julia's precompilation:
```
Generating precompile statements...┌ Error: Failed to precompile precompile(Tuple{getfield(Dates, Symbol("##s624#32")), Type{Tuple{Dates.DatePart{Char(0x59000000)}, Dates.Delim{Char, 1}, Dates.DatePart{Char(0x6d000000)}, Dates.Delim{Char, 1}, Dates.DatePart{Char(0x64000000)}, Dates.Delim{Char, 1}, Dates.DatePart{Char(0x48000000)}, Dates.Delim{Char, 1}, Dates.DatePart{Char(0x4d000000)}, Dates.Delim{Char, 1}, Dates.DatePart{Char(0x53000000)}, Dates.Delim{Char, 1}, Dates.DatePart{Char(0x73000000)}}}, Type{typeof(Dates.format)}, Type{Base.GenericIOBuffer{Array{UInt8, 1}}}, Type{Dates.DateTime}, Type{Dates.DateFormat{Symbol("YYYY-mm-dd\THH:MM:SS.s"), Tuple{Dates.DatePart{Char(0x59000000)}, Dates.Delim{Char, 1}, Dates.DatePart{Char(0x6d000000)}, Dates.Delim{Char, 1}, Dates.DatePart{Char(0x64000000)}, Dates.Delim{Char, 1}, Dates.DatePart{Char(0x48000000)}, Dates.Delim{Char, 1}, Dates.DatePart{Char(0x4d000000)}, Dates.Delim{Char,
1}, Dates.DatePart{Char(0x53000000)}, Dates.Delim{Char, 1}, Dates.DatePart{Char(0x73000000)}}}}})
└ @ Main.anonymous /Users/nathan.daly/src/julia2/contrib/generate_precompile.jl:167
ERROR: LoadError: LoadError: syntax: invalid escape sequence
```

Also, I'm not 100% sure _what_ the right world-age should be to run
these functions in. I _think_ it makes sense to just use the current
world-age, like a regular function, but it could also be reasonable to
run them always at the latest world age (which is close to what we're
currently doing) since they're currently run as pure functions.

But if we run them at the latest world-age, will this behave weirdly?:
```julia
julia> @generated foo(x) = baz(x)
foo (generic function with 1 method)

julia> baz(x) = 1
baz (generic function with 1 method)

julia> function f()
           @eval baz(x) = 2
           foo(1)
       end
f (generic function with 1 method)

julia> f()  # Should be 1, but if `foo()` is always run at latest world-age, will this be 2? Maybe this would work just-fine, actually?
```
```julia
@generated f(a::T, b, c...) where T<:Number = bar(T) + bar(a) + bar(b) + sum(bar(v) for v in c)
bar(x) = 2
@test f(2,3,4) == 8
bar(x) = 3
@test f(2,3,4) == 12
@test f(2,3,4,5) == 15
```
@NHDaly NHDaly force-pushed the generated_functions--recompilation-with-edges branch from 80fc029 to f20d374 Compare August 5, 2019 21:30
@NHDaly
Copy link
Member Author

NHDaly commented Aug 12, 2019

Okay, so I talked with @vtjnash in person last week, and it was extremely enlightening. I'll try to record what we talked about here, to bring those reading along up-to-date.

@vtjnash posted in #32732 (comment) that "this approach won't work", and in short he is right: As currently written in this PR, this approach will not work. Because the compiler does not add backedges for every single function in a call-graph (it only adds them for methods that get devirtualized), we cannot rely on the existing backedges mechanism alone to trigger regeneration of the generated function. In order to achieve the goals in this PR, we would need to be able to record every function invoked while executing the generator body, and ensure there are edge-paths from all of them that trigger recompilation.

This PR originally took the approach of simply adding a backedge from the user's function that generates the Expr (what i've been calling the generator body, m.generator.gen) to the Generated Function, hoping to trigger re-generation every time any dependent function is updated. While this will work in some cases, it is not sufficient to cover all cases where the generator needs to be re-run, because in some cases (such as dynamic dispatch), methods do not leave backedges.

As far as I understand it, the back-edges mechanism in normal functions works like this:

  1. The default behavior for any function call is a dynamic dispatch. This means that, at runtime, the system will inspect the function's method table (at a given world-age) to determine which method to invoke for given runtime arguments.
  2. If, when compiling a function, inference is able to figure out exactly which method [specialization] will be invoked for a given function call, it can elide the dynamic dispatch and insert the correct method invocation into the function, which is significantly cheaper.
    • But if the compiler optimizes away the dynamic dispatch, it must leave a "backedge" from the callee to the caller, so that if the callee function is changed in future world-ages, it can "invalidate" the caller, so that the function will be recompiled and inference can compute the correct method to invoke now.
  3. This means that if the compiler leaves a dynamic dispatch in the function, there is no need to add a backedge from the callee to the caller, so it doesn't do so.
    • Consider: even if the callee is updated (a method is redefined or a new method is added), when the caller is invoked in the later world-age, the caller's dynamic dispatch will find the correct new method, so the behavior is not broken.
    • This means that a function that contains a dynamic dispatch does not need to be invalidated if its callee is modified.

So our problem is that we cannot simply use backedge invalidation as our mechanism for triggering regeneration, because if any methods in our call-graph contain a dynamic dispatch, those methods would not be invalidated when their callee changes, so changing the callee would fail to trigger regeneration.


The good news is that this is a somewhat small, well-scoped shortcoming of this approach.

To circumvent it, we simply need to be able to ensure that all methods in our call-graph are able to invalidate the generated function. Since de-virtualized methods already get back-edges, I believe this means we only need to add edges for all dynamic dispatches encountered during execution of the generator.

Harmoniously, I think the ability to execute code and track all dynamic dispatches that occurred would also be useful for debugging. We have wanted this in the past for optimizing performance. So perhaps a mode could simply be added to Julia to execute a function and track all dispatches that occur, and that information could be used to implement generated function recompilation!

Thankfully, I was able to use Cassette.jl to quickly prototype this! I updated my example implementation to execute the generator in a Context that records all functions invoked, and adds edges to all of them to trigger recompilation!:
NHDaly/StagedFunctions.jl#1
(Thanks Cassette!)

This quick prototype is overkill -- it records edges for all function calls, instead of skipping devirtualized methods and relying on the existing backedges for that. But it shows that this mechanism will work to provide generated function recompilation! :)

So now the question is how to proceed: Is this something that we can implement in the built-in @generated functions in Julia? I think so. Or is it something that should remain in packages, like Cassette and potentially StagedFunctions?

@thautwarm
Copy link
Member

No. I'm sorry that GG.jl doesn't help any respect of what this PR is addresssing. Supporting defining closures for @generated has nothing to do with keeping validations of the generarors or recompiling the generated codes.

However I think I'm familiar with the problem this PR is addressing, and I do have an idea to make a workaround instead of adding backedges:

  1. Add one more integer argument for each @generated wrapped function definitions, named sub world age. Like the world age, it's a counter. Also we should assign the an integer counter somewhere to correspond the argument sub world age(hereafter as the variable sub world age), which will be explained later.

  2. Find out all dependencies that the generator is relying on, and share the variable**sub world age" with all these dependencies, to make sure once one of the dependencies change, the variable sub world age will increment by 1.

  3. When calling each generated function, pass its corresponding variable sub world age as the argument sub world age. This could be achieved by wrapping generated function by a mutable struct(or with a ref field):

struct SubWorldHooker{F}
     genfunc :: F
     subworldage :: Ref{Int}
end

As I know nothing about Julia implementation, my idea could be totally wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

automatic recompilation of dependent functions
4 participants