-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add runtime async design doc #75816
Add runtime async design doc #75816
Conversation
Adding an initial design document for runtime async. I'm not making a branch for this yet so I don't have to worry about keeping it up to date until we're actually ready to start implementation work; that should begin soon, but I think getting this document into `main` and starting collaboration with the runtime and debugger folks will help that proceed.
I'm not super sure you can correct me if I'm wrong, but as far as I understood we can keep generating debug information as they are already generated by the older async. |
@thaystg we can certainly keep those NOOPs, but it would be good to define precisely where you expect them to be emitted; that's one of the purposes of this document. For example, the awaiting a |
Compiler generated async state machines and runtime generated async share some of the same building blocks. Both need to have `await`s with in `catch` and `finally` blocks rewritten to pend the exceptions, | ||
perform the `await` outside of the `catch`/`finally` region, and then have the exceptions restored as necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it is worth to point it out -- but another shared building block is the hoisting of object references when an inner byref is accessed across a suspension point. E.g. in a case like
public static async Task FooAsync(C c)
{
// Address of c.S.X not saved across await like in sync code
c.S.X += await BarAsync();
}
class C
{
public S S;
}
struct S
{
public int X;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jakobbotsch this is the equivalent straight-line code. I think you're saying that those ldflda
calls will need to be redone across the suspension point?
IL_0000: ldarg.0
IL_0001: ldflda valuetype S C::S
IL_0006: ldflda int32 S::X
IL_000b: dup
IL_000c: ldind.i4
IL_000d: call int32 Program::Bar()
IL_0012: add
IL_0013: stind.i4
IL_0014: ret
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I recall, a good way to see the ref scenario is localArray[B] += await C;
. In normal code, we save the ref from indexing into array, but in async code, we'll obtain that ref before the suspension for await
and again after resuming.
using System;
using System.Threading.Tasks;
public class C {
public async Task M()
{
int[] a = new int[] { };
// The indexing into array is evaluated twice
// since we need the ref before and after the suspension but cannot hoist it
a[M2()] += await Task<int>.FromResult(42);
}
public static int M2() { return 1; }
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jakobbotsch this is the equivalent straight-line code. I think you're saying that those
ldflda
calls will need to be redone across the suspension point?IL_0000: ldarg.0 IL_0001: ldflda valuetype S C::S IL_0006: ldflda int32 S::X IL_000b: dup IL_000c: ldind.i4 IL_000d: call int32 Program::Bar() IL_0012: add IL_0013: stind.i4 IL_0014: ret
Right, exactly. There are more complex cases like @jcouv pointed out, or even impossible cases like SomethingThatReturnsRefInt() += await Foo()
(this raises a compiler error today).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. There is code somewhere in Roslyn that can decompose certain byrefs into constituent parts, which are not byrefs and also idempotent, thus can be evaluated twice with the illusion of a single evaluation. The end result is enabling cases like compound assignments with await in RHS and array access on the left and similar.
The code can handle fairly complex and recursive cases, but there are limitations - for example byref-returning call cannot be decomposed and the spec forbids to evaluate twice, so having that somewhere in LHS will result in an error.
Ideally async1 and async2 would support the same set of cases.
Unlike the rewrite of awaits in EH handlers, spilling of byrefs is done as a part of general async rewrite, if I remember correctly (not a separate pass), so it may not be completely straightforward to tease it out.
There should be good set of tests for these scenarios though.
IL_0014: ret | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to take a note so we don't forget: the compiler implements some (imperfect) clearing of hoisted locals when exiting scopes. I'm currently looking at strengthening that. We should keep track of what we want to do about that in async2 (will the runtime handle or do we still want the compiler to handle)?
For reference, see MethodToStateMachineRewriter.AddVariableCleanup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think clearing hoisted locals is going to be interesting for runtime async.
- you would not know exactly what JIT has hoisted (could even be different between quick JIT and when re-jitted with more optimizations).
- you would be clearing actual locals, not the storage in a display type. To the JIT it will appear as dead stores - storing null values that are not subsequently used. Most likely the JIT will just optimize them away.
We use the following runtime flag to drive whether feature can be used. This flag must be defined in the same assembly that defines `object`, and the assembly cannot reference any other assemblies. In terms of | ||
CoreFX, this means it must be defined in the `System.Runtime` reference assembly. | ||
|
||
TODO: Determine whether just the presence of this flag will cause the compiler to generate in runtime async mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My 2 cents: I think the driving scenario here is going to be multi-targeting. Consider for example if I have an application that targets net9.0
and net11.0
. The build needs to be functional when having this multi-target and an explicit langversion of 14.0. Ideally for the net9.0
target async v1 code generation occurs and for net11.0
we get async v2 generation.
Given that I believe that we should gate async v2 code generation on effectively
bool IsAsyncV2 => LangVersion >= 14 && IsRuntimeFeatureFlagPresent(Async);
Note: my comment here specifically uses net11.0
because that is where the feature flag will be present in the standard core lib. I would expect the experience with net10.0
+ use the async2 preview build flag matches net11.0
Note2: I suspect that in the short term we will want a feature flag to explicitly use async v1 code generation. Basically override the defaults. Having that will allow us to quickly toggle between async v1 and v2 generation. Suspect that will be useful for quickly ruling out / in whether a behavior is tied to the new model or not.
#Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm honestly not entirely sure why we'd take LangVersion
into consideration, given that this isn't a language change. Why should langver impact it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The async v2 model is going to be a highly compatible design but not a fully compatible one. There will be a few rough edges that will some segment of customers up. From that angle I can see this as being a language version issue because it allows a customer to separate out movement to a new runtime vs. a new language version of async
.
Irrespective of whether it's a language version feature, I do think we are going to want a feature flag that allows us to tweak the new codegen off. Particularly during net10.0
and early net11.0
it will be an effective way to help diagnose issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regardless of our goals I also suspect there will be bugs that will push for a flag to turn things on/off. I think an attribute like MethodImplOptions
at the method level might even be appropriate, although I haven't given it much thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an attribute like MethodImplOptions at the method level might even be appropriate
Yes, I am assuming we will have this, and it's included in this document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think we are going to want a feature flag that allows us to tweak the new codegen off. Particularly during net10.0 and early net11.0 it will be an effective way to help diagnose issues.
Oh, definitely. I figure for "turn off for the entire assembly", it'll be a Features
element, and for individual methods, we'll use the attribute I defined in this doc.
Compiler generated async state machines and runtime generated async share some of the same building blocks. Both need to have `await`s with in `catch` and `finally` blocks rewritten to pend the exceptions, | ||
perform the `await` outside of the `catch`/`finally` region, and then have the exceptions restored as necessary. | ||
|
||
TODO: Go over `IAsyncEnumerable` and confirm that the initial rewrite to a `Task`-based method produces code that can then be implemented with runtime async, rather than a full compiler state machine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I know this is not blocking early work, but for peace of mind I took a stab at this to show how it could work: https://gist.github.com/jcouv/aa5359db1b15522013cf19ef52dcc18f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Thanks (iteration 7)
``` | ||
|
||
```il | ||
call modreq(class [System.Runtime]System.Threading.Tasks.Task`1<int32>) int32 C::M() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be just
modreq(class [System.Runtime]System.Threading.Tasks.Task`1) int32 . . .
I.E. in generic case the modreq uses an open type.
Otherwise we would need to think about mismatched combinations like modreq(Task<int>) long Foo()
.
Not specifying element type in modreq avoids the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's the case, then we need to update the runtime document. It currently says this:
- Otherwise, the return type is the type argument of the return type (either
Task`1
orValueTask`1
) with the first custom modifier on the return type being the original type.
I interpret "the original type" to be the full original return type, which is also why I haven't filled in the Task<T>
generic scenario. If what we really want is the open type, that's perfectly fine, but the spec needs to be clear 🙂.
class C | ||
{ | ||
static Task<T> M<T>(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a regular case. Substituting the element type is not controversial. (unlike substituting the entire return type)
Looks good! With a couple of comments. |
Adding an initial design document for runtime async. I'm not making a branch for this yet so I don't have to worry about keeping it up to date until we're actually ready to start implementation work; that should begin soon, but I think getting this document into
main
and starting collaboration with the runtime and debugger folks will help that proceed.