Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefer interpreted execution for AsQueryable #79698

Closed
wants to merge 1 commit into from

Conversation

roji
Copy link
Member

@roji roji commented Dec 15, 2022

AsQueryable converts an IEnumerable to an IQueryable, allowing an abstraction over queries regardless of whether they're executed locally via LINQ to Objects (IEnumerable) or translated via a LINQ provider.

When the IEnumerable-based IQueryable is executed, we internally compile the expression tree and then invoke it; the delegate is then discarded (single-use). This practice is also being done in EF, and benchmarking it shows that interpretation is very significantly more performant than compilation followed by single use. This PR applies preferInterpretation: true for those cases.

Note that when trimming and using CoreCLR (not NativeAOT), this may cause a size increase since the interpreter is brought in where it could be trimmed (since by default, Compile() on CoreCLR doesn't use the interpreter). I have no idea if the linker is that smart though (on NativeAOT this wouldn't change anything since the interpreter must be used anyway).

/cc @jkotas @vitek-karas @ajcvickers

@ghost
Copy link

ghost commented Dec 15, 2022

Tagging subscribers to this area: @cston
See info in area-owners.md if you want to be subscribed.

Issue Details

AsQueryable converts an IEnumerable to an IQueryable, allowing an abstraction over queries regardless of whether they're executed locally via LINQ to Objects (IEnumerable) or translated via a LINQ provider.

When the IEnumerable-based IQueryable is executed, we internally compile the expression tree and then invoke it; the delegate is then discarded (single-use). This practice is also being done in EF, and benchmarking it shows that interpretation is very significantly more performant than compilation followed by single use. This PR applies preferInterpretation: true for those cases.

Note that when trimming and using CoreCLR (not NativeAOT), this may cause a size increase since the interpreter is brought in where it could be trimmed (since by default, Compile() on CoreCLR doesn't use the interpreter). I have no idea if the linker is that smart though (on NativeAOT this wouldn't change anything since the interpreter must be used anyway).

/cc @jkotas @vitek-karas @ajcvickers

Author: roji
Assignees: -
Labels:

area-System.Linq.Expressions

Milestone: -

@vitek-karas
Copy link
Member

Linker is not smart enough for this - it would require us to figure out all method callsites before we can process a given method, which we currently don't do. It's been discussed several times as there are other cases where this might be beneficial, but so far we didn't really invest in this.

@roji
Copy link
Member Author

roji commented Dec 15, 2022

Linker is not smart enough for this

@vitek-karas so you're saying that the interpreter is always preserved when there's a call to Compile(), regardless of the platform (CoreCLR vs. NativeAOT) or the parameter (preferInterpretation)?

I did a quick test, and the interpreter seems to add around 3.9MB. I'm not sure there's reason to invest in this specifically: for NativeAOT the interpreter is obviously required for Compile() in any case (so the current logic is fine), so this would be about trimming those additional 3.9MB on CoreCLR when the interpreter isn't needed (i.e. when the parameterless overload is used, or when preferInterpretation is false). I'm not yet sure whether that's significant for EF Core specifically (we're not necessarily prioritizing maximal trimming in the non-NativeAOT scenario at this point).

@vitek-karas
Copy link
Member

I looked into it and currently it is always preserved but for a different reason.
It's basically because of this line:

private static readonly MethodInfo s_expressionCompileMethodInfo = typeof(Expression<>).GetMethod("Compile", Type.EmptyTypes)!;

private static readonly MethodInfo s_expressionCompileMethodInfo = typeof(Expression<>).GetMethod("Compile", Type.EmptyTypes)!;

The problem is that trimming is currently not smart enough to recognize the EmptyTypes parameter and mark only the Compile method without parameters. It marks all Compile methods, so including the one with the boolean parameter, which has a direct dependency on the interpreter.

Otherwise the current AsQueryable only calls Compile() which doesn't have any dependency on the interpreter.

So if we wanted to fix this right now (before this change) - we could by either improving the trimmer to be smarter around GetMethod, or using something like the Func technique you have above to implement the methodof-like thing, which would reference the Compile method directly - the one without parameters.

That said, after your change this is a moot discussion as it will always go through Compile(bool) which will bring in the interpreter (plus - you actually want that). Maybe it would be possible to get rid of the ILEmit compiler then, but that would require the trimmer to track all callsites. And even that would not help because of the line above - which requires the Compile() to be kept, which will request the compiler implementation anyway. So we would also have to rewrite that code to only ask for this when needed (and it's still not clear if that would actually work - depends on the public APIs).

@MichalStrehovsky
Copy link
Member

Maybe it would be possible to get rid of the ILEmit compiler then

What might make sense is to add an <OptimizationPreference>Size</OptimizationPreference> option (we already have a similar one in NativeAOT). This option could flip the System.Linq.Expressions.CanCompileToIL feature switch to false and this allows getting rid of the ILEmit compiler. I.e. let the user decide whether they are willing to do extra potential perf tradeoffs for size. The feature switch already exists because that's how we get rid of the ILEmit on platforms that don't support it or don't want it. We just need a bit of MSBuild logic. Filed #79726 to track. I guess the work is going to be in the SDK repo.

@jkotas
Copy link
Member

jkotas commented Dec 16, 2022

benchmarking it shows that interpretation is very significantly more performant than compilation followed by single use

I do not expect that this is going to be the case for long-running queries. Once you get queries that run for more 10's milliseconds, I expect that the compilation is going to be more performant.

@roji
Copy link
Member Author

roji commented Dec 16, 2022

@jkotas right, it's a question of whether we expect most cases of AsQueryable queries to be complex/long or not; the current situation disadvantages short/simple ones, this change would shift it the other way. I can do a bit more benchmarking for various query complexities to see more or less where the top-off is, if that helps.

@jkotas
Copy link
Member

jkotas commented Dec 16, 2022

It helps as a data point, but I do not think it will help us to get confidence in this change. I am sure that some projects have long-running queries out there. If we merge this change and somebody reports blocking performance regression when upgrading to .NET 8, what are we going to tell them?

@roji
Copy link
Member Author

roji commented Dec 16, 2022

You're right of course.. The question is whether there's a use-case percentage where doing this change would be acceptable. For example, if 90% of uses would be optimized, would we accept it? 99%? Of course, if we don't want to regress perf for anyone - even if the vast majority of our users would benefit - we can abandon this.

Note that in the benchmarks I did, even with 1000 nodes in the expression tree, interpretation is still significantly more efficient.

@jkotas
Copy link
Member

jkotas commented Dec 16, 2022

For example, if 90% of uses would be optimized, would we accept it? 99%?

Changing the default would need to be treated as a breaking change (get breaking change template filled, etc.). Also, we would need to have mitigation for the breaking change. It would likely mean to make this configurable or introduce new APIs that allow you to pass in the config so that people can switch to old behavior if needed.

even with 1000 nodes in the expression tree, interpretation is still significantly more efficient.

Interpretation vs. compilation depends on the amount of data that the query processes. It does not depend on the amount of code in the query (modulo some noise).

@roji
Copy link
Member Author

roji commented Dec 16, 2022

OK, I'll close this then - thanks for giving it your time.

@roji roji closed this Dec 16, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Jan 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants