Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ResolveAssemblyReference is slow on .NET Core with many references #2015

Open
Tracked by #3139
rainersigwald opened this issue Apr 26, 2017 · 64 comments
Open
Tracked by #3139

Comments

@rainersigwald
Copy link
Member

ResolveAssemblyReferences runs unconditionally even on incremental builds (because the user may have installed a targeting pack or, for full-framework scenarios, GACed an assembly). This makes it performance-sensitive.

With the advent of .NET Core micro-assemblies, it has become common to pass many, many references to RAR. This exacerbates performance problems.

@rynowak was kind enough to run a profiler to see this:
image

@rainersigwald
Copy link
Member Author

One thing of note: MSBuild only calls that AssemblyName constructor because AssemblyName.Clone() isn't available in .NET Standard 1.x. One possible improvement here would be to move to .NET Core 2.0.

@rynowak
Copy link
Member

rynowak commented Apr 26, 2017

To add a little more detail we really have a project that's taking this long to build (referring to the 95sec spent in UnifyAssemblyNameVersions).

You may want to check out https://github.com/aspnet/Mvc/tree/dev/test/Microsoft.AspNetCore.Mvc.FunctionalTests in general as example of a really complicated dotnet core package that's updated to the bleeding edge.

AndyGerlicher added a commit to AndyGerlicher/msbuild that referenced this issue Apr 26, 2017
AssemblyName.Clone is not available in early versions of .NET Core.
Until we can upgrade MSBuild to use .NET Core 2.0, we can't call the
Clone method directly. This change will attempt to call .Clone using
reflection and if not available fallback to the previous .NET Core
behavior.

Potential fix for dotnet#2015 (when running on .NET Core 2.0).
AndyGerlicher added a commit to AndyGerlicher/msbuild that referenced this issue Apr 26, 2017
AssemblyName.Clone is not available in early versions of .NET Core.
Until we can upgrade MSBuild to use .NET Core 2.0, we can't call the
Clone method directly. This change will attempt to call .Clone using
reflection and if not available fallback to the previous .NET Core
behavior.

Potential fix for dotnet#2015 (when running on .NET Core 2.0).
AndyGerlicher added a commit to AndyGerlicher/msbuild that referenced this issue Apr 26, 2017
AssemblyName.Clone is not available in early versions of .NET Core.
Until we can upgrade MSBuild to use .NET Core 2.0, we can't call the
Clone method directly. This change will attempt to call .Clone using
reflection and if not available fallback to the previous .NET Core
behavior.

Potential fix for dotnet#2015 (when running on .NET Core 2.0).
AndyGerlicher added a commit to AndyGerlicher/msbuild that referenced this issue Apr 27, 2017
AssemblyName.Clone is not available in early versions of .NET Core.
Until we can upgrade MSBuild to use .NET Core 2.0, we can't call the
Clone method directly. This change will attempt to call .Clone using
reflection and if not available fallback to the previous .NET Core
behavior.

Potential fix for dotnet#2015 (when running on .NET Core 2.0).
AndyGerlicher added a commit to AndyGerlicher/msbuild that referenced this issue Apr 27, 2017
AssemblyName.Clone is not available in early versions of .NET Core.
Until we can upgrade MSBuild to use .NET Core 2.0, we can't call the
Clone method directly. This change will attempt to call .Clone using
reflection and if not available fallback to the previous .NET Core
behavior.

Potential fix for dotnet#2015 (when running on .NET Core 2.0).
AndyGerlicher added a commit that referenced this issue Apr 27, 2017
AssemblyName.Clone() is not available in early versions of .NET Core.
Until we can upgrade MSBuild to use .NET Core 2.0, we can't call the
Clone method directly. This change will attempt to call .Clone using
reflection and if not available fallback to the previous .NET Core
behavior.

Potential fix for #2015 (when running on .NET Core 2.0).
AndyGerlicher added a commit that referenced this issue May 1, 2017
AssemblyName.Clone() is not available in early versions of .NET Core.
Until we can upgrade MSBuild to use .NET Core 2.0, we can't call the
Clone method directly. This change will attempt to call .Clone using
reflection and if not available fallback to the previous .NET Core
behavior.

Potential fix for #2015 (when running on .NET Core 2.0).
@eerhardt
Copy link
Member

There's an ask from the ASP.NET team to not even run ResolveAssemblyReference in some cases. See dotnet/sdk#1193

@ericstj
Copy link
Member

ericstj commented May 22, 2017

I think you should look at a couple options:

  1. Caching immutable data (like metadata about a file inside a nuget package / targeting pack) to avoid recalculation from multiple processes / projects / rebuilds.
  2. Identifying a fast path to skip RAR entirely and populate its outputs from a cache file in OBJ.

I think #2 is doable if we look at the right set of inputs/outputs and potentially change AssemblySearchPaths for some project types.

@rainersigwald
Copy link
Member Author

There is a cache in RAR but we can't use it in Core because it depends on BinarySerialization which isn't accessible to us.

I'm fairly hesitant to build any assumptions about nuget immutability into RAR.

@DamianEdwards
Copy link
Member

Isn't binary serialization available in netcoreapp2.0 now, as part of netstandard2.0?

@rainersigwald
Copy link
Member Author

@DamianEdwards Yes, but we're not targeting that. We'd like to but it's a lot of work and we can't do it for 15.3.

@DamianEdwards
Copy link
Member

@rainersigwald can you not light-up on it in the same way you did for the change above? Or is not really factored to enable that easily right now?

@rainersigwald
Copy link
Member Author

@DamianEdwards I don't think so, because binary serialization requires source attributes, which can't be discovered via reflection.

@kieranmo
Copy link

What other serialization options do we have?

@rainersigwald
Copy link
Member Author

I'm going to spend tomorrow on some prototyping and profiling on this.

@rainersigwald rainersigwald self-assigned this May 22, 2017
@DamianEdwards
Copy link
Member

@rainersigwald thanks

@rainersigwald
Copy link
Member Author

@DamianEdwards what's the best scenario to hammer on? A dotnet new web project with sdk 2, and incremental dotnet builds? On my machine RAR doesn't seem like a significant part of that (1 ms of the 1 s build) while NuGet operations take the time:

        9 ms  CheckForImplicitPackageReferenceOverrides   1 calls
       20 ms  ProduceContentAssets                       1 calls
       99 ms  GenerateRuntimeConfigurationFiles          1 calls
      149 ms  ResolvePackageFileConflicts                1 calls
      239 ms  ResolvePackageDependencies                 1 calls

@DamianEdwards
Copy link
Member

@rainersigwald try dotnet new mvc. Here's the result on my machine (all from the CLI using dotnet build):

1st build

   9 ms  CoreGenerateAssemblyInfo                   1 calls
  32 ms  CheckForImplicitPackageReferenceOverrides   1 call
  32 ms  _CopyOutOfDateSourceItemsToOutputDirectory   1 cal
  33 ms  RunProduceContentAssets                    1 calls
  47 ms  _ComputeLockFileReferences                 1 calls
  67 ms  GenerateBuildRuntimeConfigurationFiles     1 calls
  81 ms  ResolveLockFileReferences                  1 calls
 108 ms  _ComputeLockFileCopyLocal                  1 calls
 123 ms  GenerateBuildDependencyFile                1 calls
 172 ms  RunResolvePackageDependencies              1 calls
 246 ms  _HandlePackageFileConflicts                1 calls
 905 ms  ResolveAssemblyReferences                  1 calls
1388 ms  CoreCompile                                1 calls

2nd build

  8 ms  CoreCompile                                1 calls
 11 ms  CheckForImplicitPackageReferenceOverrides   1 calls
 33 ms  RunProduceContentAssets                    1 calls
 52 ms  _ComputeLockFileReferences                 1 calls
 69 ms  GenerateBuildRuntimeConfigurationFiles     1 calls
 80 ms  ResolveLockFileReferences                  1 calls
108 ms  _ComputeLockFileCopyLocal                  1 calls
147 ms  _HandlePackageFileConflicts                1 calls
186 ms  RunResolvePackageDependencies              1 calls
960 ms  ResolveAssemblyReferences                  1 calls

3rd build

  8 ms  CoreCompile                                1 calls
 12 ms  CheckForImplicitPackageReferenceOverrides   1 calls
 34 ms  RunProduceContentAssets                    1 calls
 46 ms  _ComputeLockFileReferences                 1 calls
 66 ms  GenerateBuildRuntimeConfigurationFiles     1 calls
 79 ms  ResolveLockFileReferences                  1 calls
119 ms  _ComputeLockFileCopyLocal                  1 calls
140 ms  _HandlePackageFileConflicts                1 calls
182 ms  RunResolvePackageDependencies              1 calls
950 ms  ResolveAssemblyReferences                  1 calls

@rynowak
Copy link
Member

rynowak commented May 23, 2017

Or use this project if you want a really fun one 😆

https://github.com/aspnet/Mvc/tree/dev/test/Microsoft.AspNetCore.Mvc.FunctionalTests

@rainersigwald
Copy link
Member Author

Here's what I've figured out so far:

  • A dotnet new mvc project passes 307 assemblies to RAR. That's a lot.
  • It also checks for the existence of ~5000 files.
  • Passing FindDependencies=false reduces overall time in RAR by ~half.
  • Profiling indicates no glaring hotspots, though there's plenty of room for improvement. Time is roughly evenly split between
  • Using full-framework MSBuild (with the existing caching strategy) improves performance significantly (~30%) but not enough.

There are a couple of options to shorten this:

  • Spend a ton of time improving perf.
    • Looking around, I think there are plenty of things that can be rewritten or cached in a static.
    • But probably not (say) a 10x speedup.
  • Build a total-inputs:total-outputs cache into RAR.
    • This is very hard in the general case, because arbitrary machine state not directly input to RAR can affect outputs: changing the registry, GACing files, installing a .NET SDK.
  • See if we can avoid calling RAR entirely.

I spoke to a source who was involved with RAR in the past (you can't escape if you're still in DevDiv, but I can keep your name out of the press!) who thought that the best option might be to teach the NuGet-resolution targets to emit directly to the items that are RAR outputs (like @(ReferencePath)), instead of resolving from NuGet assets file -> items pointing to paths on disk -> RAR -> different set of items pointing to paths on disk.

RAR does basically three things, none of which are super interesting for NuGet references:

  • Find a file given a string.
    • But NuGet references are already resolved to an absolute path.
  • Find dependencies and satellite assemblies.
    • Should be described in the package.
  • Version unification and conflict resolution

We can't eliminate RAR altogether, because when targeting a full framework TFM you can use references that don't come from NuGet (like <Reference Include="System.Configuration" /> or direct reference to a path on disk). But we could filter the list to elide packages that come from NuGet.

I prototyped this with an injected task:

<Target Name="AdjustRAR" BeforeTargets="ResolveAssemblyReferences" Condition="'$(AdjustRAR)' != 'false'">
  <ItemGroup>
    <ReferencePath Include="@(Reference)" Condition="'%(NuGetSourceType)' == 'Package'" />
    <Reference Remove="@(Reference)"  Condition="'%(NuGetSourceType)' == 'Package'" />
  </ItemGroup>
</Target>

For an MVC template project, that completely eliminates RAR:

        9 ms  CoreCompile                                1 calls
       10 ms  AdjustRAR                                  1 calls
       10 ms  _CheckForUnsupportedTargetFramework        1 calls
       18 ms  CheckForImplicitPackageReferenceOverrides   1 calls
       45 ms  FindReferenceAssembliesForReferences       1 calls
       45 ms  RunProduceContentAssets                    1 calls
       65 ms  _ComputeLockFileReferences                 1 calls
       85 ms  GenerateBuildRuntimeConfigurationFiles     1 calls
      128 ms  ResolveLockFileReferences                  1 calls
      155 ms  _ComputeLockFileCopyLocal                  1 calls
      203 ms  _HandlePackageFileConflicts                1 calls
      267 ms  RunResolvePackageDependencies              1 calls

Ok, so great. What are the risks here? We need to sync with the SDK and NuGet teams about:

  • Are there situations where "the package paths are sufficient" aren't true? Can we detect them easily?
  • What metadata from RAR do we need to preserve (by teaching the NuGet package resolution logic about it)?

@DamianEdwards
Copy link
Member

@rainersigwald great info. That number of assemblies is much higher than expected (by about 67%). Could you share the list?

@DamianEdwards
Copy link
Member

Also, you say:

We can't eliminate RAR altogether, because when targeting a full framework TFM you can use references that don't come from NuGet (like or direct reference to a path on disk). But we could filter the list to elide packages that come from NuGet.

Could we eliminate it in the cases the project is not targeting a full framework TFM then?

@kzu
Copy link
Contributor

kzu commented Dec 13, 2018

I think if NuGet implemented NuGet/Home#7617, some global targets provided by them could hook into that and do that per-nuget immutability caching/optimization at restore time, ideally only once, without having MSBuild know anything about how that's done. For example, they could emit additional obj targets (but in the package install path, rather than the user's obj, since they would be immutable for a given package), so that even Rebuild would benefit from it.

It would also allow package authors to extend the same mechanism, as explained in that issue.

@rainersigwald
Copy link
Member Author

@kzu I don't think I'm seeing the connection between that comment and RAR performance; can you elaborate?

@kzu
Copy link
Contributor

kzu commented Dec 14, 2018

@rainersigwald basically, do what was suggested only once and at restore time:

who thought that the best option might be to teach the NuGet-resolution targets to emit directly to the items that are RAR outputs (like @(ReferencePath)), instead of resolving from NuGet assets file -> items pointing to paths on disk -> RAR -> different set of items pointing to paths on disk.

;). That part could leverage the same extension point for AfterRestore proposed in that issue.

@rainersigwald
Copy link
Member Author

I see. I don't think that's a viable option here because RAR needs to see all references in order to do version unification and generate binding redirects, and the inputs to that could change outside of Restore (via Reference item changes or transitive reference changes from ProjectReferences).

@rainersigwald
Copy link
Member Author

#3868 has a good perf improvement in RAR cache serialization, but we're not sure the Bond serialization is the best approach. We should consider reviving it at some point in the future.

@mfkl
Copy link
Contributor

mfkl commented Jun 19, 2020

we're not sure the Bond serialization is the best approach.

What are the downsides of it? Have you explored other alternatives?

@donJoseLuis
Copy link

@ladipro, heads up on this one. We should link it to our RAR user story.

@ViktorHofer
Copy link
Member

I just collected some data from a dotnet/runtime libraries build with ~1400 evaluations (outer + inner builds). I crossed out the tasks that are dotnet/runtime specific (a fix for ValidatePackage no-op build times is tracked via dotnet/sdk#23517).

image

Assuming RAR only runs in inner-builds, this should represent the time it takes RAR to serve ~1100 inner builds. I was looking into speeding up incremental builds in dotnet/runtime libs and I believe the highest impact contributors on the msbuild side are:

  • Project evaluation (55% of the overall fully incremental build). Once during static graph restore and then again during the actual build.
  • RAR (17%)
  • ResolveTargetingPackAssets (2.5%)
  • ResolvePackageFileConflicts (2%)

23% are dotnet/runtime specific and must be fixed by ourselves. Just wanted to share this data in case it helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests