Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change behavior of copy action to not rewrite assemblies #1869

Merged
merged 35 commits into from
Apr 1, 2021

Conversation

mateoatr
Copy link
Contributor

@mateoatr mateoatr commented Mar 4, 2021

This PR changes the behavior of copy and copyused assembly actions. The assemblies will be analyzed and marked as usual but contents will not be rewritten under any circumpstance -- we will simply copy the assembly as is to the output directory. This will help us move forward with enabling the analysis of C++/CLI assemblies and issues such as #1669 (and, of no lesser importance, have action names that behave as one would expect).

This also takes over from the changes in #1854 : type forwarders which are referenced via type name strings are now kept, even when these exist inside linked assemblies.

@marek-safar
Copy link
Contributor

What is the benefit of adding yet another copy mode (we have 3 already and they are a source of great confusion) ?

@sbomer
Copy link
Member

sbomer commented Mar 4, 2021

This is trying to solve the problem where copy assemblies can be rewritten. We need to either guarantee that copy doesn't modify an assembly as part of #1854, or introduce new functionality that has such a guarantee. This will be useful to address:

@mateoatr
Copy link
Contributor Author

mateoatr commented Mar 4, 2021

Yes. Main motivation of this is to keep the semantics of the copy action untouched and have a way to tell the linker "don't modify anything, don't even rewrite removed references".

Copy link
Member

@sbomer sbomer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the following cases still aren't handled:

  • copy assembly with a static reference to a forwarded type
    • This will show up whenever a non-trimmable assembly was built against a facade like netstandard.
  • a reference that we can't rewrite (copy or string reference) -> forwarder in copy assembly (which we also can't rewrite) -> forwarder (currently won't get marked) -> type
    • This might be rare, but to be correct, whenever we mark a forwarder, we'd need to walk the path to the resolved type, marking the target of forwarders until we see one not in a copy assembly - after that, any remaining forwarders in the chain can be rewritten. I'm not sure we need to handle this, but it would at least be good to add a testcase for it with a comment explaining the current behavior.

@mateoatr mateoatr changed the title Add copyoriginal assembly action Change behavior of copy and copyused action to not rewrite assemblies Mar 11, 2021
@mateoatr mateoatr changed the title Change behavior of copy and copyused action to not rewrite assemblies Change behavior of copy action to not rewrite assemblies Mar 12, 2021
@mateoatr mateoatr requested a review from sbomer March 15, 2021 17:22
src/linker/Linker.Steps/MarkStep.cs Outdated Show resolved Hide resolved
src/linker/Linker.Steps/MarkStep.cs Outdated Show resolved Hide resolved
@@ -121,7 +121,6 @@ void UpdateAssemblyReferencesToRemovedAssemblies (AssemblyDefinition assembly)

switch (action) {
case AssemblyAction.CopyUsed:
case AssemblyAction.Copy:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

src/linker/Linker.Steps/MarkStep.cs Outdated Show resolved Hide resolved
src/linker/Linker.Steps/SweepStep.cs Outdated Show resolved Hide resolved
@sbomer
Copy link
Member

sbomer commented Mar 25, 2021

edit: also, curious that the pdb is changing but the main dll isn't.

Copy link
Member

@sbomer sbomer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the great work! Please hold off on merging until we figure out what to do with the size diff.

@mateoatr
Copy link
Contributor Author

The size decreases can only be caused by us no longer marking all exports that point to marked types, right?

Yes. I'd expect the decrease in size to be due to the marking we used to do in the while-loop in MarkStep.Process().

We should check that the size increases look expected, for example for Microsoft.JSInterop.dll. Some of them may again be cases where we ought to consider adding IsTrimmable if we don't already.

Agreed. I'll take a look at this.

dependencies-dump.zip

@mateoatr
Copy link
Contributor Author

Microsoft.AspNetCore.Authorization:

A descriptor file is created by the SDK: https://github.com/dotnet/sdk/blob/446589256a6fc128d1adffe83551a727ebf7f234/src/BlazorWasmSdk/Targets/Microsoft.NET.Sdk.BlazorWebAssembly.Current.targets#L430
This keeps all types in the above assembly, thus marking the exported types and keeping the whole assembly:

<assembly fullname="Microsoft.AspNetCore.Authorization">
    <type fullname="*" required="false" preserve="all" />
  </assembly>

Other new assemblies:

There are three assemblies which are passed to the linker with the copy action whenever publishing a blazorwasm app, these are: Microsoft.JSInterop, Microsoft.JSInterop.WebAssembly, and System.IO.Pipelines; they all have various static type references to exported types declared in the newly seen assemblies (System.Buffers, System.Diagnostics.Debug, System.Resources.ResourceManager, System.Runtime.Extensions, System.Runtime, System.Threading.Tasks, System.Threading, System.Threading.ThreadPool).

In order to avoid linker from adding these new assemblies to the output, we should investigate whether it's possible to mark the the three copied assemblies with IsTrimmable.

@marek-safar
Copy link
Contributor

This keeps all types in the above assembly, thus marking the exported types and keeping the whole assembly:

You should be able to tweak the descriptor to something like

<assembly fullname="Microsoft.AspNetCore.Authorization">
    <type fullname="Microsoft.AspNetCore.*" required="false" preserve="all" />
  </assembly>

There are three assemblies which are passed to the linker with the copy action whenever publishing a blazorwasm app, these are: Microsoft.JSInterop, Microsoft.JSInterop.WebAssembly, and System.IO.Pipelines;

That's odd. All these assemblies should have explicit metadata IsTrimmable set to true. Are you running a recent version?

Copy link
Contributor

@marek-safar marek-safar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

external/cecil changes need to be reverted
You are adding redundant formatDiagnosticLog.binlog

FileInfo fi = GetOriginalAssemblyFileInfo (assembly);
string target = Path.GetFullPath (Path.Combine (directory, fi.Name));
string source = fi.FullName;

if (assembly.MainModule.HasSymbols && !Context.LinkSymbols && assembly.MainModule.SymbolReader is EmbeddedPortablePdbReader)
Context.LogWarning ("Debug symbols cannot be modified in a copied assembly.", 2104, new MessageOrigin (source));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it a warning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We used to allow symbols to be removed from copy assemblies; this would inform users that were doing this that the scenario is no longer supported. Although I think it's a fair point that it doesn't make sense to ask for removal in something tagged with "copy" -- I'll remove warning here.

@@ -271,42 +271,23 @@ FileInfo GetOriginalAssemblyFileInfo (AssemblyDefinition assembly)

protected virtual void CopyAssembly (AssemblyDefinition assembly, string directory)
{
// Special case. When an assembly has embedded pdbs, link symbols is not enabled, and the assembly's action is copy,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the warning. What "ask us to remove symbols for a "copy" assembly," should do? The assembly is copied so the symbols are same. What am I missing?

src/linker/Linker.Steps/MarkStep.cs Outdated Show resolved Hide resolved
@@ -11,12 +11,33 @@ public MarkingHelpers (LinkContext context)
_context = context;
}

public void MarkExportedType (ExportedType type, ModuleDefinition module, in DependencyInfo reason)
public void MarkMatchingExportedType (TypeDefinition typeToMatch, AssemblyDefinition assembly, in DependencyInfo reason)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this working when there are multiple typeforwarders in the chain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All copy assemblies will go through the MarkEntireAssembly method, which marks all exported types and the forwarded scope of every type reference (in particular, this will mark the scope of exported types -- which is the assembly to which the exported type ``points to''); this is enough for keeping all necessary forwarders in a chain.

e.g., if we have the following chain:

[copy] A -> [copy] B -> [link] C -> [link] D -> [implementation] E

and a static typeref to A, the logic would mark A and B (since these are exported types in a copy assembly), C would be marked by MarkForwardedScope(B), D would be trimmed and E would be marked since it's the type that C resolves to:

https://github.com/mono/linker/blob/ea3525008ab1496e304d90476e543565f2cc522e/src/linker/Linker.Steps/MarkStep.cs#L1344-L1350

@mateoatr
Copy link
Contributor Author

mateoatr commented Mar 29, 2021

Re measured size impacts using latest SDK preview. I can see that the descriptor file that caused linker to keep Microsoft.AspNetCore.Authorization because of some exported types declared in it is no longer included (see here). Impact now is much less noticeable. I'm adding the size differences bellow using the same color-codes as before, size are in bytes:

Total size: 2252 KB (used to be 2228 KB).

   17408 Microsoft.AspNetCore.Components.Web.dll
   48128 Microsoft.AspNetCore.Components.WebAssembly.dll
  121856 Microsoft.AspNetCore.Components.dll
    6144 Microsoft.Extensions.Configuration.Abstractions.dll
    8704 Microsoft.Extensions.Configuration.Json.dll
    7680 Microsoft.Extensions.Configuration.dll
   12800 Microsoft.Extensions.DependencyInjection.Abstractions.dll
   38400 Microsoft.Extensions.DependencyInjection.dll
   24064 Microsoft.Extensions.Logging.Abstractions.dll
   17408 Microsoft.Extensions.Logging.dll
   14848 Microsoft.Extensions.Options.dll
    8704 Microsoft.Extensions.Primitives.dll
    8704 Microsoft.JSInterop.WebAssembly.dll
   30720 Microsoft.JSInterop.dll
   17408 System.Collections.Concurrent.dll
   66560 System.Collections.Immutable.dll
!  22016 System.Collections.dll (before: 22528)
    4608 System.ComponentModel.dll
   13824 System.Console.dll
   25088 System.Linq.dll
!  17920 System.Memory.dll (before: 18432)
   10752 System.Net.Http.Json.dll
  138752 System.Net.Http.dll
    7168 System.Net.Primitives.dll
    7680 System.ObjectModel.dll
+1121280 System.Private.CoreLib.dll (before: 1117184)
   21504 System.Private.Runtime.InteropServices.JavaScript.dll
+  66048 System.Private.Uri.dll (before: 65536)
    7168 System.Runtime.CompilerServices.Unsafe.dll
-   5632 System.Runtime.dll
   29696 System.Text.Encodings.Web.dll
  239104 System.Text.Json.dll
   17408 foo.dll
+  21860 foo.pdb (before: 13032)

@eerhardt
Copy link
Member

eerhardt commented Mar 30, 2021

@mateoatr - what's the size difference in the .br compressed files? If it is ~24 KB uncompressed, I'd assume this change makes it ~8 KB .br compressed. That's still a pretty big size regression for Blazor WASM...

@vitek-karas
Copy link
Member

I cloned this branch and tried it on 6.0 Preview 4 Blazor temp app.
The difference compressed is only +906 bytes. Note that with these small numbers the uncompressed size is deceiving. The PE file sections are basically 512 aligned. So some changes are not visible and other changes are exaggerated. For example the app.dll uncompressed size is exactly the same before/after, but compressed it's +114 bytes - something changed in the file which is harder to compress. On the other hand corelib is +4096 uncompressed, but only +309 bytes compressed.

@eerhardt
Copy link
Member

The difference compressed is only +906 bytes.

That's good enough for me. Less than 1KB for more correctness seems like a fair trade.

@vitek-karas
Copy link
Member

Sorry - my bad - I missed that we've added a new file :-(
The difference is +2811 bytes which is borderline to me - it's not completely insignificant.

@marek-safar
Copy link
Contributor

Where is the difference coming from?

@vitek-karas
Copy link
Member

System.Runtime.dll - it's almost 2KB compressed. Uncompressed it's 5.5 KB. We did trim it (original is more like 30KB)...

@marek-safar
Copy link
Contributor

As this is also a correctness fix I'm fine with +2k increase for Blazor and we'll save it elsewhere if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants