Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactors LSP server extension assembly loading #71862

Merged
merged 6 commits into from
Feb 7, 2024

Conversation

dibarbet
Copy link
Member

@dibarbet dibarbet commented Jan 30, 2024

  1. Fixes bug where multiple ALCs were created if two extension dlls had the same parent directory.
  2. Provides better errors messages if extension paths are malformed.
  3. Normalizes which assembly contributes to the MEF composition if provided multiple times
  4. Loads analyzers provided by extensions into the same single extension ALC

Fixes #71100

Analyzers have same instance (and can load satellite assemblies):
image

Extension that does not ship VSTelemetry has same the instance as the host:
image

Two extensions have different instances of dependent assemblies:
image

Two extensions from the same extension directory have the same dependent assembly instance:
image

  1. Fixes bug where multiple ALCs were created if two extension dlls
     had the same parent directory.
  2. Provides better errors messages if extension paths are malformed.
  3. Normalizes which assembly contributes to the MEF composition if
     provided multiple times
  4. Loads analyzers provided by extensions into the same single extension ALC
@dotnet-issue-labeler dotnet-issue-labeler bot added Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead labels Jan 30, 2024
public Assembly? TryLoadAssemblyInExtensionContext(AssemblyName assemblyName)
{
// We don't know exactly which extension the assembly came from, so we'll try each extension load context.
foreach (var loadContext in _directoryLoadContexts.Values)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @jasonmalinowski - I wasn't sure if there was a better way to do this. Basically the MEF custom assembly loader calls into us with an AssemblyName. I'm not sure how to get the actual path it is trying to load to find the exact extension ALC to use. There is a codebase path on AssemblyName, but its obsolete.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could imagine that we could create an AssemblyName to path map in the constructor when you're initially walking through list of DLLs. Basically, you're restricting the list of potential assembly names that could be called to that list, but I think that's reasonable?

I'm not sure it strictly matters, but in some esoteric case like:

  1. Extension A ships A.dll, a MEF component.
  2. Extension B ships B.dll, a MEF component. It also ships A.dll, but not as a MEF component.

For the purposes of trying to get the MEF component for A.dll, you really want A, even if it's also shipped elsewhere. (and you'd want extension B getting it's own A.dll).

It's a bit unfortunate though that VS MEF can't give us a full path in this case?

Copy link
Member Author

@dibarbet dibarbet Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jasonmalinowski hmm I could do the assembly name map assuming that the assembly name from extension a and b are different.

It's a bit unfortunate though that VS MEF can't give us a full path in this case?

The assembly name code base does have the full path in this case, but its marked obsolete. I could suppress that warning and check it anyway?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code base does have the full path in this case, but its marked obsolete. I could suppress that warning and check it anyway?

Oh I see: I was imagining you'd use the codebase given to the VS MEF IAssemblyLoader call that does have one, if it's available. But practically the only way we can ever make this work is if VS MEF is reliably able to tells the actual underlying path. If it's taking a model that you can't have more than one extension w/ the same assembly name then that might just be a limitation we have. And I can imagine that a least on VS for Windows, that's a reasonable limitation given the underlying .NET Framework limitations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MEF in general (VS-MEF or otherwise) makes a very huge assumption that type full names (without their assembly names to qualify them) are unique in the MEF catalog. By default (and this comprises 99% of the cases) an import or export will use the Type.FullName as the MEF contract name, which is how imports and exports are matched.
If the same assembly is included in the catalog twice (whether or not it's the same version of the assembly) and that assembly includes MEF parts, it will most likely produce many cases of export duplication, which would bring down everyone who expected just one export (i.e. [Import]) and those who use [ImportMany] would get an export from each assembly.
You should therefore be sure to only include one assembly of a given name in your MEF catalog. You might choose the newest version if there are multiple to choose from.
If you include one in the MEF catalog but you allow other copies to also load (in other ALCs), that shouldn't confuse MEF, but it could still produce some confusing errors around type non-equivalence, and some of those errors could show up with MEF on the callstack.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should therefore be sure to only include one assembly of a given name in your MEF catalog. You might choose the newest version if there are multiple to choose from.
If you include one in the MEF catalog but you allow other copies to also load (in other ALCs), that shouldn't confuse MEF, but it could still produce some confusing errors around type non-equivalence, and some of those errors could show up with MEF on the callstack.

Yeah that was my rough plan - we'll only include an assembly once into the catalog, but we'll allow an extension assembly to have other copies in its own ALC that do not contribute to the catalog.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you still want to do a map here so we're not having to do an O(n) loop?


// Create the extension assembly load context for the extension.
logger.LogTrace("Loading extension {assemblyFilePath}", assemblyFilePath);
var loadContext = new ExtensionAssemblyLoadContext(fileNameNoExt, directory, loggerFactory);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the name for the ALC and the logger, I'd say make it a directory name (or even something like "Extension ALC for ...") so that way it's really clear where loads fail in the _logger.Trace() calls in the ALC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(you can also then get rid of fileNameNoExt in that case)

Copy link
Member Author

@dibarbet dibarbet Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the logger is created with the ALC name, so its already pre-pended there. Currently the ALC name is the file name without the extension, but I should change that to the directory name. Will do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So tested this out a bit - using the assembly name for the ALC name seems the best to me. Using the directory name can get confusing if you have multiple extension from the 'bin' folder for example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I was still wanting the full directory name for a few reasons:

  1. Using the first file name found more or less means we're depending on potentially nondetermnistic behavior -- if I have two DLLs like A.dll and B.dll, the ALC name is more or less arbitrary at that point.
  2. Full directory name would also include things like extension name or extension version on the directory path, which might be good so if we find a problem with a random extension named "foo.dll" we have a bit more info to track it down.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the directory name can get confusing if you have multiple extension from the 'bin' folder for example.

To me this is the critical bit: if you have more than one extension in a bin folder, that's just one ALC!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well in my case, I had two different extensions, both using a bin folder. So using just the directory name wouldn't tell me which was which. I could do the full path though.

return assembly;
}

return _defaultLoader.LoadFromPath(fullPath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this won't reuse it, right? Because this is going to give it a file path and then reload that again in the default implementation's ALC?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if we do not find the assembly in an extension ALC, it will fallback to the current behavior (which is a separate analyzer ALC).

@dibarbet dibarbet force-pushed the analyzer_assembly_loading_vscode branch 2 times, most recently from 4d9a0a1 to 3a54eb2 Compare January 31, 2024 00:23
  1. Ensure an assembly only contributes to the MEF catalog once
  2. Prefer the code base to load MEF assemblies
  3. Load devkit dlls via hooking AssemblyLoadContext.Default.Resolve
  4. Misc feedback
@dibarbet dibarbet force-pushed the analyzer_assembly_loading_vscode branch from 3a54eb2 to fe8e648 Compare January 31, 2024 01:44
@dibarbet dibarbet marked this pull request as ready for review January 31, 2024 19:00
@dibarbet dibarbet requested a review from a team as a code owner January 31, 2024 19:00
Copy link
Member

@jasonmalinowski jasonmalinowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only change I'd make here is for the load-by-assembly-name that doesn't have a codebase, just do a map rather than just loading through all the ALCs as you're doing.


// Create the extension assembly load context for the extension.
logger.LogTrace("Loading extension {assemblyFilePath}", assemblyFilePath);
var loadContext = new ExtensionAssemblyLoadContext(fileNameNoExt, directory, loggerFactory);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I was still wanting the full directory name for a few reasons:

  1. Using the first file name found more or less means we're depending on potentially nondetermnistic behavior -- if I have two DLLs like A.dll and B.dll, the ALC name is more or less arbitrary at that point.
  2. Full directory name would also include things like extension name or extension version on the directory path, which might be good so if we find a problem with a random extension named "foo.dll" we have a bit more info to track it down.


// Create the extension assembly load context for the extension.
logger.LogTrace("Loading extension {assemblyFilePath}", assemblyFilePath);
var loadContext = new ExtensionAssemblyLoadContext(fileNameNoExt, directory, loggerFactory);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the directory name can get confusing if you have multiple extension from the 'bin' folder for example.

To me this is the critical bit: if you have more than one extension in a bin folder, that's just one ALC!

public Assembly? TryLoadAssemblyInExtensionContext(AssemblyName assemblyName)
{
// We don't know exactly which extension the assembly came from, so we'll try each extension load context.
foreach (var loadContext in _directoryLoadContexts.Values)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you still want to do a map here so we're not having to do an O(n) loop?

  1. Move duplicate checking to ExtensionAssemblyManager
  2. Store mapping of assembly full name to load context to allow MEF to
     load an assembly by name from the correct load context.
  3. Move devkit assembly resolution out of ExtensionAssemblyManager
  4. Misc feedback
Copy link
Member

@jasonmalinowski jasonmalinowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

I worry any further comments I make will only make this worse when I later realize I was wrong.

Comment on lines +56 to 59
if (codeBasePath is not null)
{
return LoadAssemblyFromCodeBase(assemblyName, codeBasePath);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random thought: if we have a codebase path and we know the path isn't our application direectory, should have just done this first? Rather than that earlier load that we could have predicted would fail?

Comment on lines +51 to +52
var directoryLoadContexts = new Dictionary<string, AssemblyLoadContext>(StringComparer.Ordinal);
var assemblyFullNameToLoadContext = new Dictionary<string, AssemblyLoadContext>(StringComparer.Ordinal);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using ImmutableDictionary builders, but doesn't really matter much.

@dibarbet dibarbet merged commit c3d3595 into dotnet:main Feb 7, 2024
28 checks passed
@dibarbet dibarbet deleted the analyzer_assembly_loading_vscode branch February 7, 2024 00:55
@ghost ghost added this to the Next milestone Feb 7, 2024
@jjonescz jjonescz modified the milestones: Next, 17.10 P2 Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Language server extension dlls with analyzers and MEF parts get loaded twice into different ALCs
4 participants