Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run phases of code cleanup in parallel #73386

Merged
merged 36 commits into from
May 8, 2024

Conversation

CyrusNajmabadi
Copy link
Member

Followup to #73383.

Prior to this PR, the way FixAll worked was that it would first call into every provider to produce the fixed document roots for all relevant files. This was done in parallel and would produce an updated solution.

We'd then go (in parallel) across all those fixed documents and woudl then clean them. This was inefficient because hte cleaning phase per document would fork the solution 4 times. First, to add missing imports for the doc. Second, to simplify any expanded constructs. Third, to case-correct VB files. Fourth, to format hte files (though this wouldn't incur compilation costs as it's purely syntactic). Each time it was forked, and the next cleaning operation was done on that document, that would incur all the costs to create compilations (including generators, skeletons, and any other semantic work).

This change fixes cleaning to be much smarter. Instead of 4 forks per document, tehre are 4 forks total. Specifically, we have the starting solution, we then add imports to all changed documents in parallel against that solution. We use the results of all those documents to produce the next solution. To that, we then simplify expanded constructs in parallel across all files, using htat to produce the next solution. We then finally do that for VB-case correction.

@dotnet-issue-labeler dotnet-issue-labeler bot added Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead labels May 8, 2024
@CyrusNajmabadi CyrusNajmabadi marked this pull request as ready for review May 8, 2024 18:01
@CyrusNajmabadi CyrusNajmabadi requested a review from a team as a code owner May 8, 2024 18:01
@CyrusNajmabadi CyrusNajmabadi marked this pull request as draft May 8, 2024 18:37
@@ -16,7 +16,7 @@ namespace Microsoft.CodeAnalysis.IntroduceVariable;

internal partial class AbstractIntroduceVariableService<TService, TExpressionSyntax, TTypeSyntax, TTypeDeclarationSyntax, TQueryExpressionSyntax, TNameSyntax>
{
internal abstract class AbstractIntroduceVariableCodeAction : CodeAction
private abstract class AbstractIntroduceVariableCodeAction : CodeAction
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merge of #73392 into this PR. once that goes in, this will disappear from this PR.

@@ -248,7 +250,7 @@ public Task<ImmutableArray<CodeActionOperation>> GetOperationsAsync(Cancellation

if (operations != null)
{
return await this.PostProcessAsync(originalSolution, operations, cancellationToken).ConfigureAwait(false);
return await PostProcessAsync(originalSolution, operations, cancellationToken).ConfigureAwait(false);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

became a static method.


internal async Task<Solution> PostProcessChangesAsync(
internal static async Task<ImmutableArray<(DocumentId documentId, CodeCleanupOptions codeCleanupOptions)>> GetDocumentIdsAndOptionsAsync(
Solution solution, CodeCleanupOptionsProvider optionsProvider, ImmutableArray<DocumentId> documentIds, CancellationToken cancellationToken)
Copy link
Member Author

@CyrusNajmabadi CyrusNajmabadi May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

helper used here (for code actions that produce a solution (as opposed to a doc), and also in the FixAll-InSolution code. We want hte parallel benefits for both codepaths.

{
// We do this in 4 serialized passes. This allows us to process all documents in parallel, while only forking
// the solution 4 times *total* (instead of 4 times *per* document).
progress.AddItems(documentIdsAndOptions.Length * 4);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the meat of hte change. 4 defined passes on the solution. not 4 passes per document.

}
// Then do a pass where we cleanup semantics.
var semanticCleanedSolution = await CleanupSemanticsAsync(
syntaxCleanedSolution, documentIdsAndOptions, CodeAnalysisProgress.None, cancellationToken).ConfigureAwait(false);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for features that produce a new solution (as oppposed to new document) they now get the benefit of this cleanup in parallel, and only 4 passes as well across all documents.

@CyrusNajmabadi CyrusNajmabadi marked this pull request as ready for review May 8, 2024 18:54
progressTracker.Report(CodeAnalysisProgress.Clear());
progressTracker.Report(CodeAnalysisProgress.Description(WorkspacesResources.Running_code_cleanup_on_fixed_documents));

var cleanedSolution = await CodeAction.CleanSyntaxAndSemanticsAsync(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defers to the single helper to do the syntax and semantic cleanup passes.

if (!document.SupportsSyntaxTree)
return document;

var cleanedSolution = await CleanupSyntaxAndSemanticsAsync(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleaning a document just defers to the main N-pass helper, just processing this single document.

// good state. The semantic cleanup passes may have introduced new nodes with elastic trivia that have to be
// cleaned.
static (document, options, cancellationToken) => CleanupSyntaxAsync(document, options, cancellationToken),
];
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i really liked this. it emphasizes that we're doing a series of passes in a specific order.

var solutionChanges = changedSolution.GetChanges(originalSolution);
var documentIds = solutionChanges
.GetProjectChanges()
.SelectMany(p => p.GetChangedDocuments(onlyGetDocumentsWithTextChanges: true).Concat(p.GetAddedDocuments()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetChangedDocuments

We don't need to worry about additional or analyzerconfig docs, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not, these probably only affect code docs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope. those are never cleaned up. 'cleaning' is a specific C#/VB concept where we do final passes, applying formatting, and going and simplifying code that a code fixer annotated in the syntax trees.

// Grab all the cleaned roots and produce the new solution snapshot from that.
var currentSolution = args.solution;
await foreach (var (documentId, newRoot) in stream)
currentSolution = currentSolution.WithDocumentSyntaxRoot(documentId, newRoot);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WithDocumentSyntaxRoot

FYI, I've got it in my mind that we should eventually be batching these solution updates. I wonder if all these forks are still visible in the profiles with all the changes that have happened since I last measured.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. we can totally do that. it's orthogonal to the work here, but i'm fully in support of it.

primarily, these prs ensure that we're not forking the solution N times and also getting compilatinos for each fork.

i will do this in a followup :)

Copy link
Contributor

@ToddGrun ToddGrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🕐

Copy link
Contributor

@ToddGrun ToddGrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@CyrusNajmabadi CyrusNajmabadi merged commit c0ef482 into dotnet:main May 8, 2024
25 checks passed
@dotnet-policy-service dotnet-policy-service bot added this to the Next milestone May 8, 2024
@CyrusNajmabadi CyrusNajmabadi deleted the cleanupParallel branch May 8, 2024 21:36
@CyrusNajmabadi
Copy link
Member Author

@jasonmalinowski For review when you get back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants