Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync solution contents consistently #72860

Merged
merged 19 commits into from
Jul 3, 2024

Conversation

CyrusNajmabadi
Copy link
Member

@CyrusNajmabadi CyrusNajmabadi commented Apr 3, 2024

Prior to this PR our solution-syncing logic had two different codepaths it would go through. First, if there was no stored-primary-solution on the OOP side, it would do a 'full solution sync', (where 'full solution sync' wasn't actually necessarily full!). Second, if there was a stored primary solution, we'd then do a delta sync, which is the codepath we've hardened and validated properly syncs all data we need correctly.

This PR gets rid of the first codepath and makes it operate by going down the second. Basically, if there is no primary solution stored, we create an empty one, and then do a delta sync from it to the final state.

--

This works great, but ended up revealing a subtle bug in how we do OOP source-generator execution. This was never discovered before because the test was always benefitting from the 'fresh full sync' (which was actually missing data), instead of going through the delta path which does a proper sync and ensured the same data on both the host and oop side.

@dotnet-issue-labeler dotnet-issue-labeler bot added Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead labels Apr 3, 2024
@CyrusNajmabadi
Copy link
Member Author

Need to go over why this is failing with @jasonmalinowski . I believe it's due to a very odd test that itself may be wrong. But it relates to frozen SG docs, so i need to talk to him to understand that better.

@@ -34,20 +34,6 @@ internal partial class RemoteWorkspace
private readonly AssetProvider _assetProvider = assetService;
private readonly Solution _baseSolution = baseSolution;

public async Task<bool> IsIncrementalUpdateAsync(Checksum newSolutionChecksum, CancellationToken cancellationToken)
Copy link
Member Author

@CyrusNajmabadi CyrusNajmabadi Apr 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to inlined helper in the containing type.


// if either solution id or file path changed, then we consider it as new solution
return currentSolution.Id == newSolutionInfo.Id && currentSolution.FilePath == newSolutionInfo.FilePath;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was jsut a move.

// If not, have to create a new, fresh, solution instance to update.
var solutionInfo = await assetProvider.CreateSolutionInfoAsync(solutionChecksum, cancellationToken).ConfigureAwait(false);
return CreateSolutionFromInfo(solutionInfo);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simple logic. determine if we're updating the existing solution (and return it if so), or creating a new solution (and create it if so).

}
// Now, bring that solution in line with the snapshot defined by solutionChecksum.
var updater = new SolutionCreator(Services.HostServices, assetProvider, solutionToUpdate);
return await updater.CreateSolutionAsync(solutionChecksum, cancellationToken).ConfigureAwait(false);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once we get the solution to get update, now we actually do the work to ensure it is in sync with teh host side.

@CyrusNajmabadi
Copy link
Member Author

@jasonmalinowski Let me try to explain what's going on wrong (as far as the test is concerned). Note, i do not know if the test is wrong, or if the product behavior is wrong. So i'll def need your input here.

First, let's talk about what teh change is that this PR is making. Previously, the way we would sync with oop would be the following:

  1. send checksum over.
  2. rehydrate solution-id and solution-filepath and compare that to RemoteWorkspace's CurrentSolution.
  3. if different, we'd make a new solution, hydrating it from teh checksum.
  4. if teh same same, we'd take the CurrentSolution, get its local checksum, and do a diff/sync of it vs the checksum passed in, syncing over missing/changed data and making the necessary changes.

Now: consider what would happen in tests. In lots of tests, we would never go into the '4' path (even when making local changes to the solution). That's because we never sent the message from host to oop to "set the primary solution" in the first place. So for every call to OOP, we'd always go into the '3' path, always creating a new fresh solution instance and operating on that.

Why does this matter? Well paths '3' and '4' operate in a subtle different fashion. While both (should be) the same vis-a-vis getting the same sets of projects, documents, options, references, and analyzers. '3' differs from '4' in that it does not hydrate the "frozen source generated documents" into the solution it just created.

In the tests in question that fails SolutionWithSourceGeneratorTests.FreezingSourceGeneratedDocumentsWorks this has the impact that there's no prior frozen generated document when freezing the second document, and so only 1 document is found here:

var frozenTree = Assert.Single((await frozenWithSingleDocument.Project.GetRequiredCompilationAsync(CancellationToken.None)).SyntaxTrees);

Ok. So that explains what was going on, and why the test passed before. So what happened with my change. Well, with my change, i changed '3' and '4'. Now the new logic is:

  1. send checksum over.
  2. rehydrate solution-id and solution-filepath and compare that to RemoteWorkspace's CurrentSolution.
  3. if different, we'd make a new solution, hydrating it from teh checksum.
  4. regardless if same or different, we'd take the existing solution (or freshly hydrated new solution), get its local checksum, and do a diff/sync of it vs the checksum passed in, syncing over missing/changed data and making the necessary changes.

With thsi new logic we always do the diff/sync (to be as sure as possible the oop solution matches the host one). In the test in question, that means that after the initial sync, we see the host side had a "frozen source generated documents", and hten it adds another one (that's jut what the test does). This then means that we think we have two frozen generated documents as we keep the old syntax tree around, and add the new one in. Causing the test to fail.

I'm honestly not sure how the "frozen source generated documents" piece is supposed to work. And i'm worried the test is trying to tell me about something important :) So i don't want to just change it. So i def need your understanding/info here to move forward on this.

Note: i really like the overall concept of the change. I think it's very sound to just make a fresh solution when needed, but otherwise have both codepaths do the same full diff sync in both cases to ensure you always get the same oop snapshot no matter how you started.

Put another way, i view it as a strong bug/consistency-issue that you could ever get different solution-snapshots on the OOP side depending on if it was starting from scratch, or if it was updating something it already had. Thsi PR attempts to fix that, but runs afoul of some of the stuff you know much more about.

Thanks!

@CyrusNajmabadi
Copy link
Member Author

@jasonmalinowski this is still a priority :)

AssetPathKind.SolutionSourceGeneratorExecutionVersionMap, newSolutionCompilationChecksums.SourceGeneratorExecutionVersionMap, cancellationToken).ConfigureAwait(false);

solution = solution.UpdateSpecificSourceGeneratorExecutionVersions(newVersions);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: this logic intentionally disappears. because we now just have it fall out from calling into the normal CreateSolutionAsync copdepath which does the appropraite checksum diffing/syncing. (It also does a validation sanity check in debug, which this codepath never did).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@CyrusNajmabadi CyrusNajmabadi marked this pull request as ready for review June 24, 2024 22:26
@CyrusNajmabadi CyrusNajmabadi requested a review from a team as a code owner June 24, 2024 22:26
@CyrusNajmabadi CyrusNajmabadi requested a review from ToddGrun June 24, 2024 22:26
@CyrusNajmabadi
Copy link
Member Author

@jasonmalinowski Def need eyes on #72860 (comment). Please review soon. Thanks.

@jasonmalinowski
Copy link
Member

@CyrusNajmabadi: so if I'm following your question correctly I think this is the bit where things are breaking down:

we see the host side had a "frozen source generated documents", and hten it adds another one (that's jut what the test does). This then means that we think we have two frozen generated documents as we keep the old syntax tree around, and add the new one in. Causing the test to fail.

Note that the AssertFrozen helper there is static and not reassigning back to the "project" local. Instead it's doing a test that the freezing works with a regular snapshot, and then doing a second assert it works with a project that we removed the analyzer from. But the frozen snapshot should have been thrown away since it was just a fork at that point. Put another way, this test is conceptually two tests, just with two asserts to avoid duplicating the setup.

It's not obvious why this is failing to me, but it would indicate there's a bug in our code somewhere. I would expect that at the point of failure that this map here:

public TextDocumentStates<SourceGeneratedDocumentState>? FrozenSourceGeneratedDocumentStates { get; }

would contain only a single entry (since in each case where we're doing a test we're calling WithFrozenSourceGeneratedDocument we only had a single document in there. And I'm not sure how there'd be two, since the Ids of both cases would be the same, so the key -> value map really can't have two unless somehow the Ids were different.

What I'm guessing is happening is this: when the first AssertFrozen() is being ran, a compilation is being created, and that fork is being thrown away. But somehow that compilation is making it's way to the second use in AssertFrozen, where the compilation is being taken and "reused" by the generated document being added a second time, somehow. So we're parsing the contents a second time.

@jasonmalinowski
Copy link
Member

I had forgotten you had a question here so my signoff was before I saw that (which explained to me why I was asked to review -- the code looks so simple!) I'll leave the signoff since it seems at least what has been written looks fine, and the bug exists elsewhere.

@CyrusNajmabadi CyrusNajmabadi mentioned this pull request Jul 2, 2024
@CyrusNajmabadi
Copy link
Member Author

@jasonmalinowski That helps a lot. Thanks!

@CyrusNajmabadi CyrusNajmabadi changed the title WIP: Sync solution contents consistently Sync solution contents consistently Jul 2, 2024
internal readonly record struct RegularCompilationTrackerSourceGenerationInfo(
[property: DataMember(Order = 0)] SourceGeneratedDocumentIdentity DocumentIdentity,
[property: DataMember(Order = 1)] SourceGeneratedDocumentContentIdentity ContentIdentity,
[property: DataMember(Order = 2)] DateTime GenerationDateTime);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extracted out a helper type as this was just causing a massively long signature line.

/// Should only be called by the "RegularCompilationTracker", and should only return data from its view of the
/// world. Not from the view of a "GeneratedFileReplacingCompilationTracker".
/// </remarks>
ValueTask<ImmutableArray<RegularCompilationTrackerSourceGenerationInfo>> GetRegularCompilationTrackerSourceGenerationInfoAsync(
Checksum solutionChecksum, ProjectId projectId, CancellationToken cancellationToken);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this helps fix the sublte bug i mentioned in the OP. When teh host calls to OOP to get the generated docs, it is calling from its "regular compilation tracker" (not the "generated file replacing compilation tracker"). As such, the call should go through to that same tracker on the OOP side, not the outermost wrapped tracker like we were doing before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method rename seems a little weird to me. I wonder if it would be a bit nicer from the caller's side to use the original method name, and pass in an enum indicating the type of compilation tracker that is to be used?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be best to chat in person. Maybe a better way I could have phrased the thought is could CompilationTracker have an identifier that could be passed across to determine the tracker to use?

@@ -27,7 +27,7 @@ internal partial class SolutionCompilationState
/// compilation for that project. As the compilation is being built, the partial results are
/// stored as well so that they can be used in the 'in progress' workspace snapshot.
/// </summary>
private partial class CompilationTracker : ICompilationTracker
private sealed partial class RegularCompilationTracker : ICompilationTracker
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename. Didn't change file name to prevent bad diffs.

@CyrusNajmabadi
Copy link
Member Author

CyrusNajmabadi commented Jul 2, 2024

@jasonmalinowski THat helped a bunch. Especially realizing that each fork was independent.

What was happening is that on both the host and oop side we were now CORRECTLY setting up the trackers to have:

Host:
GeneratedSourceReplacingTracker->CompilationTracker

OOP:
GeneratedSourceReplacingTracker->CompilationTracker

Previously, due to the bugs in syncing (that this PR fixes) we weren't doing that and were gettitng:

Host:
GeneratedSourceReplacingTracker->CompilationTracker

OOP:
CompilationTracker

This revealed a bug in OOP source generators. Specifically, the host side's side normal CompilatinoTracker would ask the oop side for generated sources. But the OOP side wouldn't ask its normal CompilationTracker for this, but instead its GeneratedSourceReplacingTracker. So OOP would say "i have one generated document". the host would tehn then go "ok... great, my regular compilation tracker has one generated doc, and i'll now add an additional generated doc on top of that in my GeneratedSourceReplacingTracker".

Basically we had:

Host:                                                         OOP:
GeneratedSourceReplacingTracker->CompilationTracker    ->     GeneratedSourceReplacingTracker->CompilationTracker

Now we have:

Host:                                                         OOP:
                                                   /--------------------->--------------------\
GeneratedSourceReplacingTracker->CompilationTracker           GeneratedSourceReplacingTracker->CompilationTracker

@CyrusNajmabadi
Copy link
Member Author

@jasonmalinowski I know you already signed off. But you should take another look with the additional changes this motivated.

@CyrusNajmabadi
Copy link
Member Author

@ToddGrun ptal.

@CyrusNajmabadi CyrusNajmabadi merged commit 7c90644 into dotnet:main Jul 3, 2024
25 checks passed
@CyrusNajmabadi CyrusNajmabadi deleted the solutionSync branch July 3, 2024 17:36
@dotnet-policy-service dotnet-policy-service bot added this to the Next milestone Jul 3, 2024
@RikkiGibson RikkiGibson modified the milestones: Next, 17.12 P1 Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants