Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace the algorithm used to calculate semantic diff #2126

Merged
merged 11 commits into from
Jun 30, 2020

Conversation

ryanbrandenburg
Copy link
Contributor

It turns out there are cases which cause the algorithm seen here to fail (Index out of bounds exception). So we need a new way to diff our old and new results. Thankfully @ajaybhargavb wrote his TextDiffer with an excelent abstraction which let me plug my only-kinda-related scenario in and re-use it with only a relatively thin shim.

I tested this in VSCode and VS. VSCode worked without trouble, but VS ran into an issue (When I would remove one token actualy 5 get remove). Talking with @gundermanc we think it's a problem on their end that he's already sent a fix for, but we might want to double check that that's true before merging this.

This is the draftiest PR. I was "finishing up" right at EOD Friday, so feel free to tear it apart.

@ryanbrandenburg
Copy link
Contributor Author

Oh, and I know it will fail some tests, I need to change their expectations on Monday.

@gundermanc
Copy link
Member

VS implementation is still being stabilized, so there are some rough edges.

Here's the fix for the bug in question: https://devdiv.visualstudio.com/DevDiv/_git/VSLanguageServerClient/pullrequest/257407

Copy link
Contributor

@ajaybhargavb ajaybhargavb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some suggestion for better factoring. Otherwise the design LGTM. Ready for the non-draft version 👍

@gundermanc
Copy link
Member

gundermanc commented Jun 27, 2020

@ryanbrandenburg I think I found a bug:

image

I'm getting a response in the VS client that is inserting 11 integers in the diff (2 tokens and 1 superfluous int) but not deleting anything.

My understanding/expectation would be that in order to insert 11 integers, you'd have to have a complimentary deleteCount of 1 to ensure that the patched array is a multiple of 5 (entirely composed of whole tokens).

Does that sound right?

Repro:
Repro3

@gundermanc
Copy link
Member

gundermanc commented Jun 27, 2020

Ah, nevermind, I see what's going on: your edits aren't self contained. One edit is deleting integers and the other is inserting new partial tokens... both at the same location.

If I actually use an array instead of the incremental TagSpan tree updates we use in the VS editor, you'd get the desired result.

Does VS Code handle this scenario correctly? On the one hand, I'd expect edits at a particular location should combine inserts and deletes, and on the other hand, if it's a scenario VS Code supports gracefully, then VS should as well.

Edit
I see @ajaybhargavb called this out above.

This is currently returning 1 edit per diff even though contiguous ones can be collated. The plan is to do that as a follow up.

I'm curious as to whether this is something we anticipate being an important scenario for compatibility or collation is something that's implied/required by the protocol.

@alexdima, @aeschli does the protocol specify one way or another what is/isn't allowed?

@ajaybhargavb
Copy link
Contributor

ajaybhargavb commented Jun 27, 2020

I'm curious as to whether this is something we anticipate being an important scenario for compatibility or collation is something that's implied/required by the protocol.

The requirement is for the edits to not overlap. Collation shouldn't be required but is definitely something we should do.

@ryanbrandenburg
Copy link
Contributor Author

Does VS Code handle this scenario correctly?

I believe that VSCode handles this correctly. I do intend to make Delete's and "new data" coalesce if they start at the same position, but I should point out that the algo I'm using gives zero credence to the "i % 5 == 0" barrier, so it's entirely likely (and even accidentally encountered in one of my test cases) that a single-character edit will produce multiple SemanticTokensEdits.

@ryanbrandenburg ryanbrandenburg marked this pull request as ready for review June 29, 2020 22:42
switch (diff.Operation)
{
case DiffEdit.Type.Delete:
if (current != null &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly break out the logic in each case into a separate method? If you prefer it this way, that's fine too :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That wouldn't be my preference, the function fits on my screen in one go, is accomplishing a single task, and has interaction with the loop ("current" or results.Add). I will admit it's pushing it though, if it got much larger I'd probably agree with you.

Copy link
Contributor

@ajaybhargavb ajaybhargavb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 . Just minor comments

using System;
using System.Collections.Generic;
using Microsoft.AspNetCore.Razor.LanguageServer.Semantic.Models;
using System.Linq;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sort

@@ -58,7 +58,12 @@ public override SemanticTokensOrSemanticTokensEdits GetSemanticTokensEdits(Razor
}
var newTokens = ConvertSemanticRangesToSemanticTokens(semanticRanges, codeDocument);

var semanticEdits = SyntaxTokenToSemanticTokensEditHelper.ConvertSyntaxTokensToSemanticEdits(newTokens, previousResults);
if (previousResults is null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can remove this if if you move line 59 above line 50

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if I convert it to a conditional expression instead? previousResultId == null isn't the only way to get previousResults == null, it also happens if the previousResult was evicted from the cache (most likely scenario for that is if someone was working on a different document for a while).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Didn't realize _semanticTokensCache.Get(previousResultId); could return null. In case you touch it in the future, making that a TryGet would be more inline with how other cache APIs work.

{
throw new ArgumentNullException();
}
if (newArray is null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: newline

{
if (oldArray is null)
{
throw new ArgumentNullException();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
throw new ArgumentNullException();
throw new ArgumentNullException(nameof(oldArray));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto below

var results = new List<SemanticTokensEdit>();
foreach (var diff in diffs)
{
var current = results.Any() ? results.Last() : null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd avoid using Linq when possible because this is a hot path,

Suggested change
var current = results.Any() ? results.Last() : null;
var current = results.Count > 0 ? results[results.Count - 1] : null;

@gundermanc
Copy link
Member

Another possible bug? Doing a move lines (Alt+Up) in VS triggers a textDocument/didChange notification and an /edits request, but it looks like it comes back empty, so the transplanted line isn't colorized.

RzBug

@ryanbrandenburg
Copy link
Contributor Author

@gundermanc I can repro that, but I feel like I should deal with it in a separate PR so we can get this one moving. I suspect it's a syncing issue rather than a problem with the diff algo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants