Improve error recovery after missing ">" token #32124

YairHalberstadt · 2019-01-03T14:15:33Z

Previously we assumed that when a > or , token was missing, the missing token was a ,.

This PR assumes that the missing token is >, unless a lookahead of one suggests otherwise.

A fix to #24642

CyrusNajmabadi · 2019-01-03T16:52:24Z

src/Compilers/CSharp/Portable/Parser/LanguageParser.cs

@@ -5455,6 +5467,12 @@ private void ParseTypeArgumentList(out SyntaxToken open, SeparatedSyntaxListBuil
            }

            close = this.EatToken(SyntaxKind.GreaterThanToken);
+


please place a 'return' here to help separate out the normal code flow, and hte helper extension methods.

CyrusNajmabadi · 2019-01-03T16:53:25Z

src/Compilers/CSharp/Portable/Parser/LanguageParser.cs

+                else if (this.IsPossibleType())
+                {
+                    var nextToken = this.PeekToken(1);
+                    if (nextToken.Kind == SyntaxKind.GreaterThanToken || nextToken.Kind == SyntaxKind.CommaToken) // presumably missing a comma


comment is not really useful. It would be better to have a deeper explanation of what's going on here.

CyrusNajmabadi · 2019-01-03T16:53:59Z

src/Compilers/CSharp/Portable/Parser/LanguageParser.cs

+            {
+                types.AddSeparator(this.EatToken(SyntaxKind.CommaToken));
+                types.Add(this.ParseTypeArgument());
+            }


i don't see a lot of value with this helper. i think it would be easier to understand if it was just inlined.

CyrusNajmabadi · 2019-01-03T16:54:08Z

src/Compilers/CSharp/Test/Syntax/Parsing/ParsingErrorRecoveryTests.cs

@@ -6274,6 +6274,46 @@ class C
            Assert.Equal((int)ErrorCode.ERR_IdentifierExpected, file.Errors()[0].Code);
        }

+        [Fact]


needs WorkItem attribute.

What are the parameters to the WorkItem attribute?

WorkItem(24642, "https://github.com/dotnet/roslyn/issues/24642")

CyrusNajmabadi · 2019-01-03T16:56:05Z

src/Compilers/CSharp/Test/Syntax/Parsing/ParsingErrorRecoveryTests.cs

+                    .Select(d => ((IFormattable)d).ToString(null, EnsureEnglishUICulture.PreferredOrNull)));
+
+            Assert.Equal(16, syntaxTree.FindNodeOrTokenByKind(SyntaxKind.MethodDeclaration).Position);
+        }


we need a lot more tests of this new behavior. especially where tehre might be a > but we now assume that the lack of a , causes us to terminate the list earlier.I would particularly like a PR that first adds those tests (and has the errors currently), then makes the parser change, and has to update those tests.

i.e. to see if this causes a regressive behavior in some cases.

also, we should have tests that validate the actual shape of the tree. See several other parsing test files for examples of this. For example, nothing about this test helps me know if the parser really understood this was a method with a generic return type.

@CyrusNajmabadi
That's fair enough and I will attempt to do so.
I think, but have not yet verified that error recovery here, although not perfect, is absolutely better than it was. i.e.there are no cases where error recovery is worse than it was previously.

Excluding a case where two commas in the type argument list are missed, but that is rare enough that I will not worry about it too much

CyrusNajmabadi · 2019-01-03T20:17:47Z

src/Compilers/CSharp/Portable/Parser/LanguageParser.cs

+                }
+                else if (this.IsPossibleType())
+                {
+                    var nextToken = this.PeekToken(1);


i don't understand how peeking 1 token is sufficient. the 'possible type' may be unbounded in length. are you only testing simple 1-token types.

@CyrusNajmabadi
You are right.
If lookahead is very quick per token, how far ahead is it performant to lookahead?

I think a sensible algorithm might be to say that a GenericTypeArgumentList will always appear somewhere before one of:

; = { =>

We look ahead till the next such token, and if we are able to find a closing > token (counting opening < along the way) we assume we are missing a comma.

Else we assume we are missing a > token.

If we get past some contant number of tokens, we give up and assume it's a ,

Does that sound sensible?

At the cost of making the function longer (but since it's a switch, presumably not much less performant) we could add all the miriad SytntaxKinds that can't appear in a Generic Type Argument List, but I think the above syntax kinds will be sufficient.

CyrusNajmabadi · 2019-01-03T23:57:49Z

You are right.
If lookahead is very quick per token, how far ahead is it performant to lookahead?

Performing more expensive computation in error cases is not really a problem (as long as it's not ridiculously expensive). That's because error cases are rare most of the time. i.e. in practice, files are 99+% correct 99+% of the time. Errors come in in a very localized fashion, but are then corrected by the user very quickly.

So, instead of just looking one token ahead, we should really be scanning a full type ahead in these error cases and looking to see what is after the next full type we see. We should also be testing with a wealth of interesting type-cases (including nested incomplete generics).

YairHalberstadt · 2019-01-05T21:09:30Z

including nested incomplete generics
My current approach wouldn't work for

Dictionary<int List<int> M()
// this should be parsed as 
Dictionary<int, List<int>> M()
// but is instead parsed as 
Dictionary<int> List<int> M()

But i can't think of anyway to get that to work that isn't extremely complex.

CyrusNajmabadi · 2019-01-06T00:03:53Z

Why can't you speculatively scan to to see if you have a type argument?

YairHalberstadt · 2019-01-06T02:55:38Z

Why can't you speculatively scan to to see if you have a type argument?

Then you end up in trouble here:

void M(List<int list, string str)

You'd parse it as

void M(List<int, list, string> str)

You'd have to have different rules depending on where you were which would very quickly get very complex

CyrusNajmabadi · 2019-01-06T03:41:46Z

You'd parse it as

What's wrong with that?

YairHalberstadt · 2019-01-06T03:43:52Z

That requires assuming two tokens are missing rather than one, so is likely to be wrong more frequently

YairHalberstadt · 2019-01-06T04:29:54Z

Why can't you speculatively scan to to see if you have a type argument?

It's also difficult to distinguish between

Dictionary<int int M()

// Which should be parsed as 

Dictionary<int int> M()

And

new Dictionary<int int M()

// Which should be parsed as

new Dictionary<int int M>()

CyrusNajmabadi · 2019-01-06T05:13:01Z

I don't see what's difficult about distinguishing those. One is followed by an open paren, and thus looks like a name, and not a type-arg itself.

CyrusNajmabadi · 2019-01-06T05:14:46Z

That requires assuming two tokens are missing rather than one, so is likely to be wrong more frequently

Why? I'm not understanding the relation here. It also seems like you may be overthinking this space. As mentioned previously, errors are rare. The only pop in occasionally as the user is in the middle of editing something. Heck, even wanting to change parsing here isn't something i'm strongly in favor of due to it not actually improving things substantively, while also potentially negatively impacting things.

YairHalberstadt · 2019-01-06T08:03:33Z

I don't see what's difficult about distinguishing those. One is followed by an open paren, and thus looks like a name, and not a type-arg itself

In both cases they are followed by an open Paran, but in one case (a new expression) it is a type argument and in one (a return type) it isn't

YairHalberstadt · 2019-01-06T16:26:45Z

Anyway, I'm going to close this for now, until I have something more concrete.

I feel like to make error recovery easier, the entire LanguageParser API needs to be reconsidered to make look-ahead easier. Currently all the various helper methods use the CurrentToken such as IsEndOfNamespace. Would it be acceptable to create overrides for all these methods which take a SyntaxToken as a parameter?

Improve error recovery after missing ">" token

fa833f2

YairHalberstadt requested a review from a team as a code owner January 3, 2019 14:15

CyrusNajmabadi reviewed Jan 3, 2019

View reviewed changes

vatsalyaagrawal added the Area-Compilers label Jan 3, 2019

sharwell added the Community The pull request was submitted by a contributor who is not a Microsoft employee. label Jan 3, 2019

CyrusNajmabadi reviewed Jan 3, 2019

View reviewed changes

YairHalberstadt closed this Jan 6, 2019

YairHalberstadt deleted the ParserErrorRecoveryInGenerics branch January 6, 2019 16:27

YairHalberstadt mentioned this pull request Jan 10, 2019

Improve error recovery in type argument list #32351

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve error recovery after missing ">" token #32124

Improve error recovery after missing ">" token #32124

YairHalberstadt commented Jan 3, 2019 •

edited

Loading

CyrusNajmabadi Jan 3, 2019

CyrusNajmabadi Jan 3, 2019

CyrusNajmabadi Jan 3, 2019

CyrusNajmabadi Jan 3, 2019

YairHalberstadt Jan 3, 2019

CyrusNajmabadi Jan 3, 2019

YairHalberstadt Jan 3, 2019

CyrusNajmabadi Jan 3, 2019

CyrusNajmabadi Jan 3, 2019

YairHalberstadt Jan 3, 2019

YairHalberstadt Jan 3, 2019

CyrusNajmabadi Jan 3, 2019

YairHalberstadt Jan 3, 2019 •

edited

Loading

CyrusNajmabadi commented Jan 3, 2019

YairHalberstadt commented Jan 5, 2019 •

edited

Loading

CyrusNajmabadi commented Jan 6, 2019

YairHalberstadt commented Jan 6, 2019

CyrusNajmabadi commented Jan 6, 2019

YairHalberstadt commented Jan 6, 2019

YairHalberstadt commented Jan 6, 2019 •

edited

Loading

CyrusNajmabadi commented Jan 6, 2019 •

edited

Loading

CyrusNajmabadi commented Jan 6, 2019

YairHalberstadt commented Jan 6, 2019 •

edited

Loading

YairHalberstadt commented Jan 6, 2019

		@@ -5455,6 +5467,12 @@ private void ParseTypeArgumentList(out SyntaxToken open, SeparatedSyntaxListBuil
		}

		close = this.EatToken(SyntaxKind.GreaterThanToken);

Improve error recovery after missing ">" token #32124

Improve error recovery after missing ">" token #32124

Conversation

YairHalberstadt commented Jan 3, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

YairHalberstadt Jan 3, 2019 • edited Loading

Choose a reason for hiding this comment

CyrusNajmabadi commented Jan 3, 2019

YairHalberstadt commented Jan 5, 2019 • edited Loading

CyrusNajmabadi commented Jan 6, 2019

YairHalberstadt commented Jan 6, 2019

CyrusNajmabadi commented Jan 6, 2019

YairHalberstadt commented Jan 6, 2019

YairHalberstadt commented Jan 6, 2019 • edited Loading

CyrusNajmabadi commented Jan 6, 2019 • edited Loading

CyrusNajmabadi commented Jan 6, 2019

YairHalberstadt commented Jan 6, 2019 • edited Loading

YairHalberstadt commented Jan 6, 2019

YairHalberstadt commented Jan 3, 2019 •

edited

Loading

YairHalberstadt Jan 3, 2019 •

edited

Loading

YairHalberstadt commented Jan 5, 2019 •

edited

Loading

YairHalberstadt commented Jan 6, 2019 •

edited

Loading

CyrusNajmabadi commented Jan 6, 2019 •

edited

Loading

YairHalberstadt commented Jan 6, 2019 •

edited

Loading