-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve error recovery after missing ">" token #32124
Improve error recovery after missing ">" token #32124
Conversation
@@ -5455,6 +5467,12 @@ private void ParseTypeArgumentList(out SyntaxToken open, SeparatedSyntaxListBuil | |||
} | |||
|
|||
close = this.EatToken(SyntaxKind.GreaterThanToken); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please place a 'return' here to help separate out the normal code flow, and hte helper extension methods.
else if (this.IsPossibleType()) | ||
{ | ||
var nextToken = this.PeekToken(1); | ||
if (nextToken.Kind == SyntaxKind.GreaterThanToken || nextToken.Kind == SyntaxKind.CommaToken) // presumably missing a comma |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment is not really useful. It would be better to have a deeper explanation of what's going on here.
{ | ||
types.AddSeparator(this.EatToken(SyntaxKind.CommaToken)); | ||
types.Add(this.ParseTypeArgument()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't see a lot of value with this helper. i think it would be easier to understand if it was just inlined.
@@ -6274,6 +6274,46 @@ class C | |||
Assert.Equal((int)ErrorCode.ERR_IdentifierExpected, file.Errors()[0].Code); | |||
} | |||
|
|||
[Fact] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs WorkItem attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the parameters to the WorkItem attribute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WorkItem(24642, "https://github.com/dotnet/roslyn/issues/24642")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
.Select(d => ((IFormattable)d).ToString(null, EnsureEnglishUICulture.PreferredOrNull))); | ||
|
||
Assert.Equal(16, syntaxTree.FindNodeOrTokenByKind(SyntaxKind.MethodDeclaration).Position); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need a lot more tests of this new behavior. especially where tehre might be a >
but we now assume that the lack of a ,
causes us to terminate the list earlier.I would particularly like a PR that first adds those tests (and has the errors currently), then makes the parser change, and has to update those tests.
i.e. to see if this causes a regressive behavior in some cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, we should have tests that validate the actual shape of the tree. See several other parsing test files for examples of this. For example, nothing about this test helps me know if the parser really understood this was a method with a generic return type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CyrusNajmabadi
That's fair enough and I will attempt to do so.
I think, but have not yet verified that error recovery here, although not perfect, is absolutely better than it was. i.e.there are no cases where error recovery is worse than it was previously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excluding a case where two commas in the type argument list are missed, but that is rare enough that I will not worry about it too much
} | ||
else if (this.IsPossibleType()) | ||
{ | ||
var nextToken = this.PeekToken(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't understand how peeking 1 token is sufficient. the 'possible type' may be unbounded in length. are you only testing simple 1-token types
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CyrusNajmabadi
You are right.
If lookahead is very quick per token, how far ahead is it performant to lookahead?
I think a sensible algorithm might be to say that a GenericTypeArgumentList will always appear somewhere before one of:
;
=
{
=>
We look ahead till the next such token, and if we are able to find a closing >
token (counting opening <
along the way) we assume we are missing a comma.
Else we assume we are missing a >
token.
If we get past some contant number of tokens, we give up and assume it's a ,
Does that sound sensible?
At the cost of making the function longer (but since it's a switch, presumably not much less performant) we could add all the miriad SytntaxKinds that can't appear in a Generic Type Argument List, but I think the above syntax kinds will be sufficient.
Performing more expensive computation in error cases is not really a problem (as long as it's not ridiculously expensive). That's because error cases are rare most of the time. i.e. in practice, files are 99+% correct 99+% of the time. Errors come in in a very localized fashion, but are then corrected by the user very quickly. So, instead of just looking one token ahead, we should really be scanning a full type ahead in these error cases and looking to see what is after the next full type we see. We should also be testing with a wealth of interesting type-cases (including nested incomplete generics). |
Dictionary<int List<int> M()
// this should be parsed as
Dictionary<int, List<int>> M()
// but is instead parsed as
Dictionary<int> List<int> M() But i can't think of anyway to get that to work that isn't extremely complex. |
Why can't you speculatively scan to to see if you have a type argument? |
Then you end up in trouble here: void M(List<int list, string str) You'd parse it as void M(List<int, list, string> str) You'd have to have different rules depending on where you were which would very quickly get very complex |
What's wrong with that? |
That requires assuming two tokens are missing rather than one, so is likely to be wrong more frequently |
It's also difficult to distinguish between Dictionary<int int M()
// Which should be parsed as
Dictionary<int int> M() And new Dictionary<int int M()
// Which should be parsed as
new Dictionary<int int M>() |
I don't see what's difficult about distinguishing those. One is followed by an open paren, and thus looks like a name, and not a type-arg itself. |
Why? I'm not understanding the relation here. It also seems like you may be overthinking this space. As mentioned previously, errors are rare. The only pop in occasionally as the user is in the middle of editing something. Heck, even wanting to change parsing here isn't something i'm strongly in favor of due to it not actually improving things substantively, while also potentially negatively impacting things. |
In both cases they are followed by an open Paran, but in one case (a new expression) it is a type argument and in one (a return type) it isn't |
Anyway, I'm going to close this for now, until I have something more concrete. I feel like to make error recovery easier, the entire LanguageParser API needs to be reconsidered to make look-ahead easier. Currently all the various helper methods use the |
Previously we assumed that when a
>
or,
token was missing, the missing token was a,
.This PR assumes that the missing token is
>
, unless a lookahead of one suggests otherwise.A fix to #24642