Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error recovery when encountering errant close-paren tokens. #69482

Merged
merged 21 commits into from
Aug 16, 2023

Conversation

CyrusNajmabadi
Copy link
Member

@CyrusNajmabadi CyrusNajmabadi commented Aug 12, 2023

Fixes #58529

Our recursive parsing system was errantly adding close-paren tokens into the set of tokens used to determine when to bail out of a scope. This is not an appropriate use of that system as it should only look for tokens known to absolutely terminate a construct, or tokens known to absolutely start another sibling or parent construct.

Fixed this to no longer be part of the error recovery system, and instead just treat the close-paren specially for the 3 parsing constructs that cared about it.

Note: the majority of this PR is deletions. THe addition of a large test though makes it seem otherwise.

@CyrusNajmabadi CyrusNajmabadi requested a review from a team as a code owner August 12, 2023 02:46
@dotnet-issue-labeler dotnet-issue-labeler bot added Area-Compilers untriaged Issues and PRs which have not yet been triaged by a lead labels Aug 12, 2023
@@ -8406,7 +8412,7 @@ private FixedStatementSyntax ParseFixedStatement(SyntaxList<AttributeListSyntax>

var saveTerm = _termState;
_termState |= TerminatorState.IsEndOfFixedStatement;
var decl = ParseVariableDeclaration();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed this to make it clear which caller contexts it is expected to be used in. The old name was highly misleading and could make people think they should they should use this in any circumstance they want to parse a variable-declaration (like in a field for example).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

importantly, all of these are the constructs that do want to stop parsing the variable declaration once they hit an errant close paren, as their construct themselves want to consume the close paren into themselves.

else if (stopOnCloseParen && this.CurrentToken.Kind == SyntaxKind.CloseParenToken)
{
break;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is half of the fix. Callers who are parsing a list of declarators say how close parens should be treated.

@@ -9784,7 +9800,6 @@ private bool IsEndOfDeclarationClause()
switch (this.CurrentToken.Kind)
{
case SyntaxKind.SemicolonToken:
case SyntaxKind.CloseParenToken:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the other half. we don't mix 'close paren' into the 'bail out set' for error recovery.

@@ -2034,33 +2034,21 @@ public void StaticLambdaFunctionPointer()
Diagnostic(ErrorCode.ERR_SemicolonExpected, "{").WithLocation(1, 58));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

view with whitespace off.

// (1,32): error CS1002: ; expected
// delegate*<void> ptr = &static () => { };
Diagnostic(ErrorCode.ERR_SemicolonExpected, ")").WithLocation(1, 32));
Diagnostic(ErrorCode.ERR_SyntaxError, "static").WithArguments(",").WithLocation(1, 24));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

virtually all cases got better. with many fewer error recovery errors than before.

@@ -2103,7 +2091,7 @@ void verify()
}
}
}
M(SyntaxKind.SemicolonToken);
N(SyntaxKind.SemicolonToken);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots less missing tokens as error recovery now does a better job not preemptively bailing out.

@CyrusNajmabadi
Copy link
Member Author

@dotnet/roslyn-compiler ptal. This helps a lot of situations where a simple errant close paren can cause huge cascading failures in parsing.

@CyrusNajmabadi
Copy link
Member Author

@RikkiGibson @cston ptal.

@@ -4528,22 +4528,11 @@ private void ParseParameterModifiers(SyntaxListBuilder modifiers, bool isFunctio

var type = this.ParseType();

var saveTerm = _termState;
_termState |= TerminatorState.IsEndOfFieldDeclaration;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved this into the single ParseFieldDeclarationVariableDeclarators helper.

var variables = _pool.AllocateSeparated<VariableDeclaratorSyntax>();
try
{
this.ParseVariableDeclarators(type, VariableFlags.Fixed, variables, parentKind);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed ParseVariableDeclarators to ParseFieldDeclarationVariableDeclarators since that's the only case it is used for. this allowed a lot of simplification.

// (1,18): error CS1022: Type or namespace definition, or end-of-file expected
// [return: A, B].C();
Diagnostic(ErrorCode.ERR_EOFExpected, ")").WithLocation(1, 18));
Diagnostic(ErrorCode.ERR_SyntaxError, "]").WithArguments(",").WithLocation(1, 14));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we recover much faster in practically all cases now. no cases get worse. a small handful stay around the same.

N(SyntaxKind.GlobalStatement);
{
N(SyntaxKind.EmptyStatement);
{
N(SyntaxKind.SemicolonToken);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing like thinking you're missing a semicolon, only to be immediatley followed by an unmatched semicolon that makes its own statement.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's nice to see missing tokens "restored" and a cleaner parse.

{
N(SyntaxKind.IdentifierToken, "M");
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

old code basically tried to guess what was going on after the ), getting is quite wrong. new paths basically go into error recovery and state that we can't handle things until we get to the next semicolon.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this also means a lot less broken constructs and a lot less error messages in these bogus cases.

@CyrusNajmabadi
Copy link
Member Author

@RikkiGibson @cston ptal.

@RikkiGibson RikkiGibson self-assigned this Aug 14, 2023
src/Compilers/CSharp/Portable/Parser/LanguageParser.cs Outdated Show resolved Hide resolved
N(SyntaxKind.GlobalStatement);
{
N(SyntaxKind.EmptyStatement);
{
N(SyntaxKind.SemicolonToken);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's nice to see missing tokens "restored" and a cleaner parse.

@CyrusNajmabadi
Copy link
Member Author

@jjonescz @cston ptal :)

_syntaxFactory.VariableDeclaration(type, _pool.ToListAndFree(variables)),
_syntaxFactory.VariableDeclaration(
type,
this.ParseFieldDeclarationVariableDeclarators(type, VariableFlags.Const, parentKind)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ParseFieldDeclarationVariableDeclarators

It looks like ParseConstantFieldDeclaration() is now indirectly saving and restoring _termState when parsing variable declarators. What is the impact of that change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

honestly, it's likely better there so that parsing a constant field is the same as parsing a normal field. TBH, it's kinda strange that we even have this method instead of just consuming 'const' in the caller and calling into normal field decl parsing.

The only real difference here is that ; is mixed into the terminator tokens for this construct. but that's totally fine and desirable for constant fields.

@CyrusNajmabadi CyrusNajmabadi merged commit 2b3c173 into dotnet:main Aug 16, 2023
27 checks passed
@ghost ghost added this to the Next milestone Aug 16, 2023
@CyrusNajmabadi CyrusNajmabadi deleted the closeParenParsing branch August 16, 2023 19:08
@dibarbet dibarbet modified the milestones: Next, 17.8 P2 Aug 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compilers untriaged Issues and PRs which have not yet been triaged by a lead
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extra closing parenthesis in local function breaks parsing for the rest of the file
4 participants