Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug while compiling #regions on SDK 8.0.400. #10737

Closed
mauriciocirelli opened this issue Aug 14, 2024 · 13 comments · Fixed by #11086
Closed

Bug while compiling #regions on SDK 8.0.400. #10737

mauriciocirelli opened this issue Aug 14, 2024 · 13 comments · Fixed by #11086
Labels
area-compiler Umbrella for all compiler issues bug Something isn't working
Milestone

Comments

@mauriciocirelli
Copy link

Dear,

Today we tried to build a aspnet 8.0 project, which used to build just fine, and noticed that it pulled a new container version, moving sdk from 8.0.303 to 8.0.400, and we started to get some strange errors like this:

error CS0019: Operator '>' cannot be applied to operands of type 'string' and 'IHtmlContent'

The strange part is that the line that is pointed by the error message does not contain such a comparison.
We could see this same error in several other cshtml files and the common thing is that the line pointed by the error message is one line after an #endregion instruction.

Removing the #region / #endregion pairs fixed the build issue.

Am I missing something?

@MichaelSimons
Copy link
Member

@dotnet/roslyn-compiler - This looks like a potential compiler issue? Should I transfer to Roslyn?

@chsienki
Copy link
Contributor

@mauriciocirelli Is this occurring in a C# file or a Razor file?

@mauriciocirelli
Copy link
Author

mauriciocirelli commented Aug 14, 2024 via email

@chsienki
Copy link
Contributor

@MichaelSimons Can you transfer this to https://github.com/dotnet/razor ? Thanks.

@MichaelSimons MichaelSimons transferred this issue from dotnet/dotnet-docker Aug 14, 2024
@chsienki
Copy link
Contributor

@mauriciocirelli Can you share any repro? A simplified version of the syntax you're using would be great if possible.

@chsienki
Copy link
Contributor

/cc @333fred

@333fred
Copy link
Member

333fred commented Aug 14, 2024

Directives aren't well handled by razor today; there's no explicit testing whatsoever, so it doesn't surprise me that some change broke them. The new tokenizer will have much better support for these scenarios, with explicit testing as well.

@333fred 333fred added area-compiler Umbrella for all compiler issues and removed area-infrastructure labels Aug 14, 2024
@mauriciocirelli
Copy link
Author

mauriciocirelli commented Aug 15, 2024

@mauriciocirelli Can you share any repro? A simplified version of the syntax you're using would be great if possible.

Sure. Something as simple as this:

@{
    #region
    ViewData["Title"] = "Title";
    #endregion

    <h3>@ViewData["Title"]</h3>
    <hr />
}

In my case, I could also fix the docker build by moving the html tags out of the @{} block. But it is worth to point out that the original code builds fine in VS2022, but fails in docker.

@chsienki
Copy link
Contributor

Great, thanks for the code, I can repro on my end. It works on 8.0.303 and fails on 8.0.400

@333fred @janjones It looks like we might be parsing something differently that leads to us treating HTML code as C#, which then gets output verbatim into the code:

43c43
< #line (12,3)-(18,5) "C:\projects\scratch\WebApplication4\WebApplication4\Pages\Index.cshtml"
---
> #line (12,3)-(20,1) "C:\projects\scratch\WebApplication4\WebApplication4\Pages\Index.cshtml"
50c50,51
<     
---
>     <h3> @ViewData["Title"] </h3>
>     <hr />
56,66d56
<             WriteLiteral("<h3> ");
<             Write(
< #nullable restore
< #line (18,11)-(18,28) "C:\projects\scratch\WebApplication4\WebApplication4\Pages\Index.cshtml"
< ViewData["Title"]
< 
< #line default
< #line hidden
< #nullable disable
<             );
<             WriteLiteral(" </h3>\r\n    <hr />\r\n");

Any guesses what might have changed here?

@jjonescz
Copy link
Member

I have investigated a bit and I think it previously worked just by accident - we never parsed #region correctly, but we got to the first @ transition that follows it and because the @ wasn't followed by < nor : we didn't parse that but instead rolled back to before the #region and parsed C# until the first < only (and then parsed HTML):

{
Context.Source.Position = bookmark;
NextToken();
AcceptUntil(SyntaxKind.LessThan, SyntaxKind.LeftBrace, SyntaxKind.RightBrace);
return;
}

That was changed in #10232 - we now always parse the @ transition.

That this never have really worked can be seen if you construct an example without the @ - e.g., this never worked:

@{
    #region
    ViewData["Title"] = "Title";
    #endregion

    <h3>title</h3>
    <hr />
}

@chsienki chsienki added this to the Backlog milestone Aug 15, 2024
@mauriciocirelli
Copy link
Author

mauriciocirelli commented Aug 15, 2024

Well, intended or not, I have been using this code for about 5 years... since .net core 2.0 at least and it has always worked hehehe
I was really surprised by this compilation issue. And more surprised by it happening only in Docker builds... If it was highlighted in VS2022, I would have had it fixed.

@lbussell lbussell removed the status in .NET Docker Aug 15, 2024
@lbussell lbussell removed this from .NET Docker Aug 15, 2024
@jjonescz
Copy link
Member

And more surprised by it happening only in Docker builds... If it was highlighted in VS2022, I would have had it fixed.

That means you are using a different .NET SDK version in your Docker builds than in VS. If you update your local SDK, you should see the errors in VS as well.

@333fred
Copy link
Member

333fred commented Oct 22, 2024

Hi all, we've gotten a number of hits on this bug, so I wanted to summarize the current status.

Bug Summary

Razor's lexer and parser does not have a good understanding of C# preprocessor directives, and never has. When the preprocessor directive is followed by a normal C# statement, something that ends a C# statement (;), or something that starts or ends block (such as a {, }), the directive is usually interpreted correctly. When the preprocessor directive is followed by HTML, the result is much harder to predict. If the HTML contains no C# transitions, an error has been reported at least as far back as .NET 6. If the HTML contains C# transitions such as @, the behavior used to be that the HTML was recognized somewhat correctly. An error recovery mode would roll back parsing and usually end up in the right place. However, this handling of @ in C# contexts was generally inconsistent and broken; we fixed that in #10232. That fix meant that we no longer entered a fallback error recovery mode for C# preprocessor directives, and resulted in the various preprocessor parsing bugs that have been shown here.

Flavors

There are a few different flavors of this bug that we've identified:

  1. The preprocessor directive is directly before HTML. This flavor looks something like this:
@{
    #region R
    <h3>@ViewData["Title"]</h3>
    #endregion
}
  1. There is valid C# between the preprocessor directive and the html, but it doesn't have a character that tells the parser to end parsing before the HTML. This is a variation of 1, and can occur with things like switch statements:
@{
    switch (true)
    {
        #region R
        case true:
            <div>@(1 + 1)</div>
            break;
    }
}
  1. Breaks that do not have any C# transitions in the HTML that follows the preprocessor directive. As I mentioned earlier, we do not have any confirmation that there is a breaking change here: all the scenarios we've seen have broken for a long time. If you have a confirmation of something in this category that did break with 8.0.400, please let us know!

Actions

This is an unfortunate bug that isn't particularly easy to solve; all possible paths have different negative outcomes.

  1. We make Razor's understanding of preprocessor directives match C#'s. This has the unfortunate side effect of making existing Razor, such as @{ #if DEBUG } (all on one line) invalid. We are planning on doing this with the eventual switch to using the Roslyn lexer to drive Razor, rather than a hand-rolled lexer, but this will cause more existing code to break.
  2. Have the razor parser understand C# preprocessor directives, plus the additional rules that Razor allows today, such as being allowed in the single-line case I mentioned in 1. This is unfortunately not going to be compatible with using the Roslyn lexer, and has other existing problems today, such as incorrect interpretation of disabled sections of #if/#else blocks. It will also cause further divergence with runtime compilation, as that parser will not be updated to understand the new syntax, so users of runtime compilation will run into the scenario where their code compiles, but then fails at runtime because the runtime razor compiler does not understand the syntax.
  3. We leave the razor parser's understanding of preprocessor directives as-is, and recommend workarounds such as either putting a ; at the end of preprocessor directives that support it, like #region, or putting the HTML that follows the directive into a nested block, surrounding it with {}.

We've decided on paths 1 and 3 here; we've already been heavily working on the new roslyn-based tokenizer, and hope to have feature branch merged by the end of the month. That will be an opt-in change as we start testing on more real-world scenarios, but it has shown promising compatibility with existing major Razor repos and will bring lots of other improvements, such as better string support and making it easier to keep Razor's understanding of C# up to date. In the meantime, we don't think adding a 3rd incorrect preprocessor parsing path serves anyone; the workarounds are valid Razor and will continue to be so for the forseeable future, and we're very concerned about the potential for the runtime split confusing people further.

Going forward, we will also be creating a document on breaking changes in Razor, which will document things of this nature, and we will also be working to get it published on Microsoft Learn, like similar documents are published from the Roslyn and Runtime projects. We're sorry for the inconvenience caused here.

Workarounds

To explicitly spell out the workarounds, here's a few different examples of how to workaround the issue:

Surround the HTML in a block

The HTML can be surrounded with braces. This works in every case that we've seen reported; if there is some case that this doesn't cover, please let us know.

@{
    #if  DEBUG
    {
        <h3>@ViewData["Title"]</h3>
    }
    #endif
}

Add a semicolon to the directive

Directives such as #region and #endregion allow putting a semicolon after the directive. This will effectively work around the issue.

@{
    #region R ;
    <h3>@ViewData["Title"]</h3>
    #endregion
}

Add a semicolon after the directive

Directives such as #if and #endif do not allow semicolons after the directive condition, but one can be placed on the next line to make an empty statement.

@{
    #if  DEBUG
    ;
    <h3>@ViewData["Title"]</h3>
    #endif
}

333fred added a commit that referenced this issue Oct 25, 2024
This adds the new roslyn-based tokenizer, off by default, which uses
Roslyn, instead of a native tokenizer, to tokenize C# code. This is a
major change and will be enabled by putting
`<Features>use-roslyn-tokenizer</Features>` in the project file.

Closes #10737
Closes #10568
Closes #7084
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-compiler Umbrella for all compiler issues bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants