Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incomplete syntax handling for proc-macros produces invalid TokenStream with {/} chars as a Punct rather than a Group #18244

Open
Veetaha opened this issue Oct 5, 2024 · 4 comments
Labels
A-proc-macro proc macro C-bug Category: bug

Comments

@Veetaha
Copy link
Contributor

Veetaha commented Oct 5, 2024

rust-analyzer version: rust-analyzer version: 0.3.2129-standalon

rustc version: rustc 1.81.0 (eeb90cda1 2024-09-04)

editor or extension: VSCode v0.3.2129

I've stumbled on a bug when using my proc-macro #[bon::builder].

The minimal reproduction of this is:

#[my_lovely_macro]
fn example() {
    if 1
}

where my_lovely_macro is defined as this

use proc_macro::{TokenStream, TokenTree};

#[proc_macro_attribute]
pub fn my_lovely_macro(_params: TokenStream, item: TokenStream) -> TokenStream {
    let tokens: Vec<_> = item.clone().into_iter().collect();
    let TokenTree::Group(fn_block) = &tokens[3] else {
        unreachable!();
    };
    let fn_block_tokens: Vec<_> = fn_block.stream().into_iter().collect();

    if fn_block_tokens.len() > 2 {
        // This is the case where RA inserted a dummy {} block
        panic!("{fn_block_tokens:#?}");
    }

    item
}

If you take a look at the panic message from RA hints

it looks like this:

proc-macro panicked: [
    Ident {
        ident: "if",
        span: SpanData { range: 38..40, anchor: SpanAnchor(FileId(16777216), 2), ctx: SyntaxContextId(0) },
    },
    Literal {
        kind: Integer,
        symbol: "1",
        suffix: None,
        span: SpanData { range: 41..42, anchor: SpanAnchor(FileId(16777216), 2), ctx: SyntaxContextId(0) },
    },
    Punct {
        ch: '{',
        spacing: Alone,
        span: SpanData { range: 0..0, anchor: SpanAnchor(FileId(16777216), 4294967294), ctx: SyntaxContextId(0) },
    },
    Punct {
        ch: '}',
        spacing: Alone,
        span: SpanData { range: 0..0, anchor: SpanAnchor(FileId(16777216), 4294967294), ctx: SyntaxContextId(0) },
    },
]

Notice how there are two Punct items for { and } braces in this list? This is an invalid token tree. syn fails to parse this syntax, because curly braces are never meant to be Punct characters, they must always be balanced and they are only expected as part of a Group token tree.

This produced a panic in my bon::builder macro visible to the user writing the code if they have an incomplete if somewhere while writing the function body (the entire function is underlined in red and no IDE hints are provided). I'll fix that panic in bon separately to make my macro more bug-resilient to situations like this, but RA should also fix this.

@Veetaha Veetaha added the C-bug Category: bug label Oct 5, 2024
@Veetaha Veetaha changed the title Incomplete syntax handling for proc-macros produces invalid TokenStream with { as a Punct rather than a Group Incomplete syntax handling for proc-macros produces invalid TokenStream with {/} chars as a Punct rather than a Group Oct 5, 2024
@ChayimFriedman2
Copy link
Contributor

This probably comes from parser recovery. We need to either make it affect macro tt well, or better, not insert non-existing tokens.

@ChayimFriedman2 ChayimFriedman2 added the A-proc-macro proc macro label Oct 5, 2024
@flodiebold
Copy link
Member

flodiebold commented Oct 5, 2024

It's not parser recovery; it's this:

// FIXME: THis should be a subtree no?
Leaf::Punct(Punct {
char: '{',
spacing: Spacing::Alone,
span: fake_span(node_range)
}),
Leaf::Punct(Punct {
char: '}',
spacing: Spacing::Alone,
span: fake_span(node_range)
}),

(Original code by me, FIXME added by @Veykril and he's of course correct ;) )

@Veetaha
Copy link
Contributor Author

Veetaha commented Oct 5, 2024

Some more context.

While digging a bit more I also uncovered a bug in proc_macro2 crate (dtolnay/proc-macro2#470).

The standard library's proc_macro::Punct::new() explicitly panics if you pass an incorrect character to that method (see the code here). I suppose RA creates the proc_macro::Punct somehow unsafely by just relying on the Punct ABI layout, otherwise it's impossible to create a Punct with { and } characters.

The problem is worsened by the bug in proc_macro2 (dtolnay/proc-macro2#470), which doesn't panic early when it encounters an invalid Punct.

@ChayimFriedman2 ChayimFriedman2 self-assigned this Oct 5, 2024
@ChayimFriedman2
Copy link
Contributor

@rustbot release-assignment

This turned out way more nightmare-y than I thought and I have other things to do.

If anybody is interested in taking my work, my almost complete branch is on GitHub: https://github.com/ChayimFriedman2/rust-analyzer/tree/punct-brace. It passes all tests, but has a bug as the commit message explains. A simple fix may be to never omit a delimiter (and thus make bugs like this still possible, just more rare).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-proc-macro proc macro C-bug Category: bug
Projects
None yet
Development

No branches or pull requests

3 participants