Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider whether repeated macro matchers need separator disambiguation for futureproofing #57603

Open
alercah opened this issue Jan 14, 2019 · 4 comments
Labels
A-macros Area: All kinds of macros (custom derive, macro_rules!, proc macros, ..)

Comments

@alercah
Copy link
Contributor

alercah commented Jan 14, 2019

The following macro is valid, and can be used:

macro_rules! ex {
    ($($i:expr),* , $($j:ty),*) => { };
}

fn main() {
    ex!(a, dyn Copy);
}

However, if it's invoked in an ambiguous way, such as with ex!(a, b), then it gives an error (no lookahead is performed to ) to determine an unambiguous parse; examples can be constructed that avoid this unambiguity anyway).

Thus, while the original example compiles now, were dyn Copy to become a legal expression at some point in the future, it would become ambiguous. The issue here is that the separator , is re-used. In line with other futureproofing efforts like rust-lang/rfcs#550 and #56575, this should possibly be forbidden. The follow-set rule is insufficient to detect these cases as it only exists to allow us to promise in advance where a match is guaranteed to end; it's not sufficient to disambiguate between two potential parses starting from the same point.

My first attempt at rules to forbid this are for the situation where we have ... $(tt ...) SEP? OP uu ..., with uu ... possibly empty. In these, FIRST* is a variant on FIRST from RFC 550: it represents all terminal tokens that could appear in a valid match. Thus, FIRST*($t:ty) would be any can_begin_type token, plus any token we worry might be legally allowed to begin a type someday, and so on.

  1. If SEP is present, we must have we must have SEP \not\in FIRST*(uu ...), insisting that the seperator token in repetition will never be ambiguous.
  2. When OP is not +, we must haveFIRST*(tt ...) and FIRST*(uu ...) be disjoint, insisting that the question of whether we're repeating at all will not be ambiguous.

For unseparated Kleene repeats, the second rule above, combined with the follow-set rule, are sufficient to disambiguate.

I have no idea how frequently these potential ambiguities arise in practice. It might be a lot, it might be a little. I expect that the first rule, for separators, might be safe to add because most macros that would risk running into them cannot be invoked except with an empty repetition due to existing ambiguity (the first macro can be invoked as ex!(,a), for instance, with an empty repetition for $i eliminating the ambiguity) and therefore could likely mostly be removed. The second rule, however, might be more commonly violated.

cc @estebank @alexreg

@alexreg
Copy link
Contributor

alexreg commented Jan 14, 2019

I don't see FIRST defined in RFC 550, but I think I understand what you're saying anyway. (Nit: I think you meant to write FIRST\*($t:ty) and not FIRST*($t:ty).)

Both rules seem sensible to me, and should minimally eliminate ambiguity, unless I'm missing something. It sounds like a Crater run would be sensible at some point soon. @estebank knows more than me about this topic though.

@alercah
Copy link
Contributor Author

alercah commented Jan 14, 2019

@alexreg
Copy link
Contributor

alexreg commented Jan 14, 2019

@alercah Ah, it wasn't in the original PR text, must have been added after. Thanks.

@ghost
Copy link

ghost commented Jan 14, 2019

@alexreg apparently added in this commit rust-lang/rfcs@3c4f0f6

@Centril Centril added the A-macros Area: All kinds of macros (custom derive, macro_rules!, proc macros, ..) label Jan 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-macros Area: All kinds of macros (custom derive, macro_rules!, proc macros, ..)
Projects
None yet
Development

No branches or pull requests

3 participants