You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
macro_rules! ex {($($i:expr),*, $($j:ty),*) => {};}fnmain(){ex!(a, dyn Copy);}
However, if it's invoked in an ambiguous way, such as with ex!(a, b), then it gives an error (no lookahead is performed to ) to determine an unambiguous parse; examples can be constructed that avoid this unambiguity anyway).
Thus, while the original example compiles now, were dyn Copy to become a legal expression at some point in the future, it would become ambiguous. The issue here is that the separator , is re-used. In line with other futureproofing efforts like rust-lang/rfcs#550 and #56575, this should possibly be forbidden. The follow-set rule is insufficient to detect these cases as it only exists to allow us to promise in advance where a match is guaranteed to end; it's not sufficient to disambiguate between two potential parses starting from the same point.
My first attempt at rules to forbid this are for the situation where we have ... $(tt ...) SEP? OP uu ..., with uu ... possibly empty. In these, FIRST* is a variant on FIRST from RFC 550: it represents all terminal tokens that could appear in a valid match. Thus, FIRST*($t:ty) would be any can_begin_type token, plus any token we worry might be legally allowed to begin a type someday, and so on.
If SEP is present, we must have we must have SEP \not\in FIRST*(uu ...), insisting that the seperator token in repetition will never be ambiguous.
When OP is not +, we must haveFIRST*(tt ...) and FIRST*(uu ...) be disjoint, insisting that the question of whether we're repeating at all will not be ambiguous.
For unseparated Kleene repeats, the second rule above, combined with the follow-set rule, are sufficient to disambiguate.
I have no idea how frequently these potential ambiguities arise in practice. It might be a lot, it might be a little. I expect that the first rule, for separators, might be safe to add because most macros that would risk running into them cannot be invoked except with an empty repetition due to existing ambiguity (the first macro can be invoked as ex!(,a), for instance, with an empty repetition for $i eliminating the ambiguity) and therefore could likely mostly be removed. The second rule, however, might be more commonly violated.
I don't see FIRST defined in RFC 550, but I think I understand what you're saying anyway. (Nit: I think you meant to write FIRST\*($t:ty) and not FIRST*($t:ty).)
Both rules seem sensible to me, and should minimally eliminate ambiguity, unless I'm missing something. It sounds like a Crater run would be sensible at some point soon. @estebank knows more than me about this topic though.
The following macro is valid, and can be used:
However, if it's invoked in an ambiguous way, such as with
ex!(a, b)
, then it gives an error (no lookahead is performed to)
to determine an unambiguous parse; examples can be constructed that avoid this unambiguity anyway).Thus, while the original example compiles now, were
dyn Copy
to become a legal expression at some point in the future, it would become ambiguous. The issue here is that the separator,
is re-used. In line with other futureproofing efforts like rust-lang/rfcs#550 and #56575, this should possibly be forbidden. The follow-set rule is insufficient to detect these cases as it only exists to allow us to promise in advance where a match is guaranteed to end; it's not sufficient to disambiguate between two potential parses starting from the same point.My first attempt at rules to forbid this are for the situation where we have
... $(tt ...) SEP? OP uu ...
, withuu ...
possibly empty. In these,FIRST*
is a variant onFIRST
from RFC 550: it represents all terminal tokens that could appear in a valid match. Thus,FIRST*($t:ty)
would be anycan_begin_type
token, plus any token we worry might be legally allowed to begin a type someday, and so on.SEP
is present, we must have we must haveSEP \not\in FIRST*(uu ...)
, insisting that the seperator token in repetition will never be ambiguous.OP
is not+
, we must haveFIRST*(tt ...)
andFIRST*(uu ...)
be disjoint, insisting that the question of whether we're repeating at all will not be ambiguous.For unseparated Kleene repeats, the second rule above, combined with the follow-set rule, are sufficient to disambiguate.
I have no idea how frequently these potential ambiguities arise in practice. It might be a lot, it might be a little. I expect that the first rule, for separators, might be safe to add because most macros that would risk running into them cannot be invoked except with an empty repetition due to existing ambiguity (the first macro can be invoked as
ex!(,a)
, for instance, with an empty repetition for$i
eliminating the ambiguity) and therefore could likely mostly be removed. The second rule, however, might be more commonly violated.cc @estebank @alexreg
The text was updated successfully, but these errors were encountered: