Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change behavior of ? as a macro separator and Kleene op in 2018 edition #51934

Closed
mark-i-m opened this issue Jun 30, 2018 · 38 comments
Closed

Change behavior of ? as a macro separator and Kleene op in 2018 edition #51934

mark-i-m opened this issue Jun 30, 2018 · 38 comments
Labels
disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@mark-i-m
Copy link
Member

mark-i-m commented Jun 30, 2018

This is a request for FCP on PR #51587, according to #51587 (comment).

Tracking issue: #48075 (? macro repetition)

Current behavior:

  • ? is a macro separator, not a Kleene op.

Proposed new behavior (implemented in #51587):

  • Edition 2015: ? is a macro separator, not a kleene op, but using it as a separator triggers a migration lint.
  • Edition 2018: ? is not a valid separator. ? is a valid Kleene op.

EDIT: To clarify: Under this proposal, $(a)?+ matches + and a+ unambiguously.

Why we chose this behavior:

  • It's the cleanest choice both in terms of end-user experience and implementation. See the alternatives.

Alternatives: The main problem to address is ambiguity. Currently, ? is a valid separator, so there are a lot of ambiguous possibilities like $(a)?+. Is the ? a separator for 1 or more iterations OR is it a zero-or-one kleene op followed by a + token? The solutions fall into 2 categories: make breaking changes or add fallbacks.

  1. Fallbacks

    • The original implementation (currently on nightly behind feature gate macro_at_most_once_rep):
      • The ? kleene op is allowed to take a separator which can never be used (we don't allow trailing separators and ? allows at most one repetition).
      • When parsing a macro, we disambiguate by looking after the separator: if ? is followed by +, *, or ?, we treat the ? as a separator and the second token as a kleene op.
    • On the tracking issue, we decided that it is confusing for ? to take a separator, so the rules would be that ? followed by + or * is a separator, but ? followed by ? or any other token would be a ? kleene op followed by the other token.
      • This is a viable implementation, but it seems like a lot of special cases and the code gets messy. This led us more towards the breaking changes options.
    • There are probably lots of variations of fallbacks we could do, but they all have weird corner cases.
  2. Breaking changes

    • Make a breaking change in the 2015 edition (implemented in Update ? repetition disambiguation. #49719, accidentally merged without FCP, and rolled back).
      • Disallow ? as a separator in 2015 edition.
      • Disallow ? kleene op from taking a separator.
      • This means that $(a)?+ matches + and a+ unambiguously. However, it is a breaking change. This was deemed not worthy of a breaking change in the 2015 edition.
    • Make a breaking change in the 2018 edition and lint in the 2015 edition. No change to behavior in 2015. This is the current proposal implemented in 2018 edition ? Kleene operator #51587 and is the subject of this FCP.
      • The primary downside here is that it's not clear what happens if you import a macro from 2015 into 2018 that uses ? as a separator. Currently, you would just get a compile error.
      • I believe the breakage is likely to be extremely rare. A crater run showed no breakage (Update ? repetition disambiguation. #49719 (comment)). However, we did found out later that there was apparently some breakage elsewhere (Update ? repetition disambiguation. #49719 (comment)).
      • The implementation is significantly cleaner and more maintainable and the behavior is more consistent for end-users.

cc @nikomatsakis
cc @petrochenkov Reviewed the second attempt
cc @durka @alexreg @clarcharr @Centril @kennytm @ExpHP Who participated extensively in the original design discussions (EDIT: apologies if I missed anyone; there were a lot of people who came in at various points)
cc @oli-obk Who brought up clippy breakage
cc @sgrif @pietroalbini Who wanted to see an FCP

Whew... that's a lot of people :)

@alexreg
Copy link
Contributor

alexreg commented Jun 30, 2018

Nice work, @mark-i-m. I think we finally have a fully appropriate solution in all cases, at least as I understand it. LTGM!

@petrochenkov
Copy link
Contributor

petrochenkov commented Jun 30, 2018

I think this is too much hassle over nothing.
This is a very useful feature and crater found zero regressions, we should make a "breaking change" implement it on edition 2015.

The one symbol lookup checking for another Kleene op and reducing the breakage from zero to absolute zero also doesn't look like too much complexity or special casing to me.

Also, the separator should be allowed on both left-hand and right-hand sides of a macro (this is the first thing I tried to use when trying ? in 399da7b), or conservatively accepted and reported as a non-fatal error.

@pietroalbini
Copy link
Member

I don't really see a point in making breaking changes like this in the current edition, especially now that we have editions and the next one is right around the corner.

@kennytm
Copy link
Member

kennytm commented Jun 30, 2018

What would happen for macro definitions across editions?

// edition (4033 - A)
macro_rules! define_macro {
    ($($t:tt)*) => { 
        macro_rules! use_macro {
            ($($t)*) => {}
        }
    }
}

// edition A
define_macro!($(a)?*);

use_macro!(a?a?a);  // ???
use_macro!(a*);     // ???

@mark-i-m
Copy link
Member Author

@kennytm In the current design, if you use ? as a separator in 2018, you get an error. Perhaps this is a rare enough case that that's ok? The fix is frankly trivial: use any other separator than ?...

@ExpHP
Copy link
Contributor

ExpHP commented Jun 30, 2018

@kennytm In the current design, if you use ? as a separator in 2018, you get an error. Perhaps this is a rare enough case that that's ok? The fix is frankly trivial: use any other separator than ?...

@mark-i-m but can a 2018 crate use a macro defined in a 2015 crate that uses ? as a separator?

(going even further down the rabbit hole, there is the possibility of a 2015 crate with a macro that generates a macro_rules! definition that uses ? as a separator. ISTM that this clearly would not be callable from 2018... however, the odds of any such macro existing seem to be a trillion to one)

@ExpHP
Copy link
Contributor

ExpHP commented Jun 30, 2018

Also, it is unclear to me from the summary, but:

This means that $(a)?+ matches + and a+ unambiguously.

Is this describing the new proposed behavior in 2018, or the behavior of an alternative design? It seems desirable to me.

@mark-i-m
Copy link
Member Author

@ExpHP

@mark-i-m but can a 2018 crate use a macro defined in a 2015 crate that uses ? as a separator?

Sorry, perhaps I misunderstanding. What is the distinction from @kennytm's point?

Yes, in 2018 (according to the current plan) if you were to use ? as a separator (even coming from a 2015 crate), you would get an error. The expectation is that this case is unlikely enough and easy enough to fix that we are ok with the breakage across an edition boundary.

This means that $(a)?+ matches + and a+ unambiguously.

Is this describing the new proposed behavior in 2018, or the behavior of an alternative design? It seems desirable to me.

Yes, this describes the proposed behavior (implemented in #51587).

@ExpHP
Copy link
Contributor

ExpHP commented Jun 30, 2018

What is the distinction from @kennytm's point?

None that I know of, it just sounded to me like you and I interpreted his question differently, especially due to the quote

The fix is frankly trivial: use any other separator than ?...

which is not something a 2018 consumer of a 2015 macro crate can do.

@mark-i-m
Copy link
Member Author

which is not something a 2018 consumer of a 2015 macro crate can do.

Yes, that's true. However, I believe the fact that the fix is easy mitigates this a bit, since you can often make a PR to fix the problem, and a PR with this fix ought to be easy to review.

Again, I agree that this is not a super great situation, but I expect it to be rare.

@nikomatsakis
Copy link
Contributor

Ignoring the state of the current implementation: I believe the expected behavior for the cross-crate case would be that a macro defined in the 2015 Edition crate would not treat ? as "0 or 1" but rather as separator. Put another way, the behavior of macros across crates is a somewhat orthogonal issue. (We have an open issue or two on that topic, and it does need attention and resolution, don't get me wrong.)

@nikomatsakis
Copy link
Contributor

Regarding regressions, I'm curious: @petrochenkov asserted that a crater run found zero regressions, but I believe that @mark-i-m cited some regressions to me. Which is correct?

@nikomatsakis nikomatsakis added the T-lang Relevant to the language team, which will review and decide on the PR/issue. label Jul 2, 2018
@nikomatsakis
Copy link
Contributor

@rfcbot fcp merge

I'm going to propose that we accept this proposal. I feel like using an edition here for backwards compatibility is the right thing to do and it's an easy enough for us to do.

The one thing that gives me pause here is that we are effectively losing expressiveness: there is no (convenient, at least) way in Rust 2018 to get the equivalent of $(foo)?* -- but I agree with @petrochenkov that this is quite the edge-case and we can live with it. I'm not sure anyone cares.

At worst, I suppose, you could isolate the macro to Rust 2015 crate that you use as a dependency. =)

@rfcbot
Copy link

rfcbot commented Jul 2, 2018

Team member @nikomatsakis has proposed to merge this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Jul 2, 2018
@alexreg
Copy link
Contributor

alexreg commented Jul 2, 2018

The one thing that gives me pause here is that we are effectively losing expressiveness

Escaping could be the way forwards, if we decide we really need this...

@kennytm
Copy link
Member

kennytm commented Jul 2, 2018

there is no (convenient, at least) way in Rust 2018 to get the equivalent of $(foo)?*

There's no way to express $(foo)++ in any Rust edition either 😄 (and that's one reason we get the weird syntax where T: X + Y +) (rust-lang/rfcs#991). When declarative-macros 2.0 has a solution for using + and * as a separator it could be easily extensible to cover ? as well if anyone cares.

@mark-i-m
Copy link
Member Author

mark-i-m commented Jul 2, 2018

Regarding regressions, I'm curious: @petrochenkov asserted that a crater run found zero regressions, but I believe that @mark-i-m cited some regressions to me. Which is correct?

Both. Crater found no regressions, but we still broke clippy.

@mark-i-m
Copy link
Member Author

mark-i-m commented Jul 2, 2018

Ignoring the state of the current implementation: I believe the expected behavior for the cross-crate case would be that a macro defined in the 2015 Edition crate would not treat ? as "0 or 1" but rather as separator. Put another way, the behavior of macros across crates is a somewhat orthogonal issue. (We have an open issue or two on that topic, and it does need attention and resolution, don't get me wrong.)

So IIUC, the current implementation would not change, but a different edition would be passed to the macros parser? So the macro parser would think it was in edition 2015 even though we are in a 2018 crate?

@nikomatsakis
Copy link
Contributor

@mark-i-m I think the right way to think of it is that one can ask what Edition a given span is in.

@nikomatsakis
Copy link
Contributor

There wouldn't really be a notion of "current edition" for the most part

@petrochenkov
Copy link
Contributor

petrochenkov commented Jul 4, 2018

  1. Fallbacks
    This is a viable implementation, but it seems like a lot of special cases and the code gets messy.

As you can see in #51587 the edition-specific logic is much worse than previously implemented fallback.

I don't understand what kind of priorities you need to have to end up preferring such edition split and removal of a useful feature to introducing one-symbol lookup logic.

@mark-i-m
Copy link
Member Author

mark-i-m commented Jul 4, 2018

As you can see in #51587 the edition-specific logic is much much worse than previously implemented fallback.

I was surprised to read this comment. I find this implementation significantly cleaner than what was rolled back. The 2015 logic is similar to what on nightly today, and the 2018 logic very small without the feature gate. This clean 2018 logic is what well be updating in the future, which I consider a benefit.

removal of a useful feature to introducing one-symbol lookup logic.

Not removing; delaying until the edition in a couple of months...

@sgrif
Copy link
Contributor

sgrif commented Jul 4, 2018

Mostly a tangent, but I would absolutely love to have a solution to $(...)++ (Which presumably covers this case as well), its pretty commonly needed for me.

@alexreg
Copy link
Contributor

alexreg commented Jul 4, 2018

Using +, *, and ? as separators we seem to have determined is an extremely rare requirement, and not worth supporting right now. Perhaps in the future it might get supported via escape characters.

@mark-i-m
Copy link
Member Author

mark-i-m commented Jul 4, 2018

Hmm... Perhaps we should explore escaping tokens as separators? Will open a thread on internals.

@mark-i-m
Copy link
Member Author

mark-i-m commented Jul 4, 2018

@nrc nrc mentioned this issue Jul 11, 2018
15 tasks
@rfcbot rfcbot added the final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. label Jul 12, 2018
@rfcbot
Copy link

rfcbot commented Jul 12, 2018

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot removed the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label Jul 12, 2018
@mark-i-m
Copy link
Member Author

I think the current proposal is forward compatible with separator escaping, right?

@alexreg
Copy link
Contributor

alexreg commented Jul 13, 2018

@mark-i-m I don't think so, no. Because any future escape char chosen would be recognised as a separator char under the current behaviour (or if $ were chosen, that would screw things up too). It would require a new edition lint. Maybe best to reserve an escape char now?

@mark-i-m
Copy link
Member Author

Hmm... It looks like currently \ is not allowed (the lexer errors out). There are also other combinations that are not allowed (e.g. the suggestion from the internals thread $(a)(?)). So it doesn't seem like too much of a stretch to think that we could find a reasonable escaping scheme after finishing this...

@alexreg
Copy link
Contributor

alexreg commented Jul 14, 2018

Ah, that's better than I thought.

@alexreg
Copy link
Contributor

alexreg commented Jul 14, 2018

@mark-i-m BTW, could you kindly expand the bit in the book about lints with the knowledge you picked up from writing this PR? :-)

@mark-i-m
Copy link
Member Author

Funny you should ask :)

rust-lang/rustc-dev-guide#170

@rfcbot
Copy link

rfcbot commented Jul 22, 2018

The final comment period, with a disposition to merge, as per the review above, is now complete.

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Jul 22, 2018
@alexreg
Copy link
Contributor

alexreg commented Jul 22, 2018

@mark-i-m Great. Maybe r? @nikomatsakis now?

@nikomatsakis
Copy link
Contributor

The corresponding PR has been merged, closing this issue.

@varkor
Copy link
Member

varkor commented Aug 15, 2018

@mark-i-m: isn't using the operator in 2015 as a separator supposed to trigger a migration lint? It doesn't seem to at the moment.

@mark-i-m
Copy link
Member Author

@varkor You're not using it as a separator. You are using it as a Kleene op. And it does actually give the correct error:

error: expected `*` or `+`
 --> src/main.rs:7:33
  |
7 |   (do $b1:block $(and $b2:block)?) => {
  |                                 ^
  |
  = note: `?` is not a macro repetition operator

If you want the migration lint, you will need to add #![warn(question_mark_macro_sep)] or #![warn(rust_2018_compatibility)]:

warning: using `?` as a separator is deprecated and will be a hard error in an upcoming edition
 --> src/main.rs:8:33
  |
8 |   (do $b1:block $(and $b2:block)?*) => {
  |                                 ^
  |
note: lint level defined here
 --> src/main.rs:5:9
  |
5 | #![warn(rust_2018_compatibility)]
  |         ^^^^^^^^^^^^^^^^^^^^^^^
  = note: #[warn(question_mark_macro_sep)] implied by #[warn(rust_2018_compatibility)]
  = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in the 2018 edition!
  = note: for more information, see issue #48075 <https://github.com/rust-lang/rust/issues/48075>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

10 participants