Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove support for 1-token lookahead from the lexer #62329

Merged
merged 9 commits into from
Jul 6, 2019

Conversation

matklad
Copy link
Member

@matklad matklad commented Jul 3, 2019

StringReader maintained peek_token and peek_span_src_raw for look ahead.

peek_token was used only by rustdoc syntax coloring. After moving peeking logic into highlighter, I was able to remove peek_token from the lexer. I tried to use iter::Peekable, but that wasn't as pretty as I hoped, due to buffered fatal errors. So I went with hand-rolled peeking.

After that I've noticed that the only peeking behavior left was for raw tokens to test tt jointness. I've rewritten it in terms of trivia tokens, and not just spans.

After that it became possible to simplify the awkward constructor of the lexer, which could return Err if the first peeked token contained error.

@matklad
Copy link
Member Author

matklad commented Jul 3, 2019

r? @petrochenkov

@rust-highfive rust-highfive assigned petrochenkov and unassigned eddyb Jul 3, 2019
@matklad
Copy link
Member Author

matklad commented Jul 3, 2019

cc @GuillaumeGomez for highlight.rs change I guess (first commit)

self.real_token();
let is_joint = raw.hi() == self.string_reader.peek_span_src_raw.lo()
&& self.token.is_op();
let is_joint = self.joint_to_prev == Joint && self.token.is_op();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.token.is_op() makes me think that perhaps we should check that the previous token is also an op?

That is, we currently can say Join for an identifier followed by (, for example.

Copy link
Contributor

@petrochenkov petrochenkov Jul 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the observable behavior is represented by the Spacing enum returned by Punct::spacing.

It has pretty specific documentation - we are "revealing" the jointness knowledge only for Punct + Punct pairs and additionally for Punct(') + Ident (for lifetimes).
Any other jointness is not revealed, Ident + ( in particular is not revealed because ( is not is_op, so the implementation looks correct.

Perhaps we should do this hiding entirely at the proc macro interface border though, and keep the full knowledge internally.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in particular is not revealed because ( is not is_op, so the implementation looks correct.

Ah, indeed, it needs to be Ident-, for example.

Yeah, I think we should either handle this completely on proc-macro layer, or completely in the TokenTreesReader, but this is unrelated to PR at hand

@matklad matklad force-pushed the no-peeking branch 3 times, most recently from 8b2cfd2 to f1f8def Compare July 3, 2019 15:57
@petrochenkov
Copy link
Contributor

@bors r+

@bors
Copy link
Contributor

bors commented Jul 3, 2019

📌 Commit f1f8def5e9f1ee88222a2dce3d4da007e0350556 has been approved by petrochenkov

@bors bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Jul 3, 2019
@matklad
Copy link
Member Author

matklad commented Jul 3, 2019

@bors r=petrochenkov

@bors
Copy link
Contributor

bors commented Jul 3, 2019

📌 Commit 07a9e4dcd2a518f7916a9db0487a1b950fa50e01 has been approved by petrochenkov

@bors
Copy link
Contributor

bors commented Jul 4, 2019

☔ The latest upstream changes (presumably #62355) made this pull request unmergeable. Please resolve the merge conflicts.

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jul 4, 2019
matklad added 5 commits July 4, 2019 09:01
The reader itself doesn't need ability to peek tokens, so it's better
if clients implement this functionality.

This hopefully becomes especially easy once we use iterator interface
for lexer, but this is not too easy at the moment, because of buffered
errors.
@matklad
Copy link
Member Author

matklad commented Jul 4, 2019

@bors r=petrochenkov

@bors
Copy link
Contributor

bors commented Jul 4, 2019

📌 Commit 3e362a4 has been approved by petrochenkov

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 4, 2019
Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this pull request Jul 4, 2019
Remove support for 1-token lookahead from the lexer

`StringReader` maintained `peek_token` and `peek_span_src_raw` for look ahead.

`peek_token` was used only by rustdoc syntax coloring. After moving peeking logic into highlighter, I was able to remove `peek_token` from the lexer. I tried to use `iter::Peekable`, but that wasn't as pretty as I hoped, due to buffered fatal errors. So I went with hand-rolled peeking.

After that I've noticed that the only peeking behavior left was for raw tokens to test tt jointness. I've rewritten it in terms of trivia tokens, and not just spans.

After that it became possible to simplify the awkward constructor of the lexer, which could return `Err` if the first peeked token contained error.
@GuillaumeGomez
Copy link
Member

A bit late but looks good to me as well (for the rustdoc part at least).

Centril added a commit to Centril/rust that referenced this pull request Jul 5, 2019
Remove support for 1-token lookahead from the lexer

`StringReader` maintained `peek_token` and `peek_span_src_raw` for look ahead.

`peek_token` was used only by rustdoc syntax coloring. After moving peeking logic into highlighter, I was able to remove `peek_token` from the lexer. I tried to use `iter::Peekable`, but that wasn't as pretty as I hoped, due to buffered fatal errors. So I went with hand-rolled peeking.

After that I've noticed that the only peeking behavior left was for raw tokens to test tt jointness. I've rewritten it in terms of trivia tokens, and not just spans.

After that it became possible to simplify the awkward constructor of the lexer, which could return `Err` if the first peeked token contained error.
bors added a commit that referenced this pull request Jul 5, 2019
Rollup of 8 pull requests

Successful merges:

 - #60260 (Add support for UWP targets)
 - #62151 (Update linked OpenSSL version)
 - #62245 (Miri engine: support extra function (pointer) values)
 - #62257 (forward read_c_str method from Memory to Alloc)
 - #62264 (Fix perf regression from Miri Machine trait changes)
 - #62296 (request at least ptr-size alignment from posix_memalign)
 - #62329 (Remove support for 1-token lookahead from the lexer)
 - #62377 (Add test for ICE #62375)

Failed merges:

r? @ghost
Centril added a commit to Centril/rust that referenced this pull request Jul 6, 2019
Remove support for 1-token lookahead from the lexer

`StringReader` maintained `peek_token` and `peek_span_src_raw` for look ahead.

`peek_token` was used only by rustdoc syntax coloring. After moving peeking logic into highlighter, I was able to remove `peek_token` from the lexer. I tried to use `iter::Peekable`, but that wasn't as pretty as I hoped, due to buffered fatal errors. So I went with hand-rolled peeking.

After that I've noticed that the only peeking behavior left was for raw tokens to test tt jointness. I've rewritten it in terms of trivia tokens, and not just spans.

After that it became possible to simplify the awkward constructor of the lexer, which could return `Err` if the first peeked token contained error.
bors added a commit that referenced this pull request Jul 6, 2019
Rollup of 7 pull requests

Successful merges:

 - #62151 (Update linked OpenSSL version)
 - #62245 (Miri engine: support extra function (pointer) values)
 - #62257 (forward read_c_str method from Memory to Alloc)
 - #62264 (Fix perf regression from Miri Machine trait changes)
 - #62296 (request at least ptr-size alignment from posix_memalign)
 - #62329 (Remove support for 1-token lookahead from the lexer)
 - #62377 (Add test for ICE #62375)

Failed merges:

r? @ghost
@bors bors merged commit 3e362a4 into rust-lang:master Jul 6, 2019
@matklad matklad deleted the no-peeking branch July 6, 2019 06:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants