-
Notifications
You must be signed in to change notification settings - Fork 866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Java21 string template #1010
Add support for Java21 string template #1010
Conversation
won't work because the output tokens of `com.sun.tools.javac.parser.Scanner` are not sorted by position
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this!
I had started working on this in parallel before seeing this PR, and had merged a very simple initial handling of string templates in b5feefe. It just passes the entire string template through without formatting it. This approach is more complete, and I would like to merge most of what you have here.
@@ -104,6 +123,8 @@ public static ImmutableList<RawTok> getTokens( | |||
break; | |||
} | |||
if (last < t.pos) { | |||
/* If the current token is not immediately following the previous one, | |||
treat the gap as a token */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the intent here was to only add tokens for these 'gaps' corresponding to whitespace. I think with this change there's also a gap for the \
that escapes {
in the string templates, and this logic is adding in that token.
One alternative I was thinking about was to include the \
in the preceding string fragment token. That also solves the problem I mentioned below where an empty token is leading to a crash in JavaInput.buildToks
.
I have a demo of that approach here, what do you think? https://github.com/google/google-java-format/compare/master...cushon:google-java-format:stringtemplate?expand=1#diff-a1167950838171d2ee18097c833af4c3155bfededea60870f677cb8a72738aa1R161
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your demo will work without problem.
There's one thing I'm not very sure, will that too tricky to split \
and {
into two tokens? For example, can we guarantee that {
will follow \
immediately in output?
After some test of your demo, I can't find a negative test case to illustrate this. Even for very long string fragment, \
and {
are stay together perfectly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that should be OK, and that \{
is handled as an escape sequence where it wouldn't be valid to have any separate between the two characters.
/** Lex the input and return a list of {@link RawTok}s. */ | ||
public static ImmutableList<RawTok> getTokens( | ||
String source, Context context, Set<TokenKind> stopTokens) { | ||
private static ImmutableList<Token> readAllToken(String source, Context context) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the test cases that were added for I981
, I am seeing an issue where this returns tokens with empty string values for some string templates, which crash JavaInput.buildToks
. I think the issue is that for string templates like "foo\{X}\{Y}bar"
there are three string fragment tokens, and the middle one is empty.
java.lang.StringIndexOutOfBoundsException: Index 0 out of bounds for length 0
...
at java.base/java.lang.String.charAt(String.java:1555)
at com.google.googlejavaformat.java.JavaInput.buildToks(JavaInput.java:385)
at com.google.googlejavaformat.java.JavaInput.buildToks(JavaInput.java:335)
at com.google.googlejavaformat.java.JavaInput.<init>(JavaInput.java:277)
at com.google.googlejavaformat.java.Formatter.getFormatReplacements(Formatter.java:270)
at com.google.googlejavaformat.java.Formatter.formatSource(Formatter.java:257)
at com.google.googlejavaformat.java.Formatter.formatSource(Formatter.java:223)
|
||
String frag = fragIt.next(); | ||
if (!fragIt.hasNext()) { | ||
token(frag + "\""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this handles unicode escapes. The approach in the other change was to just pass through the token for string fragments:
Line 89 in b5feefe
token(builder.peekToken().get()); |
which is similar to how string literals worked previously:
google-java-format/core/src/main/java/com/google/googlejavaformat/java/JavaInputAstVisitor.java
Line 1658 in b5feefe
String sourceForNode = getSourceForNode(node, getCurrentPath()); |
Thank you for this detailed reviewing! |
Should I close this PR and wait for yours to merge? There are some details in the original PR that weren't considered, and I found they're already handled in your fork. |
Thanks for taking a look! If the changes I made look OK I will go ahead and merge that version (and credit you). |
The initial implementation passed through the entire string unmodified, this allows formatting the Java expressions inside the `\{...}`. See #1010 Co-authored-by: butterunderflow <[email protected]> PiperOrigin-RevId: 592889073
The initial implementation passed through the entire string unmodified, this allows formatting the Java expressions inside the `\{...}`. See #1010 Co-authored-by: butterunderflow <[email protected]> PiperOrigin-RevId: 592889073
The initial implementation passed through the entire string unmodified, this allows formatting the Java expressions inside the `\{...}`. See #1010 Co-authored-by: butterunderflow <[email protected]> PiperOrigin-RevId: 592940163
This has been merged in 38de9c4. Thanks again for the contribution! |
86390d0: Support for string templates and corresponding tests. But won't work, because output tokens of
com.sun.tools.javac.parser.Scanner
are not sorted by their position, and then cause the verification here failed.86390d0: Read all tokens from Scanner, and sort them if jdk version >= 21.
Some links may be useful:
How jdk parse string template from tokens: https://github.com/openjdk/jdk/blob/3d9d353edb64dd364925481d7b7c8822beeaa117/src/jdk.compiler/share/classes/com/sun/tools/javac/parser/JavacParser.java#L695-L746
How tokens of string templates are builded: https://github.com/openjdk/jdk/blob/3d9d353edb64dd364925481d7b7c8822beeaa117/src/jdk.compiler/share/classes/com/sun/tools/javac/parser/JavaTokenizer.java#L342-L420