From 3bad767f4c7edd303a93301b6add176bed8d464a Mon Sep 17 00:00:00 2001 From: Lee Byron Date: Wed, 21 Jun 2017 17:31:26 -0700 Subject: [PATCH 1/2] RFC: Multi-line String This RFC adds a new lexed token, the multi-line string, similar to that found in Python and Scala. A multi-line string starts and ends with a triple-quote: ``` """This is a triple-quoted string and it can contain multiple lines""" ``` Multi-line strings are useful for typing literal bodies of text where new lines should be interpretted literally. In fact, the only escape sequence used is `\"""` and `\` is otherwise allowed unescaped. This is beneficial when writing documentation within strings which may reference the back-slash often: ``` """ In a multi-line string \n and C:\\ are unescaped. """ ``` The primary value of multi-line strings are to write long-form input directly in query text, in tools like GraphiQL, and as a prerequisite to another pending RFC to allow docstring style documentation in the Schema Definition Language. --- spec/Appendix B -- Grammar Summary.md | 8 ++++-- spec/Section 2 -- Language.md | 36 +++++++++++++++++++++------ 2 files changed, 35 insertions(+), 9 deletions(-) diff --git a/spec/Appendix B -- Grammar Summary.md b/spec/Appendix B -- Grammar Summary.md index d3b59f131..97e82faf4 100644 --- a/spec/Appendix B -- Grammar Summary.md +++ b/spec/Appendix B -- Grammar Summary.md @@ -69,14 +69,18 @@ ExponentIndicator :: one of `e` `E` Sign :: one of + - StringValue :: - - `""` - - `"` StringCharacter+ `"` + - `"` StringCharacter* `"` + - `"""` MultiLineStringCharacter* `"""` StringCharacter :: - SourceCharacter but not `"` or \ or LineTerminator - \u EscapedUnicode - \ EscapedCharacter +MultiLineStringCharacter :: + - SourceCharacter but not `"""` or `\"""` + - `\"""` + EscapedUnicode :: /[0-9A-Fa-f]{4}/ EscapedCharacter :: one of `"` \ `/` b f n r t diff --git a/spec/Section 2 -- Language.md b/spec/Section 2 -- Language.md index 05f4f145d..62f2aaee3 100644 --- a/spec/Section 2 -- Language.md +++ b/spec/Section 2 -- Language.md @@ -694,14 +694,18 @@ The two keywords `true` and `false` represent the two boolean values. ### String Value StringValue :: - - `""` - - `"` StringCharacter+ `"` + - `"` StringCharacter* `"` + - `"""` MultiLineStringCharacter* `"""` StringCharacter :: - SourceCharacter but not `"` or \ or LineTerminator - \u EscapedUnicode - \ EscapedCharacter +MultiLineStringCharacter :: + - SourceCharacter but not `"""` or `\"""` + - `\"""` + EscapedUnicode :: /[0-9A-Fa-f]{4}/ EscapedCharacter :: one of `"` \ `/` b f n r t @@ -714,16 +718,34 @@ Note: Unicode characters are allowed within String value literals, however GraphQL source must not contain some ASCII control characters so escape sequences must be used to represent these characters. -**Semantics** +**Multi-line Strings** -StringValue :: `""` +Multi-line strings are sequences of characters wrapped in triple-quotes (`"""`). +White space, line terminators, and quote and backslash characters may all be +used unescaped, enabling freeform text. Characters must all be valid +{SourceCharacter} to ensure printable source text. If non-printable ASCII +characters need to be used, escape sequences must be used within standard +double-quote strings. - * Return an empty Unicode character sequence. +**Semantics** -StringValue :: `"` StringCharacter+ `"` +StringValue :: `"` StringCharacter* `"` * Return the Unicode character sequence of all {StringCharacter} - Unicode character values. + Unicode character values (which may be empty). + +StringValue :: `"""` MultiLineStringCharacter* `"""` + + * Return the Unicode character sequence of all {MultiLineStringCharacter} + Unicode character values (which may be empty). + +MultiLineStringCharacter :: SourceCharacter but not `"""` or `\"""` + + * Return the character value of {SourceCharacter}. + +MultiLineStringCharacter :: `\"""` + + * Return the character sequence `"""`. StringCharacter :: SourceCharacter but not `"` or \ or LineTerminator From 979a1f25e80592a0aa41cfbe4befa34385fac132 Mon Sep 17 00:00:00 2001 From: Lee Byron Date: Wed, 21 Jun 2017 23:57:54 -0700 Subject: [PATCH 2/2] Include language to remove common indentation from multi-line strings --- spec/Appendix B -- Grammar Summary.md | 13 ++-- spec/Section 2 -- Language.md | 107 ++++++++++++++++++++------ 2 files changed, 91 insertions(+), 29 deletions(-) diff --git a/spec/Appendix B -- Grammar Summary.md b/spec/Appendix B -- Grammar Summary.md index 97e82faf4..c6c2edaab 100644 --- a/spec/Appendix B -- Grammar Summary.md +++ b/spec/Appendix B -- Grammar Summary.md @@ -70,21 +70,24 @@ Sign :: one of + - StringValue :: - `"` StringCharacter* `"` - - `"""` MultiLineStringCharacter* `"""` + - `"""` BlockStringCharacter* `"""` StringCharacter :: - SourceCharacter but not `"` or \ or LineTerminator - \u EscapedUnicode - \ EscapedCharacter -MultiLineStringCharacter :: - - SourceCharacter but not `"""` or `\"""` - - `\"""` - EscapedUnicode :: /[0-9A-Fa-f]{4}/ EscapedCharacter :: one of `"` \ `/` b f n r t +BlockStringCharacter :: + - SourceCharacter but not `"""` or `\"""` + - `\"""` + +Note: Block string values are interpretted to exclude blank initial and trailing +lines and uniform indentation with {BlockStringValue()}. + ## Query Document diff --git a/spec/Section 2 -- Language.md b/spec/Section 2 -- Language.md index 62f2aaee3..3b6da6d97 100644 --- a/spec/Section 2 -- Language.md +++ b/spec/Section 2 -- Language.md @@ -695,57 +695,72 @@ The two keywords `true` and `false` represent the two boolean values. StringValue :: - `"` StringCharacter* `"` - - `"""` MultiLineStringCharacter* `"""` + - `"""` BlockStringCharacter* `"""` StringCharacter :: - SourceCharacter but not `"` or \ or LineTerminator - \u EscapedUnicode - \ EscapedCharacter -MultiLineStringCharacter :: - - SourceCharacter but not `"""` or `\"""` - - `\"""` - EscapedUnicode :: /[0-9A-Fa-f]{4}/ EscapedCharacter :: one of `"` \ `/` b f n r t +BlockStringCharacter :: + - SourceCharacter but not `"""` or `\"""` + - `\"""` + Strings are sequences of characters wrapped in double-quotes (`"`). (ex. `"Hello World"`). White space and other otherwise-ignored characters are significant within a string value. Note: Unicode characters are allowed within String value literals, however -GraphQL source must not contain some ASCII control characters so escape +{SourceCharacter} must not contain some ASCII control characters so escape sequences must be used to represent these characters. -**Multi-line Strings** +**Block Strings** -Multi-line strings are sequences of characters wrapped in triple-quotes (`"""`). -White space, line terminators, and quote and backslash characters may all be -used unescaped, enabling freeform text. Characters must all be valid -{SourceCharacter} to ensure printable source text. If non-printable ASCII -characters need to be used, escape sequences must be used within standard -double-quote strings. +Block strings are sequences of characters wrapped in triple-quotes (`"""`). +White space, line terminators, quote, and backslash characters may all be +used unescaped to enable verbatim text. Characters must all be valid +{SourceCharacter}. -**Semantics** +Since block strings represent freeform text often used in indented +positions, the string value semantics of a block string excludes uniform +indentation and blank initial and trailing lines via {BlockStringValue()}. -StringValue :: `"` StringCharacter* `"` +For example, the following operation containing a block string: - * Return the Unicode character sequence of all {StringCharacter} - Unicode character values (which may be empty). +```graphql +mutation { + sendEmail(message: """ + Hello, + World! -StringValue :: `"""` MultiLineStringCharacter* `"""` + Yours, + GraphQL. + """) +} +``` - * Return the Unicode character sequence of all {MultiLineStringCharacter} - Unicode character values (which may be empty). +Is identical to the standard quoted string: + +```graphql +mutation { + sendEmail(message: "Hello,\n World!\n\nYours,\n GraphQL.") +} +``` -MultiLineStringCharacter :: SourceCharacter but not `"""` or `\"""` +Note: If non-printable ASCII characters are needed in a string value, a standard +quoted string with appropriate escape sequences must be used instead of a +block string. - * Return the character value of {SourceCharacter}. +**Semantics** -MultiLineStringCharacter :: `\"""` +StringValue :: `"` StringCharacter* `"` - * Return the character sequence `"""`. + * Return the Unicode character sequence of all {StringCharacter} + Unicode character values (which may be an empty sequence). StringCharacter :: SourceCharacter but not `"` or \ or LineTerminator @@ -771,6 +786,50 @@ StringCharacter :: \ EscapedCharacter | `r` | U+000D | carriage return | | `t` | U+0009 | horizontal tab | +StringValue :: `"""` BlockStringCharacter* `"""` + + * Let {rawValue} be the Unicode character sequence of all + {BlockStringCharacter} Unicode character values (which may be an empty + sequence). + * Return the result of {BlockStringValue(rawValue)}. + +BlockStringCharacter :: SourceCharacter but not `"""` or `\"""` + + * Return the character value of {SourceCharacter}. + +BlockStringCharacter :: `\"""` + + * Return the character sequence `"""`. + +BlockStringValue(rawValue): + + * Let {lines} be the result of splitting {rawValue} by {LineTerminator}. + * Let {commonIndent} be {null}. + * For each {line} in {lines}: + * If {line} is the first item in {lines}, continue to the next line. + * Let {length} be the number of characters in {line}. + * Let {indent} be the number of leading consecutive {WhiteSpace} characters + in {line}. + * If {indent} is less than {length}: + * If {commonIndent} is {null} or {indent} is less than {commonIndent}: + * Let {commonIndent} be {indent}. + * If {commonIndent} is not {null}: + * For each {line} in {lines}: + * If {line} is the first item in {lines}, continue to the next line. + * Remove {commonIndent} characters from the beginning of {line}. + * While the first item {line} in {lines} contains only {WhiteSpace}: + * Remove the first item from {lines}. + * While the last item {line} in {lines} contains only {WhiteSpace}: + * Remove the last item from {lines}. + * Let {formatted} be the empty character sequence. + * For each {line} in {lines}: + * If {line} is the first item in {lines}: + * Append {formatted} with {line}. + * Otherwise: + * Append {formatted} with a line feed character (U+000A). + * Append {formatted} with {line}. + * Return {formatted}. + ### Null Value