From c88e58928295ca13b5a92066c77adeadc1ed5a64 Mon Sep 17 00:00:00 2001 From: Michael Dyck Date: Sun, 29 Nov 2020 21:16:35 -0500 Subject: [PATCH] Normative: Make B.1.2 "String Literals" normative. (Part of Annex B reform, see PR #1595.) B.1.2 makes 2 changes to the EscapeSequence production: (1) It adds the rhs `NonOctalDecimalEscapeSequence`. (2) It replaces the rhs: `0` [lookahead <! DecimalDigit] with: LegacyOctalEscapeSequence where the latter nonterminal generates `0` among lots of other things. We want to continue to disallow such syntax in strict mode and templates, but the mechanism to do so must change. Formerly, the spec would say that in such contexts, it's forbidden to extend the syntax in this way. But since (with this PR), this is no longer an extension, we instead use early error rules to say that in such contexts, occurrences of the 'new' parts of the syntax are Syntax Errors. For change 1, making it a Syntax Error is fairly straightforward. But for change 2, we can't simply say that LegacyOctalEscapeSequence is a Syntax Error in strict mode, because strict mode still has to allow the restricted syntax. Instead, we say that if we're in strict mode code, an instance of LegacyOctalEscapeSequence is a Syntax Error *unless* it's an instance of the restricted syntax. To express the latter condition, we use the cover grammar machinery. (It could be done in other ways, but I think this is clearest.) --- spec.html | 202 +++++++++++++++++++++++++++++------------------------- 1 file changed, 109 insertions(+), 93 deletions(-) diff --git a/spec.html b/spec.html index abf1c5c8515..4575ac8e96c 100644 --- a/spec.html +++ b/spec.html @@ -11327,12 +11327,11 @@

Syntax

EscapeSequence :: CharacterEscapeSequence - `0` [lookahead <! DecimalDigit] + LegacyOctalEscapeSequence + NonOctalDecimalEscapeSequence HexEscapeSequence UnicodeEscapeSequence - -

A conforming implementation, when processing strict mode code, must not extend the syntax of |EscapeSequence| to include or as described in .

- + CharacterEscapeSequence :: SingleEscapeCharacter NonEscapeCharacter @@ -11349,6 +11348,21 @@

Syntax

`x` `u` + LegacyOctalEscapeSequence :: + OctalDigit [lookahead <! OctalDigit] + ZeroToThree OctalDigit [lookahead <! OctalDigit] + FourToSeven OctalDigit + ZeroToThree OctalDigit OctalDigit + + ZeroToThree :: one of + `0` `1` `2` `3` + + FourToSeven :: one of + `4` `5` `6` `7` + + NonOctalDecimalEscapeSequence :: one of + `8` `9` + HexEscapeSequence :: `x` HexDigit HexDigit @@ -11364,6 +11378,36 @@

Syntax

<LF> and <CR> cannot appear in a string literal, except as part of a |LineContinuation| to produce the empty code points sequence. The proper way to include either in the String value of a string literal is to use an escape sequence such as `\\n` or `\\u000A`.

+

Supplemental Syntax

+

When processing an instance of the production LegacyOctalEscapeSequence :: OctalDigit the following production is used to refine the interpretation of |LegacyOctalEscapeSequence|.

+ + StrictZeroEscapeSequence :: + `0` [lookahead <! DecimalDigit] + + + +

Static Semantics: Early Errors

+ + EscapeSequence :: LegacyOctalEscapeSequence + +
    +
  • It is a Syntax Error if |EscapeSequence| is not covering a |StrictZeroEscapeSequence| and either the source code matching this production is strict mode code or |EscapeSequence| is contained within a |TemplateCharacter|.
  • +
+ + EscapeSequence :: NonOctalDecimalEscapeSequence + +
    +
  • It is a Syntax Error if the source code matching this production is strict mode code or |EscapeSequence| is contained within a |TemplateCharacter|.
  • +
+ In non-strict code, this syntax is allowed, but deprecated. + +

It is possible for string literals to precede a Use Strict Directive that places the enclosing code in strict mode, and implementations must take care to enforce the above rules for such literals. For example, the following source text contains a Syntax Error:

+

+            function invalid() { "\7"; "use strict"; }
+          
+
+
+

Static Semantics: StringValue

@@ -11377,7 +11421,7 @@

Static Semantics: StringValue

- +

Static Semantics: SV

A string literal stands for a value of the String type. The String value (SV) of the literal is described in terms of String values contributed by the various parts of the string literal. As part of this process, some Unicode code points within the string literal are interpreted as having a mathematical value (MV), as described below or in .

    @@ -11417,9 +11461,6 @@

    Static Semantics: SV

  • The SV of SingleStringCharacter :: LineContinuation is the empty String.
  • -
  • - The SV of EscapeSequence :: `0` is the String value consisting of the code unit 0x0000 (NULL). -
  • The SV of CharacterEscapeSequence :: SingleEscapeCharacter is the String value consisting of the code unit whose value is determined by the |SingleEscapeCharacter| according to .
  • @@ -11574,6 +11615,15 @@

    Static Semantics: SV

  • The SV of NonEscapeCharacter :: SourceCharacter but not one of EscapeCharacter or LineTerminator is the result of performing UTF16EncodeCodePoint on the code point value of |SourceCharacter|.
  • +
  • + The SV of EscapeSequence :: LegacyOctalEscapeSequence is the String value consisting of the code unit whose value is the MV of |LegacyOctalEscapeSequence|. +
  • +
  • + The SV of NonOctalDecimalEscapeSequence :: `8` is the String value consisting of the code unit 0x0038 (DIGIT EIGHT). +
  • +
  • + The SV of NonOctalDecimalEscapeSequence :: `9` is the String value consisting of the code unit 0x0039 (DIGIT NINE). +
  • The SV of HexEscapeSequence :: `x` HexDigit HexDigit is the String value consisting of the code unit whose value is the MV of |HexEscapeSequence|.
  • @@ -11589,6 +11639,39 @@

    Static Semantics: SV

    Static Semantics: MV

      +
    • + The MV of LegacyOctalEscapeSequence :: ZeroToThree OctalDigit is (8 times the MV of |ZeroToThree|) plus the MV of |OctalDigit|. +
    • +
    • + The MV of LegacyOctalEscapeSequence :: FourToSeven OctalDigit is (8 times the MV of |FourToSeven|) plus the MV of |OctalDigit|. +
    • +
    • + The MV of LegacyOctalEscapeSequence :: ZeroToThree OctalDigit OctalDigit is (64 (that is, 82) times the MV of |ZeroToThree|) plus (8 times the MV of the first |OctalDigit|) plus the MV of the second |OctalDigit|. +
    • +
    • + The MV of ZeroToThree :: `0` is 0. +
    • +
    • + The MV of ZeroToThree :: `1` is 1. +
    • +
    • + The MV of ZeroToThree :: `2` is 2. +
    • +
    • + The MV of ZeroToThree :: `3` is 3. +
    • +
    • + The MV of FourToSeven :: `4` is 4. +
    • +
    • + The MV of FourToSeven :: `5` is 5. +
    • +
    • + The MV of FourToSeven :: `6` is 6. +
    • +
    • + The MV of FourToSeven :: `7` is 7. +
    • The MV of HexEscapeSequence :: `x` HexDigit HexDigit is (16 times the MV of the first |HexDigit|) plus the MV of the second |HexDigit|.
    • @@ -11734,10 +11817,12 @@

      Syntax

      CodePoint :: HexDigits[~Sep] [> but only if MV of |HexDigits| ≤ 0x10FFFF] -

      A conforming implementation must not use the extended definition of |EscapeSequence| described in when parsing a |TemplateCharacter|.

      |TemplateSubstitutionTail| is used by the |InputElementTemplateTail| alternative lexical goal.

      + +

      Instances of the production TemplateCharacter :: `\` EscapeSequence are restricted by early error rules in .

      +

      Static Semantics: TV and TRV

      @@ -11792,7 +11877,7 @@

      Static Semantics: TV and TRV

      The TRV of TemplateCharacter :: `\` NotEscapeSequence is the string-concatenation of the code unit 0x005C (REVERSE SOLIDUS) and the TRV of |NotEscapeSequence|.
    • - The TRV of EscapeSequence :: `0` is the String value consisting of the code unit 0x0030 (DIGIT ZERO). + The TRV of EscapeSequence :: LegacyOctalEscapeSequence is the String value consisting of the code unit 0x0030 (DIGIT ZERO).
    • The TRV of NotEscapeSequence :: `0` DecimalDigit is the string-concatenation of the code unit 0x0030 (DIGIT ZERO) and the TRV of |DecimalDigit|. @@ -24740,9 +24825,6 @@

      Forbidden Extensions

    • The Syntactic Grammar must not be extended in any manner that allows the token `:` to immediately follow source text that matches the |BindingIdentifier| nonterminal symbol.
    • -
    • - |TemplateCharacter| must not be extended to include or as defined in . -
    • When processing strict mode code, the extensions defined in , , , and must not be supported.
    • @@ -41605,9 +41687,16 @@

      Lexical Grammar

      + + + + +

      When processing an instance of the production the following production is used to refine the interpretation of |LegacyOctalEscapeSequence|.

      + +

       

      @@ -41958,86 +42047,13 @@

      Numeric Literals

      String Literals

      -

      The syntax and semantics of is extended as follows except that this extension is not allowed for strict mode code:

      -

      Syntax

      - - EscapeSequence :: - CharacterEscapeSequence - LegacyOctalEscapeSequence - NonOctalDecimalEscapeSequence - HexEscapeSequence - UnicodeEscapeSequence - - LegacyOctalEscapeSequence :: - OctalDigit [lookahead <! OctalDigit] - ZeroToThree OctalDigit [lookahead <! OctalDigit] - FourToSeven OctalDigit - ZeroToThree OctalDigit OctalDigit - - ZeroToThree :: one of - `0` `1` `2` `3` - - FourToSeven :: one of - `4` `5` `6` `7` +

      The following syntax from , and its associated semantics, used to be normative optional:

      + + EscapeSequence :: LegacyOctalEscapeSequence - NonOctalDecimalEscapeSequence :: one of - `8` `9` + EscapeSequence :: NonOctalDecimalEscapeSequence -

      This definition of |EscapeSequence| is not used in strict mode or when parsing |TemplateCharacter|.

      - -

      It is possible for string literals to precede a Use Strict Directive that places the enclosing code in strict mode, and implementations must take care to not use this extended definition of |EscapeSequence| with such literals. For example, attempting to parse the following source text must fail:

      -
      
      -          function invalid() { "\7"; "use strict"; }
      -        
      -
      - - -

      Static Semantics

      -
        -
      • - The SV of EscapeSequence :: LegacyOctalEscapeSequence is the String value consisting of the code unit whose value is the MV of |LegacyOctalEscapeSequence|. -
      • -
      • - The MV of LegacyOctalEscapeSequence :: ZeroToThree OctalDigit is (8 times the MV of |ZeroToThree|) plus the MV of |OctalDigit|. -
      • -
      • - The MV of LegacyOctalEscapeSequence :: FourToSeven OctalDigit is (8 times the MV of |FourToSeven|) plus the MV of |OctalDigit|. -
      • -
      • - The MV of LegacyOctalEscapeSequence :: ZeroToThree OctalDigit OctalDigit is (64 (that is, 82) times the MV of |ZeroToThree|) plus (8 times the MV of the first |OctalDigit|) plus the MV of the second |OctalDigit|. -
      • -
      • - The SV of NonOctalDecimalEscapeSequence :: `8` is the String value consisting of the code unit 0x0038 (DIGIT EIGHT). -
      • -
      • - The SV of NonOctalDecimalEscapeSequence :: `9` is the String value consisting of the code unit 0x0039 (DIGIT NINE). -
      • -
      • - The MV of ZeroToThree :: `0` is 0. -
      • -
      • - The MV of ZeroToThree :: `1` is 1. -
      • -
      • - The MV of ZeroToThree :: `2` is 2. -
      • -
      • - The MV of ZeroToThree :: `3` is 3. -
      • -
      • - The MV of FourToSeven :: `4` is 4. -
      • -
      • - The MV of FourToSeven :: `5` is 5. -
      • -
      • - The MV of FourToSeven :: `6` is 6. -
      • -
      • - The MV of FourToSeven :: `7` is 7. -
      • -
      -
      +

      and the productions for |LegacyOctalEscapeSequence|, |ZeroToThree|, and |FourToSeven|.

      @@ -42244,7 +42260,7 @@

      Static Semantics: CharacterValue

      CharacterEscape :: LegacyOctalEscapeSequence - 1. Return the MV of |LegacyOctalEscapeSequence| (see ). + 1. Return the MV of |LegacyOctalEscapeSequence| (see ).
      @@ -43226,7 +43242,7 @@

      The Strict Mode of ECMAScript

      A conforming implementation, when processing strict mode code, must disallow instances of the productions NumericLiteral :: LegacyOctalIntegerLiteral and DecimalIntegerLiteral :: NonOctalDecimalIntegerLiteral.
    • - A conforming implementation, when processing strict mode code, may not extend the syntax of |EscapeSequence| to include or as described in . + A conforming implementation, when processing strict mode code, must disallow instances of the production EscapeSequence :: LegacyOctalEscapeSequence that do not cover a |StrictZeroEscapeSequence|, and instances of the production EscapeSequence :: NonOctalDecimalEscapeSequence.
    • Assignment to an undeclared identifier or otherwise unresolvable reference does not create a property in the global object. When a simple assignment occurs within strict mode code, its |LeftHandSideExpression| must not evaluate to an unresolvable Reference. If it does a *ReferenceError* exception is thrown (). The |LeftHandSideExpression| also may not be a reference to a data property with the attribute value { [[Writable]]: *false* }, to an accessor property with the attribute value { [[Set]]: *undefined* }, nor to a non-existent property of an object whose [[Extensible]] internal slot has the value *false*. In these cases a `TypeError` exception is thrown ().