Skip to content

Commit

Permalink
Editorial: Merge 'ExtendedAtom' and 'Atom'
Browse files Browse the repository at this point in the history
Note that:

ExtendedAtom was only ever 'invoked' under [~U],
and when the merged production is invoked with [~U],
it exactly reproduces the RHSs of ExtendedAtom.

Atom was only ever invoked under [+U],
and when the merged production is invoked with [+U],
it reproduces the RHSs of former Atom
except for the placement of the PatternCharacter RHS,
but that's okay, because former Atom wasn't an order-disambiguated production.
  • Loading branch information
jmdyck committed Sep 14, 2021
1 parent 53c48e5 commit 8e986fa
Showing 1 changed file with 30 additions and 41 deletions.
71 changes: 30 additions & 41 deletions spec.html
Original file line number Diff line number Diff line change
Expand Up @@ -34048,8 +34048,8 @@ <h2>Patterns</h2>
[+UnicodeMode] Atom[+UnicodeMode, ?N]
[~UnicodeMode] QuantifiableAssertion[~UnicodeMode, ?N] Quantifier
[~UnicodeMode] Assertion[~UnicodeMode, ?N]
[~UnicodeMode] ExtendedAtom[?N] Quantifier
[~UnicodeMode] ExtendedAtom[?N]
[~UnicodeMode] Atom[~UnicodeMode, ?N] Quantifier
[~UnicodeMode] Atom[~UnicodeMode, ?N]

Assertion[UnicodeMode, N] ::
`^`
Expand All @@ -34076,23 +34076,17 @@ <h2>Patterns</h2>
`{` DecimalDigits[~Sep] `,` `}`
`{` DecimalDigits[~Sep] `,` DecimalDigits[~Sep] `}`

ExtendedAtom[N] ::!
`.`
`\` AtomEscape[~UnicodeMode, ?N]
`\` [lookahead == `c`]
CharacterClass[~UnicodeMode]
`(` Disjunction[~UnicodeMode, ?N] `)`
`(` `?` `:` Disjunction[~UnicodeMode, ?N] `)`
InvalidBracedQuantifier
ExtendedPatternCharacter

Atom[UnicodeMode, N] ::
PatternCharacter
Atom[UnicodeMode, N] ::!
`.`
`\` AtomEscape[?UnicodeMode, ?N]
[~UnicodeMode] `\` [lookahead == `c`]
CharacterClass[?UnicodeMode]
`(` GroupSpecifier[?UnicodeMode] Disjunction[?UnicodeMode, ?N] `)`
[+UnicodeMode] `(` GroupSpecifier[?UnicodeMode] Disjunction[?UnicodeMode, ?N] `)`
[~UnicodeMode] `(` Disjunction[?UnicodeMode, ?N] `)`
`(` `?` `:` Disjunction[?UnicodeMode, ?N] `)`
[~UnicodeMode] InvalidBracedQuantifier
[+UnicodeMode] PatternCharacter
[~UnicodeMode] ExtendedPatternCharacter

InvalidBracedQuantifier ::
`{` DecimalDigits[~Sep] `}`
Expand Down Expand Up @@ -34283,7 +34277,7 @@ <h2>Escapes</h2>
<emu-grammar>
Term ::! QuantifiableAssertion Quantifier

ExtendedAtom ::! `\` [lookahead == `c`]
Atom ::! `\` [lookahead == `c`]

ClassAtomNoDash ::! `\` [lookahead == `c`]

Expand Down Expand Up @@ -34314,7 +34308,7 @@ <h1>Static Semantics: Early Errors</h1>
It is a Syntax Error if the MV of the first |DecimalDigits| is larger than the MV of the second |DecimalDigits|.
</li>
</ul>
<emu-grammar>ExtendedAtom ::! InvalidBracedQuantifier</emu-grammar>
<emu-grammar>Atom ::! InvalidBracedQuantifier</emu-grammar>
<ul>
<li>
It is a Syntax Error if any source text matches this rule.
Expand Down Expand Up @@ -34682,7 +34676,7 @@ <h1>Notation</h1>
_InputLength_ is the number of characters in _Input_.
</li>
<li>
_NcapturingParens_ is the total number of left-capturing parentheses (i.e. the total number of <emu-grammar>Atom :: `(` GroupSpecifier Disjunction `)`</emu-grammar> Parse Nodes) in the pattern. A left-capturing parenthesis is any `(` pattern character that is matched by the `(` terminal of the <emu-grammar>Atom :: `(` GroupSpecifier Disjunction `)`</emu-grammar> production.
_NcapturingParens_ is the total number of left-capturing parentheses (i.e. the total number of <emu-grammar>Atom ::! `(` GroupSpecifier Disjunction `)`</emu-grammar> Parse Nodes) in the pattern. A left-capturing parenthesis is any `(` pattern character that is matched by the `(` terminal of the <emu-grammar>Atom ::! `(` GroupSpecifier Disjunction `)`</emu-grammar> production.
</li>
<li>
_DotAll_ is *true* if the RegExp object's [[OriginalFlags]] internal slot contains *"s"* and otherwise is *false*.
Expand Down Expand Up @@ -34831,18 +34825,16 @@ <h1>Term</h1>
1. Evaluate |Atom| with argument _direction_ to obtain a Matcher _m_.
1. Evaluate |Quantifier| to obtain the three results: a non-negative integer _min_, a non-negative integer (or +&infin;) _max_, and Boolean _greedy_.
1. Assert: _min_ &le; _max_.
1. Let _parenIndex_ be the number of left-capturing parentheses in the entire regular expression that occur to the left of this |Term|. This is the total number of <emu-grammar>Atom :: `(` GroupSpecifier Disjunction `)`</emu-grammar> Parse Nodes prior to or enclosing this |Term|.
1. Let _parenCount_ be the number of left-capturing parentheses in |Atom|. This is the total number of <emu-grammar>Atom :: `(` GroupSpecifier Disjunction `)`</emu-grammar> Parse Nodes enclosed by |Atom|.
1. Let _parenIndex_ be the number of left-capturing parentheses in the entire regular expression that occur to the left of this |Term|. This is the total number of <emu-grammar>Atom ::! `(` GroupSpecifier Disjunction `)`</emu-grammar> Parse Nodes prior to or enclosing this |Term|.
1. Let _parenCount_ be the number of left-capturing parentheses in |Atom|. This is the total number of <emu-grammar>Atom ::! `(` GroupSpecifier Disjunction `)`</emu-grammar> Parse Nodes enclosed by |Atom|.
1. Return a new Matcher with parameters (_x_, _c_) that captures _m_, _min_, _max_, _greedy_, _parenIndex_, and _parenCount_ and performs the following steps when called:
1. Assert: _x_ is a State.
1. Assert: _c_ is a Continuation.
1. Return ! RepeatMatcher(_m_, _min_, _max_, _greedy_, _x_, _c_, _parenIndex_, _parenCount_).
</emu-alg>
<p>----</p>
<p>In the above algorithm, references to <emu-grammar>Atom :: `(` GroupSpecifier Disjunction `)`</emu-grammar> are to be interpreted as meaning <emu-grammar>Atom :: `(` GroupSpecifier Disjunction `)`</emu-grammar> or <emu-grammar>ExtendedAtom ::! `(` Disjunction `)`</emu-grammar> .</p>
<p>In the above algorithm, references to <emu-grammar>Atom ::! `(` GroupSpecifier Disjunction `)`</emu-grammar> are to be interpreted as meaning <emu-grammar>Atom ::! `(` GroupSpecifier Disjunction `)`</emu-grammar> or <emu-grammar>Atom ::! `(` Disjunction `)`</emu-grammar> .</p>
<p>The production <emu-grammar>Term ::! QuantifiableAssertion Quantifier</emu-grammar> evaluates the same as the production <emu-grammar>Term ::! Atom Quantifier</emu-grammar> but with |QuantifiableAssertion| substituted for |Atom|.</p>
<p>The production <emu-grammar>Term ::! ExtendedAtom Quantifier</emu-grammar> evaluates the same as the production <emu-grammar>Term ::! Atom Quantifier</emu-grammar> but with |ExtendedAtom| substituted for |Atom|.</p>
<p>The production <emu-grammar>Term ::! ExtendedAtom</emu-grammar> evaluates the same as the production <emu-grammar>Term ::! Atom</emu-grammar> but with |ExtendedAtom| substituted for |Atom|.</p>

<emu-clause id="sec-runtime-semantics-repeatmatcher-abstract-operation" type="abstract operation">
<h1>
Expand Down Expand Up @@ -35097,37 +35089,31 @@ <h1>Quantifier</h1>
<emu-clause id="sec-atom">
<h1>Atom</h1>
<p>With parameter _direction_.</p>
<p>The production <emu-grammar>Atom :: PatternCharacter</emu-grammar> evaluates as follows:</p>
<emu-alg>
1. Let _ch_ be the character matched by |PatternCharacter|.
1. Let _A_ be a one-element CharSet containing the character _ch_.
1. Return ! CharacterSetMatcher(_A_, *false*, _direction_).
</emu-alg>
<p>The production <emu-grammar>Atom :: `.`</emu-grammar> evaluates as follows:</p>
<p>The production <emu-grammar>Atom ::! `.`</emu-grammar> evaluates as follows:</p>
<emu-alg>
1. Let _A_ be the CharSet of all characters.
1. If _DotAll_ is not *true*, then
1. Remove from _A_ all characters corresponding to a code point on the right-hand side of the |LineTerminator| production.
1. Return ! CharacterSetMatcher(_A_, *false*, _direction_).
</emu-alg>
<p>The production <emu-grammar>Atom :: `\` AtomEscape</emu-grammar> evaluates as follows:</p>
<p>The production <emu-grammar>Atom ::! `\` AtomEscape</emu-grammar> evaluates as follows:</p>
<emu-alg>
1. Return the Matcher that is the result of evaluating |AtomEscape| with argument _direction_.
</emu-alg>
<p>The production <emu-grammar>ExtendedAtom ::! `\` [lookahead == `c`]</emu-grammar> evaluates as follows:</p>
<p>The production <emu-grammar>Atom ::! `\` [lookahead == `c`]</emu-grammar> evaluates as follows:</p>
<emu-alg>
1. Let _A_ be the CharSet containing the single character `\\` U+005C (REVERSE SOLIDUS).
1. Return ! CharacterSetMatcher(_A_, *false*, _direction_).
</emu-alg>
<p>The production <emu-grammar>Atom :: CharacterClass</emu-grammar> evaluates as follows:</p>
<p>The production <emu-grammar>Atom ::! CharacterClass</emu-grammar> evaluates as follows:</p>
<emu-alg>
1. Evaluate |CharacterClass| to obtain a CharSet _A_ and a Boolean _invert_.
1. Return ! CharacterSetMatcher(_A_, _invert_, _direction_).
</emu-alg>
<p>The production <emu-grammar>Atom :: `(` GroupSpecifier Disjunction `)`</emu-grammar> evaluates as follows:</p>
<p>The production <emu-grammar>Atom ::! `(` GroupSpecifier Disjunction `)`</emu-grammar> evaluates as follows:</p>
<emu-alg>
1. Evaluate |Disjunction| with argument _direction_ to obtain a Matcher _m_.
1. Let _parenIndex_ be the number of left-capturing parentheses in the entire regular expression that occur to the left of this |Atom|. This is the total number of <emu-grammar>Atom :: `(` GroupSpecifier Disjunction `)`</emu-grammar> Parse Nodes prior to or enclosing this |Atom|.
1. Let _parenIndex_ be the number of left-capturing parentheses in the entire regular expression that occur to the left of this |Atom|. This is the total number of <emu-grammar>Atom ::! `(` GroupSpecifier Disjunction `)`</emu-grammar> Parse Nodes prior to or enclosing this |Atom|.
1. Return a new Matcher with parameters (_x_, _c_) that captures _direction_, _m_, and _parenIndex_ and performs the following steps when called:
1. Assert: _x_ is a State.
1. Assert: _c_ is a Continuation.
Expand All @@ -35148,18 +35134,22 @@ <h1>Atom</h1>
1. Return _c_(_z_).
1. Return _m_(_x_, _d_).
</emu-alg>
<p>The production <emu-grammar>Atom :: `(` `?` `:` Disjunction `)`</emu-grammar> evaluates as follows:</p>
<p>The production <emu-grammar>Atom ::! `(` `?` `:` Disjunction `)`</emu-grammar> evaluates as follows:</p>
<emu-alg>
1. Return the Matcher that is the result of evaluating |Disjunction| with argument _direction_.
</emu-alg>
<p>The production <emu-grammar>ExtendedAtom ::! ExtendedPatternCharacter</emu-grammar> evaluates as follows:</p>
<p>The production <emu-grammar>Atom ::! ExtendedPatternCharacter</emu-grammar> evaluates as follows:</p>
<emu-alg>
1. Let _ch_ be the character represented by |ExtendedPatternCharacter|.
1. Let _A_ be a one-element CharSet containing the character _ch_.
1. Return ! CharacterSetMatcher(_A_, *false*, _direction_).
</emu-alg>
<p>----</p>
<p>The evaluation rules for the |Atom| productions except for <emu-grammar>Atom :: PatternCharacter</emu-grammar> are also used for the |ExtendedAtom| productions, but with |ExtendedAtom| substituted for |Atom|.</p>
<p>The production <emu-grammar>Atom ::! PatternCharacter</emu-grammar> evaluates as follows:</p>
<emu-alg>
1. Let _ch_ be the character matched by |PatternCharacter|.
1. Let _A_ be a one-element CharSet containing the character _ch_.
1. Return ! CharacterSetMatcher(_A_, *false*, _direction_).
</emu-alg>

<emu-clause id="sec-runtime-semantics-charactersetmatcher-abstract-operation" type="abstract operation">
<h1>
Expand Down Expand Up @@ -35489,7 +35479,7 @@ <h1>AtomEscape</h1>
<emu-alg>
1. Search the enclosing |Pattern| for an instance of a |GroupSpecifier| containing a |RegExpIdentifierName| which has a CapturingGroupName equal to the CapturingGroupName of the |RegExpIdentifierName| contained in |GroupName|.
1. Assert: A unique such |GroupSpecifier| is found.
1. Let _parenIndex_ be the number of left-capturing parentheses in the entire regular expression that occur to the left of the located |GroupSpecifier|. This is the total number of <emu-grammar>Atom :: `(` GroupSpecifier Disjunction `)`</emu-grammar> Parse Nodes prior to or enclosing the located |GroupSpecifier|, including its immediately enclosing |Atom|.
1. Let _parenIndex_ be the number of left-capturing parentheses in the entire regular expression that occur to the left of the located |GroupSpecifier|. This is the total number of <emu-grammar>Atom ::! `(` GroupSpecifier Disjunction `)`</emu-grammar> Parse Nodes prior to or enclosing the located |GroupSpecifier|, including its immediately enclosing |Atom|.
1. Return ! BackreferenceMatcher(_parenIndex_, _direction_).
</emu-alg>

Expand Down Expand Up @@ -46362,7 +46352,6 @@ <h1>Regular Expressions</h1>
<emu-prodref name="QuantifiableAssertion"></emu-prodref>
<emu-prodref name="Quantifier"></emu-prodref>
<emu-prodref name="QuantifierPrefix"></emu-prodref>
<emu-prodref name="ExtendedAtom"></emu-prodref>
<emu-prodref name="Atom"></emu-prodref>
<emu-prodref name="InvalidBracedQuantifier"></emu-prodref>
<emu-prodref name="ExtendedPatternCharacter"></emu-prodref>
Expand Down

0 comments on commit 8e986fa

Please sign in to comment.