Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misc editorial #1053

Merged
merged 48 commits into from
Feb 13, 2018
Merged

Misc editorial #1053

merged 48 commits into from
Feb 13, 2018

Conversation

jmdyck
Copy link
Collaborator

@jmdyck jmdyck commented Dec 21, 2017

In the first commit, add a [lookahead <! HexDigit] restriction to these
two right-hand sides of NotEscapeSequence:

    u { NotCodePoint
    u { CodePoint

to force them to consume all the HexDigits there.
Otherwise, the NotEscapeSequence could end 'early',
causing 'TemplateCharacters' to be ambiguous.

(I think the TRV is the same regardless which parse you pick, but presumably we want to avoid ambiguity anyway.)


Fix some typos in a recent commit.


Resolve a technical (though not substantive) ambiguity raised in issue #1059.


Attn @mathiasbynens re recent commits (starting at "consistify grammar params in defining prodns").

@jmdyck jmdyck changed the title Editorial: add more lookahead-restrictions to NotEscapeSequence Editorial: add more lookahead-restrictions to NotEscapeSequence (plus tweaks to recent commits) Jan 4, 2018
@jmdyck jmdyck changed the title Editorial: add more lookahead-restrictions to NotEscapeSequence (plus tweaks to recent commits) Misc editorial Jan 5, 2018
@jmdyck jmdyck force-pushed the editorial branch 3 times, most recently from 7663d7f to 77bfec2 Compare January 26, 2018 00:09
@mathiasbynens
Copy link
Member

This needs a rebase, but the changes look great — thank you @jmdyck!

@jmdyck
Copy link
Collaborator Author

jmdyck commented Jan 26, 2018

Working on the rebase now.

@jmdyck
Copy link
Collaborator Author

jmdyck commented Jan 26, 2018

rebase done

spec.html Outdated
@@ -30395,7 +30395,7 @@ <h1>Runtime Semantics: UnicodeMatchProperty ( _p_ )</h1>
<p>Implementations must only recognize the property aliases listed in <emu-xref href="#table-nonbinary-unicode-properties"></emu-xref> and <emu-xref href="#table-binary-unicode-properties"></emu-xref>.</p>
<p>Implementations must only recognize the property value aliases and canonical property value names listed in <emu-xref href="#table-unicode-general-category-values"></emu-xref> and <emu-xref href="#table-unicode-script-values"></emu-xref>.</p>
<emu-note>
<p>For example, `Script_Extensions` (property name) and `scx` (property alias) are valid, but `script_extensions` or `Scx` arent.</p>
<p>For example, `Script_Extensions` (property name) and `scx` (property alias) are valid, but `script_extensions` or `Scx` aren't.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typographically, is correct here (and ' isn't); why these changes?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consistency with the rest of the spec. (scan for /[a-z]'[a-z]/)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, that's unfortunate.

spec.html Outdated
@@ -30415,7 +30415,7 @@ <h1>Runtime Semantics: UnicodeMatchPropertyValue ( _p_, _v_ )</h1>
</emu-alg>
<p>Only the canonical property values and property value aliases listed in <emu-xref href="#table-unicode-general-category-values"></emu-xref> and <emu-xref href="#table-unicode-script-values"></emu-xref> must be recognized.</p>
<emu-note>
<p>For example, `Xpeo` and `Old_Persian` are valid `Script_Extension` values, but `xpeo` and `Old Persian` arent.</p>
<p>For example, `Xpeo` and `Old_Persian` are valid `Script_Extension` values, but `xpeo` and `Old Persian` aren't.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(also here)

spec.html Outdated
@@ -30549,7 +30549,7 @@ <h1>CharacterClassEscape</h1>
1. Return the CharSet containing all Unicode code points whose character database definition includes the property &ldquo;General_Category&rdquo; with value _LoneUnicodePropertyNameOrValue_.
1. Let _p_ be ! UnicodeMatchProperty(_LoneUnicodePropertyNameOrValue_).
1. Assert: _p_ is a binary Unicode property or binary property alias listed in the &ldquo;Property name and aliases&rdquo; column of <emu-xref href="#table-binary-unicode-properties"></emu-xref>.
1. Return the CharSet containing all Unicode code points whose character database definition includes the property _p_ with value |True|.
1. Return the CharSet containing all Unicode code points whose character database definition includes the property _p_ with value &ldquo;True&rdquo;.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the value true be lowercased?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current casing is correct. The binary property values according to the Unicode Standard are N/No/F/False and Y/Yes/T/True.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gotcha, I read it as "JS value" :-) thanks for clarifying.

Copy link
Member

@ljharb ljharb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but def needs more eyes

@jmdyck
Copy link
Collaborator Author

jmdyck commented Jan 27, 2018

added two more commits re lookbehind assertions, attn @mathiasbynens

@jmdyck jmdyck force-pushed the editorial branch 2 times, most recently from a2bcffb to e6f8436 Compare February 2, 2018 02:02
jmdyck added 11 commits February 8, 2018 11:46
Specifically, add a [lookahead <! HexDigit] restriction to the
two right-hand sides:
    u { NotCodePoint
    u { CodePoint
to force them to consume all the HexDigits there.
Otherwise, the NotEscapeSequence could end 'early',
causing 'TemplateCharacters' to be ambiguous.
(Could go either way. I chose to resolve it in the direction of smaller diff.)
... because that RHS made it ambiguous.

(Resolves issue tc39#1059.)
jmdyck added 23 commits February 8, 2018 11:46
... because it's already derived by UnicodeIDContinue RHS.

(See issue tc39#1059.)
... so that it doesn't look like PromiseResolve
is a property of the Promise Constructor.
... in prodns for AtomEscape and ClassEscape.

(Please check that [?U] is correct, and not [+U] or [~U].)
... from non-defining CharacterClassEscape prodns.
... because backtick is for ECMAScript code.
... because pipe is for nonterminals.
... from UnicodeMatchProperty to UnicodeMatchPropertyValue.
... for UnicodeMatchProperty and UnicodeMatchPropertyValue.
(And make it parallel between the two.)
... for UnicodeMatchProperty and UnicodeMatchPropertyValue
(as in typeof, DateString, GetSubstitution)
(The spec tends to put a nonterminal's definition
after its right-hand-side uses.)
_UnicodePropertyName_, _UnicodePropertyValue_, and _LoneUnicodePropertyNameOrValue_
are invalid because UnicodePropertyName etc are nonterminals,
not metavariables.

And simply changing them to |UnicodePropertyName| etc
wouldn't be valid, because that would be passing a Parse Node
to UnicodeMatchProperty/UnicodeMatchPropertyValue,
which isn't what they're expecting.

So instead, use the SourceText operation to 'extract'
the List of code points for each.
... to resolve BackreferenceMatcher's reference to _direction_ at step 1.f
The note's content was marked up as if the code sample
would be rendered inline, but it's a <pre>, so it'll be a block.
... so that the <td> element doesn't have both inline and block content.
@jmdyck
Copy link
Collaborator Author

jmdyck commented Feb 8, 2018

Added 3 markup commits re #890, attn @littledan.

(I should have caught these in the branch review I did 5 days ago, but I guess I only looked at the "reparse" -> "covering" changes.)

@littledan
Copy link
Member

Markup commits for #890 LGTM

@@ -30298,12 +30298,12 @@ <h1>Atom</h1>
1. Let _xe_ be _x_'s _endIndex_.
1. Let _ye_ be _y_'s _endIndex_.
1. If _direction_ is equal to +1, then
1. Assert: _xe_ &lte; _ye_.
1. Let _s_ be a fresh List whose characters are the characters of _Input_ at indices _xe_ (inclusive) through _ye_ (exclusive).
1. Assert: _xe_ &le; _ye_.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the intention here was less than or equal to. While gte/lte are not entities, &le;= would preserve the intent. Is < a better assert? /cc @littledan

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

never mind I don't know HTML entities.

@bterlson bterlson merged commit 528eb61 into tc39:master Feb 13, 2018
@jmdyck
Copy link
Collaborator Author

jmdyck commented Feb 13, 2018

Yay! Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants