-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Editorial: Stylize strings as values. #1733
Conversation
<p><emu-xref href="#sec-date.prototype.toisostring"></emu-xref>: If the year cannot be represented using the Date Time String Format specified in <emu-xref href="#sec-date-time-string-format"></emu-xref> a RangeError exception is thrown. Previous editions did not specify the behaviour for that case.</p> | ||
<p><emu-xref href="#sec-date.prototype.tostring"></emu-xref>: Previous editions did not specify the value returned by `Date.prototype.toString` when this time value is *NaN*. ECMAScript 2015 specifies the result to be the String value *"Invalid Date"*.</p> | ||
<p><emu-xref href="#sec-regexp-pattern-flags"></emu-xref>, <emu-xref href="#sec-escaperegexppattern"></emu-xref>: Any LineTerminator code points in the value of the `"source"` property of a RegExp instance must be expressed using an escape sequence. Edition 5.1 only required the escaping of `"/"`.</p> | ||
<p><emu-xref href="#sec-regexp-pattern-flags"></emu-xref>, <emu-xref href="#sec-escaperegexppattern"></emu-xref>: Any LineTerminator code points in the value of the *"source"* property of a RegExp instance must be expressed using an escape sequence. Edition 5.1 only required the escaping of `/`.</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: The removal of quotes around the /
here is just to (hopefully?) simplify conflict resolution with #1724.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say you should prefer a coherent commit over avoiding merge conflicts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from the three spots noted, this looks good to me.
spec.html
Outdated
@@ -4758,7 +4758,7 @@ <h1>Runtime Semantics: MV</h1> | |||
The MV of <emu-grammar>StrUnsignedDecimalLiteral ::: DecimalDigits ExponentPart</emu-grammar> is the MV of |DecimalDigits| times 10<sub>ℝ</sub><sup>_e_</sup>, where _e_ is the MV of |ExponentPart|. | |||
</li> | |||
</ul> | |||
<p>Once the exact MV for a String numeric literal has been determined, it is then rounded to a value of the Number type. If the MV is 0, then the rounded value is *+0* unless the first non white space code point in the String numeric literal is `"-"`, in which case the rounded value is *-0*. Otherwise, the rounded value must be the Number value for the MV (in the sense defined in <emu-xref href="#sec-ecmascript-language-types-number-type"></emu-xref>), unless the literal includes a |StrUnsignedDecimalLiteral| and the literal has more than 20 significant digits, in which case the Number value may be either the Number value for the MV of a literal produced by replacing each significant digit after the 20th with a 0 digit or the Number value for the MV of a literal produced by replacing each significant digit after the 20th with a 0 digit and then incrementing the literal at the 20th digit position. A digit is significant if it is not part of an |ExponentPart| and</p> | |||
<p>Once the exact MV for a String numeric literal has been determined, it is then rounded to a value of the Number type. If the MV is 0, then the rounded value is *+0* unless the first non white space code point in the String numeric literal is *"-"*, in which case the rounded value is *-0*. Otherwise, the rounded value must be the Number value for the MV (in the sense defined in <emu-xref href="#sec-ecmascript-language-types-number-type"></emu-xref>), unless the literal includes a |StrUnsignedDecimalLiteral| and the literal has more than 20 significant digits, in which case the Number value may be either the Number value for the MV of a literal produced by replacing each significant digit after the 20th with a 0 digit or the Number value for the MV of a literal produced by replacing each significant digit after the 20th with a 0 digit and then incrementing the literal at the 20th digit position. A digit is significant if it is not part of an |ExponentPart| and</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a code point, not a String value, so should be changed to one of:
`-`
U+002D (HYPHEN-MINUS)
`-` U+002D (HYPHEN-MINUS)
(There's precedent for each.)
spec.html
Outdated
@@ -19025,7 +19025,7 @@ <h2>Syntax</h2> | |||
<emu-clause id="sec-directive-prologues-and-the-use-strict-directive"> | |||
<h1>Directive Prologues and the Use Strict Directive</h1> | |||
<p>A <dfn id="directive-prologue">Directive Prologue</dfn> is the longest sequence of |ExpressionStatement|s occurring as the initial |StatementListItem|s or |ModuleItem|s of a |FunctionBody|, a |ScriptBody|, or a |ModuleBody| and where each |ExpressionStatement| in the sequence consists entirely of a |StringLiteral| token followed by a semicolon. The semicolon may appear explicitly or may be inserted by automatic semicolon insertion. A Directive Prologue may be an empty sequence.</p> | |||
<p>A <dfn id="use-strict-directive">Use Strict Directive</dfn> is an |ExpressionStatement| in a Directive Prologue whose |StringLiteral| is either the exact code unit sequences `"use strict"` or `'use strict'`. A Use Strict Directive may not contain an |EscapeSequence| or |LineContinuation|.</p> | |||
<p>A <dfn id="use-strict-directive">Use Strict Directive</dfn> is an |ExpressionStatement| in a Directive Prologue whose |StringLiteral| is either the exact code unit sequences *"use strict"* or *'use strict'*. A Use Strict Directive may not contain an |EscapeSequence| or |LineContinuation|.</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence is odd, in particular the phrase:
whose |StringLiteral| is either the exact code unit sequences ...
A |StringLiteral|
is a Parse Node, or (by extension) the source text that it matches, but it's not a code unit sequence. So I recommend:
- change
code unit
tocode point
(orcode unit sequence
tosource text
), - revert the backtick-to-asterisk change, and
- maybe insert
of
aftereither
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good; this is the line I was most conflicted over myself.
spec.html
Outdated
@@ -31624,7 +31624,7 @@ <h1>Runtime Semantics: Canonicalize ( _ch_ )</h1> | |||
<pre><code class="javascript">["baaabaac", "ba", undefined, "abaac"]</code></pre> | |||
</emu-note> | |||
<emu-note> | |||
<p>In case-insignificant matches when _Unicode_ is *true*, all characters are implicitly case-folded using the simple mapping provided by the Unicode standard immediately before they are compared. The simple mapping always maps to a single code point, so it does not map, for example, `"ß"` (U+00DF) to `"SS"`. It may however map a code point outside the Basic Latin range to a character within, for example, `"ſ"` (U+017F) to `"s"`. Such characters are not mapped if _Unicode_ is *false*. This prevents Unicode code points such as U+017F and U+212A from matching regular expressions such as `/[a-z]/i`, but they will match `/[a-z]/ui`.</p> | |||
<p>In case-insignificant matches when _Unicode_ is *true*, all characters are implicitly case-folded using the simple mapping provided by the Unicode standard immediately before they are compared. The simple mapping always maps to a single code point, so it does not map, for example, *"ß"* (U+00DF) to *"SS"*. It may however map a code point outside the Basic Latin range to a character within, for example, *"ſ"* (U+017F) to *"s"*. Such characters are not mapped if _Unicode_ is *false*. This prevents Unicode code points such as U+017F and U+212A from matching regular expressions such as `/[a-z]/i`, but they will match `/[a-z]/ui`.</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are code points, not string values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This particular line is some remaining confusion from #1724, as I thought we decided not to remove the quotes, but I ought to have asked that more explicitly over there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we made an explicit decision on this case in #1724, I just failed to flag it as code points there.
0b43d49
to
ca7f823
Compare
The net effect is good now, but I think, for clarity, it would be better to restructure the PR as two commits:
@ljharb, what do you think? |
PRs are generally squash-merged though, right? So in that sense, it might be better to move the third commit here over to #1724 (since the merge button hasn't been hit on it yet) and ensure that those lines aren't touched in this PR. Edit: I now see that #1734 is an example of a PR where we didn't squash, so I suppose either way should work. |
Ah! I totally forgot that #1724 was still open. Moving those cases there makes sense. |
ca7f823
to
242d2cf
Compare
Moved third commit to #1724 (and dropped the problematic lines from the first commit here). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there!
<p><emu-xref href="#sec-date.prototype.toisostring"></emu-xref>: If the year cannot be represented using the Date Time String Format specified in <emu-xref href="#sec-date-time-string-format"></emu-xref> a RangeError exception is thrown. Previous editions did not specify the behaviour for that case.</p> | ||
<p><emu-xref href="#sec-date.prototype.tostring"></emu-xref>: Previous editions did not specify the value returned by `Date.prototype.toString` when this time value is *NaN*. ECMAScript 2015 specifies the result to be the String value *"Invalid Date"*.</p> | ||
<p><emu-xref href="#sec-regexp-pattern-flags"></emu-xref>, <emu-xref href="#sec-escaperegexppattern"></emu-xref>: Any LineTerminator code points in the value of the `"source"` property of a RegExp instance must be expressed using an escape sequence. Edition 5.1 only required the escaping of `"/"`.</p> | ||
<p><emu-xref href="#sec-regexp-pattern-flags"></emu-xref>, <emu-xref href="#sec-escaperegexppattern"></emu-xref>: Any LineTerminator code points in the value of the *"source"* property of a RegExp instance must be expressed using an escape sequence. Edition 5.1 only required the escaping of `/`.</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say you should prefer a coherent commit over avoiding merge conflicts.
spec.html
Outdated
@@ -31793,7 +31793,7 @@ <h1>CharacterClassEscape</h1> | |||
<p>The production <emu-grammar>UnicodePropertyValueExpression :: LoneUnicodePropertyNameOrValue</emu-grammar> evaluates as follows:</p> | |||
<emu-alg> | |||
1. Let _s_ be SourceText of |LoneUnicodePropertyNameOrValue|. | |||
1. If ! UnicodeMatchPropertyValue(`"General_Category"`, _s_) is identical to a List of Unicode code points that is the name of a Unicode general category or general category alias listed in the “Property value and aliases” column of <emu-xref href="#table-unicode-general-category-values"></emu-xref>, then | |||
1. If ! UnicodeMatchPropertyValue(*"General_Category"*, _s_) is identical to a List of Unicode code points that is the name of a Unicode general category or general category alias listed in the “Property value and aliases” column of <emu-xref href="#table-unicode-general-category-values"></emu-xref>, then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found another! UnicodeMatchPropertyValue
"takes two parameters ..., each of which is a List of Unicode code points". So here, "General_Category" represents a List of code points, not a String value. (Ditto the occurrence on the next line.) This case should probably be handled in #1724.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I don't think the next line applies though, as it's referring to this (that's also the reason for “True” three lines after that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the next line applies though
Okay, I can go along with that.
242d2cf
to
a0bcce7
Compare
Removed those two changes from this PR. |
Are there plans to fix up all the active proposals? |
a0bcce7
to
75f27e8
Compare
Surely not for proposal spec text. The old convention doesn't impede readability and understanding. But of course we should take care to apply the new convention when merging proposals into master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More a rubberstamp than review.
75f27e8
to
55707d0
Compare
#1725 included a new section on Value Notation; this PR enacts it by updating
`"foo"`
→*"foo"*
throughout the spec. This means that all ECMAScript language values shall now be stylized as such.