Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editorial: Extract operation ParseText #2013

Merged
merged 2 commits into from
Dec 1, 2020
Merged

Conversation

jmdyck
Copy link
Collaborator

@jmdyck jmdyck commented May 25, 2020

(This is the PR foreshadowed here. Attn @michaelficarra)

It's only a draft pull request so far, because:
(a) Some of the prose is first draft.
(b) I haven't used ParseText in IsValidRegularExpressionLiteral and RegExpInitialize yet, because that would just raise merge conflicts when #1866 is merged. Simpler to just wait.

@jmdyck jmdyck force-pushed the ParseText branch 2 times, most recently from 1e2b425 to b1a3492 Compare May 26, 2020 14:07
spec.html Outdated
Comment on lines 10462 to 10463
1. Attempt to parse _sourceText_ using _goalSymbol_ as the goal symbol, and analyse the parse result for any Early Error conditions.
1. NOTE: In the preceding step, parsing and Early Error detection may be interleaved in an implementation-dependent manner.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would combine these lines.

Suggested change
1. Attempt to parse _sourceText_ using _goalSymbol_ as the goal symbol, and analyse the parse result for any Early Error conditions.
1. NOTE: In the preceding step, parsing and Early Error detection may be interleaved in an implementation-dependent manner.
1. Attempt to parse _sourceText_ using _goalSymbol_ as the goal symbol, and analyse the parse result for any Early Error conditions. Parsing and Early Error detection may be interleaved in an implementation-dependent manner.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup.

@michaelficarra
Copy link
Member

I think the capitalisation of Early Errors is inconsistent here.

spec.html Outdated
1. Let _strict_ be ContainsUseStrict of _body_.
1. If any static semantics errors are detected for _parameters_ or _body_, throw a *SyntaxError* exception. If _strict_ is *true*, the Early Error rules for <emu-grammar>UniqueFormalParameters : FormalParameters</emu-grammar> are applied.
1. If _strict_ is *true*, apply the Early Error rules for <emu-grammar>UniqueFormalParameters : FormalParameters</emu-grammar>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. If _strict_ is *true*, apply the Early Error rules for <emu-grammar>UniqueFormalParameters : FormalParameters</emu-grammar>.
1. If _strict_ is *true*, apply the Early Error rules for <emu-grammar>UniqueFormalParameters : FormalParameters</emu-grammar> to _parameters_.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's interesting, the status quo doesn't specify what the rules should be applied to. There's similar wording elsewhere in the spec, but it's always in the context of an Early Errors clause, where the 'target' is presumably the instance of the header production that's currently being examined.

So, yup, I'll add this.

spec.html Outdated
@@ -32512,11 +32532,12 @@ <h1>Static Semantics: ParsePattern ( _patternText_, _u_ )</h1>
<p>The abstract operation ParsePattern takes arguments _patternText_ (a sequence of Unicode code points) and _u_ (a Boolean). It performs the following steps when called:</p>
<emu-alg>
1. If _u_ is *true*, then
1. Parse _patternText_ using the grammars in <emu-xref href="#sec-patterns"></emu-xref>. The goal symbol for the parse is |Pattern[+U, +N]|.
1. Let _parseResult_ be ParseText(_patternText_, |Pattern[+U, +N]|).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just return directly here. Similarly below.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried out the direct returns and found I didn't like the result.

@bakkot
Copy link
Contributor

bakkot commented May 26, 2020

@jmdyck Ecmarkup is complaining because the step

1. Let _constructorText_ be the source text
  <pre><code class="javascript">constructor() {}</code></pre>

ends in a tag. It expects all steps to end in text. (Previously it ended in "using the syntactic grammar with the goal symbol |MethodDefinition[~Yield, ~Await]|.", i.e. a period, which is text.)

I guess adding a trailing period there would be pretty awkward, so I'll add an exception to Ecmarkup for this particular case.

@ljharb
Copy link
Member

ljharb commented May 26, 2020

Why not add a period to the end?

@bakkot
Copy link
Contributor

bakkot commented May 26, 2020

Because it would be on its own line, which would look pretty weird.

@michaelficarra
Copy link
Member

I'm okay with that.

@jmdyck
Copy link
Collaborator Author

jmdyck commented May 26, 2020

@jmdyck Ecmarkup is complaining because the step

1. Let _constructorText_ be the source text
  <pre><code class="javascript">constructor() {}</code></pre>

ends in a tag. It expects all steps to end in text.

Ah, got it. Is that why you need to special-case embedded <figure> elements?

I guess adding a trailing period there would be pretty awkward, so I'll add an exception to Ecmarkup for this particular case.

Would it help to end the previous line with a colon? (As the quasi-signal that something indented and possibly unusual follows.)

@bakkot
Copy link
Contributor

bakkot commented May 26, 2020

Ah, got it. Is that why you need to special-case embedded <figure> elements?

Yup.

Would it help to end the previous line with a colon?

Not as currently implemented. I could make an allowance for that, but I think I'd prefer to just not make assertions about lines which end in pre, if we're tweaking ecmarkup. But, for now, can you try just adding a . after the </pre>, as @michaelficarra wants, so we can see what that looks like?

@bakkot
Copy link
Contributor

bakkot commented May 26, 2020

Yeah, that does indeed look pretty weird. @michaelficarra thoughts?

@jmdyck
Copy link
Collaborator Author

jmdyck commented May 26, 2020

But, for now, can you try just adding a . after the </pre>, as @michaelficarra wants, so we can see what that looks like?

Done.

(I added the period on its own line in the source. Not sure if that's wanted. I suppose the whole thing could be one line:

1. Let _constructorText_ be the source text <pre>...</pre>.

and it would probably render about the same.)

@michaelficarra
Copy link
Member

@bakkot I'm okay either way.

@bakkot
Copy link
Contributor

bakkot commented May 28, 2020

@jmdyck #2016 has been merged, which should allow you to end lines in <pre> tags, as you'd originally had. If you rebase on master and delete the periods, it should pass this time around.

Apologies for the churn.

@jmdyck
Copy link
Collaborator Author

jmdyck commented Oct 15, 2020

I think the capitalisation of Early Errors is inconsistent here.

Yup, because the status quo spec is inconsistent. I'll use lower-case consistently in the lines that this PR touches.

spec.html Outdated
@@ -38154,7 +38177,9 @@ <h1>JSON.parse ( _text_ [ , _reviver_ ] )</h1>
1. Let _jsonString_ be ? ToString(_text_).
1. Parse ! UTF16DecodeString(_jsonString_) as a JSON text as specified in ECMA-404. Throw a *SyntaxError* exception if it is not a valid JSON text as defined in that specification.
1. Let _scriptString_ be the string-concatenation of *"("*, _jsonString_, and *");"*.
1. Let _completion_ be the result of parsing and evaluating ! UTF16DecodeString(_scriptString_) as if it was the source text of an ECMAScript |Script|. The extended PropertyDefinitionEvaluation semantics defined in <emu-xref href="#sec-__proto__-property-names-in-object-initializers"></emu-xref> must not be used during the evaluation.
1. Let _script_ be ParseText(! UTF16DecodeString(_scriptString_), |Script|).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Let _script_ be ParseText(! UTF16DecodeString(_scriptString_), |Script|).
1. Let _script_ be ParseText(! UTF16DecodeString(_scriptString_), |Script|). The Early Error defined in <emu-xref href="#sec-__proto__-property-names-in-object-initializers"></emu-xref> must not be applied during the parse.

The original text also fails to mention that that early error mustn't be applied, which is a bug, but we should fix it here.

(The extended PropertyDefinitionEvaluation semantics also need not to apply, which is covered, but the error needs its own mention, not just the evaluation semantics.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the bugfix qualify as editorial?

Copy link
Contributor

@bakkot bakkot Oct 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, that's tricky. I guess we should leave this change out for now and the editors will make the change as a normative commit (but probably without asking for consensus) once this PR lands.

Edit: opened #2206.

@jmdyck
Copy link
Collaborator Author

jmdyck commented Oct 15, 2020

force-pushed to resolve merge conflicts and:

  • join two steps in ParseText;
  • use lower-case "early error" in lines touched by this PR;
  • add to _parameters_ in CreateDynamicFunction;
  • remove the two single-period lines; and
  • tighten up the wording of the 'If' step in ParseText.

@jmdyck jmdyck marked this pull request as ready for review October 15, 2020 21:04
@ljharb ljharb requested review from syg, ljharb and a team and removed request for ljharb October 15, 2020 21:50
@ljharb ljharb self-assigned this Oct 15, 2020
Copy link
Contributor

@syg syg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, nice refactoring

…2013)

PR tc39#1188 deleted the text "but using the alternative definition of |DoubleStringCharacter| provided below" from step 4 of JSON.parse's algorithm.

At that point, the accompanying Note's phrase "as modified by Step 4 above" became obsolete.
@ljharb ljharb merged commit ff0b206 into tc39:master Dec 1, 2020
@jmdyck jmdyck deleted the ParseText branch December 1, 2020 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants