-
Notifications
You must be signed in to change notification settings - Fork 475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RegExp named group tests #998
Conversation
Tests against the Stage 3 named capture groups proposal https://tc39.github.io/proposal-regexp-named-groups These tests are mostly by @schuay, from V8's tests for the feature
|
||
// @@replace with a callable replacement argument (no named captures). | ||
{ | ||
let result = "abcd".replace(/(.)(.)/u, (match, fst, snd, offset, str) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the {}
wrappers are a perfect indication these tests should be split into multiple files. There's already the description for each part.
It would be great to provide a info
tag containing the spec parts allowing each cases. It's a valuable reference to match the validity for each assertion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I broke up some tests further and added an info tag to some tests.
features: [regexp-named-groups] | ||
---*/ | ||
|
||
assert.throws(SyntaxError, () => eval("/(?<>a)/u"), "Empty name"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for each assertion message.
assert.sameValue("a", /(?<π>a)/.exec("bab").groups.\u03C0); | ||
assert.sameValue("a", /(?<$>a)/.exec("bab").groups.$); | ||
assert.sameValue("a", /(?<_>a)/.exec("bab").groups._); | ||
assert.sameValue("a", /(?<$𐒤>a)/.exec("bab").groups.$𐒤); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be an early syntax error, because 𐒤 is parsed as two separate code units in non-Unicode mode.
https://tc39.github.io/ecma262/#sec-pattern-semantics
If a BMP pattern contains a non-BMP SourceCharacter the entire pattern is encoded using UTF-16 and the individual code units of that encoding are used as the elements of the List.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anba Good catch; do you think we should enforce this, or change RegExpIdentifierStart and RegExpIdentifierContinue to allow surrogate pairs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we allow 𐒤 in BMP patterns, we probably also need to allow \ud801\udca4
(and maybe even \u{104a4}
) for consistency reasons, because it'd strange to accept a surrogate pair only if it's written in non-escaped form. (If we want to keep it simple, we can forbid 𐒤 for the time being, and if there's any user demand, we can later revisit this decision and allow 𐒤. This won't cause any compatibility issues, because we'd only change an error into a non-error.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, fixed the test
- Broke up some tests - Added more frontmatter From here, it would still be possible to add more frontmatter, break up tests more, and add more Proxy tests for interaction with RegExp subclassing
Thanks @anba for pointing this out
Tests against the Stage 3 named capture groups proposal
https://tc39.github.io/proposal-regexp-named-groups
These tests are mostly by @schuay, from V8's tests for the feature