-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EncodeForRegExpEscape should not return results that require particular flags #69
Comments
Whitespace is escaped to leave room for /x mode regexps in the future. |
So to make sure I understand the issue properly, this would be solved if done by code units, and not code points? |
Yes, but I think there is a possibility that a |
ljharb
added a commit
that referenced
this issue
Mar 27, 2024
ljharb
added a commit
that referenced
this issue
Mar 27, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
EncodeForRegExpEscape step 4.e (which would be reached if input c were a Space_Separator supplementary code point in [U+10000, U+10FFFF]) results in a return value like
\u{…}
. The interpretation of such pattern text is dependent upon regular expression flags—specifically, it is interpreted as a |RegExpUnicodeEscapeSequence| that will match a code point with the contained hexadecimal value in the presence of a "u" or "v" flag, but otherwise is interpreted as either a syntax error or (only in a host supporting Annex B and only when the hexadecimal representation of the code point consists only of decimal digits) as a quantified |ExtendedAtom| "u" with the specified decimal count of repetitions (e.g.,/^\u{10000}$/.test("u".repeat(10000))
is true).Rather than returning results subject to conditional interpretation, EncodeForRegExpEscape should return a
\u…\u…
surrogate pair |RegExpUnicodeEscapeSequence| for such inputs (which work in both Unicode and non-Unicode regular expressions, e.g./^\uD834\uDF06$/u.test("𝌆")
and/^\uD834\uDF06$/v.test("𝌆")
and/^\uD834\uDF06$/.test("𝌆")
are all true).Or alternatively (and preferably IMO), EncodeForRegExpEscape should not escape all white space. I'm not certain why it does so right now, but looking back I suspect it is due to a misinterpretation of #30 (which requests escaping of control characters, and even more specifically line terminators—and even that isn't necessary).
The text was updated successfully, but these errors were encountered: