-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix compiling v
set notation to u
with unicode properties
#70
Fix compiling v
set notation to u
with unicode properties
#70
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Transforming unicodeFlag requires transforming unicodePropertyEscapes because unicode property escape is valid only in /u
and /v
mode. In other words, it is impossible to transform unicode flag to an ES5 regex while preserving the unicode property escape.
However, this is not the case for the /v
flag, e.g.
/\p{ASCII}/v
/\p{ASCII}/u
They are equivalent. I will expect the regexpu
transform former to the latter, i.e. the current behavior, so that it can run on the targets with native unicode property escape. However, this PR further transforms /\p{ASCII}/v
to /[\x00-/x7f]/
. Although it could solve the linked Babel issue, we could have done better, e.g. only transform unicode property escape when it is involved in set notations, e.g.
/[\p{ASCII}&&\p{Decimal_Number}]/v
I see what you mean, but that I haven't figured out how to do it yet, I plan to put it in the next PR. |
regexpu-core/rewrite-pattern.js Line 412 in 3515c6b
It seems to me we can mark the character escape as transformed when the (config.transform.unicodeSetsFlag && (nestedData.maybeIncludesStrings || characterClassItem.kind !== "union")) |
@JLHwung Thanks! This is amazing. |
Since these two changes are relatively small, I put them in a PR.🙂 |
v
flagv
flag and unicodePropertyEscapes that set notations depend on
@@ -700,7 +700,7 @@ const rewritePattern = (pattern, flags, options) => { | |||
config.transform.unicodeSetsFlag = config.flags.unicodeSets && transform(options, 'unicodeSetsFlag'); | |||
|
|||
// unicodeFlag: 'transform' implies unicodePropertyEscapes: 'transform' | |||
config.transform.unicodePropertyEscapes = config.flags.unicode && ( | |||
config.transform.unicodePropertyEscapes = (config.flags.unicode || config.flags.unicodeSets) && ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still not convinced that we should always transform unicode property escape if unicodeSets flag is set. Can you add a new test case?
/[\p{ASCII}]/v
should be transformed to
/[\p{ASCII}]/u
Of course if user wants to further transform /u
, then we should transform \p{ASCII}
, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, there is a transform(options, 'unicodePropertyEscapes')
check later so that unicodePropertyEscapes: false
will still preserve the property escapes.
Can you check if this PR also fixes babel/babel#15193 (comment) ? |
Yes, I originally opened this PR to fix babel/babel#15193 (comment). |
If so can you add And also the union of two property escape, e.g. |
I added it here, should be the same? |
Not quite the same, |
@@ -700,7 +700,7 @@ const rewritePattern = (pattern, flags, options) => { | |||
config.transform.unicodeSetsFlag = config.flags.unicodeSets && transform(options, 'unicodeSetsFlag'); | |||
|
|||
// unicodeFlag: 'transform' implies unicodePropertyEscapes: 'transform' | |||
config.transform.unicodePropertyEscapes = config.flags.unicode && ( | |||
config.transform.unicodePropertyEscapes = (config.flags.unicode || config.flags.unicodeSets) && ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, there is a transform(options, 'unicodePropertyEscapes')
check later so that unicodePropertyEscapes: false
will still preserve the property escapes.
✅ pending test case updates. |
Co-authored-by: Huáng Jùnliàng <[email protected]>
v
flag and unicodePropertyEscapes that set notations depend onv
set notation to u
with unicode properties
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Fixes: babel/babel#15193 (comment)
Since
v
andu
flags are mutually exclusive, and I read in the v8 blog that there is no reason not to replaceu
withv
, I guess they can maintain similar behavior here. (I'm not familiar with regular expressions, sorry if I'm wrong)