-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-ASCII character support #45736
Non-ASCII character support #45736
Conversation
test const isLiteralSymbol = (char) => {
const code = char.charCodeAt(0);
if (code > 127) {
return true;
}
if (char >= '0' && char <= 9) {
return false;
}
if (code === 35) {
return false;
}
return true;
} output isLiteralSymbol('أهلا')
true |
@anonrig I applied the changes, thank you very much for the review 🚀 |
@manekinekko Can you review this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the change LGTM, but this seems to break quite a few tests
assert.strictEqual(isLiteralSymbol('#'), false); | ||
assert.strictEqual(isLiteralSymbol('أ'), true); | ||
assert.strictEqual(isLiteralSymbol('ت'), true); | ||
assert.strictEqual(isLiteralSymbol('ث'), true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you:
- group these into 2 groups: literals (
true
) and non-literals (false
)? - add more samples of non-Latin characters? you can cherry-pick from this list here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@manekinekko thank you so much i will fly here 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@manekinekko would that be healthy
{
const literals = ['A', 'a', '-', '+', 'أ', 'ت', 'ث', '讲', '演', '講'];
const nonLiterals = ['0', '#', '\\', '+', '-'];
literals.forEach((literal) => {
assert.strictEqual(isLiteralSymbol(literal), true);
});
nonLiterals.forEach((nonLiteral) => {
assert.strictEqual(isLiteralSymbol(nonLiteral), false);
});
}
I'm so sorry I triggered you all to review, sorry for the extra notifications :/ |
@manekinekko I sent the edits thank you very much |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we really accept zero width characters as acceptable input? I would skip all of them. We also already have a function to check for these:
node/lib/internal/util/inspect.js
Lines 2388 to 2398 in ab064d1
const isZeroWidthCodePoint = (code) => { | |
return code <= 0x1F || // C0 control codes | |
(code >= 0x7F && code <= 0x9F) || // C1 control codes | |
(code >= 0x300 && code <= 0x36F) || // Combining Diacritical Marks | |
(code >= 0x200B && code <= 0x200F) || // Modifying Invisible Characters | |
// Combining Diacritical Marks for Symbols | |
(code >= 0x20D0 && code <= 0x20FF) || | |
(code >= 0xFE00 && code <= 0xFE0F) || // Variation Selectors | |
(code >= 0xFE20 && code <= 0xFE2F) || // Combining Half Marks | |
(code >= 0xE0100 && code <= 0xE01EF); // Variation Selectors | |
}; |
I didn't accept zero-width characters @BridgeAR i will update the code like this if (typeof char !== 'string' || util.inspect.isZeroWidthCodePoint(char)) {
return false;
} can I do it like this? |
@mertcanaltin I think that would be ok. |
78c70a1
to
4697163
Compare
e05f17f
to
eeb6f02
Compare
CC @nodejs/test_runner @manekinekko additional reviews will be appreciated |
@anonrig can you please dismiss your review/approve? |
Landed in 3c6547f |
🎉 |
PR-URL: nodejs#45736 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Moshe Atlow <[email protected]>
PR-URL: #45736 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Moshe Atlow <[email protected]>
PR-URL: #45736 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Moshe Atlow <[email protected]>
PR-URL: nodejs#45736 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Moshe Atlow <[email protected]>
PR-URL: nodejs#45736 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Moshe Atlow <[email protected]>
PR-URL: nodejs#45736 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Moshe Atlow <[email protected]>
PR-URL: #45736 Backport-PR-URL: #46839 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Moshe Atlow <[email protected]>
PR-URL: #45736 Backport-PR-URL: #46839 Reviewed-By: Yagiz Nizipli <[email protected]> Reviewed-By: Moshe Atlow <[email protected]>
Fixes #45706
Fixes #46508