-
-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance - Drastically improve worst case regex performance #51
Conversation
Please see for context: |
index.js
Outdated
@@ -42,7 +42,7 @@ const wrapWord = (rows, word, columns) => { | |||
|
|||
if (ESCAPES.has(character)) { | |||
isInsideEscape = true; | |||
isInsideLinkEscape = characters.slice(index + 1).join('').startsWith(ANSI_ESCAPE_LINK); | |||
isInsideLinkEscape = word.slice(index + 1).startsWith(ANSI_ESCAPE_LINK); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.slice()
operates on code units while the [...word]
splits on codepoints. That's a subtle difference.
> '🌍'.slice(0, 2).length
< 2
> [...'🌍'].length
< 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, not too familiar with unicode/string particulars.
Will add a test to this effect and update PR
Converting from array to string repeatedly in a loop leads to very poor performance in some contexts.
Updated PR to respect unicode character lengths |
Missing the test |
I would argue the test is "missing" bc basically every other test confirms the correct behaviour, isn't it? @sindresorhus |
My profiler shows me that a significant amount of time is also spent in the |
Completely tangential but @adam-arthur can you please tell me what tool you're using for the per-line timings in #51 (comment) ? |
Bump :) |
Problem:
Very poor performance when testing certain regexes due to unnecessary conversion to/from string/array.
Impact:
Some of the reporters in Vitest that depend on this code are extremely slow. (See: vitest-dev/vitest#2602)
Notes:
While time complexity of both operations is theoretically the same, seems the VM is optimizing string specific operations much better than the conversion back/forth.
Future:
In this case can run a single global regex on the entire string up front rather than re-allocating a new substring at each index. But for now this is simple and solves the issue.