Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faster wcwidth with lookup table #798

Merged
merged 11 commits into from
Jul 18, 2017
Merged

faster wcwidth with lookup table #798

merged 11 commits into from
Jul 18, 2017

Conversation

jerch
Copy link
Member

@jerch jerch commented Jul 16, 2017

This PR optimizes wcwidth by using a lookup table for BMP characters. For a BMP char it boils down to a simple table lookup with some shifting magic to pack the data into 16k bytes. Speedup is at least 10 times. Equality tests included.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.1%) to 70.039% when pulling b965d0a on jerch:faster_wcwidth into 48dab49 on sourcelair:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.1%) to 70.039% when pulling dd0fcb6 on jerch:faster_wcwidth into 48dab49 on sourcelair:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.4%) to 70.581% when pulling 1722248 on jerch:faster_wcwidth into 48dab49 on sourcelair:master.

@coveralls
Copy link

coveralls commented Jul 16, 2017

Coverage Status

Coverage increased (+0.5%) to 70.601% when pulling 6733c6c on jerch:faster_wcwidth into 48dab49 on sourcelair:master.

@coveralls
Copy link

coveralls commented Jul 16, 2017

Coverage Status

Coverage increased (+0.5%) to 70.601% when pulling 5d87b54 on jerch:faster_wcwidth into 48dab49 on sourcelair:master.

Copy link
Member

@Tyriar Tyriar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice ☺️

}
return function (num) {
num = num | 0;
return (num < 65536) ? (table[num >> 2] >> ((num & 3) * 2)) & 3 : wcwidthHigh(num);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this have an early return for ascii characters as they are the vast majority, or would that not boost perf much at this point?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could do with a comment explaining the bit shifting part

Copy link
Member Author

@jerch jerch Jul 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Tyriar Well it gives 5% speedup to exclude the ascii range from the shifting & lookup. So yeah gonna change it.

return 1;
}
let table = new Uint8Array(16384);
for (let i = 0; i < 16384; ++i) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this increase startup in a noticeable way?

Copy link
Member Author

@jerch jerch Jul 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It adds 10 to 16 ms.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that 10-16ms when the first terminal is launched or when the script is loaded?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the script loads, it is done during import.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a follow up 😃 #799

}
return 1;
}
let table = new Uint8Array(16384);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A named variable/comment explaining 16384 would be good ((65536-4)/4+1?).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this needs to be conditional and fallback to a regular array for browsers without support

- commenting the bit shifts
- switching to Uint32Array
- conditional for TypedArray
- fixing Math.floor slowdown in bisearch
@coveralls
Copy link

Coverage Status

Coverage increased (+0.5%) to 70.616% when pulling 1232c00 on jerch:faster_wcwidth into 48dab49 on sourcelair:master.

@coveralls
Copy link

coveralls commented Jul 16, 2017

Coverage Status

Coverage increased (+0.4%) to 70.555% when pulling 6ed5c9e on jerch:faster_wcwidth into 48dab49 on sourcelair:master.

@jerch
Copy link
Member Author

jerch commented Jul 16, 2017

@Tyriar Changed the table to Uint32Array, which speeds up the creation time (down to 6ms), it slightly slows down the reading by 20% compared to Uint8Array, dunno why. About the lazy init - I dont know where to hook it into. Is there a sumthing like a terminal-init event?

@Tyriar
Copy link
Member

Tyriar commented Jul 16, 2017

@jerch this would mean doing it on either terminal creation (new Terminal()) or on first request of character width (might cause async issues with the way writing works?).

@jerch
Copy link
Member Author

jerch commented Jul 17, 2017

Addressed the lazy loading with the last commit, the lookup table is created at the first call with a char > 127. Benchmarks with node V8 on my machine are now:

  • old wcwidth ASCII range (51200 times): 6.5MB in 120ms
  • new wcwidth ASCII range (51200 times): 6.5MB in 60ms
  • old wcwidth 0-65536 range (100 times): 6.5MB in 850ms
  • new wcwidth 0-65536 range (100 times): 6.5MB in 100ms

@coveralls
Copy link

coveralls commented Jul 17, 2017

Coverage Status

Coverage increased (+0.4%) to 70.584% when pulling 598a729 on jerch:faster_wcwidth into 48dab49 on sourcelair:master.

Copy link
Member

@Tyriar Tyriar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, I'm planning on pulling this into its own file #800 and doing a clean up so the code style is a little closer to the rest of the codebase 👍

})({nul: 0, control: 0}); // configurable options

describe('wcwidth', () => {
it('same as old implementation for BMP and individual higher', () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't get more confident about no functional changes than this 😉

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha :D

@Tyriar Tyriar added this to the 2.9.0 milestone Jul 17, 2017
@coveralls
Copy link

Coverage Status

Coverage increased (+0.4%) to 70.564% when pulling 20048d3 on jerch:faster_wcwidth into 48dab49 on sourcelair:master.

@Tyriar
Copy link
Member

Tyriar commented Jul 17, 2017

@jerch CI failed on macOS:

  181 passing (5s)
  1 failing
  1) wcwidth same as old implementation for BMP and individual higher:
     Error: Timeout of 2000ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves.

Might need to increase the timeout of that test? https://travis-ci.org/sourcelair/xterm.js/jobs/254546569

@jerch
Copy link
Member Author

jerch commented Jul 17, 2017

Hmm yeah seems the travis OSX boxes are pretty busy from time to time and tend timeout longer running tests. The test runs in under a second here...

@coveralls
Copy link

coveralls commented Jul 18, 2017

Coverage Status

Coverage increased (+0.4%) to 70.589% when pulling cd5827e on jerch:faster_wcwidth into 48dab49 on sourcelair:master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants