-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Buffer.asciiWrite
is 3x slower than Buffer.set
#169
Comments
Buffer.asciiWrite
is slowBuffer.asciiWrite
is 3x slower than Buffer.set
you are caching text encoding with here is fixed example import { run, bench, summary } from 'mitata'
summary(() => {
bench('Buffer.set($size)', function* (state) {
const str = 'x'.repeat(state.get('size'));
const buf = () => Buffer.from(str, 'ascii');
const scratch = Buffer.allocUnsafe(state.get('size'));
yield () => scratch.set(buf(), 0);
}).compact().range('size', 1, 1024);
bench('Buffer.asciiWrite($size)', function* (state) {
const str = 'x'.repeat(state.get('size'));
const scratch = Buffer.allocUnsafe(state.get('size'));
yield () => scratch.asciiWrite(str, 0);
}).compact().range('size', 1, 1024);
bench('for loop ($size)', function* (state) {
const str = 'x'.repeat(state.get('size'));
const scratch = Buffer.allocUnsafe(state.get('size'));
yield () => {
for (let i = 0; i < str.length; i++) {
scratch[i] = str.charCodeAt(i);
}
}
}).compact().range('size', 1, 1024);
});
await run(); clk: ~3.25 GHz
cpu: Apple M2 Pro
runtime: node 22.9.0 (arm64-darwin)
benchmark avg (min … max) p75 p99 (min … top 1%)
-------------------------------------- -------------------------------
Buffer.set(1) 44.55 ns/iter 44.90 ns 65.30 ns █▄▂▁▁▁▁▁▁▁▁
Buffer.set(8) 46.47 ns/iter 47.04 ns 67.72 ns █▆▃▂▁▁▁▁▁▁▁
Buffer.set(64) 81.05 ns/iter 82.12 ns 105.02 ns ▃██▃▂▁▁▁▁▁▁
Buffer.set(512) 128.24 ns/iter 132.17 ns 156.47 ns ▁▂▂▇█▆▄▂▂▂▁
Buffer.set(1024) 162.07 ns/iter 170.16 ns 198.58 ns ▂▂▃▃▄█▅▃▂▃▁
Buffer.asciiWrite(1) 31.07 ns/iter 31.39 ns 35.41 ns ▁▁▁▁▁▁▁▁█▃▁
Buffer.asciiWrite(8) 31.87 ns/iter 32.00 ns 36.10 ns ▁▁▁▁▁▁▁▁█▃▁
Buffer.asciiWrite(64) 34.58 ns/iter 34.82 ns 40.47 ns █▁▂▂▁▁▁▁▁▁▁
Buffer.asciiWrite(512) 43.15 ns/iter 43.71 ns 49.75 ns █▂▂▂▁▁▁▁▁▁▁
Buffer.asciiWrite(1024) 52.29 ns/iter 52.75 ns 59.33 ns █▂▂▂▂▁▁▁▁▁▁
for loop (1) 1.18 ns/iter 1.15 ns 2.15 ns █▁▁▁▁▁▁▁▁▁▁
for loop (8) 6.10 ns/iter 5.98 ns 8.69 ns █▁▁▁▁▁▁▁▁▁▁
for loop (64) 158.66 ns/iter 160.59 ns 170.77 ns █▇█▄▄▃▂▂▂▁▁
for loop (512) 1.22 µs/iter 1.24 µs 1.27 µs ▂▆▅▄▅▇▅█▄▄▁
for loop (1024) 2.39 µs/iter 2.40 µs 2.43 µs ▂█▂▆▇▃▅▅▄▃▁
summary
Buffer.asciiWrite($size)
3.1…1.43x faster than Buffer.set($size)
45.72…-26.43x faster than for loop ($size) |
That's on purpose. The conversion to buffer should not be part of the benchmark. |
for loop and |
It does not. When it has a "one-byte" representation it can be memcpied. Though there is indeed something wrong with the benchmark. It should be |
now it makes much more sense what you are trying to compare (found it), in this case the difference comes from v8 doing something with string when passing it to fast function (checking if string is latin1? might depend on length? making unique string copy for c++?) or Uint8Array.set and memcpy being different Uint8Array memcpy 14256 bytes 181.47 ns/iter 187.24 ns 220.88 ns ▂▆█▆▄▃▃▂▂▁▁
asciiWrite(14256 bytes [lati.. 292.18 ns/iter 306.63 ns 319.10 ns ▁▁▁▁▁▁▁▁▁█▂
asciiWrite(14259 bytes [lati.. 4.20 µs/iter 4.22 µs 4.23 µs ▃▁▃▅▅▃█▁▄█▂
summary
Uint8Array memcpy 14256 bytes
1.61x faster than asciiWrite(14256 bytes [latin1])
23.16x faster than asciiWrite(14259 bytes [latin1 + 1 unicode]) |
I would assume that
Buffer.set
andBuffer.asciiWrite
should be roughly the same (they are both essentially amemcpy
). However this is not the case.Given that the speed is somewhat the same even when the string size grows I would assume that most goes to call overhead.
Refs: #168
The text was updated successfully, but these errors were encountered: