-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
async_hooks: use typed array stack as fast path #17780
Conversation
- Communicate the current async stack length through a typed array field rather than a native binding method - Add a new fixed-size `async_ids_fast_stack` typed array that contains the async ID stack up to a fixed limit. This increases performance noticeably, since most of the time the async ID stack will not be more than a handful of levels deep. - Make the JS `pushAsyncIds()` and `popAsyncIds()` functions do the same thing as the native ones if the fast path is applicable. Benchmarks: $ ./node benchmark/compare.js --new ./node --old ./node-master --runs 10 --filter next-tick process | Rscript benchmark/compare.R [00:03:25|% 100| 6/6 files | 20/20 runs | 1/1 configs]: Done improvement confidence p.value process/next-tick-breadth-args.js millions=4 19.72 % *** 3.013913e-06 process/next-tick-breadth.js millions=4 27.33 % *** 5.847983e-11 process/next-tick-depth-args.js millions=12 40.08 % *** 1.237127e-13 process/next-tick-depth.js millions=12 77.27 % *** 1.413290e-11 process/next-tick-exec-args.js millions=5 13.58 % *** 1.245180e-07 process/next-tick-exec.js millions=5 16.80 % *** 2.961386e-07
2ce172b
to
6bed72a
Compare
I think we should just remove the slow path. Having more than 32 nested |
@AndreasMadsen I know that @trevnorris made the switch from using only a typed array to a I agree that the slow path is not likely to be taken but I don’t see why we shouldn’t support it just because of that |
It is code complexity for an edge case that is unlikely to ever happen. |
Possibly naive question: can't you grow the typed array when it's full? Seems simpler than a separate data structure for the fallback case. |
@bnoordhuis You mean, replace by a larger typed array? Yeah, I guess you could. :) @AndreasMadsen Would you be okay with that approach? |
@addaleax Yes. I was thinking the same. |
@AndreasMadsen @bnoordhuis I’ve switched this to a resizable typed array … it doesn’t seem to affect the perf improvements here. :) I’m totally fine with this approach, but I have to admit I don’t see a huge benefit over the previous code… |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments that you are welcome to ignore if I'm completely off-mark. :-)
src/env.h
Outdated
// Stores the ids of the current execution context stack. | ||
std::stack<async_context> async_ids_stack_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove the #include <stack>
now, I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, done!
async_hook_fields[kStackLength]++; | ||
async_id_fields[kExecutionAsyncId] = asyncId; | ||
async_id_fields[kTriggerAsyncId] = triggerAsyncId; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd take a slightly different approach here. Growing the stack can be done in JS: allocate a new typed array, copy over the elements, then call into C++ to update the AliasedBuffer's reference. That probably also obviates the need to overload operator=(AliasedBuffer&&)
.
If you wrote it this way because of how PromiseHook() calls env->async_hooks()->push_async_ids()
, perhaps you can work around that by ensuring there are always a few free slots? I think that should work although admittedly I didn't look too closely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bnoordhuis It’s not just PromiseHook()
, MakeCallback()
also does that – I think for your suggestion to work in a generic fashion you’d have to call that growing function from C++, which kind of just moves the problem around
lib/internal/async_hooks.js
Outdated
// This is the equivalent of the native push_async_ids() call. | ||
function pushAsyncIds(asyncId, triggerAsyncId) { | ||
const stackLength = async_hook_fields[kStackLength]; | ||
if (stackLength >= async_hook_fields[kStackCapacity]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed? You can just check async_ids_stack.length
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
… right, yes. :) Done!
- Communicate the current async stack length through a typed array field rather than a native binding method - Add a new fixed-size `async_ids_fast_stack` typed array that contains the async ID stack up to a fixed limit. This increases performance noticeably, since most of the time the async ID stack will not be more than a handful of levels deep. - Make the JS `pushAsyncIds()` and `popAsyncIds()` functions do the same thing as the native ones if the fast path is applicable. Benchmarks: $ ./node benchmark/compare.js --new ./node --old ./node-master --runs 10 --filter next-tick process | Rscript benchmark/compare.R [00:03:25|% 100| 6/6 files | 20/20 runs | 1/1 configs]: Done improvement confidence p.value process/next-tick-breadth-args.js millions=4 19.72 % *** 3.013913e-06 process/next-tick-breadth.js millions=4 27.33 % *** 5.847983e-11 process/next-tick-depth-args.js millions=12 40.08 % *** 1.237127e-13 process/next-tick-depth.js millions=12 77.27 % *** 1.413290e-11 process/next-tick-exec-args.js millions=5 13.58 % *** 1.245180e-07 process/next-tick-exec.js millions=5 16.80 % *** 2.961386e-07 PR-URL: #17780 Reviewed-By: James M Snell <[email protected]>
Landed in 83e5215 |
This does not land cleanly on v9.x, could someone please manually backport |
This should apply cleanly or at least almost cleanly after #18079 lands |
- Communicate the current async stack length through a typed array field rather than a native binding method - Add a new fixed-size `async_ids_fast_stack` typed array that contains the async ID stack up to a fixed limit. This increases performance noticeably, since most of the time the async ID stack will not be more than a handful of levels deep. - Make the JS `pushAsyncIds()` and `popAsyncIds()` functions do the same thing as the native ones if the fast path is applicable. Benchmarks: $ ./node benchmark/compare.js --new ./node --old ./node-master --runs 10 --filter next-tick process | Rscript benchmark/compare.R [00:03:25|% 100| 6/6 files | 20/20 runs | 1/1 configs]: Done improvement confidence p.value process/next-tick-breadth-args.js millions=4 19.72 % *** 3.013913e-06 process/next-tick-breadth.js millions=4 27.33 % *** 5.847983e-11 process/next-tick-depth-args.js millions=12 40.08 % *** 1.237127e-13 process/next-tick-depth.js millions=12 77.27 % *** 1.413290e-11 process/next-tick-exec-args.js millions=5 13.58 % *** 1.245180e-07 process/next-tick-exec.js millions=5 16.80 % *** 2.961386e-07 PR-URL: nodejs#17780 Backport-PR-URL: nodejs#18179 Reviewed-By: James M Snell <[email protected]>
- Communicate the current async stack length through a typed array field rather than a native binding method - Add a new fixed-size `async_ids_fast_stack` typed array that contains the async ID stack up to a fixed limit. This increases performance noticeably, since most of the time the async ID stack will not be more than a handful of levels deep. - Make the JS `pushAsyncIds()` and `popAsyncIds()` functions do the same thing as the native ones if the fast path is applicable. Benchmarks: $ ./node benchmark/compare.js --new ./node --old ./node-master --runs 10 --filter next-tick process | Rscript benchmark/compare.R [00:03:25|% 100| 6/6 files | 20/20 runs | 1/1 configs]: Done improvement confidence p.value process/next-tick-breadth-args.js millions=4 19.72 % *** 3.013913e-06 process/next-tick-breadth.js millions=4 27.33 % *** 5.847983e-11 process/next-tick-depth-args.js millions=12 40.08 % *** 1.237127e-13 process/next-tick-depth.js millions=12 77.27 % *** 1.413290e-11 process/next-tick-exec-args.js millions=5 13.58 % *** 1.245180e-07 process/next-tick-exec.js millions=5 16.80 % *** 2.961386e-07 Backport-PR-URL: #18179 PR-URL: #17780 Reviewed-By: James M Snell <[email protected]>
hm, it still isn't cherry-picking cleanly. Anyone have time to backport? Thanks! |
- Communicate the current async stack length through a typed array field rather than a native binding method - Add a new fixed-size `async_ids_fast_stack` typed array that contains the async ID stack up to a fixed limit. This increases performance noticeably, since most of the time the async ID stack will not be more than a handful of levels deep. - Make the JS `pushAsyncIds()` and `popAsyncIds()` functions do the same thing as the native ones if the fast path is applicable. Benchmarks: $ ./node benchmark/compare.js --new ./node --old ./node-master --runs 10 --filter next-tick process | Rscript benchmark/compare.R [00:03:25|% 100| 6/6 files | 20/20 runs | 1/1 configs]: Done improvement confidence p.value process/next-tick-breadth-args.js millions=4 19.72 % *** 3.013913e-06 process/next-tick-breadth.js millions=4 27.33 % *** 5.847983e-11 process/next-tick-depth-args.js millions=12 40.08 % *** 1.237127e-13 process/next-tick-depth.js millions=12 77.27 % *** 1.413290e-11 process/next-tick-exec-args.js millions=5 13.58 % *** 1.245180e-07 process/next-tick-exec.js millions=5 16.80 % *** 2.961386e-07 PR-URL: nodejs#17780 Reviewed-By: James M Snell <[email protected]>
- Communicate the current async stack length through a typed array field rather than a native binding method - Add a new fixed-size `async_ids_fast_stack` typed array that contains the async ID stack up to a fixed limit. This increases performance noticeably, since most of the time the async ID stack will not be more than a handful of levels deep. - Make the JS `pushAsyncIds()` and `popAsyncIds()` functions do the same thing as the native ones if the fast path is applicable. Benchmarks: $ ./node benchmark/compare.js --new ./node --old ./node-master --runs 10 --filter next-tick process | Rscript benchmark/compare.R [00:03:25|% 100| 6/6 files | 20/20 runs | 1/1 configs]: Done improvement confidence p.value process/next-tick-breadth-args.js millions=4 19.72 % *** 3.013913e-06 process/next-tick-breadth.js millions=4 27.33 % *** 5.847983e-11 process/next-tick-depth-args.js millions=12 40.08 % *** 1.237127e-13 process/next-tick-depth.js millions=12 77.27 % *** 1.413290e-11 process/next-tick-exec-args.js millions=5 13.58 % *** 1.245180e-07 process/next-tick-exec.js millions=5 16.80 % *** 2.961386e-07 Backport-PR-URL: #18290 PR-URL: #17780 Reviewed-By: James M Snell <[email protected]>
Notable changes: * deps: * update V8 to 6.2.414.46 (Michaël Zasso) [#16413](#16413) * revert ABI breaking changes in V8 6.2 (Anna Henningsen) [#16413](#16413) * upgrade libuv to 1.19.1 (cjihrig) [#18260](#18260) * re land npm 5.6.0 (Myles Borins) [#18625](#18625) * ICU 60 bump (Steven R. Loomis) [#16876](#16876) * crypto: * Support both OpenSSL 1.1.0 and 1.0.2 (David Benjamin) [#16130](#16130) * warn on invalid authentication tag length (Tobias Nießen) [#17566](#17566) * async_hooks: * update defaultTriggerAsyncIdScope for perf (Anatoli Papirovski) [#18004](#18004) * use typed array stack as fast path (Anna Henningsen) [#17780](#17780) * use scope for defaultTriggerAsyncId (Andreas Madsen) [#17273](#17273) * separate missing from default context (Andreas Madsen) [#17273](#17273) * rename initTriggerId (Andreas Madsen) [#17273](#17273) * deprecate undocumented API (Andreas Madsen) [#16972](#16972) * add destroy event for gced AsyncResources (Sebastian Mayr) [#16998](#16998) * add trace events to async_hooks (Andreas Madsen) [#15538](#15538) * set HTTPParser trigger to socket (Andreas Madsen) [#18003](#18003) * add provider types for net server (Andreas Madsen) [#17157](#17157) * n-api: * add helper for addons to get the event loop (Anna Henningsen) [#17109](#17109) * cli: * add --stack-trace-limit to NODE_OPTIONS (Anna Henningsen) [#16495](#16495) * console: * add support for console.debug (Benjamin Zaslavsky) [#17033](#17033) * module: * add builtinModules (Jon Moss) [#16386](#16386) * replace default paths in require.resolve() (cjihrig) [#17113](#17113) * src: * add helper for addons to get the event loop (Anna Henningsen) [#17109](#17109) * add process.ppid (cjihrig) [#16839](#16839) * http: * support generic `Duplex` streams (Anna Henningsen) [#16267](#16267) * add rawPacket in err of `clientError` event (XadillaX) [#17672](#17672) * better support for IPv6 addresses (Mattias Holmlund) [#14772](#14772) * net: * remove ADDRCONFIG DNS hint on Windows (Bartosz Sosnowski) [#17662](#17662) * process: * fix reading zero-length env vars on win32 (Anna Henningsen) [#18463](#18463) * tls: * unconsume stream on destroy (Anna Henningsen) [#17478](#17478) * process: * improve unhandled rejection message (Madara Uchiha) [#17158](#17158) * stream: * remove usage of *State.highWaterMark (Calvin Metcalf) [#12860](#12860) * trace_events: * add executionAsyncId to init events (Andreas Madsen) [#17196](#17196) PR-URL: #18336
Notable changes: * deps: * update V8 to 6.2.414.46 (Michaël Zasso) [#16413](#16413) * revert ABI breaking changes in V8 6.2 (Anna Henningsen) [#16413](#16413) * upgrade libuv to 1.19.1 (cjihrig) [#18260](#18260) * re land npm 5.6.0 (Myles Borins) [#18625](#18625) * ICU 60 bump (Steven R. Loomis) [#16876](#16876) * crypto: * Support both OpenSSL 1.1.0 and 1.0.2 (David Benjamin) [#16130](#16130) * warn on invalid authentication tag length (Tobias Nießen) [#17566](#17566) * async_hooks: * update defaultTriggerAsyncIdScope for perf (Anatoli Papirovski) [#18004](#18004) * use typed array stack as fast path (Anna Henningsen) [#17780](#17780) * use scope for defaultTriggerAsyncId (Andreas Madsen) [#17273](#17273) * separate missing from default context (Andreas Madsen) [#17273](#17273) * rename initTriggerId (Andreas Madsen) [#17273](#17273) * deprecate undocumented API (Andreas Madsen) [#16972](#16972) * add destroy event for gced AsyncResources (Sebastian Mayr) [#16998](#16998) * add trace events to async_hooks (Andreas Madsen) [#15538](#15538) * set HTTPParser trigger to socket (Andreas Madsen) [#18003](#18003) * add provider types for net server (Andreas Madsen) [#17157](#17157) * n-api: * add helper for addons to get the event loop (Anna Henningsen) [#17109](#17109) * cli: * add --stack-trace-limit to NODE_OPTIONS (Anna Henningsen) [#16495](#16495) * console: * add support for console.debug (Benjamin Zaslavsky) [#17033](#17033) * module: * add builtinModules (Jon Moss) [#16386](#16386) * replace default paths in require.resolve() (cjihrig) [#17113](#17113) * src: * add helper for addons to get the event loop (Anna Henningsen) [#17109](#17109) * add process.ppid (cjihrig) [#16839](#16839) * http: * support generic `Duplex` streams (Anna Henningsen) [#16267](#16267) * add rawPacket in err of `clientError` event (XadillaX) [#17672](#17672) * better support for IPv6 addresses (Mattias Holmlund) [#14772](#14772) * net: * remove ADDRCONFIG DNS hint on Windows (Bartosz Sosnowski) [#17662](#17662) * process: * fix reading zero-length env vars on win32 (Anna Henningsen) [#18463](#18463) * tls: * unconsume stream on destroy (Anna Henningsen) [#17478](#17478) * process: * improve unhandled rejection message (Madara Uchiha) [#17158](#17158) * stream: * remove usage of *State.highWaterMark (Calvin Metcalf) [#12860](#12860) * trace_events: * add executionAsyncId to init events (Andreas Madsen) [#17196](#17196) PR-URL: #18336
Notable changes: * deps: * update V8 to 6.2.414.46 (Michaël Zasso) [nodejs#16413](nodejs#16413) * revert ABI breaking changes in V8 6.2 (Anna Henningsen) [nodejs#16413](nodejs#16413) * upgrade libuv to 1.19.1 (cjihrig) [nodejs#18260](nodejs#18260) * re land npm 5.6.0 (Myles Borins) [nodejs#18625](nodejs#18625) * ICU 60 bump (Steven R. Loomis) [nodejs#16876](nodejs#16876) * crypto: * Support both OpenSSL 1.1.0 and 1.0.2 (David Benjamin) [nodejs#16130](nodejs#16130) * warn on invalid authentication tag length (Tobias Nießen) [nodejs#17566](nodejs#17566) * async_hooks: * update defaultTriggerAsyncIdScope for perf (Anatoli Papirovski) [nodejs#18004](nodejs#18004) * use typed array stack as fast path (Anna Henningsen) [nodejs#17780](nodejs#17780) * use scope for defaultTriggerAsyncId (Andreas Madsen) [nodejs#17273](nodejs#17273) * separate missing from default context (Andreas Madsen) [nodejs#17273](nodejs#17273) * rename initTriggerId (Andreas Madsen) [nodejs#17273](nodejs#17273) * deprecate undocumented API (Andreas Madsen) [nodejs#16972](nodejs#16972) * add destroy event for gced AsyncResources (Sebastian Mayr) [nodejs#16998](nodejs#16998) * add trace events to async_hooks (Andreas Madsen) [nodejs#15538](nodejs#15538) * set HTTPParser trigger to socket (Andreas Madsen) [nodejs#18003](nodejs#18003) * add provider types for net server (Andreas Madsen) [nodejs#17157](nodejs#17157) * n-api: * add helper for addons to get the event loop (Anna Henningsen) [nodejs#17109](nodejs#17109) * cli: * add --stack-trace-limit to NODE_OPTIONS (Anna Henningsen) [nodejs#16495](nodejs#16495) * console: * add support for console.debug (Benjamin Zaslavsky) [nodejs#17033](nodejs#17033) * module: * add builtinModules (Jon Moss) [nodejs#16386](nodejs#16386) * replace default paths in require.resolve() (cjihrig) [nodejs#17113](nodejs#17113) * src: * add helper for addons to get the event loop (Anna Henningsen) [nodejs#17109](nodejs#17109) * add process.ppid (cjihrig) [nodejs#16839](nodejs#16839) * http: * support generic `Duplex` streams (Anna Henningsen) [nodejs#16267](nodejs#16267) * add rawPacket in err of `clientError` event (XadillaX) [nodejs#17672](nodejs#17672) * better support for IPv6 addresses (Mattias Holmlund) [nodejs#14772](nodejs#14772) * net: * remove ADDRCONFIG DNS hint on Windows (Bartosz Sosnowski) [nodejs#17662](nodejs#17662) * process: * fix reading zero-length env vars on win32 (Anna Henningsen) [nodejs#18463](nodejs#18463) * tls: * unconsume stream on destroy (Anna Henningsen) [nodejs#17478](nodejs#17478) * process: * improve unhandled rejection message (Madara Uchiha) [nodejs#17158](nodejs#17158) * stream: * remove usage of *State.highWaterMark (Calvin Metcalf) [nodejs#12860](nodejs#12860) * trace_events: * add executionAsyncId to init events (Andreas Madsen) [nodejs#17196](nodejs#17196) PR-URL: nodejs#18336
typed array field rather than a native binding method
async_ids_fast_stack
typed arraythat contains the async ID stack up to a fixed limit.
This increases performance noticeably, since most of the time
the async ID stack will not be more than a handful of
levels deep.
pushAsyncIds()
andpopAsyncIds()
functionsdo the same thing as the native ones if the fast path
is applicable.
Benchmarks:
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passesAffected core subsystem(s)
async_hooks