-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build: speed up startup with V8 code cache #21405
Conversation
src/node.cc
Outdated
@@ -1581,6 +1582,10 @@ static void GetBinding(const FunctionCallbackInfo<Value>& args) { | |||
} else if (!strcmp(*module_v, "natives")) { | |||
exports = Object::New(env->isolate()); | |||
DefineJavaScript(env, exports); | |||
} else if (!strcmp(*module_v, "code_cache")) { | |||
// internalBinding('code_cache') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant to add this to the internalBinding
but it is now in the legacy binding for debugging purposes.
what are those process properties for exactly |
Makefile
Outdated
CODE_CACHE_FILE ?= $(CODE_CACHE_DIR)/node_code_cache.cc | ||
|
||
.PHONY: with-code-cache | ||
with-code-cache: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be possible to put this into the gyp file, but given that gyp has not been supported by V8 and we are trying to migrate into a new build system, it's easier to put the two-pass build step into Makefile (but to support that on Windows we would need to port to vcbuild.bat
)
It may also be possible to generate the file using an executable compiled from C++ that includes node_javascript.cc
for the sources and calls V8 APIs to generate the cache instead of using a JS script, like the mk*
targets in V8, but to do that we would still need to refactor the gyp file as well so I picked the easier route.
@devsnek Right now there are three things added to the process object, all of them for debugging purposes:
I can move the two lists somewhere else, or only expose them in debug builds, or just delete them, suggestions are definitely welcomed. Mind that we should avoid doing the bookkeeping by controlling the behavior with any env variables or flags because if we want to introduce the V8 snapshot in the future, we will need |
tools/generate_code_cache.js
Outdated
|
||
for (const key of Object.keys(natives)) { | ||
if (key === 'config') continue; | ||
const wrapper = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently there are no way to get the native module wrapper in the user land so this is just copy-pasted. We could put the wrapper in a file and transform that in the loader using macros in js2c and read it off the disk here, or just put a comment here. In any cases if the source do not match, the module would appear in process.cacheRejectedList
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
require('module).wrapper
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benjamingr That's the wrapper we use for user land modules. The native modules uses NativeModule.wrapper
which is internal and subject to change. We have tests to make sure the internal loaders and their properties could not be leaked to user land.
tools/generate_code_cache.js
Outdated
|
||
const script = new vm.Script(code, { | ||
filename: `${key}.js`, | ||
produceCachedData: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to use the old API to produce cache right after compile. This either compiles everything if specified to compiled eagerly, or only the toplevel function.
With the new API for which @devsnek is writing the binding, you could precisely include the functions you require for startup.
Maybe something to think about for the future :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hashseed Yeah this is also why I think it's a good idea to return error information in that PR, for example if I run this script with --inspect-brk
and open the devtools I would not be able to create any cache at all. I know it's because it's in debug mode but that's only because I've looked into the API before. If we want to open up tooling opportunities to users it would be nice to be able to tell then why the code cache cannot be created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This either compiles everything if specified to compiled eagerly, or only the toplevel function.
FYI: We do not actually support eager compilation through the vm.Script
API, so this is only the top-level function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TimothyGu We should probably add support for the full range of CompileOptions
(and possibly NoCacheReason
) to ContextifyScript
/vm.Script
(in another PR of course)
lib/internal/bootstrap/loaders.js
Outdated
source, this.filename, 0, 0, | ||
codeCache[this.id], false, undefined | ||
); | ||
if (!codeCache[this.id] || script.cachedDataRejected) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this should happen. If it does, something bad must have happened. Probably best to crash here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hashseed I assume you are talking about the second condition? The first happens when the cache is not built into the binary. Throwing on the second one (if (codeCache[this.id] && script.cachedDataRejected)
) sounds reasonable to me.
tools/generate_code_cache.js
Outdated
} else { | ||
const { def, initializer } = getInitalizer(key, Buffer.alloc(0)); | ||
cacheDefs.push(def); | ||
initializers.push(initializer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we are still adding the definition and the initializer to the output source file when the cached data failed to generate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TimothyGu I think it makes sense to make the build step more lenient. If the cache data cannot be created it'll just be empty and in the current implementation it's going to be added to cacheRejectedList
when the loader fails to compile a module with the cache. If we are going to throw on bad caches we can skip the cache if it's an empty buffer, or just fail at the build phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. Thanks for the explanation.
src/node.cc
Outdated
exports = InitModule(env, mod, module); | ||
} else { | ||
return ThrowIfNoSuchModule(env, *module_v); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a functional difference here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TimothyGu I was adding the code_cache
binding to internalBindnig
but later moved that to process.binding
for debugging purposes. This is the legacy of that cut-and-paste. I think we should move it back in the final incarnation of this PR..
tools/generate_code_cache.js
Outdated
const fs = require('fs'); | ||
let count = 0; | ||
|
||
function human(num) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we give this a better name?
tools/generate_code_cache.js
Outdated
|
||
namespace node { | ||
|
||
${cacheDefs.join('\n\n')} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the double newline? Just debugging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The double new line will result in an empty new line between definitions, which is what we generally have (one after the current definition, one empty line). Alternatively we could add the newline into the elements of cacheDefs
itself.
This patch speeds up the startup time and reduce the startup memory footprint by using V8 code cache when comiling builtin modules. The current approach is demonstrated in the `with-code-cache` Makefile target (no corresponding Windows target at the moment). 1. Build the binary normally (`src/node_code_cache_stub.cc` is used), by now `internalBinding('code_cache')` is an empty object 2. Run `tools/generate_code_cache.js` with the binary, which generates the code caches by reading source code of builtin modules off source code exposed by `require('internal/bootstrap/cache').builtinSource` and then generate a C++ file containing static char arrays of the code cache, using a format similar to `node_javascript.cc` 3. Run `configure` with the `--code-cache-path` option so that the newly generated C++ file will be used when compiling the new binary. The generated C++ file will put the cache into the `internalBinding('code_cache')` object with the module ids as keys 4. The new binary tries to read the code cache from `internalBinding('code_cache')` and use it to compile builtin modules. If the cache is used, it will put the id into `require('internal/bootstrap/cache').compiledWithCache` for bookkeeping, otherwise the id will be pushed into `require('internal/bootstrap/cache').compiledWithoutCache` This patch also added tests that verify the code cache is generated and used when compiling builtin modules. The binary with code cache: - Is ~1MB bigger than the binary without code cahe - Consumes ~1MB less memory during start up - Starts up about 60% faster
I have updated this PR based on the feedback:
Note that this patch still does not affect normal workflows and releases. To use the code cache by default we'll need to make changes to I think it's a good start to land this first and iterate on the build files later (and ideally |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super neat!
lib/internal/bootstrap/loaders.js
Outdated
@@ -184,7 +223,7 @@ | |||
if (id === loaderId) { | |||
return false; | |||
} | |||
return NativeModule.exists(id); | |||
return id === cacheId || NativeModule.exists(id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can it be handled as any other internal (in its own file)? That way it avoids conditional checks NativeModule.require
and NativeModule.nonInternalExists
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jdalton Good idea, I'll do that
test/code-cache/code-cache.status
Outdated
@@ -0,0 +1,21 @@ | |||
prefix v8-updates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefix code-cache
? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, copy paste error :P
tools/generate_code_cache.js
Outdated
v8::Local<v8::Uint8Array> ${defName}_array = | ||
v8::Uint8Array::New(${defName}_ab, 0, ${cache.length}); | ||
target->Set(context, | ||
OneByteString(isolate, "${key}"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use FIXED_ONE_BYTE_STRING
since the length is known at compile time?
`${cache.join(',')}\n};`; | ||
const initializer = ` | ||
v8::Local<v8::ArrayBuffer> ${defName}_ab = | ||
v8::ArrayBuffer::New(isolate, ${defName}_raw, ${cache.length}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, but maybe as a future optimization we could use a single array buffer for all of these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@addaleax Do you mean creating one buffer ane using offsets on it when creating the Uint8Arrays? I think that's doable but I am not sure how big a performance impact that would be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that’s what I mean – I wouldn’t expect it to make a huge difference, and it’s definitely not something that needs to be thought about in this PR :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joyeecheung For a reference, whenever it's tackled, you can check out v8-compile-cache which uses a single buffer and an offset map.
Makefile
Outdated
@@ -91,6 +91,22 @@ $(NODE_G_EXE): config.gypi out/Makefile | |||
$(MAKE) -C out BUILDTYPE=Debug V=$(V) | |||
if [ ! -r $@ -o ! -L $@ ]; then ln -fs out/Debug/$(NODE_EXE) $@; fi | |||
|
|||
CODE_CACHE_DIR ?= out/$(BUILDTYPE)/obj.target | |||
CODE_CACHE_FILE ?= $(CODE_CACHE_DIR)/node_code_cache.cc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think most all generated code is in out/$(BUILDTYPE)/obj/gen
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@addaleax Yes, not sure why I picked obj.target
...thanks for pointing that out, I'll fix it
@jdalton @addaleax Thanks for the reviews, updated.
EDIT: forgot to exclude the |
Relying on V8's code caching sanity checks to catch this is a bad idea. The only check V8 performs wrt source string is that its length is as expected. You could modify the source but keep the length, and V8 will happily accept the cache data. You will end up with confused developers who cannot figure out why their changes to something in Please perform some more surgery to the build files, with this PR or later :) |
@hashseed Thanks for the reminder, I'll add a FIXME there. I think one way to solve the mismatch issue is to calculate checksums for the source both in js2c and in |
// NativeModule.prototype.compile | ||
const script = new vm.Script(code, { | ||
filename: `${key}.js`, | ||
produceCachedData: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Script#createCachedData has landed so i wonder if maybe there's another way to do this so we can snapshot them after initial evaluation. (maybe build node with a --produce-snapshots flag?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devsnek Did you mean snapshot or code cache? I am not sure what initial evaluation
means, right now to make enough progress on the snapshot front we still need some proper refactoring of node.cc
and node.js
and pick out the parts that can be snapshotted. Code cache is context-free so it's a lower hanging fruit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah sorry for using some confusing terms. I mean we should see if we can produce code caches right after the js runs but before we run user code or smth similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devsnek Does that make a difference though? The options passed to ScriptCompiler
would probably still be the same and the script is still unbound.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joyeecheung we would make a code cache from after the code first evaluated, which as i understand it is essentially a "fast-forward" to that state when you use the code cache
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devsnek Did you mean creating the code cache out of a Function
? Otherwise I cannot see a difference if the unbound scripts are produced by passing the same arguments to CompileUnboundScript
? I have not checked but I'd expect the code cache produced here and the code cache produced in the loaders to be the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joyeecheung right now this pr creates the caches before evaluation. i'm suggesting we create them after evaluation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devsnek Did you mean we can create the code cache after unbound_script->BindToCurrentContext()
and bounded_script->Run()
? I am still not sure what difference that makes and how to get the core to spit the cache out to a file without relying on any environment variable or flags in loaders.js
...maybe we can leave that for a future PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me clear up some confusion :)
This current PR creates the code cache right after each script compiles. At that time, V8 only compiled the toplevel function. Any inner functions that are called during bootstrapping needs to be compiled as we go. We call this lazy compilation. Lazily compiled functions are not part of the code cache.
Script#createCachedData introduces a way to create the code cache at arbitrary time, as long as you have the v8::UnboundScript object. The benefit is that if you call it after you have executed the script for bootstrapping, lazily compiled functions during that execution are now also included in the code cache. Therefore, when bootstrapping with code cache, you no longer need to compile previously lazily compiled functions.
Neither of the two ways of code caching captures state (to fast-forward to). Please do not confuse code cache with startup snapshot :)
CI before landing https://ci.nodejs.org/job/node-test-pull-request/15643/ |
Landed in 4750ce2, thanks! |
This patch speeds up the startup time and reduce the startup memory footprint by using V8 code cache when comiling builtin modules. The current approach is demonstrated in the `with-code-cache` Makefile target (no corresponding Windows target at the moment). 1. Build the binary normally (`src/node_code_cache_stub.cc` is used), by now `internalBinding('code_cache')` is an empty object 2. Run `tools/generate_code_cache.js` with the binary, which generates the code caches by reading source code of builtin modules off source code exposed by `require('internal/bootstrap/cache').builtinSource` and then generate a C++ file containing static char arrays of the code cache, using a format similar to `node_javascript.cc` 3. Run `configure` with the `--code-cache-path` option so that the newly generated C++ file will be used when compiling the new binary. The generated C++ file will put the cache into the `internalBinding('code_cache')` object with the module ids as keys 4. The new binary tries to read the code cache from `internalBinding('code_cache')` and use it to compile builtin modules. If the cache is used, it will put the id into `require('internal/bootstrap/cache').compiledWithCache` for bookkeeping, otherwise the id will be pushed into `require('internal/bootstrap/cache').compiledWithoutCache` This patch also added tests that verify the code cache is generated and used when compiling builtin modules. The binary with code cache: - Is ~1MB bigger than the binary without code cahe - Consumes ~1MB less memory during start up - Starts up about 60% faster PR-URL: #21405 Reviewed-By: John-David Dalton <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Gus Caplan <[email protected]>
This patch speeds up the startup time and reduce the startup memory footprint by using V8 code cache when comiling builtin modules. The current approach is demonstrated in the `with-code-cache` Makefile target (no corresponding Windows target at the moment). 1. Build the binary normally (`src/node_code_cache_stub.cc` is used), by now `internalBinding('code_cache')` is an empty object 2. Run `tools/generate_code_cache.js` with the binary, which generates the code caches by reading source code of builtin modules off source code exposed by `require('internal/bootstrap/cache').builtinSource` and then generate a C++ file containing static char arrays of the code cache, using a format similar to `node_javascript.cc` 3. Run `configure` with the `--code-cache-path` option so that the newly generated C++ file will be used when compiling the new binary. The generated C++ file will put the cache into the `internalBinding('code_cache')` object with the module ids as keys 4. The new binary tries to read the code cache from `internalBinding('code_cache')` and use it to compile builtin modules. If the cache is used, it will put the id into `require('internal/bootstrap/cache').compiledWithCache` for bookkeeping, otherwise the id will be pushed into `require('internal/bootstrap/cache').compiledWithoutCache` This patch also added tests that verify the code cache is generated and used when compiling builtin modules. The binary with code cache: - Is ~1MB bigger than the binary without code cahe - Consumes ~1MB less memory during start up - Starts up about 60% faster PR-URL: #21405 Reviewed-By: John-David Dalton <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Gus Caplan <[email protected]>
Notable changes: * build: * Node.js should now be about 60% faster to startup than the previous version, thanks to the use V8's code cache feature for core modules. [#21405](#21405) * dns: * An experimental promisified version of the dns module is now available. Give it a try with `require('dns').promises`. [#21264](#21264) * fs: * `fs.lchown` has been undeprecated now that libuv supports it. [#21498](#21498) * lib: * `Atomics.wake` is being renamed to `Atomics.notify` in the ECMAScript specification ([reference](tc39/ecma262#1220)). Since Node.js now has experimental support for worker threads, we are being proactive and added a `notify` alias, while emitting a warning if `wake` is used. [#21413](#21413) [#21518](#21518) * n-api: * Add API for asynchronous functions. [#17887](#17887) * util: * `util.inspect` is now able to return a result instead of throwing when the maximum call stack size is exceeded during inspection. [#20725](#20725) * vm: * Add `script.createCachedData()`. This API replaces the `produceCachedData` option of the `Script` constructor that is now deprecated. [#20300](#20300) * worker: * Support for relative paths has been added to the `Worker` constructor. Paths are interpreted relative to the current working directory. [#21407](#21407) PR-URL: #21629
build: speed up startup with V8 code cache
This patch speeds up the startup time and reduce the startup memory
footprint by using V8 code cache when comiling builtin modules.
The current approach is demonstrated in the
with-code-cache
Makefile target (no corresponding Windows target at the moment).
src/node_code_cache_stub.cc
is used),by now
internalBinding('code_cache')
is an empty objecttools/generate_code_cache.js
with the binary, which generatesthe code caches by reading source code of builtin modules off source
code exposed by
require('internal/bootstrap/cache').builtinSource
and then generate a C++ file containing static char arrays of the
code cache, using a format similar to
node_javascript.cc
configure
with the--code-cache-path
option so thatthe newly generated C++ file will be used when compiling the
new binary. The generated C++ file will put the cache into
the
internalBinding('code_cache')
object with the moduleids as keys
internalBinding('code_cache')
and use it to compilebuiltin modules. If the cache is used, it will put the id
into
require('internal/bootstrap/cache').compiledWithCache
for bookkeeping, otherwise the id will be pushed into
require('internal/bootstrap/cache').compiledWithoutCache
This patch also added tests that verify the code cache is
generated and used when compiling builtin modules.
The binary with code cache:
perf stat ./node -e ";"
was ~80ms and is now ~40ms.Performance stats
OS: Darwin Kernel Version 17.5.0
Compiler: Apple LLVM version 9.1.0 (clang-902.0.39.1)
CPU: Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
Benchmark result:
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes