-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
src: add snapshot support for embedder API #45888
Conversation
Review requested:
|
👍 to see this happening! Before I dive deeper into the code: the snapshot blob we currently have is really a per-process thing, it contains a template for the vm contexts, a template for creating worker instances, and a template for creating the main instance. For now it probably makes more sense to make it a parameter to something like |
@joyeecheung Aren't snapshots really a per-Isolate concept, though? What would be the downside of embedders potentially passing two different snapshots to different Isolates, for example?
Just to be clear, this PR does allow embedders to consume snapshots built with |
af893f5
to
5e487f1
Compare
Yes, and our SnapshotData is...currently not. That's a TODO.
Because they'd be passing two snapshots blobs that are actually not of the same structure. Currently they are passing a pointer to a per-process snapshot blob to a per-isolate API, so if we add per-isolate snapshots later (which is probably a building block of the per-process snapshot), we need to figure out which type the blob actually is, it's also somewhat confusing the there is one
Yes and...no. It's user-land snapshot but it's not per-isolate, so we build it as a user-land snapshot, and consume it as a built-in snapshot, which is fine by now because we technically only support one isolate (the one of the main instance) in the snapshot and there is currently no way for users to pass any more snapshots into any other isolate that might be in the process.
That's actually where the problem is because technically you'd want to pass a snapshot data blob to |
So ... let me rephrase things a bit: Currently, the Node.js snapshots, both builtin and userland, are per-Node.js-Isolate-tree, where "Node.js Isolate tree" includes a main Isolate and (at least potentially) the Isolates of worker threads spawned from that main Isolate. Embedders don't really know about those "child" Isolates, so from their perspective, the snapshot is per-Isolate -- that's what I mean here. It probably makes sense to be more specific with the wording here. But as there is no true per-process data associated with snapshots right now (and I don't think there could ever really be, given their nature), I don't think it makes sense to say they are per-process.
It's always easy enough to distinguish different types of snapshots from this API starting point -- there could just be two different subclasses of
Then maybe it could help to clarify, what is the eventual plan for user-land snapshots (from a non-embedder point of view) here? In a non-embedder context, users can only create new Isolates through the Worker API or addons, but it doesn't seem to me like those would lead to a deviation in APIs for embedders.
I think that might be useful in the future, yes.
As mentioned above -- it does seem to be that both of these things are really per-Node.js-Isolate-tree, not per-process. |
OK, I think it makes a bit more sense if we conceptualize the snapshot as per-isolate-tree. Are we supposed to support multiple isolate trees though? I don't think the snapshot is really ready for that yet... BTW,
Is there any plan to support creating workers in the embedder API? Or are the embedders only supposed to build workers on top of existing APIs? I think the snapshot support in the JS-land worker API should be blob-based, not file-based, as that's programmable unlike the startup of the main instance, and technically, the embedder API should be blob-based too, as that's also programmable. The intermediate file seems to be a weird distraction to me, we only do that for the main instance because when you start Node.js from the command line there's not really a better alternative to specify the blob. |
I guess the question would be, in what ways are those actually interdependent? As long as there is no shared global state that is affected, this should be fine.
That’s not necessarily my plan… we can do it, and I don’t really see anything speak against it. At the same time, for regular users, I think you already brought up better alternatives: For Workers, there should likely be a more programmatic approach than CLI flags, and for embedders, creating a snapshot through an API method may be convenient. For the application that this patch was evaluated against, it seems like a perfectly fine approach to build a snapshot with a regular Node.js binary (using the same version and configure flags of course), then compile Node.js again with the generated snapshot embedded.
Great point – I’m happy to provide a PR for making the Line 460 in b66ae39
Well, Workers are already kind of embedded Node.js instances… :) I don’t see any reason to change the embedder API in this regard.
Totally agree – the main reason I kept using |
As discussed in nodejs#45888, using a global `BuiltinLoader` instance is probably undesirable in a world in which embedders are able to create Node.js Environments with different sources and therefore mutually incompatible code caching properties. This PR makes it so that `BuiltinLoader` is no longer a global singleton and instead only shared between `Environment`s that have a direct relation to each other, and addresses a few thread safety issues along with that.
@joyeecheung Fwiw, I’ve opened #45942 to take us one further step there regarding the |
SGTM, I don't think that's necessarily a blocker for this PR though if the APIs here assert that the embedders cannot try to create isolates from multiple Node.js snapshots in the same process (otherwise, that's unsafe without isolation of the builtin code cache). I feel more strongly about the file based API though....it just seems weird to me that the users would use a per-process CLI flag to generate the blob and them consume it from the API with a |
As discussed in nodejs#45888, using a global `BuiltinLoader` instance is probably undesirable in a world in which embedders are able to create Node.js Environments with different sources and therefore mutually incompatible code caching properties. This PR makes it so that `BuiltinLoader` is no longer a global singleton and instead only shared between `Environment`s that have a direct relation to each other, and addresses a few thread safety issues along with that.
As discussed in nodejs#45888, using a global `BuiltinLoader` instance is probably undesirable in a world in which embedders are able to create Node.js Environments with different sources and therefore mutually incompatible code caching properties. This PR makes it so that `BuiltinLoader` is no longer a global singleton and instead only shared between `Environment`s that have a direct relation to each other, and addresses a few thread safety issues along with that.
As discussed in nodejs#45888, using a global `BuiltinLoader` instance is probably undesirable in a world in which embedders are able to create Node.js Environments with different sources and therefore mutually incompatible code caching properties. This PR makes it so that `BuiltinLoader` is no longer a global singleton and instead only shared between `Environment`s that have a direct relation to each other, and addresses a few thread safety issues along with that.
As discussed in #45888, using a global `BuiltinLoader` instance is probably undesirable in a world in which embedders are able to create Node.js Environments with different sources and therefore mutually incompatible code caching properties. This PR makes it so that `BuiltinLoader` is no longer a global singleton and instead only shared between `Environment`s that have a direct relation to each other, and addresses a few thread safety issues along with that. PR-URL: #45942 Reviewed-By: Joyee Cheung <[email protected]>
As discussed in #45888, using a global `BuiltinLoader` instance is probably undesirable in a world in which embedders are able to create Node.js Environments with different sources and therefore mutually incompatible code caching properties. This PR makes it so that `BuiltinLoader` is no longer a global singleton and instead only shared between `Environment`s that have a direct relation to each other, and addresses a few thread safety issues along with that. PR-URL: #45942 Reviewed-By: Joyee Cheung <[email protected]>
5e487f1
to
d2746af
Compare
@joyeecheung Sooo … I did just rebase this PR with what is essentially what you are suggesting, in a way that I don’t think requires adding too much new code, see d2746af. This doesn’t currently work, because compiling the main script in the embedder test uses the code path that compiles it as a builtin, rather than what the |
@addaleax IIUC the issue is essentially that the snapshot preparation process is done in |
PR-URL: #45888 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Joyee Cheung <[email protected]>
This is a shared follow-up to 06bb6b4 and a466fea now that both have been merged. PR-URL: #46491 Refs: #45888 Refs: #46463 Reviewed-By: Joyee Cheung <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Darshan Sen <[email protected]> Reviewed-By: Tobias Nießen <[email protected]>
nodejs#45888 took the environment creation code out of the scope covered by the v8::TryCatch that we use to print early failures during environment creation. So e.g. when adding something that would fail in node.js, we get ``` node:internal/options:554: Uncaught Error: Should not query options before bootstrapping is done ``` This patch restores that by adding another v8::TryCatch for it: ``` node:internal/options:20 ({ options: optionsMap } = getCLIOptions()); ^ Error: Should not query options before bootstrapping is done at getCLIOptionsFromBinding (node:internal/options:20:32) at getOptionValue (node:internal/options:45:19) at node:internal/bootstrap/node:433:29 ```
Add experimental support for loading snapshots in the embedder API by adding a public opaque wrapper for our `SnapshotData` struct and allowing embedders to pass it to the relevant setup functions. Where applicable, use these helpers to deduplicate existing code in Node.js’s startup path. This has shown a 40 % startup performance increase for a real-world application, even with the somewhat limited current support for built-in modules. The documentation includes a note about no guarantees for API or ABI stability for this feature while it is experimental. PR-URL: #45888 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Joyee Cheung <[email protected]>
PR-URL: #45888 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Joyee Cheung <[email protected]>
PR-URL: #45888 Reviewed-By: James M Snell <[email protected]> Reviewed-By: Joyee Cheung <[email protected]>
This is a shared follow-up to 06bb6b4 and a466fea now that both have been merged. PR-URL: #46491 Refs: #45888 Refs: #46463 Reviewed-By: Joyee Cheung <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Darshan Sen <[email protected]> Reviewed-By: Tobias Nießen <[email protected]>
Notable changes: deps: * upgrade npm to 9.5.0 (npm team) #46673 * add ada as a dependency (Yagiz Nizipli) #46410 doc: * add debadree25 to collaborators (Debadree Chatterjee) #46716 * add deokjinkim to collaborators (Deokjin Kim) #46444 doc,lib,src,test: * rename --test-coverage (Colin Ihrig) #46017 lib: * (SEMVER-MINOR) add aborted() utility function (Debadree Chatterjee) #46494 src: * (SEMVER-MINOR) add initial support for single executable applications (Darshan Sen) #45038 * (SEMVER-MINOR) allow optional Isolate termination in node::Stop() (Shelley Vohr) #46583 * (SEMVER-MINOR) allow blobs in addition to `FILE*`s in embedder snapshot API (Anna Henningsen) #46491 * (SEMVER-MINOR) allow snapshotting from the embedder API (Anna Henningsen) #45888 * (SEMVER-MINOR) make build_snapshot a per-Isolate option, rather than a global one (Anna Henningsen) #45888 * (SEMVER-MINOR) add snapshot support for embedder API (Anna Henningsen) #45888 * (SEMVER-MINOR) allow embedder control of code generation policy (Shelley Vohr) #46368 stream: * (SEMVER-MINOR) add abort signal for ReadableStream and WritableStream (Debadree Chatterjee) #46273 test_runner: * add initial code coverage support (Colin Ihrig) #46017 url: * replace url-parser with ada (Yagiz Nizipli) #46410 PR-URL: TODO
Notable changes: deps: * upgrade npm to 9.5.0 (npm team) #46673 * add ada as a dependency (Yagiz Nizipli) #46410 doc: * add debadree25 to collaborators (Debadree Chatterjee) #46716 * add deokjinkim to collaborators (Deokjin Kim) #46444 doc,lib,src,test: * rename --test-coverage (Colin Ihrig) #46017 lib: * (SEMVER-MINOR) add aborted() utility function (Debadree Chatterjee) #46494 src: * (SEMVER-MINOR) add initial support for single executable applications (Darshan Sen) #45038 * (SEMVER-MINOR) allow optional Isolate termination in node::Stop() (Shelley Vohr) #46583 * (SEMVER-MINOR) allow blobs in addition to `FILE*`s in embedder snapshot API (Anna Henningsen) #46491 * (SEMVER-MINOR) allow snapshotting from the embedder API (Anna Henningsen) #45888 * (SEMVER-MINOR) make build_snapshot a per-Isolate option, rather than a global one (Anna Henningsen) #45888 * (SEMVER-MINOR) add snapshot support for embedder API (Anna Henningsen) #45888 * (SEMVER-MINOR) allow embedder control of code generation policy (Shelley Vohr) #46368 stream: * (SEMVER-MINOR) add abort signal for ReadableStream and WritableStream (Debadree Chatterjee) #46273 test_runner: * add initial code coverage support (Colin Ihrig) #46017 url: * replace url-parser with ada (Yagiz Nizipli) #46410 PR-URL: #46725
Notable changes: deps: * upgrade npm to 9.5.0 (npm team) nodejs#46673 * add ada as a dependency (Yagiz Nizipli) nodejs#46410 doc: * add debadree25 to collaborators (Debadree Chatterjee) nodejs#46716 * add deokjinkim to collaborators (Deokjin Kim) nodejs#46444 doc,lib,src,test: * rename --test-coverage (Colin Ihrig) nodejs#46017 lib: * (SEMVER-MINOR) add aborted() utility function (Debadree Chatterjee) nodejs#46494 src: * (SEMVER-MINOR) add initial support for single executable applications (Darshan Sen) nodejs#45038 * (SEMVER-MINOR) allow optional Isolate termination in node::Stop() (Shelley Vohr) nodejs#46583 * (SEMVER-MINOR) allow blobs in addition to `FILE*`s in embedder snapshot API (Anna Henningsen) nodejs#46491 * (SEMVER-MINOR) allow snapshotting from the embedder API (Anna Henningsen) nodejs#45888 * (SEMVER-MINOR) make build_snapshot a per-Isolate option, rather than a global one (Anna Henningsen) nodejs#45888 * (SEMVER-MINOR) add snapshot support for embedder API (Anna Henningsen) nodejs#45888 * (SEMVER-MINOR) allow embedder control of code generation policy (Shelley Vohr) nodejs#46368 stream: * (SEMVER-MINOR) add abort signal for ReadableStream and WritableStream (Debadree Chatterjee) nodejs#46273 test_runner: * add initial code coverage support (Colin Ihrig) nodejs#46017 url: * replace url-parser with ada (Yagiz Nizipli) nodejs#46410 PR-URL: nodejs#46725
Notable changes: deps: * upgrade npm to 9.5.0 (npm team) #46673 * add ada as a dependency (Yagiz Nizipli) #46410 doc: * add debadree25 to collaborators (Debadree Chatterjee) #46716 * add deokjinkim to collaborators (Deokjin Kim) #46444 doc,lib,src,test: * rename --test-coverage (Colin Ihrig) #46017 lib: * (SEMVER-MINOR) add aborted() utility function (Debadree Chatterjee) #46494 src: * (SEMVER-MINOR) add initial support for single executable applications (Darshan Sen) #45038 * (SEMVER-MINOR) allow optional Isolate termination in node::Stop() (Shelley Vohr) #46583 * (SEMVER-MINOR) allow blobs in addition to `FILE*`s in embedder snapshot API (Anna Henningsen) #46491 * (SEMVER-MINOR) allow snapshotting from the embedder API (Anna Henningsen) #45888 * (SEMVER-MINOR) make build_snapshot a per-Isolate option, rather than a global one (Anna Henningsen) #45888 * (SEMVER-MINOR) add snapshot support for embedder API (Anna Henningsen) #45888 * (SEMVER-MINOR) allow embedder control of code generation policy (Shelley Vohr) #46368 stream: * (SEMVER-MINOR) add abort signal for ReadableStream and WritableStream (Debadree Chatterjee) #46273 test_runner: * add initial code coverage support (Colin Ihrig) #46017 url: * replace url-parser with ada (Yagiz Nizipli) #46410 PR-URL: #46725
Notable changes: deps: * upgrade npm to 9.5.0 (npm team) #46673 * add ada as a dependency (Yagiz Nizipli) #46410 doc: * add debadree25 to collaborators (Debadree Chatterjee) #46716 * add deokjinkim to collaborators (Deokjin Kim) #46444 doc,lib,src,test: * rename --test-coverage (Colin Ihrig) #46017 lib: * (SEMVER-MINOR) add aborted() utility function (Debadree Chatterjee) #46494 src: * (SEMVER-MINOR) add initial support for single executable applications (Darshan Sen) #45038 * (SEMVER-MINOR) allow optional Isolate termination in node::Stop() (Shelley Vohr) #46583 * (SEMVER-MINOR) allow blobs in addition to `FILE*`s in embedder snapshot API (Anna Henningsen) #46491 * (SEMVER-MINOR) allow snapshotting from the embedder API (Anna Henningsen) #45888 * (SEMVER-MINOR) make build_snapshot a per-Isolate option, rather than a global one (Anna Henningsen) #45888 * (SEMVER-MINOR) add snapshot support for embedder API (Anna Henningsen) #45888 * (SEMVER-MINOR) allow embedder control of code generation policy (Shelley Vohr) #46368 stream: * (SEMVER-MINOR) add abort signal for ReadableStream and WritableStream (Debadree Chatterjee) #46273 test_runner: * add initial code coverage support (Colin Ihrig) #46017 url: * replace url-parser with ada (Yagiz Nizipli) #46410 PR-URL: #46725
Notable changes: deps: * upgrade npm to 9.5.0 (npm team) #46673 * add ada as a dependency (Yagiz Nizipli) #46410 doc: * add debadree25 to collaborators (Debadree Chatterjee) #46716 * add deokjinkim to collaborators (Deokjin Kim) #46444 doc,lib,src,test: * rename --test-coverage (Colin Ihrig) #46017 lib: * (SEMVER-MINOR) add aborted() utility function (Debadree Chatterjee) #46494 src: * (SEMVER-MINOR) add initial support for single executable applications (Darshan Sen) #45038 * (SEMVER-MINOR) allow optional Isolate termination in node::Stop() (Shelley Vohr) #46583 * (SEMVER-MINOR) allow blobs in addition to `FILE*`s in embedder snapshot API (Anna Henningsen) #46491 * (SEMVER-MINOR) allow snapshotting from the embedder API (Anna Henningsen) #45888 * (SEMVER-MINOR) make build_snapshot a per-Isolate option, rather than a global one (Anna Henningsen) #45888 * (SEMVER-MINOR) add snapshot support for embedder API (Anna Henningsen) #45888 * (SEMVER-MINOR) allow embedder control of code generation policy (Shelley Vohr) #46368 stream: * (SEMVER-MINOR) add abort signal for ReadableStream and WritableStream (Debadree Chatterjee) #46273 test_runner: * add initial code coverage support (Colin Ihrig) #46017 url: * replace url-parser with ada (Yagiz Nizipli) #46410 PR-URL: #46725
Notable changes: deps: * upgrade npm to 9.5.0 (npm team) #46673 * add ada as a dependency (Yagiz Nizipli) #46410 doc: * add debadree25 to collaborators (Debadree Chatterjee) #46716 * add deokjinkim to collaborators (Deokjin Kim) #46444 doc,lib,src,test: * rename --test-coverage (Colin Ihrig) #46017 lib: * (SEMVER-MINOR) add aborted() utility function (Debadree Chatterjee) #46494 src: * (SEMVER-MINOR) add initial support for single executable applications (Darshan Sen) #45038 * (SEMVER-MINOR) allow optional Isolate termination in node::Stop() (Shelley Vohr) #46583 * (SEMVER-MINOR) allow blobs in addition to `FILE*`s in embedder snapshot API (Anna Henningsen) #46491 * (SEMVER-MINOR) allow snapshotting from the embedder API (Anna Henningsen) #45888 * (SEMVER-MINOR) make build_snapshot a per-Isolate option, rather than a global one (Anna Henningsen) #45888 * (SEMVER-MINOR) add snapshot support for embedder API (Anna Henningsen) #45888 * (SEMVER-MINOR) allow embedder control of code generation policy (Shelley Vohr) #46368 stream: * (SEMVER-MINOR) add abort signal for ReadableStream and WritableStream (Debadree Chatterjee) #46273 test_runner: * add initial code coverage support (Colin Ihrig) #46017 url: * replace url-parser with ada (Yagiz Nizipli) #46410 PR-URL: #46725
#45888 took the environment creation code out of the scope covered by the v8::TryCatch that we use to print early failures during environment creation. So e.g. when adding something that would fail in node.js, we get ``` node:internal/options:554: Uncaught Error: Should not query options before bootstrapping is done ``` This patch restores that by adding another v8::TryCatch for it: ``` node:internal/options:20 ({ options: optionsMap } = getCLIOptions()); ^ Error: Should not query options before bootstrapping is done at getCLIOptionsFromBinding (node:internal/options:20:32) at getOptionValue (node:internal/options:45:19) at node:internal/bootstrap/node:433:29 ``` PR-URL: #46533 Reviewed-By: Chengzhong Wu <[email protected]> Reviewed-By: Anna Henningsen <[email protected]>
#45888 took the environment creation code out of the scope covered by the v8::TryCatch that we use to print early failures during environment creation. So e.g. when adding something that would fail in node.js, we get ``` node:internal/options:554: Uncaught Error: Should not query options before bootstrapping is done ``` This patch restores that by adding another v8::TryCatch for it: ``` node:internal/options:20 ({ options: optionsMap } = getCLIOptions()); ^ Error: Should not query options before bootstrapping is done at getCLIOptionsFromBinding (node:internal/options:20:32) at getOptionValue (node:internal/options:45:19) at node:internal/bootstrap/node:433:29 ``` PR-URL: #46533 Reviewed-By: Chengzhong Wu <[email protected]> Reviewed-By: Anna Henningsen <[email protected]>
#45888 took the environment creation code out of the scope covered by the v8::TryCatch that we use to print early failures during environment creation. So e.g. when adding something that would fail in node.js, we get ``` node:internal/options:554: Uncaught Error: Should not query options before bootstrapping is done ``` This patch restores that by adding another v8::TryCatch for it: ``` node:internal/options:20 ({ options: optionsMap } = getCLIOptions()); ^ Error: Should not query options before bootstrapping is done at getCLIOptionsFromBinding (node:internal/options:20:32) at getOptionValue (node:internal/options:45:19) at node:internal/bootstrap/node:433:29 ``` PR-URL: #46533 Reviewed-By: Chengzhong Wu <[email protected]> Reviewed-By: Anna Henningsen <[email protected]>
@addaleax do you mind opening a backport PR for this to land in v18? Thanks! |
#45888 took the environment creation code out of the scope covered by the v8::TryCatch that we use to print early failures during environment creation. So e.g. when adding something that would fail in node.js, we get ``` node:internal/options:554: Uncaught Error: Should not query options before bootstrapping is done ``` This patch restores that by adding another v8::TryCatch for it: ``` node:internal/options:20 ({ options: optionsMap } = getCLIOptions()); ^ Error: Should not query options before bootstrapping is done at getCLIOptionsFromBinding (node:internal/options:20:32) at getOptionValue (node:internal/options:45:19) at node:internal/bootstrap/node:433:29 ``` PR-URL: #46533 Reviewed-By: Chengzhong Wu <[email protected]> Reviewed-By: Anna Henningsen <[email protected]>
As discussed in #45888, using a global `BuiltinLoader` instance is probably undesirable in a world in which embedders are able to create Node.js Environments with different sources and therefore mutually incompatible code caching properties. This PR makes it so that `BuiltinLoader` is no longer a global singleton and instead only shared between `Environment`s that have a direct relation to each other, and addresses a few thread safety issues along with that. PR-URL: #45942 Reviewed-By: Joyee Cheung <[email protected]>
As discussed in nodejs/node#45888, using a global `BuiltinLoader` instance is probably undesirable in a world in which embedders are able to create Node.js Environments with different sources and therefore mutually incompatible code caching properties. This PR makes it so that `BuiltinLoader` is no longer a global singleton and instead only shared between `Environment`s that have a direct relation to each other, and addresses a few thread safety issues along with that. PR-URL: nodejs/node#45942 Reviewed-By: Joyee Cheung <[email protected]>
As discussed in nodejs/node#45888, using a global `BuiltinLoader` instance is probably undesirable in a world in which embedders are able to create Node.js Environments with different sources and therefore mutually incompatible code caching properties. This PR makes it so that `BuiltinLoader` is no longer a global singleton and instead only shared between `Environment`s that have a direct relation to each other, and addresses a few thread safety issues along with that. PR-URL: nodejs/node#45942 Reviewed-By: Joyee Cheung <[email protected]>
The first commits here are #45885 + #45886 + #45887.
src: add snapshot support for embedder API
Add experimental support for loading snapshots in the embedder API
by adding a public opaque wrapper for our
SnapshotData
struct andallowing embedders to pass it to the relevant setup functions.
Where applicable, use these helpers to deduplicate existing code
in Node.js’s startup path.
This has shown a 40 % startup performance increase for a real-world
application, even with the somewhat limited current support for
built-in modules.
The documentation includes a note about no guarantees for API or
ABI stability for this feature while it is experimental.