-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
modules: significantly improve require performance #25362
Conversation
1) It adds more benchmark options to properly verify the gains. This makes sure the benchmark also tests requiring the same module again instead of only loading each module only once. 2) Remove dead code: The array check is obsolete as this function will only be called internally with preprepared data which is always an array. 3) Simpler code It was possible to use a more direct logic to prevent some branches. 4) Inline try catch The function is not required anymore, since V8 is able to produce performant code with it. 5) var -> let / const & less lines 6) Update require.extensions description The comment was outdated. 7) Improve extension handling This is a performance optimization to prevent loading the extensions on each uncached require call. It uses proxies to intercept changes and receives the necessary informations by doing that.
9576b54
to
9c54e43
Compare
// Do not expose this to user land even with --expose-internals. | ||
const loaderId = 'internal/bootstrap/loaders'; | ||
|
||
// Create a native module map to tell if a module is internal or not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heads up: I think this does something similar but conflicts with #25352?
When I added that parameter it was because there were Google (IIRC) projects using this function and were expecting the old return value type. To avoid breaking their (and anyone else's) projects, I added the new parameter for backwards compatibility. I do not remember which projects they were, but just wanted to put that out there. |
CI https://ci.nodejs.org/job/node-test-pull-request/19953/ @mscdex thanks for pointing that out. I'll have a look at the modules again in the next few days. First I want CITGM to be happy and I already stumbled upon two minor things that I just fixed. |
There are some issues involving the proxy. This tries to circumvent the issue by not making any copy of the newly set object.
I just marked this as work in progress. I keep it open to pull out smaller chunks but I do not plan on pushing everything in one PR. I already moved out a small part and for v12 I'll only pull out a couple non semver-major changes. All semver-major parts should come up for v13 instead. |
This publicly documents that adding native module names will resolve the added entry instead of the native module. It also updates the description why extensions are deprecated. PR-URL: nodejs#26971 Refs: nodejs#25362 Reviewed-By: Gus Caplan <[email protected]> Reviewed-By: Vse Mozhet Byt <[email protected]>
PR-URL: nodejs#26970 Refs: nodejs#25362 Reviewed-By: Guy Bedford <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
Add more benchmark options to properly verify the gains. This makes sure the benchmark also tests requiring the same module again instead of only loading each module only once. PR-URL: nodejs#26970 Refs: nodejs#25362 Reviewed-By: Guy Bedford <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
Moving `try / catch` into separate functions is not necessary anymore due to V8 optimizations. PR-URL: nodejs#26970 Refs: nodejs#25362 Reviewed-By: Guy Bedford <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
This adds the `path` property to the module object. It contains the current directory as path. That is necessary to add an extra caching layer. It also makes sure the `id` uses a default in case it's not set. Otherwise the `path.dirname(id)` command could fail. PR-URL: nodejs#26970 Refs: nodejs#25362 Reviewed-By: Guy Bedford <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
This adds an extra modules caching layer that operates on the parent's `path` property and the current require argument. That together can be used as unique identifier to speed up loading the same module more than once. It is a cache on top of the current modules cache. It has the nice feature that this cache does not only work in the same file but it works for the whole current directory. So if the same file is loaded in any other file from the same directory, it will also hit this cache instead of having to resolve the file again. To keep it backwards compatible with the old modules cache, it detects invalidation of that cache. PR-URL: nodejs#26970 Refs: nodejs#25362 Reviewed-By: Guy Bedford <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
This publicly documents that adding native module names will resolve the added entry instead of the native module. It also updates the description why extensions are deprecated. PR-URL: #26971 Refs: #25362 Reviewed-By: Gus Caplan <[email protected]> Reviewed-By: Vse Mozhet Byt <[email protected]>
This removes a lot of code that has no functionality anymore. All Node.js internal code calls `_resolveLookupPaths` with two arguments. The code that validates `index.js` is not required at all as we check for these files anyway, so it's just redundant code that should be removed. PR-URL: nodejs#26983 Refs: nodejs#25362 Reviewed-By: Jan Krems <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Guy Bedford <[email protected]> Reviewed-By: Rich Trott <[email protected]> Reviewed-By: Michaël Zasso <[email protected]>
This publicly documents that adding native module names will resolve the added entry instead of the native module. It also updates the description why extensions are deprecated. PR-URL: #26971 Refs: #25362 Reviewed-By: Gus Caplan <[email protected]> Reviewed-By: Vse Mozhet Byt <[email protected]> Signed-off-by: Beth Griggs <[email protected]>
This publicly documents that adding native module names will resolve the added entry instead of the native module. It also updates the description why extensions are deprecated. PR-URL: #26971 Refs: #25362 Reviewed-By: Gus Caplan <[email protected]> Reviewed-By: Vse Mozhet Byt <[email protected]> Signed-off-by: Beth Griggs <[email protected]>
PR-URL: #26970 Refs: #25362 Reviewed-By: Guy Bedford <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Signed-off-by: Beth Griggs <[email protected]>
Add more benchmark options to properly verify the gains. This makes sure the benchmark also tests requiring the same module again instead of only loading each module only once. PR-URL: #26970 Refs: #25362 Reviewed-By: Guy Bedford <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Signed-off-by: Beth Griggs <[email protected]>
Moving `try / catch` into separate functions is not necessary anymore due to V8 optimizations. PR-URL: #26970 Refs: #25362 Reviewed-By: Guy Bedford <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Signed-off-by: Beth Griggs <[email protected]>
This adds the `path` property to the module object. It contains the current directory as path. That is necessary to add an extra caching layer. It also makes sure the `id` uses a default in case it's not set. Otherwise the `path.dirname(id)` command could fail. PR-URL: #26970 Refs: #25362 Reviewed-By: Guy Bedford <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Signed-off-by: Beth Griggs <[email protected]>
This adds an extra modules caching layer that operates on the parent's `path` property and the current require argument. That together can be used as unique identifier to speed up loading the same module more than once. It is a cache on top of the current modules cache. It has the nice feature that this cache does not only work in the same file but it works for the whole current directory. So if the same file is loaded in any other file from the same directory, it will also hit this cache instead of having to resolve the file again. To keep it backwards compatible with the old modules cache, it detects invalidation of that cache. PR-URL: #26970 Refs: #25362 Reviewed-By: Guy Bedford <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Signed-off-by: Beth Griggs <[email protected]>
This publicly documents that adding native module names will resolve the added entry instead of the native module. It also updates the description why extensions are deprecated. PR-URL: #26971 Refs: #25362 Reviewed-By: Gus Caplan <[email protected]> Reviewed-By: Vse Mozhet Byt <[email protected]> Signed-off-by: Beth Griggs <[email protected]>
PR-URL: #26970 Refs: #25362 Reviewed-By: Guy Bedford <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Signed-off-by: Beth Griggs <[email protected]>
Add more benchmark options to properly verify the gains. This makes sure the benchmark also tests requiring the same module again instead of only loading each module only once. PR-URL: #26970 Refs: #25362 Reviewed-By: Guy Bedford <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Signed-off-by: Beth Griggs <[email protected]>
Moving `try / catch` into separate functions is not necessary anymore due to V8 optimizations. PR-URL: #26970 Refs: #25362 Reviewed-By: Guy Bedford <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Signed-off-by: Beth Griggs <[email protected]>
This adds the `path` property to the module object. It contains the current directory as path. That is necessary to add an extra caching layer. It also makes sure the `id` uses a default in case it's not set. Otherwise the `path.dirname(id)` command could fail. PR-URL: #26970 Refs: #25362 Reviewed-By: Guy Bedford <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Signed-off-by: Beth Griggs <[email protected]>
This adds an extra modules caching layer that operates on the parent's `path` property and the current require argument. That together can be used as unique identifier to speed up loading the same module more than once. It is a cache on top of the current modules cache. It has the nice feature that this cache does not only work in the same file but it works for the whole current directory. So if the same file is loaded in any other file from the same directory, it will also hit this cache instead of having to resolve the file again. To keep it backwards compatible with the old modules cache, it detects invalidation of that cache. PR-URL: #26970 Refs: #25362 Reviewed-By: Guy Bedford <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Signed-off-by: Beth Griggs <[email protected]>
Closing due to age and most important improvements already landed. |
This is a significant performance boost for require in all cases but mainly for loaded modules. This came up recently as a botteneck for @jasnell and @mcollina as far as I know.
Loading relative or absolute paths now has the same performance profile and loading node modules will be significantly faster than before.
The native startup won't be impacted by this but this should improve the startup time from an actual application due to this performance improvement. It is now also possible to use require lazily without sacrificing performance.
I introduced a new cache which sits on top of the former main cache and caches the filename of already loaded modules. This is then used to check for lazy require calls and for identical require calls from different files in the same directory.
I repaired the stat cache which was broken due to a minor mistake in a former commit. While doing that I just used a hard limit instead of the actual former implementation.
I removed multiple redundant path.resolve calls and refactored code for simplicity and less code branches where possible.
The extensions are now behind a proxy as otherwise it's necessary to recompute them each time a file can not be required without the extension (which is the default for most calls).
The
Module._pathCache
is now going to cache all requests in combination with the path including failed requests. At the same time I removed the package cache which is now obsolete due to that.I tried to stay backwards compatible while some improvements required me change
DEP0019
(require('.') resolved outside directory) to end-of-life (this was already the case but that was reverted due to a mistake in the implementation). I also slightly changed the_resolveLookupPaths
and_resolveFilename
implementations. Both got a new argument in 2017 while these have not been used so far in the wild as far as I can tell (I checkedGzemnid
and could not find any hits about these arguments). This is clearly semver-major and even though I am able to backports some parts of this, there's always a chance that even different caching might have a slightly different behavior for users than the one before.I also documented an existing behavior which was not yet documented: placing new entries in the
require.cache
.I plan on splitting this up in smaller commits but I wanted to get some general feedback first.
One thing we might want to have a look at again is how much we cache and if we want to clear some caches after each require has fully resolved (including the child require calls) or if we are just fine keeping everything in the cache.
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes