-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fs: initial experimental promisified API #15485
Conversation
Introduces an experimental promises API for the fs module. The API is accessible via `fs.async`, e.g. ```js const { async:fs } = require('fs'); fs.access('.') .then(console.log) .catch(console.error); ``` Most fs functions are supported. There are some variations from the base fs module (for instance, `fs.async.realpath()` uses the faster libuv realpath implementation rather than the slower js implementation. This does not yet include documentation updates.
Note that these are not |
This seems confusing given that non-
Why? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Note that these are not
util.promisify
wrappers. These are implementations of the fs functions with Promises without any wrapping.
In that case, the [util.promifiy.custom]
properties on the original functions should still be set to the “proper” async implementation wherever that makes sense
req); | ||
} catch (err) { | ||
promiseReject(promise, err); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tried making all of these functions async function
s to have implicit try/catch+reject? I think unless you’re having any await
in it that’s all there they are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, was considering that already :-)
configurable: true, | ||
get() { | ||
return require('internal/async/fs'); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not value
+ writable: false
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to avoid the experimental warning from printing until someone explicitly goes to use the new API. We can switch this later once it's non-experimental
src/node_file.cc
Outdated
if (promise->IsPromise()) { | ||
if (promise.As<Promise>()->State() != Promise::kPending) return; | ||
Local<Promise::Resolver> resolver = promise.As<Promise::Resolver>(); | ||
resolver->Resolve(context, arg).FromJust(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes, thanks for the reminder :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fwiw, how and where to use InternalCallbackScope
is about as clear as mud right now :-( ... any chance of having some docs written up on it? Even if only in the form of some code comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, sure. From node.h
:
* `MakeCallback()` is a wrapper around this class as well as
* `Function::Call()`. Either one of these mechanisms needs to be used for
* top-level calls into JavaScript (i.e. without any existing JS stack).
do you have any specific suggestions for how to improve that? should resolving promise state be mentioned explicitly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll work up some language.
src/node_file.cc
Outdated
Local<Value> promise = | ||
object()->Get(context, env->oncomplete_string()).ToLocalChecked(); | ||
if (promise->IsPromise()) { | ||
if (promise.As<Promise>()->State() != Promise::kPending) return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is this not true? Can you make this a check or, alternatively, add a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 .. yeah, was already looking at making this a CHECK. It should always be true.
src/node_file.cc
Outdated
Local<Value> argv[2]; | ||
argv[0] = Null(env->isolate()); | ||
argv[1] = arg; | ||
MakeCallback(env->oncomplete_string(), 2, argv); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arraysize(argv)
:)
Yeah, I'm thinking I'll just change it to
Performance mainly. And lack of backwards compatibility concerns. The key reason we ended up having to revert back to the old realpath implementation is because changing broke existing code. We don't have that issue here. Also, if we kept the js implementation, we wouldn't really gain any performance advantages over simply doing |
Fwiw, I think it might be cleaner to introduce |
const getPathFromURL = internalURL.getPathFromURL; | ||
|
||
const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the issue with accessing internalFs.
For example realpathCacheKey
is only used in one location, why create a module level variable to manage instead of being explicit and accessing it from internalFs
the one time?
@@ -283,35 +190,41 @@ fs.access = function(path, mode, callback) { | |||
throw new errors.TypeError('ERR_INVALID_CALLBACK'); | |||
} | |||
|
|||
if (handleError((path = getPathFromURL(path)), callback)) | |||
return; | |||
handleError(path = getPathFromURL(path)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, my first thought when I saw the change to handle errors sync was that it is a good idea but after having a second thought I am not so sure about that anymore. The path could definitely be passed in by users and therefore handling the error in the callback would probably be the best thing to stick to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, argument validation error handling is inconsistent, with some errors resulting in a throw and others being passed to the callback. I'm good with passing them to the callback, it just needs to be consistent (e.g. all validation errors always going to the callback or always throwing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that they should be consistent but as long as it is not a "programmers" error I think it should land in the callback.
|
||
if (!nullCheck(path, callback)) | ||
return; | ||
if (typeof path !== 'string' && !Buffer.isBuffer(path)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually accept all TypedArrays right now if I am not mistaken. Did you want to limit it to buffers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll double check
req.oncomplete = makeCallback(callback); | ||
binding.access(pathModule._makeLong(path), mode, req); | ||
}; | ||
|
||
fs.accessSync = function(path, mode) { | ||
handleError((path = getPathFromURL(path))); | ||
fs.accessSync = function(path, mode = fs.F_OK) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the validation of the mode is later it would be better performance wise to check for undefined (the Number.isInteger
check is obsolete in that case). But this might be over optimizing stuff.
@@ -329,9 +242,9 @@ Object.defineProperty(fs.exists, internalUtil.promisify.custom, { | |||
|
|||
|
|||
fs.existsSync = function(path) { | |||
handleError(path = getPathFromURL(path)); | |||
nullCheck(path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -827,117 +808,192 @@ fs.ftruncate = function(fd, len, callback) { | |||
if (typeof len === 'function') { | |||
callback = len; | |||
len = 0; | |||
} else if (len === undefined) { | |||
} else if (len == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the explicit version is actually better performance wise. But do we really want to support using null at all? I did not check but I guess you just ported that over from the c++ layer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, this simplifies an additional check that was being done in C. There's an existing comment in the C code about the inconsistency here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think == null is optimized anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benjamingr that is currently not correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #15178.
return; | ||
if (!nullCheck(path, callback)) return; | ||
var req = new FSReqWrap(); | ||
mode = modeNum(0o777); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The modeNum wrapping here is obsolete. I think you actually wanted to do
if (typeof mode === 'function') {
callback = mode;
mode = 0o777;
} else {
mode = modeNum(mode, 0o777);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same applies to other functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
|
||
if (typeof path !== 'string' && !Buffer.isBuffer(path)) { | ||
throw new errors.TypeError('ERR_INVALID_ARG_TYPE', 'path', | ||
['string', 'Buffer', 'URL']); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be great if all the errors could also contain the actual value.
}); | ||
bench.start(); | ||
for (let i = 0; i < n; i++) { | ||
access(__filename, () => countdown.dec()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not convinced that running 2e5 calls in the same tick is actually a good way of benchmarking this. I think it would be better to either use a recursive strategy to see the individual call length or a batched recursive call that batches up to lets say 500 calls together.
This applies to all these benchmarks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, ideally if we can get a more applicative benchmark that'd be the best
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In particular, the promises version always queue the callback to run later every time here which would be disproportionately represented in the above benchmark (I think)
@nodejs/tsc for awareness |
nit: in the PR abstract |
re: fs.fastRealpath please refer to these references about age old security vulns found in realpath(3). The potential off by one sec vuln was why it wasn't built on top of originally. I landed a fix into libuv that should guard against this, but we should likely have a larger discussion about this before introducing the api |
Can we get a benchmark comparing the two? |
@benjamingr ... there are some in the original post at the top. e.g.
The second entry in each run uses |
cc @bmeurer |
@@ -259,7 +260,9 @@ E('ERR_NO_CRYPTO', 'Node.js is not compiled with OpenSSL crypto support'); | |||
E('ERR_NO_ICU', '%s is not supported on Node.js compiled without ICU'); | |||
E('ERR_NO_LONGER_SUPPORTED', '%s is no longer supported'); | |||
E('ERR_OUT_OF_RANGE', 'The "%s" argument is out of range'); | |||
E('ERR_OUTOFBOUNDS', 'The "%s" argument is out of bounds'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe better ERR_OUT_OF_BOUNDS (to be consistent with prev line)?
Why is this labelled |
It makes a number of breaking changes including changes to error handling in the existing fs apis |
|
||
if (!Number.isInteger(len) || len < 0) { | ||
throw new errors.TypeError('ERR_INVALID_ARG_TYPE', 'len', | ||
'unsigned integer'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't a lot of these checks be done in c++? it would be a nice performance win.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pulled many of these out of C++ in order to get better consistency with the move to internal/errors and consistency with where the type checking is happening. As it is, the type checking is rather... inconsistent. Will look at it again, tho.
Work continued in #17739 |
Introduces an experimental promises API for the fs module. The API is accessible via
fs.async
, e.g.This is intended to land as Experimental.
Most fs functions are supported. There are some variations from
the base fs module (for instance,
fs.async.realpath()
uses thefaster libuv realpath implementation rather than the slower js
implementation.
This does not yet include documentation updates.
Semver-major because this makes substantive changes to error handling in the existing
fs
module.Benchmarks are included. For instance,
access
copyFile
As illustrated by the benchmarks, the promise versions can be faster than the traditional callbacks, but the results are inconsistent. This appears largely to do with gc (there are a lot of promise objects created that need to be cleaned up).
Having the Promise version does not obsolete the callback version, as the creation and management of the promise objects can be fairly expensive.
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passesAffected core subsystem(s)