Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs: add recursive opendir/readdir #41439

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions doc/api/fs.md
Original file line number Diff line number Diff line change
Expand Up @@ -1214,6 +1214,9 @@ a colon, Node.js will open a file system stream, as described by
<!-- YAML
added: v12.12.0
changes:
- version: REPLACEME
pr-url: https://github.com/nodejs/node/pull/41439
description: Added `recursive` option.
- version:
- v13.1.0
- v12.16.0
Expand All @@ -1227,6 +1230,8 @@ changes:
* `bufferSize` {number} Number of directory entries that are buffered
internally when reading from the directory. Higher values lead to better
performance but higher memory usage. **Default:** `32`
* `recursive` {boolean} Resolved `Dir` will be an {AsyncIterable}
containing all sub files and directories. **Default:** `false`
Comment on lines +1233 to +1234
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably applies to all documented APIs?

* Returns: {Promise} Fulfills with an {fs.Dir}.

Asynchronously open a directory for iterative scanning. See the POSIX
Expand Down Expand Up @@ -1260,6 +1265,9 @@ closed after the iterator exits.
<!-- YAML
added: v10.0.0
changes:
- version: REPLACEME
pr-url: https://github.com/nodejs/node/pull/41439
description: Added `recursive` option.
- version: v10.11.0
pr-url: https://github.com/nodejs/node/pull/22020
description: New option `withFileTypes` was added.
Expand All @@ -1269,6 +1277,7 @@ changes:
* `options` {string|Object}
* `encoding` {string} **Default:** `'utf8'`
* `withFileTypes` {boolean} **Default:** `false`
* `recursive` {boolean} **Default:** `false`
* Returns: {Promise} Fulfills with an array of the names of the files in
the directory excluding `'.'` and `'..'`.

Expand Down Expand Up @@ -3390,6 +3399,9 @@ const { openAsBlob } = require('node:fs');
<!-- YAML
added: v12.12.0
changes:
- version: REPLACEME
pr-url: https://github.com/nodejs/node/pull/41439
description: Added `recursive` option.
- version: v18.0.0
pr-url: https://github.com/nodejs/node/pull/41678
description: Passing an invalid callback to the `callback` argument
Expand All @@ -3408,6 +3420,7 @@ changes:
* `bufferSize` {number} Number of directory entries that are buffered
internally when reading from the directory. Higher values lead to better
performance but higher memory usage. **Default:** `32`
* `recursive` {boolean} **Default:** `false`
* `callback` {Function}
* `err` {Error}
* `dir` {fs.Dir}
Expand Down Expand Up @@ -3526,6 +3539,9 @@ above values.
<!-- YAML
added: v0.1.8
changes:
- version: REPLACEME
pr-url: https://github.com/nodejs/node/pull/41439
description: Added `recursive` option.
- version: v18.0.0
pr-url: https://github.com/nodejs/node/pull/41678
description: Passing an invalid callback to the `callback` argument
Expand Down Expand Up @@ -3555,6 +3571,7 @@ changes:
* `options` {string|Object}
* `encoding` {string} **Default:** `'utf8'`
* `withFileTypes` {boolean} **Default:** `false`
* `recursive` {boolean} **Default:** `false`
* `callback` {Function}
* `err` {Error}
* `files` {string\[]|Buffer\[]|fs.Dirent\[]}
Expand Down Expand Up @@ -5525,6 +5542,9 @@ object with an `encoding` property specifying the character encoding to use.
<!-- YAML
added: v12.12.0
changes:
- version: REPLACEME
pr-url: https://github.com/nodejs/node/pull/41439
description: Added `recursive` option.
- version:
- v13.1.0
- v12.16.0
Expand All @@ -5538,6 +5558,7 @@ changes:
* `bufferSize` {number} Number of directory entries that are buffered
internally when reading from the directory. Higher values lead to better
performance but higher memory usage. **Default:** `32`
* `recursive` {boolean} **Default:** `false`
* Returns: {fs.Dir}

Synchronously open a directory. See opendir(3).
Expand Down Expand Up @@ -5581,6 +5602,9 @@ this API: [`fs.open()`][].
<!-- YAML
added: v0.1.21
changes:
- version: REPLACEME
pr-url: https://github.com/nodejs/node/pull/41439
description: Added `recursive` option.
- version: v10.10.0
pr-url: https://github.com/nodejs/node/pull/22020
description: New option `withFileTypes` was added.
Expand All @@ -5594,6 +5618,7 @@ changes:
* `options` {string|Object}
* `encoding` {string} **Default:** `'utf8'`
* `withFileTypes` {boolean} **Default:** `false`
* `recursive` {boolean} **Default:** `false`
* Returns: {string\[]|Buffer\[]|fs.Dirent\[]}

Reads the contents of the directory.
Expand Down Expand Up @@ -6447,6 +6472,16 @@ The file name that this {fs.Dirent} object refers to. The type of this
value is determined by the `options.encoding` passed to [`fs.readdir()`][] or
[`fs.readdirSync()`][].

#### `dirent.path`

<!-- YAML
added: REPLACEME
-->

* {string}

The base path that this {fs.Dirent} object refers to.

### Class: `fs.FSWatcher`

<!-- YAML
Expand Down
47 changes: 47 additions & 0 deletions lib/fs.js
Original file line number Diff line number Diff line change
Expand Up @@ -1404,6 +1404,36 @@ function mkdirSync(path, options) {
}
}

// TODO(Ethan-Arrowood): Make this iterative too
function readdirSyncRecursive(path, origPath, options) {
nullCheck(path, 'path', true);
const ctx = { path };
const result = binding.readdir(pathModule.toNamespacedPath(path),
options.encoding, !!options.withFileTypes, undefined, ctx);
handleErrorFromBinding(ctx);
return options.withFileTypes ?
getDirents(path, result).flatMap((dirent) => {
return [
dirent,
...(dirent.isDirectory() ?
readdirSyncRecursive(
pathModule.join(path, dirent.name),
origPath,
options,
) : []),
];
}) :
result.flatMap((ent) => {
const innerPath = pathModule.join(path, ent);
const relativePath = pathModule.relative(origPath, innerPath);
const stat = binding.internalModuleStat(innerPath);
return [
relativePath,
...(stat === 1 ? readdirSyncRecursive(innerPath, origPath, options) : []),
];
});
}

/**
* Reads the contents of a directory.
* @param {string | Buffer | URL} path
Expand All @@ -1421,6 +1451,14 @@ function readdir(path, options, callback) {
callback = makeCallback(typeof options === 'function' ? options : callback);
options = getOptions(options);
path = getValidatedPath(path);
if (options.recursive != null) {
validateBoolean(options.recursive, 'options.recursive');
}

if (options.recursive) {
callback(null, readdirSyncRecursive(path, path, options));
return;
}

const req = new FSReqCallback();
if (!options.withFileTypes) {
Expand All @@ -1444,12 +1482,21 @@ function readdir(path, options, callback) {
* @param {string | {
* encoding?: string;
* withFileTypes?: boolean;
* recursive?: boolean;
* }} [options]
* @returns {string | Buffer[] | Dirent[]}
*/
function readdirSync(path, options) {
options = getOptions(options);
path = getValidatedPath(path);
if (options.recursive != null) {
validateBoolean(options.recursive, 'options.recursive');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to be strict about typeof options.recursive === 'boolean', or we can allow any truthy/falsy value, similar to options.withFileTypes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not being strict with boolean type checks create issues where you also have to check 1. I've worked on something similar and without introducing a breaking change, I had to add Booelan(value) to pass it. If in the future we move this function to C++, it would be a lot easier to validate the truthy of the value. So, I recommend being as strict as possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not being strict with boolean type checks create issues where you also have to check 1.

I'm not sure if I follow. Loose typecheck on positional arguments can be error-prone in some scenarios, but I don't see an issue in this form. Can you provide an example?
If the logic will be moved to C++ layer, we can easily pass !!recursive to it, similar to how it's done for withFileTypes in the same function:

node/lib/fs.js

Lines 1496 to 1498 in 611a616

const result = binding.readdir(pathModule.toNamespacedPath(path),
options.encoding, !!options.withFileTypes,
undefined, ctx);

If strict check is preferable, perhaps we shouldn't allow null value either, and align other boolean parameters with it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I recommend @Ethan-Arrowood to follow up with this on a different pull request, if it's ok with you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a part of stable API, so changing it separately might be semver-major.

That said, I'm not blocking this PR in current form, but it would be nice to have consensus on what we want to have after follow-ups before landing this one.
Since recursive readdir can take a long time, I think ideally we can move validation/coercion to getOptions() and make it return a copy rather than original object. This will make it unaffected by mutations of options while it's still running, and align other methods with same behavior.

}

if (options.recursive) {
return readdirSyncRecursive(path, path, options);
}

const ctx = { path };
const result = binding.readdir(pathModule.toNamespacedPath(path),
options.encoding, !!options.withFileTypes,
Expand Down
93 changes: 77 additions & 16 deletions lib/internal/fs/dir.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,7 @@

const {
ArrayPrototypePush,
ArrayPrototypeSlice,
ArrayPrototypeSplice,
ArrayPrototypeShift,
FunctionPrototypeBind,
ObjectDefineProperty,
PromiseReject,
Expand Down Expand Up @@ -99,13 +98,21 @@ class Dir {
}

if (this[kDirBufferedEntries].length > 0) {
const { 0: name, 1: type } =
ArrayPrototypeSplice(this[kDirBufferedEntries], 0, 2);
if (maybeSync)
process.nextTick(getDirent, this[kDirPath], name, type, callback);
else
getDirent(this[kDirPath], name, type, callback);
return;
try {
const dirent = ArrayPrototypeShift(this[kDirBufferedEntries]);

if (this[kDirOptions].recursive && dirent.isDirectory()) {
this.readSyncRecursive(dirent);
}

if (maybeSync)
process.nextTick(callback, null, dirent);
else
callback(null, dirent);
return;
} catch (error) {
return callback(error);
}
}

const req = new FSReqCallback();
Expand All @@ -120,8 +127,16 @@ class Dir {
return callback(err, result);
}

this[kDirBufferedEntries] = ArrayPrototypeSlice(result, 2);
getDirent(this[kDirPath], result[0], result[1], callback);
try {
this.processReadResult(this[kDirPath], result);
const dirent = ArrayPrototypeShift(this[kDirBufferedEntries]);
if (this[kDirOptions].recursive && dirent.isDirectory()) {
this.readSyncRecursive(dirent);
}
callback(null, dirent);
} catch (error) {
callback(error);
}
};

this[kDirOperationQueue] = [];
Expand All @@ -132,6 +147,45 @@ class Dir {
);
}

processReadResult(path, result) {
for (let i = 0; i < result.length; i += 2) {
Ethan-Arrowood marked this conversation as resolved.
Show resolved Hide resolved
ArrayPrototypePush(
this[kDirBufferedEntries],
getDirent(
pathModule.join(path, result[i]),
MoLow marked this conversation as resolved.
Show resolved Hide resolved
result[i],
result[i + 1],
),
);
}
}

// TODO(Ethan-Arrowood): Review this implementation. Make it iterative.
// Can we better leverage the `kDirOperationQueue`?
readSyncRecursive(dirent) {
const ctx = { path: dirent.path };
const handle = dirBinding.opendir(
pathModule.toNamespacedPath(dirent.path),
this[kDirOptions].encoding,
undefined,
ctx,
);
handleErrorFromBinding(ctx);
const result = handle.read(
this[kDirOptions].encoding,
this[kDirOptions].bufferSize,
undefined,
ctx,
);

if (result) {
this.processReadResult(dirent.path, result);
Ethan-Arrowood marked this conversation as resolved.
Show resolved Hide resolved
}

handle.close(undefined, ctx);
handleErrorFromBinding(ctx);
}

readSync() {
if (this[kDirClosed] === true) {
throw new ERR_DIR_CLOSED();
Expand All @@ -142,9 +196,11 @@ class Dir {
}

if (this[kDirBufferedEntries].length > 0) {
const { 0: name, 1: type } =
ArrayPrototypeSplice(this[kDirBufferedEntries], 0, 2);
return getDirent(this[kDirPath], name, type);
const dirent = ArrayPrototypeShift(this[kDirBufferedEntries]);
if (this[kDirOptions].recursive && dirent.isDirectory()) {
this.readSyncRecursive(dirent);
}
return dirent;
}

const ctx = { path: this[kDirPath] };
Expand All @@ -160,8 +216,13 @@ class Dir {
return result;
}

this[kDirBufferedEntries] = ArrayPrototypeSlice(result, 2);
return getDirent(this[kDirPath], result[0], result[1]);
this.processReadResult(this[kDirPath], result);

const dirent = ArrayPrototypeShift(this[kDirBufferedEntries]);
if (this[kDirOptions].recursive && dirent.isDirectory()) {
this.readSyncRecursive(dirent);
}
return dirent;
}

close(callback) {
Expand Down
Loading