Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

walk_dir uses too many file descriptors #23715

Closed
hugoduncan opened this issue Mar 25, 2015 · 6 comments
Closed

walk_dir uses too many file descriptors #23715

hugoduncan opened this issue Mar 25, 2015 · 6 comments

Comments

@hugoduncan
Copy link

The current implementation of fs::walk_dir is breadth first, with a file descriptor being held for each directory that has been found but not yet traversed. This can lead to "Too many open files" errors.

On my mac, the default limit for file descriptors is 256. A .git/objects directory can contain 257 directories, so with a breadth first search with a queue of ReadDir objects (each of which holds a DIR), this limit can easily be hit.

I can see two possible solutions: either changing the queue to hold paths rather than ReadDir objects, or switching to depth first traversal.

I hit this error running the following on a directory tree containing a .git directory, and with a max directory depth of 9.

#![feature(fs_walk)]

use std::fs;
use std::io;
use std::path::Path;

fn main() {
    match walk() {
        Ok(_) => (),
        Err(e) => println!("ERROR {}", e)
    }
}

fn walk() -> Result<(), io::Error> {
    for f in try!(fs::walk_dir(&Path::new("."))) {
        let f = try!(f);
        println!("copy_tree {:?}", f.path());
    }
    Ok(())
}
@nagisa
Copy link
Member

nagisa commented Mar 25, 2015

Exactly the same issue exists with depth-first traversals as well, when directory structure could be deeper than RLIMIT_NOFILE. I do not think I’d want to see paths being stored in memory either.

@nagisa
Copy link
Member

nagisa commented Mar 25, 2015

A hybrid approach similar to one taken by ftw(3) could be implemented. It would be a breaking change to add a parameter to specify limit of descriptors.

@alexcrichton
Copy link
Member

Oh dear I definitely didn't intend for this to happen! I intended for it to only have an active list of directories proportional to the current depth. I do agree with @nagisa that a DFS wouldn't solve the problem, but I suspect directories are more often wide than deep.

I also would love to add tons of configuration to Walk. I sketched out an idea or two in the RFC issue with a WalkOptions structure which may affect this as well.

@clee
Copy link

clee commented May 7, 2015

Yeah, this sounds like the problem I'm seeing for sure. 👍

@BurntSushi
Copy link
Member

This is fixed in an external crate: http://burntsushi.net/rustdoc/walkdir/

@steveklabnik
Copy link
Member

The in-tree walk_dir is now deprecated, in favor of @BurntSushi 's crate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants