-
-
Notifications
You must be signed in to change notification settings - Fork 816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closes #535 --prune option #546
Conversation
- Added --prune option which will not descend into directories that are a match on pattern. - Added test to cover --prune option.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for your contribution.
If we are going to introduce this new option in fd
, I would definitely like to see it more thoroughly tested.
I certainly believe that the current implementation does not really work in all cases. We need to check possible interference with other command line options.
For example, consider what happens if we use --prune
in combination with --type <filetype>
. If a user specifies --type file
, only files will be matched. With the current implementation, this would mean that --prune
would not work anymore (I think).
There are also other command-line flags where we should check for interference:
-p, --full-path
-e, --extension <ext>...
and other filters such as-S, --size <size>...
,--changed-within <date|dur>
, ...-d, --max-depth <depth>
-E, --exclude <pattern>...
- How does this all work with symlinks? With and without
-L, --follow
.
Thanks for the feedback!
You are correct, I just checked this case against the current implementation and
I will exercise these combinations and include some #test cases for them. |
- Fixed logic for --prune to consider other command-line flags. - Added additional test cases to cover combinations of --prune and other command-line flags.
In diff2, I attempted to cover all of these items.
|
(Rust noob here.) |
// when pruning, don't descend into matching dirs | ||
// note that prune requires pattern | ||
let walk_action = if config.prune | ||
&& pattern.as_str().len() != 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand why this is necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it so that the prune is only evaluated if there is a pattern specified, thinking that the prune option should only be used with a pattern. (and I added a test case for this). But, I suppose it could be allowed and then the expected behavior would be that no directories would be descended into. What do you think/prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, "not having a pattern" does not only refer to cases where --type
is used, right? It could also be a unconstrained search, a --size
search or a --changed-{within,before}
search.
I agree that --prune
maybe doesn't make sense in these cases, but we should take a closer look, I think. We could use the find
behavior as a guideline.
If two options in combination do not make sense, we should also think about disallowing that via .conflicts_with
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, "not having a pattern" does not only refer to cases where
--type
is used, right? It could also be a unconstrained search, a--size
search or a--changed-{within,before}
search.
Good point! I hadn't considered that...
I agree that
--prune
maybe doesn't make sense in these cases, but we should take a closer look, I think. We could use thefind
behavior as a guideline.
Lets see. If we are using find
as an example, it seems like the -prune
option actually takes an argument on what to prune.
(I found this explanation helpful: https://stackoverflow.com/questions/1489277/how-to-use-prune-option-of-find-in-sh)
Currently this implementation uses the <pattern>
as what to also prune. But I suppose we could make --prune
take an argument, stating what to prune. In this case, the user has more control over what to prune. for example:
$ fd foo --prune=bar
, will look for matches with foo but not descend into dirs matching bar.
This would eliminate the need to do the check for pattern on line 369.
What do you think?
Yet it almost sounds like the functionality of --exclude <pattern>
but specific to directories. hmmmm... are they too similar?
If two options in combination do not make sense, we should also think about disallowing that via
.conflicts_with
.
That is good to know about!
After reading your comment, and looking at the find documentation
http://man7.org/linux/man-pages/man1/find.1.html
-
The
find -prune
seems to conflict with-depth
, so that could be something to consider, to mark conflicts _with on--max-depth
? -
I'm not convinced we'd need to mark prune as conflicts with --size or --changed-{} , if we added an argument to it as I proposed above. Someone could want to see files of a certain size but omitting some directories, maybe? But if we don't go with adding an argument to prune, marking as conflicts with --size and --changed-{} might be a better way to enforce the need for a to prune on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yet it almost sounds like the functionality of --exclude but specific to directories. hmmmm... are they too similar?
FWIW, the -E option invokes ripgreps gitignore style excludes. To ignore a directory -E foo/
should work. Eg.
[squirt ~/tmp/tests]
$ tree -p
.
├── [drwxr-xr-x] bar
│ └── [drwxr-xr-x] lower_dir
│ └── [-rw-r--r--] bar
└── [drwxr-xr-x] foo
└── [drwxr-xr-x] lower_dir
└── [-rw-r--r--] bar
4 directories, 2 files
[squirt ~/tmp/tests]
$ fd -E bar/
foo
foo/lower_dir
foo/lower_dir/bar
Since this adds a new condition in the "hot path" of |
I followed the instructions for generating the report (attached) |
doc!(h, "prune" | ||
, "Do not descend into matched directories" | ||
, "When a directory matches the pattern, fd does not descend into it but still shows \ | ||
all matches at the same level as that directory."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, this is slightly ambiguous for people who aren't familiar with find's -prune option. In particular, I think it should be more explicitly stated that the matched directories are included in the output. Maybe something like:
"Do not descend into matched directories"
"When a directory matches the pattern, fd does not descend into it. The directory itself is not excluded."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tested this locally, and it works according to my expectations. So @neuronull I would really encourage you to stick it out with this PR.
For reference this is what I ran.
fd 'node_modules' . -t d --prune -I -E Library
And to help with clarity perhaps @sharkdp could make clear what still needs to be done on this PR to get this merged? I would love to see this in. |
This is a complex new feature. What is much more important for me than getting this in early is that we make sure that (1) it doesn't break anything (2) it doesn't make any existing functionality slower, (3) it works as expected and the design is well thought-out such that we don't have to rewrite it immediately after shipping it to the first users (4) it plays nicely with all kinds of existing features. Also, this PR needs to be in a mergable state. The four points listed are prioritized. We can merge this PR as soon as I have a feeling that we have enough evidence for all the listed points. Any help is very much appreciated. Examples: For point (1): this is probably the hardest. We have a large test coverage but things can always go wrong. Thinking hard about how the new code could break things and writing new, sensible tests is one thing to help with this point. For point (2): We have some benchmark results but we may need to repeat them after the PR is in a mergeable state again. For point (3) and (4): What do we actually want from this feature? How do we expect it to behave it different situations? How do we expect it to behave in combination with other features? We have posed some of these questions here (and not all of them have been answered), but I'm sure there are more. See also: #613 |
closing for now (cleaning up list of open PRs). |
that are a match on pattern.