-
-
Notifications
You must be signed in to change notification settings - Fork 814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fd performs worse than find on NFS #432
Comments
It seems that even
|
Your
It may be worth putting The other thing that might explain the difference between As far as |
@sbocq fd may not be faster than GNU find for small and midsize folders (~50K) or with -j1 option. fd will shine if you have a large folder hierarchy with >100K files and folders. See this benchmark results for more information https://github.com/hungptit/ioutils/blob/master/benchmark.md. |
Indeed, thanks for the correction.
Yes, changing the order doesn't make a difference.
Exactly! |
@hungptit Maybe it can improve it in other benchmarks as well. For example, here is another one in the same environment where
|
Thank you very much for the detailed report, analysis and all the answers. I didn't have time to look into this so far, but planning to do so in the next days. |
…data This should partially address sharkdp#432 by decreasing the number of stat() calls: $ strace -c -f ./fd-before '.h$' -j1 /usr -S +1k >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 100.00 3.700169 3 983938 38022 stat $ strace -c -f ./fd-after '.h$' -j1 /usr -S +1k >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 100.00 0.671723 4 162052 38021 stat Though it's not as good as possible: $ strace -c -f find /usr -name '*.h' -size +1k >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 18.75 0.449866 3 136199 newfstatat $ strace -c -f bfs /usr -name '*.h' -size +1k >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 15.01 0.216024 3 60211 1 statx Performance is much better when metadata is required: $ hyperfine ./fd-{before,after}" '.h$' -j1 /usr -S +1k" Benchmark sharkdp#1: ./fd-before '.h$' -j1 /usr -S +1k Time (mean ± σ): 2.707 s ± 0.042 s [User: 890.8 ms, System: 1939.7 ms] Range (min … max): 2.659 s … 2.786 s 10 runs Benchmark sharkdp#2: ./fd-after '.h$' -j1 /usr -S +1k Time (mean ± σ): 1.562 s ± 0.034 s [User: 726.2 ms, System: 957.9 ms] Range (min … max): 1.536 s … 1.648 s 10 runs Summary './fd-after '.h$' -j1 /usr -S +1k' ran 1.73 ± 0.05 times faster than './fd-before '.h$' -j1 /usr -S +1k' While remaining the same when it's not: tavianator@graviton $ hyperfine ./fd-{before,after}" '.h$' -j1 /usr" Benchmark sharkdp#1: ./fd-before '.h$' -j1 /usr Time (mean ± σ): 1.341 s ± 0.016 s [User: 664.3 ms, System: 761.2 ms] Range (min … max): 1.309 s … 1.361 s 10 runs Benchmark sharkdp#2: ./fd-after '.h$' -j1 /usr Time (mean ± σ): 1.338 s ± 0.012 s [User: 684.1 ms, System: 741.1 ms] Range (min … max): 1.310 s … 1.350 s 10 runs Summary './fd-after '.h$' -j1 /usr' ran 1.00 ± 0.02 times faster than './fd-before '.h$' -j1 /usr'
…data This should partially address sharkdp#432 by decreasing the number of stat() calls: $ strace -c -f ./fd-before '\.h$' /usr -j1 -S +1k >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 15.71 8.831948 7 1192279 46059 stat $ strace -c -f ./fd-after '\.h$' /usr -j1 -S +1k >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 7.92 1.972474 10 183907 46046 stat Though it's not as few as possible: $ strace -c -f find /usr -iname '*.h' -size +1k >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 19.01 0.946500 5 161649 newfstatat $ strace -c -f bfs /usr -iname '*.h' -size +1k >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 13.73 0.406565 5 69005 statx Performance is much better when metadata is required: $ hyperfine ./fd-{before,after}" '\.h$' /usr -j1 -S +1k" Benchmark sharkdp#1: ./fd-before '\.h$' /usr -j1 -S +1k Time (mean ± σ): 4.623 s ± 0.154 s [User: 1.465 s, System: 3.354 s] Range (min … max): 4.327 s … 4.815 s 10 runs Benchmark sharkdp#2: ./fd-after '\.h$' /usr -j1 -S +1k Time (mean ± σ): 2.650 s ± 0.058 s [User: 1.258 s, System: 1.592 s] Range (min … max): 2.568 s … 2.723 s 10 runs Summary './fd-after '\.h$' /usr -j1 -S +1k' ran 1.74 ± 0.07 times faster than './fd-before '\.h$' /usr -j1 -S +1k' While remaining the same when it's not: $ hyperfine ./fd-{before,after}" '.h$' /usr -j1" Benchmark sharkdp#1: ./fd-before '.h$' /usr -j1 Time (mean ± σ): 2.314 s ± 0.052 s [User: 1.185 s, System: 1.291 s] Range (min … max): 2.260 s … 2.441 s 10 runs Benchmark sharkdp#2: ./fd-after '.h$' /usr -j1 Time (mean ± σ): 2.316 s ± 0.040 s [User: 1.162 s, System: 1.315 s] Range (min … max): 2.263 s … 2.375 s 10 runs Summary './fd-before '.h$' /usr -j1' ran 1.00 ± 0.03 times faster than './fd-after '.h$' /usr -j1'
…data This should partially address sharkdp#432 by decreasing the number of stat() calls: $ strace -c -f ./fd-before '\.h$' /usr -j1 -S +1k >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 15.71 8.831948 7 1192279 46059 stat $ strace -c -f ./fd-after '\.h$' /usr -j1 -S +1k >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 7.92 1.972474 10 183907 46046 stat Though it's not as few as possible: $ strace -c -f find /usr -iname '*.h' -size +1k >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 19.01 0.946500 5 161649 newfstatat $ strace -c -f bfs /usr -iname '*.h' -size +1k >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 13.73 0.406565 5 69005 statx Performance is much better when metadata is required: $ hyperfine ./fd-{before,after}" '\.h$' /usr -j1 -S +1k" Benchmark sharkdp#1: ./fd-before '\.h$' /usr -j1 -S +1k Time (mean ± σ): 4.623 s ± 0.154 s [User: 1.465 s, System: 3.354 s] Range (min … max): 4.327 s … 4.815 s 10 runs Benchmark sharkdp#2: ./fd-after '\.h$' /usr -j1 -S +1k Time (mean ± σ): 2.650 s ± 0.058 s [User: 1.258 s, System: 1.592 s] Range (min … max): 2.568 s … 2.723 s 10 runs Summary './fd-after '\.h$' /usr -j1 -S +1k' ran 1.74 ± 0.07 times faster than './fd-before '\.h$' /usr -j1 -S +1k' While remaining the same when it's not: $ hyperfine ./fd-{before,after}" '.h$' /usr -j1" Benchmark sharkdp#1: ./fd-before '.h$' /usr -j1 Time (mean ± σ): 2.314 s ± 0.052 s [User: 1.185 s, System: 1.291 s] Range (min … max): 2.260 s … 2.441 s 10 runs Benchmark sharkdp#2: ./fd-after '.h$' /usr -j1 Time (mean ± σ): 2.316 s ± 0.040 s [User: 1.162 s, System: 1.315 s] Range (min … max): 2.263 s … 2.375 s 10 runs Summary './fd-before '.h$' /usr -j1' ran 1.00 ± 0.03 times faster than './fd-after '.h$' /usr -j1'
…data This should partially address sharkdp#432 by decreasing the number of stat() calls: $ strace -c -f ./fd-before '\.h$' /usr -j1 -S +1k >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 15.71 8.831948 7 1192279 46059 stat $ strace -c -f ./fd-after '\.h$' /usr -j1 -S +1k >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 7.92 1.972474 10 183907 46046 stat Though it's not as few as possible: $ strace -c -f find /usr -iname '*.h' -size +1k >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 19.01 0.946500 5 161649 newfstatat $ strace -c -f bfs /usr -iname '*.h' -size +1k >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 13.73 0.406565 5 69005 statx Performance is much better when metadata is required: $ hyperfine ./fd-{before,after}" '\.h$' /usr -j1 -S +1k" Benchmark sharkdp#1: ./fd-before '\.h$' /usr -j1 -S +1k Time (mean ± σ): 4.623 s ± 0.154 s [User: 1.465 s, System: 3.354 s] Range (min … max): 4.327 s … 4.815 s 10 runs Benchmark sharkdp#2: ./fd-after '\.h$' /usr -j1 -S +1k Time (mean ± σ): 2.650 s ± 0.058 s [User: 1.258 s, System: 1.592 s] Range (min … max): 2.568 s … 2.723 s 10 runs Summary './fd-after '\.h$' /usr -j1 -S +1k' ran 1.74 ± 0.07 times faster than './fd-before '\.h$' /usr -j1 -S +1k' While remaining the same when it's not: $ hyperfine ./fd-{before,after}" '\.h$' /usr -j1" Benchmark sharkdp#1: ./fd-before '\.h$' /usr -j1 Time (mean ± σ): 2.382 s ± 0.038 s [User: 1.221 s, System: 1.286 s] Range (min … max): 2.325 s … 2.433 s 10 runs Benchmark sharkdp#2: ./fd-after '\.h$' /usr -j1 Time (mean ± σ): 2.362 s ± 0.034 s [User: 1.193 s, System: 1.294 s] Range (min … max): 2.307 s … 2.422 s 10 runs Summary './fd-after '\.h$' /usr -j1' ran 1.01 ± 0.02 times faster than './fd-before '\.h$' /usr -j1'
…data This should partially address #432 by decreasing the number of stat() calls: $ strace -c -f ./fd-before '\.h$' /usr -j1 -S +1k >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 15.71 8.831948 7 1192279 46059 stat $ strace -c -f ./fd-after '\.h$' /usr -j1 -S +1k >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 7.92 1.972474 10 183907 46046 stat Though it's not as few as possible: $ strace -c -f find /usr -iname '*.h' -size +1k >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 19.01 0.946500 5 161649 newfstatat $ strace -c -f bfs /usr -iname '*.h' -size +1k >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 13.73 0.406565 5 69005 statx Performance is much better when metadata is required: $ hyperfine ./fd-{before,after}" '\.h$' /usr -j1 -S +1k" Benchmark #1: ./fd-before '\.h$' /usr -j1 -S +1k Time (mean ± σ): 4.623 s ± 0.154 s [User: 1.465 s, System: 3.354 s] Range (min … max): 4.327 s … 4.815 s 10 runs Benchmark #2: ./fd-after '\.h$' /usr -j1 -S +1k Time (mean ± σ): 2.650 s ± 0.058 s [User: 1.258 s, System: 1.592 s] Range (min … max): 2.568 s … 2.723 s 10 runs Summary './fd-after '\.h$' /usr -j1 -S +1k' ran 1.74 ± 0.07 times faster than './fd-before '\.h$' /usr -j1 -S +1k' While remaining the same when it's not: $ hyperfine ./fd-{before,after}" '\.h$' /usr -j1" Benchmark #1: ./fd-before '\.h$' /usr -j1 Time (mean ± σ): 2.382 s ± 0.038 s [User: 1.221 s, System: 1.286 s] Range (min … max): 2.325 s … 2.433 s 10 runs Benchmark #2: ./fd-after '\.h$' /usr -j1 Time (mean ± σ): 2.362 s ± 0.034 s [User: 1.193 s, System: 1.294 s] Range (min … max): 2.307 s … 2.422 s 10 runs Summary './fd-after '\.h$' /usr -j1' ran 1.01 ± 0.02 times faster than './fd-before '\.h$' /usr -j1'
There have been significant performance improvements in v7.4.0, notably the ones by @tavianator which should directly address this issue. I'm going to close this ticket for now, but it would be great if we could get some updated benchmark results on this. I'm happy to re-open it if anyone thinks, that this is not yet resolved. |
I have a script that scrubs broken transfer files on a live FTP stored on a NFS share using
find
like this:As this can take quite some time I decided to put
fd
to the test. Unfortunately, its performance is worse on my use case (I hope I didn't make any mistake in the translation) and I think it might be amplified by NFS, which makes some system calls more expensive than others.Here is a first comparison on
CentOS 7
usingfd-v7.3.0-x86_64-unknown-linux-musl.tar.gz
:I tried with the options
-j 1
a while later to be more equivalent tofind
, which is single threaded, but it didn't help:As you can see, benchmarking on NFS is a bit wild depending on if metadata is cached or not on the client or server side. But I went on collecting more evidence using
strace
with and without tracing the calls tofutex
:The calls to
futex
were a bit unexpected sincefd
is invoked with-j 1
and it looks like it has quite some incidence on the timings. Otherwise, I think thatfind
behaves smarter here 1) by eliminating some calls tostat
while filtering first on the names returned bygetdents
and then 2) by using a better allocation strategy that reduces the calls tobrk and
getdents`.The text was updated successfully, but these errors were encountered: