Faster file transfer and less CPU consumption #55

m0nstermind · 2020-06-25T13:59:02Z

yajsync currently transfers files at speeds up to 800-900 mbits per second. Which is not very fast for syncing files to SSD/NVMEs or to memory disks.

During the file transfer a single core CPU usage go to 100%.

I did some profiling and identified 2 reasons of high cpu usage:

md5 checksums calculation consume about 60% of a single core cpu on receiver side, a bit less on sender.
unneccessary byte buffer copies across receiver and sender during sending/receiving to/from sockets as well as calculating md5 checksums

This PR tries to solve these problems by:

Introducing alternative checksums. md5 is still default, for "compatibility" with rsync. This alternative can be chosen by --checksum-choice client option. In addition to md5 I implemented the xxhash from Zero-Allocation-Hashing. According to information found on Zero-Allocation-Hashing readme, xxhash is fastest, with speeds up to 9.5GBs. It is faster than md5 at least 10x on my test setup
Eliminated extra byte copying by using mmapped file io on sender and receiver. This also makes a step forward in the direction of making it possible to sync hugetlbfs filesystems ( which disallows conventional IOs in favor of mmap only ).

All these allowed to speed up transfer up to 2Gbits/sec with 50-60%of single core consumption. The main bottleneck is now found in net/fs/io in kernel space of receiver.

I also have more PRs to submit:

hardlinks preservation implementation for rsync protocol > 28
-B, --block-size=SIZE option. so hugetlbfs filesystemms can be transferred. hugetlbfs forces to write by blocks of hugepage sizes (could be 2M or 1G depending on system setup ). Currently block size is exactly 8k.
-W, --whole-file copy files whole (w/o delta-xfer algorithm).

but these are based on refactorings made in this PR, so I am not submitting them right now to avoid unnecessary rebase to current master.

* --checksum-choice=xxhash option for much faster than md5 digesting of files * eliminated extra byte copying by using mmapped file io on sender and receiver. This also makes it possible to sync hugetlbfs filesystem

x2 faster sync:

bc72d2a

* --checksum-choice=xxhash option for much faster than md5 digesting of files * eliminated extra byte copying by using mmapped file io on sender and receiver. This also makes it possible to sync hugetlbfs filesystem

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster file transfer and less CPU consumption #55

Faster file transfer and less CPU consumption #55

m0nstermind commented Jun 25, 2020

Faster file transfer and less CPU consumption #55

Are you sure you want to change the base?

Faster file transfer and less CPU consumption #55

Conversation

m0nstermind commented Jun 25, 2020