[CopyJob] Increase the amount of data sendfile can copy at once
AbandonedPublic

Authored by meven on Oct 30 2019, 3:51 PM.

Details

Reviewers
dfaure
davidedmundson
Group Reviewers
Frameworks
Maniphest Tasks
T11627: Improve KIO asynchronicity
Summary

We had a bad value of 32,768 bytes to copy at once.
But Linux sendfile can support 2,147,479,552 bytes at once.
Increase the amount of data we copy in one sendfile syscall to 5MB so that we still have some copy progress reporting.

CCBUG: 342056
CCBUG: 402276

Test Plan

Tested on my system a fast drive : nvme ssd.

ll test.mkv
.rw-rw-r--  4,3G meven 30 oct.  16:16 test.mkv

Before:

time KDE_FORK_SLAVES=1 /usr/bin/kioclient5 cp file:///home/meven/test.mkv file:///home/meven/testa.mkv 

real    0m19,583s
user    0m0,084s
sys     0m0,060s

After:

time KDE_FORK_SLAVES=1  /home/meven/kde/usr/bin/kioclient5 cp file:///home/meven/test.mkv file:///home/meven/testa.mkv 

real    0m18,488s
user    0m0,075s
sys     0m0,046s

For reference cp:

time cp test.mkv testa.mkv

real    0m18,707s
user    0m0,062s
sys     0m7,017s

We can be as fast as cp with this, a 5.5% spead increase of large file copy.
Results depend on your system.

Diff Detail

Repository
R241 KIO
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 18367
Build 18385: arc lint + arc unit
meven created this revision.Oct 30 2019, 3:51 PM
Restricted Application added a project: Frameworks. · View Herald TranscriptOct 30 2019, 3:51 PM
Restricted Application added a subscriber: kde-frameworks-devel. · View Herald Transcript
meven requested review of this revision.Oct 30 2019, 3:51 PM
meven edited the test plan for this revision. (Show Details)Oct 30 2019, 3:53 PM
meven updated this revision to Diff 69064.Oct 30 2019, 3:53 PM

amend commit message

apol added a subscriber: apol.EditedOct 30 2019, 3:57 PM

This is really interesting.

src/ioslaves/file/file_unix.cpp
55

Where did you get this number from? it could be useful for when we reevaluate life.

272

Why keep the if it's going to always be true?

meven updated this revision to Diff 69065.Oct 30 2019, 4:07 PM

Update quote from man sendfile

sitter added a subscriber: sitter.Oct 30 2019, 4:13 PM

TBH, I would move away from the hardcoding in general.... try to sendfile size_of_file and if n < size use n as new count for the next cycle. The second cycle will use the maximum regardless of what the specific kernel version has as maximum.
The maximum is an implementation detail of the kernel that may change every release, so hardcoding anything seems wrong here.

meven added a comment.Oct 30 2019, 4:22 PM

The bug scope is rather large.
But this will help and I suspect https://phabricator.kde.org/D23523 will too.

src/ioslaves/file/file_unix.cpp
55
272

It is used when sendfile call fails to stop using it in next iteration.

line 272 - 276 and later

meven updated this revision to Diff 69075.Oct 30 2019, 6:05 PM

Limit sendfile chunk size to 5MB so that users can still follow progress while enjoying faster copy

meven marked 2 inline comments as done.Oct 30 2019, 6:17 PM
meven added inline comments.
src/ioslaves/file/file_unix.cpp
58

I have reduced the value to 5 MB so that users can still follow progress, otherwise the progress bar would move every 2,147,479,552 bytes copied.
This value is quite arbitrary. I suspect I should use a lower value.

The best value we could have depends on the actual speed of the drive we are writing to, so we can have regular user feedback and maximum speed.
We would need an adaptative value, so that the chunk size would match our desired progress report frequency, for instance each half second.
An algorithm could be:
Start with a low chunk value size.
Compute speed = copied bytes/second obtained, if chunk < speed, double chunk size.
If chunk > speed, decrease a little chunk.

In the meantime this patch should help things already.
I may work on such an algorithm and implementation.

meven edited the summary of this revision. (Show Details)Oct 31 2019, 8:55 AM
ngraham edited the summary of this revision. (Show Details)Oct 31 2019, 1:51 PM
meven marked 3 inline comments as done.Nov 11 2019, 8:12 AM
meven added inline comments.
src/ioslaves/file/file_unix.cpp
267

This prevented using sendfile for file bigger than 2 GB.
Our code has no reason to have this restriction.

meven marked an inline comment as done.Nov 11 2019, 8:13 AM
meven abandoned this revision.Apr 4 2020, 12:21 PM

In favor of D28555