Refactor KFileProtocol::copy
Open, NormalPublic

Description

Our man file copy function has some glaring issues that I'd like to tackle in a few steps:

  • Split the sendfile and read/write code path to function, to make evolution easier
  • Then Make those function io calls (read, write, sendfile) O_NONBLOCK and use select/poll freeing user space CPU time to for instance notify the user of progressand and reducing chances of blocking in both ways (file io blocking or socket/ksmserver blocking)
  • Then make the sendFile/write operation O_DIRECT and O_SYNC (https://linux.die.net/man/2/open), perhaps by default, or option or depending on the fs destination. Fixing https://bugs.kde.org/show_bug.cgi?id=281270 Inconsistent notifications during/after file operations (copying, moving, deleting, compressing, extracting) depending on amount of data/items involved (another to fix this is to mount usb sticks with the sync mount option)

Then if needed :

  • D28555 Tune the parameters of the sendFile code path to copy data by chunks bigger than 32kB, I am thinking 128/512 to keep granular progress reporting,
  • Tune the read/write (it reads and write by chunks of 32kB at the moment

Bonus steps

All of those steps should make the code more async, maintainable and performant, and fix a few longstanding bugs.

It should help on
*Ridiculously slow file copy (multiple small files) https://bugs.kde.org/show_bug.cgi?id=342056

meven created this task.Feb 3 2020, 2:55 AM
meven triaged this task as Normal priority.
meven added a comment.Feb 12 2020, 6:39 PM

@dfaure does this makes sense to you ?

I don't really understand the relation to ksmserver, surely kioslaves are not part of session management.

I also don't really understand what would happen in the new event loop (select/poll) of the kioslave while the non-blocking I/O is happening. It's not like the slave can handle other commands in parallel. Slaves are in a separate process exactly so that they can block.

Regarding O_DIRECT, I have some doubts. The man page you linked to, says

In summary, O_DIRECT is a potentially powerful tool that should be used with caution. It is recommended that applications treat use of O_DIRECT as a performance option which is disabled by default.
"The thing that has always disturbed me about O_DIRECT is that the whole interface is just stupid, and was probably designed by a deranged monkey on some serious mind-controlling substances."--Linus"`

Are you sure you want to use that? :-)
More precisely, it says it will break with NFS (in some cases).
And it sounds like a bad buffer size with O_DIRECT will slow things down (too many direct writes instead of buffering).
Overall, I'd say -- did you check what the cp source code does? That's what we're competing against :-)

meven updated the task description. (Show Details)Apr 4 2020, 6:58 PM
meven added a comment.Oct 1 2021, 8:03 AM

copy https://github.com/coreutils/coreutils/blob/master/src/copy.c#L301 use https://man7.org/linux/man-pages/man2/copy_file_range.2.html or https://github.com/coreutils/gnulib/blob/master/lib/full-write.h
Did not know about copy_file_range, it was introduced in kernel 4.5 and improved/reworked in 5.3 apparently.
It is simply newer than sendfile (but older than io_uring), it seems to be just almost a drop-in replacement for sendfile with reflink aka COW support for supporting fs built-in.
That should be the next area for investigation, it is way simpler than our current reflink support, but it is Linux only.

meven added a comment.Oct 13 2021, 2:16 PM

I have a first step : split to two loops use copy_file_range

https://invent.kde.org/frameworks/kio/-/merge_requests/602

meven added a comment.Oct 18 2021, 8:14 AM

Found some hints how to help the kernel manage its buffers and page cache efficiently when writing
http://lkml.iu.edu/hypermail/linux/kernel/1005.2/01845.html
http://lkml.iu.edu/hypermail/linux/kernel/1005.2/01953.html

Also additional fadvise might be worse testing.

(void)posix_fadvise(fd, 0, 0, POSIX_FADV_WILLNEED);
(void)posix_fadvise(fd, 0, 0, POSIX_FADV_SEQUENTIAL);
(void)posix_fadvise(fd, 0, 0, POSIX_FADV_NOREUSE);

In any case, I desperately need a benchmark tool as kioclient can't really play that role.
Something that would link directly with file IOSLAVE.