Support for xattrs on kio copy/move
Needs ReviewPublic

Authored by cochise on Dec 27 2018, 9:38 AM.

Details

Reviewers
dfaure
Summary

This patch adds support to KIO to preserve xattr of files and directories on copy or move.
Working for copy/move including overwrite and rename cases.
Not working for "write into" when coping folders have name conflict. I'm against adding to this user case as the name of the action don't imply overwrite.

Tested only on Linux.

FEATURE: 116617

Diff Detail

Repository
R241 KIO
Branch
xattr-copy-support (branched from master)
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 7103
Build 7121: arc lint + arc unit
There are a very large number of changes, so older changes are hidden. Show Older Changes
cochise updated this revision to Diff 48384.Dec 30 2018, 5:25 AM

As using qDebug supressed the error, I changed to qCWarning and managed to track the memory corruption. On my system, after getting the list of keys on source file, the local variable holding this file path gets corrupted. I'm using a code almost identical to official example [1], and enlarging the buffer, or allocating it on the heap didn't resoled the issue.
A hackish workaround is to not store the file path after conversion (const char*), and using QUrl.path().toLocal8Bit().data(); in all nine calls.
I'm not happy with this solution, but I think I reached my limit here. Commiting this hackish version for review, and going to make some tests.

1 - http://man7.org/linux/man-pages/man2/listxattr.2.html

cochise updated this revision to Diff 48410.Dec 30 2018, 5:46 PM
cochise marked an inline comment as done.

Bug resolved in a proper way with help of Tomaz Canabrava

Some other small fixes.

pino added a subscriber: pino.Dec 30 2018, 6:49 PM

general notes:

  • NULL -> nullptr
  • there is not just glibc
  • the changes to file_unix.cpp seem unrelated to you patch now, so better split them in an own patch
  • use constData() instead of data() every time the data needed is read-only
src/core/config-kiocore.h.cmake
8

HAVE_SYS_XATTR_H is available here as side-effect of using the FindACL.cmake module.
Better explicitly look for sys/xattr.h, like src/ioslaves/file/CMakeLists.txt does.

src/core/copyjob.cpp
24–30

this #include block has a slightly messy logic:

  • if sys/attr.h is available, just include it directly with no other checks
  • sys/extattr.h needs its own cmake check, including it if found
2182–2184

if HAVE_SYS_XATTR_H is not defined, the first instruction in copyXattrs will be a return, and some compilers may warn/error out because the rest of the function is unreachable; better stub out the whole function instead

2186
  • you don't need to call data() here, the return value of toLocal8Bit() is already a QByteArray
  • toLocal8Bit() is the wrong function here: please use QFile::encodeName(), which does the right job for QString -> local filesystem paths
2192

NULL -> nullptr

2194

NULL -> nullptr

2197

there is not just glibc; also, better check for errno as ENOTSUP, because that means the source filesystem does not support xattrs, and thus you can directly skip the rest of the function (as it will not work anyway)

2218

NULL -> nullptr

2220

NULL -> nullptr

2248–2250

as noted above: checking that errno is ENOTSUP means that the destination filesystem does not support xattrs, so there is little point trying to set the other attributes

src/ioslaves/file/file_unix.cpp
42 ↗(On Diff #48410)

HAVE_SYS_XATTR_H from config-kioslave-file.h can be used here

44 ↗(On Diff #48410)

better check for sys/extattr.h instead

In D17816#384056, @pino wrote:
  • NULL -> nullptr
  • there is not just glibc

I'm following the pattern in Baloo, that keeps NULL on Mac and *BSD. I don't have any of these systems to test, so I didn't touch it.
Didn't searched yet about compatibility of these functions on libc alternatives. Frameworks officially supports a subset of them I should check?

  • the changes to file_unix.cpp seem unrelated to you patch now, so better split them in an own patch

OK, will do it.

  • use constData() instead of data() every time the data needed is read-only

OK, will do it.

abika added a subscriber: abika.Jan 1 2019, 12:58 PM

Only 116617 needs the BUG: keyword, since the other ones are all duplicates. Those should get CCBUG: at most, or nothing.

cochise updated this revision to Diff 48586.Jan 3 2019, 7:24 AM

Tests added, includes and ifdefs reworked

Initial tests. Not crossplatform of extensive yet.
The includes and ifdefs were reworked, and I think they are more concise and simple, but some feedback is needed, as they were not tested outside linux.
The debug messages were slighly improved, various @pino sugestions included and changes to file_unix reverted for a posterior PR.

cochise retitled this revision from Initial support for xattrs on kio copy/move to Support for xattrs on kio copy/move.Jan 3 2019, 7:36 AM
cochise edited the summary of this revision. (Show Details)
cochise edited the summary of this revision. (Show Details)
pino added a comment.Jan 3 2019, 1:37 PM

Nice progresses, thanks for the fixes. I added few more notes, just mentioning the first occurrence of each.
One more thing is to print errno (and possibly its string representation using strerror/strerror_r) on failure, so that the debugging is easier.

autotests/jobtest.cpp
454

extra empty line

533

you can use QStandardPaths::findExecutable() to check whether setfattr is available, and QSKIP the test if not

539

please use the QProcess::start(QString,QStringList) variant, splitting the arguments; this way there is no need to manual quoting the paths, and thus avoid problems with spaces

539

check using QVERIFY/QCOMPARE that the process was started correctly

540

check using QVERIFY/QCOMPARE that the process ended correctly

src/core/CMakeLists.txt
5–6

good checks, although they fit better in ConfigureChecks.cmake (which is there for a reasons ;) )

src/core/copyjob.cpp
2185

destiny is another thing ;) "destination" in this case

2198–2202

this is more of a personal taste rather than an issue: IMHO you can avoid the "if .. else if ... else" that adds a nesting level more, especially when all the checks lead to early return; so something like:

if (listlen < 0) {
  if (errno != ENOTSUP) {
    qCWarning(KIO_COPYJOB_DEBUG) << "failed to extract xattr from" << xattrsrc;
  }
  return;
} 
if (listlen == 0) {
  qCDebug(KIO_COPYJOB_DEBUG) << "File" << xattrsrc << "doesn't have any xattr.";
  return;
}
[etc..]
2249

IMHO this is not a warning worth situation

ngraham edited the summary of this revision. (Show Details)Jan 3 2019, 8:36 PM

Hello, and thanks to everyone involved for working on this! If I am not too late to the discussion, I would also like to point out that another instance when we lose xattr attributes (tags, comments, rating) is when we open a file in Kate or KWrite and then we save it.

Test case:

  1. create a "testfile.txt"
  2. add a tag to the file
  3. open the file with Kate or KWrite and add some text
  4. save the file
  5. the tag that we added in step 2 is gone!

I suspect that the reason this happens is that Kate and KWrite probably create a copy of the original file, but without including the xattr data from the original file.

Perhaps the new KIO copy that is being implemented will address this scenario as well? If that is the case, then I will release the bounty @cochise once I see it working in KDE Neon Developer edition!

Looking forward to it!

dhaumann added a subscriber: dhaumann.EditedJan 4 2019, 2:32 PM

@funkybomber: Kate and KWrite stopped using QSaveFile since KF 5.50, see D14890 and commit 681cafb74607d9fdb93e3bd9a90ea20b861de896. So what you write about Kate/KWrite is not true anymore, this should indeed work again.

@dhaumann: I just tested it on KDE Neon. You are right, it is fixed! :)

On a related note, is it possible that the "QSaveFile" is also responsible for a similar behaviour in Ark?

If I add tags to a "test.zip" file and then I open it with Ark and add/remove a file, the tags on "test.zip" are gone. That is on the latest KDE Neon.

Would it help if I filed a bug?

Sorry everyone, I won't derail this thread any further.

cochise updated this revision to Diff 48731.EditedJan 5 2019, 12:53 PM

Tests refactored, but no multiplatform. Other small fixes.

The tests aren't a big mess anymore, and some groundwork for multiplatform tests are in place, but as I don't have a non linux system, I can't finish this work.
Include checks moved to ConfigureChecks.cmake.
Warning about destination FS not supporting xattrs demoted to debug.
As listxattr function returns 0 if the source FS don't support xattrs, no ENOTSUP check is made at this stage.
Nesting reduced using a switch for listlen. After some tests, I think this is the option with better readability.

The tests showed that the feature don't work is KIO:file_copy is used, only KIO::copy. As it's working on Dolphin and kioclient copy/move this seems to not be a big issue, but I will research a fix.

cochise edited the summary of this revision. (Show Details)Jan 5 2019, 12:54 PM
cochise updated this revision to Diff 48733.Jan 5 2019, 1:04 PM

small cleanup

phidrho added a subscriber: phidrho.Jan 5 2019, 1:28 PM

On a related note, is it possible that the "QSaveFile" is also responsible for a similar behaviour in Ark?

Probably yes: https://github.com/KDE/ark/search?q=qsavefile&unscoped_q=qsavefile

Many KDE applications use this class (the most prominent ones being Krita and Ark), and it provides a useful feature:

QSaveFile is an I/O device for writing text and binary files, without losing existing data if the writing operation fails.
While writing, the contents will be written to a temporary file, and if no error happened, commit() will move it to the final file.

But as it causes loss of ACLs and xattrs as side effect, I think it's use deserves a proper discussion in a dedicated topic.

On the support for KIO::file_copy:

Trying to add support on the low level KIO::file_copy I found that it would be hard without code duplication or exposing the function to copy xattrs that are currently on KIO::CopyJobPrivate, but this would change the API, adding a non virtual method, what I think wont break the ABI.

I think the best way to do this is:

  1. Put copyXattr as a public method of FileCopyJob.
  2. Call copyXattr for files in FileCopyJob::file_copy, because CopyJob::copy uses it internally.
  3. Call copyXattr for directories on CopyJob, because it's the lowest abstraction level that knows that the mkdir have a source.

I can't find uses of file_copy related to user files in a very superficial search, only backups, config files and network stuff. So, in theory, the current patch is good enough and we don't need to change the API, if this is a problem.

autotests/jobtest.cpp
533

On linux, the command line bin are list/get/setfattr, on BSD list/get/setxattr, on mac are xattr -l/-p/-w, so this part will need a lot of ifdefs too, to work on many platforms.

src/core/config-kiocore.h.cmake
8

Didn't get it defined until added the snippet. Will research src/ioslaves/file/CMakeLists.txt to see if I can simplify the include for *BSD too.

src/core/copyjob.cpp
2182–2184

I think looks better this way, but if it throw a warning I will revert to encapsulating the whole function.

2186

I need data(), because the compiler complains about non private cast. but using encodeName() should be better either way.

2197

If a error is found at this point, all the rest of the function will not run, because it is on the a statement, but doing a early check for destination filesystem support of xattr is a good idea.

dfaure added a comment.Jan 5 2019, 6:44 PM

I'll be happy to review a patch for QSaveFile to preserve ACLs and xattrs.

bruns added a subscriber: bruns.Jan 6 2019, 1:16 AM
bruns added inline comments.
src/core/copyjob.cpp
2192

you can (typically) avoid the double (syscall, i.e. expensive) by preallocating the array and only reallocating if you get errno == ERANGE. Dito for getxattr.

2214

... always ...

2218

There should probably be a if (!xattrkey.startsWith("user.")) continue; here.

2219

getxattr is called on the same file again and again, thus it is much more efficient to use fgetxattr.

cochise updated this revision to Diff 48959.Jan 8 2019, 1:13 PM
cochise marked an inline comment as done.

All use cases working, added a new KIO to copy xattr

Working on copy and move with or without overwrite for calls of
KIO::copy, KIO::copyAs and KIO::file_copy of files and directories.
Due the last problems with the method to copy xattrs being private, I
created a new KIO::copy_xattr. Not very async, but the operation should
be fast enough.
Some more work on the tests, removed the ones that don't make sense for
the case and added checks to preent breackage if xattrs ar not supported.

cochise updated this revision to Diff 48960.Jan 8 2019, 1:21 PM

For some reason the added kio was not included

bruns added inline comments.Jan 8 2019, 7:00 PM
src/core/copyxattrjob.cpp
115

indentation

src/core/copyxattrjob.h
34

"... to another."

60

There are no params

cfeck added a subscriber: cfeck.Jan 9 2019, 6:15 AM
cfeck added inline comments.
src/core/filecopyjob.cpp
519

This waits (i.e. spawns a separate event loop) until the job is finished. Should use a subjob.

cfeck added inline comments.Jan 9 2019, 6:17 AM
src/core/copyjob.cpp
1119

Here, too.

cochise updated this revision to Diff 49487.Jan 14 2019, 9:22 PM
cochise marked 3 inline comments as done.

Use subjobs on file_copy and other small fixes

Not using subjobs in KIO::copy, as it breakes the tests.

cfeck added a comment.Jan 17 2019, 3:48 AM

I am still not fond of having a local event loop in KIO. The whole point of KIO is that it should work asynchronously.

If the tests fail, could they be adapted? Or is the reason why the tests fail unfixable?

cochise marked an inline comment as done.Jan 17 2019, 12:59 PM
This comment was removed by cochise.
src/core/copyjob.cpp
1119

I tried various ways to call this as a subjob, but all of them leads to breakage of non xattr related tests. Maybe some major changes to the class are needed.

But the call can be asynchronous with little effort. All tests still pass if start() is called, instead of exec().

2192

In theory, xattrs have unlimited size on some filesystems. Ext limits it to a fs block (4 Kb on most end user setups). Allocating full 4 kb is a overkill, as most of xattr ae small.

But as most of files don't have xattr, and the function will return here, we have to pay the cost of the second call only if we will use it. If we preallocate memory, we have to pay a cost for every file.

2214

On Linux, at least. Each item gets a \0 terminator and the list itself too. Not tested on other systems.

2218

Doing some research and test...

Currently, the security, system, trusted, and user extended attribute classes are defined as described below. Additional classes may be added in the future.

If a copy a file with kioclient as user, any attribute outside user. is lost and I get a warning for the ones I'm unable to preserve, but if I copy as root, all are preserved.

cochise added a comment.EditedJan 17 2019, 12:59 PM

I'm not sure why the tests fail, and the tests failing are unrelated to xattrs. I think there is some problem in queuing the subjobs and ensuring they all have finished.
After the copy job is finished, the copyLocalDirectory test can't find the file inside the copied test directory. Manually testing recursive copy I can't find problems, but the autotest fail. So, I think some parts of the copy job are executed after the copyjob finishes if I add a subjob.
As I said, we can use start, instead of exec, to be asynchronous, but using a subjob seems to need some refactor, maybe adding a new state for the xattr subjob.

Can we ship it, or using a subjob on KIO::copy is mandatory here?

Using exec() from a job implementation is a big no no, it creates an unexpected re-entrancy, source of all sorts of problems. This cannot be shipped as is.

See KCompositeJob's documentation for how to add a subjob - and/or see DirectorySizeJob for a simple example.

Where are we with this? Would be really nice to finally get this bug fixed for our users. :)

My vacancies ended and I'm making the annual planning for my classes, so I will be busy for at least two weeks, but I plan to work on the subjob bug as soon my schedule allows it.

Thanks, that would be appreciated. This looks like it's sooooo close...

bruns added a comment.Mar 5 2019, 3:25 PM

Does this work correctly when the source is a symlink? - getxattr vs lgetxattr.

src/core/copyxattrjob.cpp
134

if errno=ERANGE, the list has grown after you called listxattr(file, nullptr, 0). This is completetely valid, see man 2 listxattr:

But, bear in mind that there is a possibility that the set of extended attributes may change between the two calls, so that it is still necessary to check the return status from the second call.

Same applies for getxattr.

Also, you can cut the number of syscalls by half by using an opportunistic buffer size. Also saves you from allocating a buffer for each attribute value. As said, you have to check for "ERANGE" anyway.