[baloo/KInotify] Notify if folder was moved from unwatched place
ClosedPublic

Authored by poboiko on Feb 3 2019, 3:44 PM.

Details

Summary

If a folder was moved from an unwatched place, KInotify will receive an EventMoveTo event,
which doesn't have an EventMoveFrom counterpart, and thus it will emit only created signal
for the moved directory, but not its contents.
It also won't install watches for the directory (as it does in EventCreate).

Instead use FilteredDirIterator to emit created() signal for all the contents as well, and add inotify watches.

It should also now handle the race condition if a directory was created
and files were moved inside it before an inotify watch for this directory was installed.

Note that it's not really realted to symbolic links, one just has to move a folder from excluded place to included.

BUG: 342224

Test Plan

Added a test case for KInotifyTest, similar to one described in Bug 342224.
It passes now.

Diff Detail

Repository
R293 Baloo
Branch
add-watch-moved (branched from master)
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 8334
Build 8352: arc lint + arc unit
poboiko created this revision.Feb 3 2019, 3:44 PM
Restricted Application added projects: Frameworks, Baloo. · View Herald TranscriptFeb 3 2019, 3:44 PM
poboiko requested review of this revision.Feb 3 2019, 3:44 PM
ngraham accepted this revision.Feb 4 2019, 12:25 AM
ngraham added a subscriber: ngraham.

This makes sense to me. Thanks for the test, too.

This revision is now accepted and ready to land.Feb 4 2019, 12:25 AM
bruns added a subscriber: bruns.Feb 4 2019, 1:39 AM

To which "file1"/"file2" are you referring in the summary? The commit log should be self contained.

src/file/kinotify.cpp
447

QFile:decodeName(path) called twice for the same path

447

This also traverses the directory tree twice, once with all items (here), and once for directories (in KInotify::addWatch(...)).
Calling if (it.fileInfo().isDir()) { d->addWatch(it.filePath()) } in the loop should have the same effect.

bruns requested changes to this revision.Feb 4 2019, 1:40 AM
This revision now requires changes to proceed.Feb 4 2019, 1:40 AM
poboiko updated this revision to Diff 50832.Feb 4 2019, 11:43 AM
poboiko edited the summary of this revision. (Show Details)

Explained the race condition in summary, expanded test to check if watches were installed correctly.

I've encountered some pretty weird issue: the watch for the moved folder is not installed; but it is installed for all its subfolders.
The reason is that it.fileInfo() returns empty QFileInfo() for the moved folder. But it works fine for all the subentries.
That is not the issue with the previous version, because KInotify::addWatch doesn't check for fileInfo().isDir() explicitly: it just relies on DirsOnly flag for the iterator.

There is a workaround: if I create QFileInfo by hand, using QFileInfo fi(it.filePath()) instead of it.fileInfo(), everything seems to be working fine.
(the test fails if I use it.fileInfo(), but it passes if I use this workaround)

poboiko marked 2 inline comments as done.Feb 4 2019, 11:43 AM
bruns requested changes to this revision.Feb 4 2019, 7:18 PM
bruns added inline comments.
src/file/kinotify.cpp
446

FilteredDirIterator is somewhat odd, as it returns the traversed folder itself as the first element (if not filtered/excluded by the config).

The folder itself has no fileInfo, as it is not backed by a QDirIterator item. Unfortunately, this is not really trivial to fix.

Instead of creating another QFileInfo for each item, you can just do the following:

if (event->mask & IN_ISDIR) {
    Baloo::FilteredDirIterator it(d->config, QFile::decodeName(path));
    if (!it.next().isEmpty()) {
            // add folder itself, it not excluded
            d->addWatch(it.filePath());
    }
    while (!it.next().isEmpty()) {
        Q_EMIT created(it.filePath(), it.fileInfo().isDir());
        if (it.fileInfo().isDir()) {
            d->addWatch(it.filePath());
        }
    }
}
Q_EMIT created(QFile::decodeName(path), event->mask & IN_ISDIR);
This revision now requires changes to proceed.Feb 4 2019, 7:18 PM
bruns added a comment.Feb 4 2019, 7:21 PM

Explained the race condition in summary, expanded test to check if watches were installed correctly.

I am not sure if I understood your description correctly, but I am quite sure this race condition does not exist - the files/folders inside the moved folder are not created/moved one by one, but the containing folder ist just "renamed" - it is unlinked from the old parent and linked into the new one, atomically.

I am not sure if I understood your description correctly, but I am quite sure this race condition does not exist - the files/folders inside the moved folder are not created/moved one by one, but the containing folder ist just "renamed" - it is unlinked from the old parent and linked into the new one, atomically.

Sure. That concern corresponded only to the second note - if moving from another device, system has to do actual copy/move.

FilteredDirIterator is somewhat odd, as it returns the traversed folder itself as the first element (if not filtered/excluded by the config).
The folder itself has no fileInfo, as it is not backed by a QDirIterator item. Unfortunately, this is not really trivial to fix.

Damn. I've though it's behavior is inherited from QDirIterator. Didn't look into FilteredDirIterator enough. Thanks!

poboiko updated this revision to Diff 50932.Feb 5 2019, 11:25 AM

Added code to work with first entry that pops from FilteredDirIterator (that is the directory itself)
Test still works; but we should emit created() signal for it as well.

poboiko marked an inline comment as done.Feb 5 2019, 11:26 AM
bruns added a comment.Feb 5 2019, 4:10 PM

Added code to work with first entry that pops from FilteredDirIterator (that is the directory itself)
Test still works; but we should emit created() signal for it as well.

Thats the reason I moved the Q_EMIT created(...) out of the if(mask & IN_ISDIR) branch.

Thinking about it, it should be before the branch, otherwise baloo may receive "created" events for the children before getting a "created" for the parent.

bruns added a comment.EditedFeb 5 2019, 4:12 PM

I am not sure if I understood your description correctly, but I am quite sure this race condition does not exist - the files/folders inside the moved folder are not created/moved one by one, but the containing folder ist just "renamed" - it is unlinked from the old parent and linked into the new one, atomically.

Sure. That concern corresponded only to the second note - if moving from another device, system has to do actual copy/move.

But when "moving" across devices, we should get regular "created" events for each directory and file, as copying/moving across devices is done one by one. This is definitely not of concern for the "MovedTo" case.

For the "Created" case, we may be actually missing events, although it is possible the baloo core itself syncs its state when getting a directory create event. Haven't checked it though.

In general, inotify is racy, but the code as written here would not be, even for "created" events:

  1. For each directory, we add a watch, while traversing the parent directory.
  2. Each directory is also pushed to m_paths in FilteredDirIterator - https://phabricator.kde.org/source/baloo/browse/master/src/file/filtereddiriterator.cpp$80
  3. After traversing the parent, we descend into the first found directory - https://phabricator.kde.org/source/baloo/browse/master/src/file/filtereddiriterator.cpp$67
  4. The subdirectory is traversed - note, the watch has been installed already
poboiko updated this revision to Diff 50967.Feb 5 2019, 4:59 PM

Cosmetics

bruns added a comment.Feb 5 2019, 5:13 PM

Addendum:
I have just checked, https://phabricator.kde.org/source/baloo/browse/master/src/file/filewatch.cpp$144

filewatch relies on the code here to emit created events for all children, so we need the traversal code not only for "moved", but also for "created", as otherwise we may miss some "created" events.

Duplicate "created" events is completely fine, these are merged in the "PendingFileQueue".

So I would like you to do the following:

  • take the newly added code and create a function from it, e.g. handleCreateRecursive(event, path)
  • move the Q_EMIT created(...) to the beginning of the function
  • call this function for the "EventMoveTo && noCookie)" case
  • call this function also for "EventCreated"

Probably also add a comment to the function why the manual traversal is needed:

// Files/directories inside the new directory may be created before the watch
// is installed. Ensure created events for all children are issued at least once
bruns added inline comments.Feb 5 2019, 5:17 PM
src/file/kinotify.cpp
445

This is broken now, as you fail to emit created for any child.

bruns added inline comments.Feb 5 2019, 5:26 PM
src/file/kinotify.cpp
445

Ignore, looked at the wrong line ...

poboiko updated this revision to Diff 51062.Feb 6 2019, 10:24 PM
poboiko edited the summary of this revision. (Show Details)

Added recursive iteration over all contents for Create event as well

Something like that? I've decided not to emit created signal from inside the function, just to have a bit less branching in the code (and documented this behavior, since it might be a bit confusing)

Actually, I think this race condition is now handled properly, i.e. I can't think even of the case when we got created signal twice for the same file.

bruns added a comment.Feb 7 2019, 3:21 AM

Something like that? I've decided not to emit created signal from inside the function, just to have a bit less branching in the code (and documented this behavior, since it might be a bit confusing)

Actually, I think this race condition is now handled properly, i.e. I can't think even of the case when we got created signal twice for the same file.

Yes, looks almost fine now.

It is still possible to get two "created" signals for a new file, but as said, two signals are the race we handle fine, only zero signals would be an issue:

= a =
it->next()   -- adds "foobar" to it.m_paths
addWatch("foobar")
= b =
it.next(), it.next(), ... it.next()   -- siblings of "foobar"
= c =
it.next() -- take "foobar" from it.m_paths, instantiate QDirIterator("foobar")
it.next(), ... it.next()  -- contents of "foobar"
it.next() -- destruct "foobar" iterator
= d =

Any file created after =b= and before =c= will emit "created" twice, once from the already created watcher, and once during the traversal
Files created before =a= will be missed by the watch (thus we need the iteration), files created after =d= will only be picked up by the watcher.

src/file/kinotify.cpp
391

QFile::decodeName(path) twice ...

442

dito ...

bruns added inline comments.Feb 7 2019, 3:25 AM
src/file/kinotify.h
198

Can you add a note created events are only emitted if the file/directory is not exclude by the config, likewise for installed watches?

bruns requested changes to this revision.Feb 9 2019, 12:12 AM
This revision now requires changes to proceed.Feb 9 2019, 12:12 AM
poboiko added inline comments.Feb 12 2019, 11:12 AM
src/file/kinotify.cpp
391

I'm a bit lost, why is it a problem? Or do you mean it's a costly operation and suggest to do the following?

const QString& fname = QFile::decodeName(path);
Q_EMIT created(fname);
[...]
handleDirCreated(fname);
bruns added inline comments.Feb 12 2019, 3:47 PM
src/file/kinotify.cpp
391

yes, but just const QString fname = ...

poboiko added inline comments.Feb 12 2019, 3:57 PM
src/file/kinotify.cpp
391

Right, sorry, misprinted.
Actually, since there are a lot of decodeName calls around, probably it would be better to decode it just once, right before event->mask matching then, and then pass it everywhere?
(less code duplication & mutliple calls if mask matches several events...)

bruns added inline comments.Feb 15 2019, 3:11 AM
src/file/kinotify.cpp
391

Yes, but such a cleanup should go into a different review.

poboiko updated this revision to Diff 51739.Feb 15 2019, 11:52 AM

Updated comment, removed duplicated QFile::decodeName

bruns accepted this revision.Feb 15 2019, 2:10 PM

Thanks!

This revision is now accepted and ready to land.Feb 15 2019, 2:10 PM
poboiko updated this revision to Diff 51762.Feb 15 2019, 3:41 PM

Forgot to define fname inside EventMoveTo

poboiko closed this revision.Feb 15 2019, 3:42 PM