Always skip trailing slashes in FilderedDirIterator
Needs ReviewPublic

Authored by poboiko on May 27 2019, 10:44 AM.

Details

Reviewers
bruns
ngraham
Group Reviewers
Frameworks
Baloo
Summary

I've encountered weird regression, introduced by D18830: Handle folders matching substrings of included/excluded folders correctly. Here is the description:

  1. Since some time, config->includeFolders() contains folders with trailing slashes.
  2. First entry that QDirIterator::path() returns matches exactly its argument, i.e. if we feed it with path with trailing slash, it will return exactly it. All the other paths don't have trailing slashes though.
  3. Because of that, inside UnindexedFileIndexer::run(), when we check whether path has changed, we perform comparison it.filePath() == tr.documentUrl(id), and it fails becaue documentUrl() always returns path without trailing slash.
  4. So we call DocumentUrlDB::replace(), which removes old entry and calls DocumentUrlDB::put().
  5. Finally, for some reason, IdTreeDB don't want to work with paths with trailing slashes, so it gets silently ignored.

So in the end of the day we get entry for includeFolder removed from DB. Which corrupts IdTreeDB --- it can no longer resolve paths.
The simplest solution proposed here is to make sure our DirIterator always returns paths without trailing slash, including first call.

BUG: 409257

Test Plan

It compiles. As far as I can see, it also fixes the regression.
Although, for some reason, this regression it's not always reproducible, so I cannot be completely sure. Would appreciate if someone else looked into it.

Diff Detail

Repository
R293 Baloo
Branch
trailingSlash
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 12187
Build 12205: arc lint + arc unit
poboiko created this revision.May 27 2019, 10:44 AM
Restricted Application added projects: Frameworks, Baloo. · View Herald TranscriptMay 27 2019, 10:44 AM
Restricted Application added a subscriber: kde-frameworks-devel. · View Herald Transcript
poboiko requested review of this revision.May 27 2019, 10:44 AM
poboiko edited the summary of this revision. (Show Details)May 27 2019, 10:44 AM
poboiko edited the summary of this revision. (Show Details)

Ping.

I've found a way to reproduce a related issue:

$ mkdir ~/test
$ balooctl config add includeFolders ~/test
$ balooctl stop
<make some changes with ~/test, i.e. add a tag>
$ balooctl start

This prints an error:

replace called with invalid arguments, docId: <docid> url: "~/test/"

The problem is the same: DocumentUrlDB returns a path without trailing slash, but FilteredDirIterator returns a path with one.
WriteTransaction thinks the path has changed, tries to replace it, calls DocumentUrlDB::replace, which fails because it doesn't want to work with path which has trailing slash.

In general, it's not a serious issue: we have problems only for folders that are inside includeFolders. If such folder wasn't renamed, then we don't care about DocUrlDB::replace failure anyway.
If it was renamed, most likely it's not inside includeFolders anymore. However, we can do something like

$ mkdir ~/test1
$ mkdir ~/test2
$ balooctl config add includeFolders ~/test{1,2}
$ touch ~/test1/somefile
$ balooctl stop
$ rm -rf ~/test2
$ mv ~/test2 ~/test1
$ balooctl start

the rename then gets silently ignored (file somefile doesn't pop up in searches; if we do balooshow -x <docid>, it returns an invalid path).

Not entirely sure, but bug 409257 might be caused by that (at least in my case, it looked the same).

poboiko edited the summary of this revision. (Show Details)Jun 30 2019, 12:09 PM
poboiko edited the summary of this revision. (Show Details)Jul 11 2019, 7:04 AM

Ping!

Apparently, it does fix bug 409257, which is pretty serious one (db corruption, after all).

Ping!

Apparently, it does fix bug 409257, which is pretty serious one (db corruption, after all).

The DB corruption is already fixed in KF5.60.

Ping!

Apparently, it does fix bug 409257, which is pretty serious one (db corruption, after all).

The DB corruption is already fixed in KF5.60.

I saw it in master and was wondering if it ended up in 5.60 (and whether it's still possible to corrupt it in any other way).

OK, DB corruption apart, I believe this patch is still relevant - we rely on paths not having trailing slash in different parts of the code anyways. And it clearly doesn't work as expected.