Baloo engine: treat every non-success code as a failure
ClosedPublic

Authored by valeriymalov on Feb 2 2019, 11:26 AM.

Details

Summary

Treating only MDB_NOTFOUND as an error leads to use of uninitliazed
pointers and handle IDs in other cases (e.g. when get fails with
MDB_BAD_TXN) and wreaks havoc in the application.

CCBUG: 361186
CCBUG: 390823
CCBUG: 372880
CCBUG: 395888
CCBUG: 367480
CCBUG: 403720

Diff Detail

Repository
R293 Baloo
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
valeriymalov created this revision.Feb 2 2019, 11:26 AM
Restricted Application added projects: Frameworks, Baloo. · View Herald TranscriptFeb 2 2019, 11:26 AM
Restricted Application added subscribers: Baloo, kde-frameworks-devel. · View Herald Transcript
valeriymalov requested review of this revision.Feb 2 2019, 11:26 AM
  • unbreak documentdatadb::contains, oops :(
ngraham added subscribers: bruns, ngraham.

Conceptually this seems like it's definitely better than nothing or the status quo. Only checking for MDB_NOTFOUND and ignoring other error statuses seems like really bad code hygiene. @bruns?

Nice! I like it, it's definitely much better than Q_ASSERT_X macros that are just silently ignored in non-debug builds.

Have some nitpicks though:

  1. Are we actually sure this is gonna fix all those crashes? Otherwise I would suggest using CCBUG instead of BUG inside the commit message.
  2. Looks like this patch is composed of two parts - introducing new logging category and and improving error handling. It would be also nice to split those into two separate patches.
  3. For now, logging is the only way to know if there's something bad going on. In that case I would suggest to at least increase severity of those messages - it would increase chances users will notice it. For example, use qCCritical (but this would also require additional check for okayish/non-critical return codes, such as MDB_NOTFOUND)
  4. There are a lot of redundant Q_ASSERT_X left, which could be removed. I suggest just grep'ing over the code to catch'em all. I've started marking them here, but then I gave up - too many of them.
  5. There are also several unchecked return codes as well, such as inside *DB::size() calls. Those can also be found by grep'ing over Q_ASSERT_X.
src/engine/documentdatadb.cpp
148–149

This is not needed now, I guess

src/engine/documentdb.cpp
147

This can also return something weird.

src/engine/postingdb.cpp
133–134

This assert is not necessary here then?

valeriymalov planned changes to this revision.Feb 3 2019, 9:43 PM
  1. Are we actually sure this is gonna fix all those crashes? Otherwise I would suggest using CCBUG instead of BUG inside the commit message.

I've chosen those bugs based on backtraces I could get myself but without core dumps I can't vouch for that, even with core dumps it'd be hard to tell. Alternatively I can CCBUG and after this gets into a release, ask in those bugs people if they still experience the crash and close them if they won't get back in a month or so.

  1. For now, logging is the only way to know if there's something bad going on. In that case I would suggest to at least increase severity of those messages - it would increase chances users will notice it. For example, use qCCritical (but this would also require additional check for okayish/non-critical return codes, such as MDB_NOTFOUND)

These errors have good potential to flood users log files. I get around 30 megabytes of those on each boot. I'd rather see them disabled by default.
It might make sense to add a catch-all error message somewhere on higher level, that would get printed out once, but I haven't looked yet where such message could be placed.

  1. Looks like this patch is composed of two parts - introducing new logging category and and improving error handling. It would be also nice to split those into two separate patches.
  2. There are a lot of redundant Q_ASSERT_X left, which could be removed. I suggest just grep'ing over the code to catch'em all. I've started marking them here, but then I gave up - too many of them.
  3. There are also several unchecked return codes as well, such as inside *DB::size() calls. Those can also be found by grep'ing over Q_ASSERT_X.

I'll look into these this week.

bruns added a comment.Feb 4 2019, 1:48 AM

In general, these changes seem worthwhile.
But as already said, splitting these into two parts - QDebug etc vs. Error handling - is a must.

I think this is a step into the right direction.
Thought I am still not sure if we not should go away from a own storage db for all this and port that over to e.g. use the tracker stuff.
Given that fixing the issues will be more or less a complete rewrite, too.
(I have some partly working port in my aged https://cgit.kde.org/clones/baloo/cullmann/tbaloo.git/)

valeriymalov marked 3 inline comments as done.
  • clean up rest of the asserts
bruns added inline comments.Feb 9 2019, 3:40 PM
src/engine/documentdatadb.cpp
42

Warning

54

warning

76

warning

90–93

please keep MDB_NOTFOUND as "ok" here, any other return code != 0 -> qCWarning

109–110

see get

125

see get

src/engine/documentdb.cpp
44

same rules for warning/debug as in documentdatadb

src/engine/postingdb.cpp
142–144

if (rc != MDB_NOTFOUND) ...

235

critical - You are leaking the cursor now

src/engine/transaction.cpp
283

warning

src/file/fileindexscheduler.cpp
186

add a ", " after battery

valeriymalov marked 10 inline comments as done.
  • update logging per review
valeriymalov added inline comments.Feb 9 2019, 6:09 PM
src/engine/documentdatadb.cpp
109–110

Trying to delete a non-existent entry seems like an error to me

bruns added inline comments.Feb 9 2019, 6:58 PM
src/engine/documentdatadb.cpp
109–110

Deleting is an idempotent operation, so e.g. doing it twice is fine. There may be multiple events queued leading to the deletion of a database entry, i.e. deleting a file and then deleting a parent directory recursively.

src/engine/postingdb.cpp
235

not done ... you have to always mdb_cursor_close

Also, MDB_NOTFOUND is not an error here.

valeriymalov marked 4 inline comments as done.
  • fix ::del error logging criteria, don't return without closing cursor in
bruns added a comment.Feb 10 2019, 4:07 PM

looks almost good to me, can someone else please crosscheck?

src/engine/documenturldb.h
140

indentation

src/engine/mtimedb.cpp
110–113

if (rc != MDB_NOTFOUND) for the message

src/engine/postingdb.cpp
235

For the debug message, level warning is sufficient ;-)

src/engine/transaction.cpp
304

This should also be a warning IMHO

bruns added a comment.Feb 10 2019, 4:34 PM

Also, these should only be CCBUG, as it is only an assumption it fixes any of these.

Actually, I have seen the database return corrupt data, not caused by a failed call, but because the entry did contain garbage.

bruns added a comment.Feb 10 2019, 4:35 PM

Also, remove the second paragraph from the summary - a commit message is not a TODO list.

valeriymalov edited the summary of this revision. (Show Details)Thu, Feb 21, 4:25 PM
valeriymalov marked 4 inline comments as done.
  • review warning fixes
bruns requested changes to this revision.Thu, Feb 21, 6:06 PM
bruns added inline comments.
src/engine/mtimedb.cpp
110–113

for the message - you now only retrieve the first entry, spew an error message, and break. In case the key does not exist, you loop infinitely.

This revision now requires changes to proceed.Thu, Feb 21, 6:06 PM
  • fix MTimeDB::get loop
bruns added a comment.Thu, Feb 21, 7:40 PM

I think this looks good now, but I would prefer an OK from another developer, as this has become quite large.

Readability good profit from if (rc) -> if (rc != MDB_SUCCESS) resp. if (rc == 0) -> if (rc == MDB_SUCCESS), but I am not sure about this.

I've looked through the patch (quite large indeed), apart from the single note I think it's good to go.

src/engine/postingdb.cpp
238

Can it happen that rc == MDB_NOTFOUND just after some iterations of MDB_NEXT operation (i.e. we've reached the end of DB), but keys which popped up previously are perfectly valid, and we're just ignoring them here?
(shouldn't it read rc != 0 && rc != MDB_NOTFOUND, just like above?)

bruns requested changes to this revision.Sat, Feb 23, 4:09 PM
bruns added inline comments.
src/engine/postingdb.cpp
238

Yes, this is indeed wrong - it only works if the loop is left via 'if (!arr.startsWith(prefix)) break`, but not when all entries in the key range have a matching prefix.

If an additional rc check here is correct is hard to say. We are in an error condition when rc != {NOTFOUND, SUCCESS}, but all the values in termIterators are valid.

I vote for removing the rc check here again.

This revision now requires changes to proceed.Sat, Feb 23, 4:09 PM
bruns added a comment.Sat, Mar 9, 4:43 PM

@valeriymalov - are you going to update this?

valeriymalov marked 3 inline comments as done.
  • do not ignore results of PostingDB::iter in case of an error/end of db
bruns accepted this revision.Mon, Mar 11, 12:48 AM
This revision is now accepted and ready to land.Mon, Mar 11, 12:48 AM
This revision was automatically updated to reflect the committed changes.