[Balooshow] Avoid out-of-bounds access when accessing corrupt db data
ClosedPublic

Authored by bruns on Sep 29 2018, 12:31 AM.

Details

Summary

Looping over word without bounds check may cause illegal memory accesses,
potentially crashing balooshow. Add sanity checks for required lengths
and provide feedback in case an error has occured.

Invalid data may occur when the DB has beend corrupted.

Test Plan

corrupt database
run balooshow -x <file>

Diff Detail

Repository
R293 Baloo
Branch
db_robustness2
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 3329
Build 3347: arc lint + arc unit
bruns created this revision.Sep 29 2018, 12:31 AM
Restricted Application added projects: Frameworks, Baloo. · View Herald TranscriptSep 29 2018, 12:31 AM
Restricted Application added a subscriber: kde-frameworks-devel. · View Herald Transcript
bruns requested review of this revision.Sep 29 2018, 12:31 AM
anthonyfieroni added inline comments.
src/tools/balooshow/main.cpp
217 ↗(On Diff #42530)

posOfNonNumeric < 0 or == -1 ?

bruns updated this revision to Diff 42547.Sep 29 2018, 12:52 PM
bruns marked an inline comment as done.

fix "not found" condition for word.indexOf(...)

bruns added inline comments.Sep 29 2018, 12:54 PM
src/tools/balooshow/main.cpp
217 ↗(On Diff #42530)

Thou shall not code after bedtime ... Thx.

I never experienced such corruption, though, but sanity check shouldn't hurt.

src/tools/balooshow/main.cpp
204

I'm not sure this check is needed - the other one (posOfNonNumeric < 0) seem to be covering this case.
There shouldn't be an empty QByteArray, right?

211

Same note here. The fewer code to maintain - the better :)

218

I think we should i18n() those messages as well

bruns added inline comments.Oct 7 2018, 2:43 PM
src/tools/balooshow/main.cpp
204

See word[0] access directly after.
I have fixed to many "shouldn't be" errors/crashes due to corrupt DB values in the past.

211

This is the exact case I have had - "X<garbage">.
The code after (word.indexOf('-', 2)) requires a check for length >= 3 here (code correctness), semantics require >= 4.

218

I am not really sure:
balooctl -x is a quite low level debug tool. This is a diagnostic message only printed in case of DB errors. Translated strings make search on the web harder.

Maybe we should print also a suggestion to user, something like (maybe rephrase it better)

WARNING:
Looks like your index is corrupted.
We suggest you to run `balooctl disable && balooctl disable` to wipe it and rebuild from scratch

so they won't have to google what to do (there's still not much can be done in that case)

BTW, does it also wipe tags and other user-provided metadata?

src/tools/balooshow/main.cpp
204

OK, you're right, I guess.

211

But still, if the term is short (namely, length < 4), we will either won't have "-" (this corresponds to posOfNonNumeric < 0, and that's X<garbage>), or it will be the last symbol (something like X1- - which is posOfNonNumeric+1 == word.length()).

218

On the one hand, you're right. On the other hand, the output should be consistent. Other messages here are translated.

bruns added inline comments.Oct 8 2018, 9:20 PM
src/tools/balooshow/main.cpp
211

but you are not allowed to access word[2] if word.length() < 3.

218

None of the other messages is an error message, but just regular output. Hm, maybe i18n("Internal Error: %1", message).

poboiko added inline comments.Oct 8 2018, 9:40 PM
src/tools/balooshow/main.cpp
211

We don't access it directly, and indexOf performs internal checks. For example, QStringLiteral("ab").indexOf('c', 5) seem to be perfectly valid code, which returns -1.
On one hand, it makes sense, and I think we can rely on it. On the other - it's not actually stated directly in the documentation...

218

OK, I think that's better

bruns added a comment.Oct 8 2018, 10:24 PM

BTW, does it also wipe tags and other user-provided metadata?

Tags and comments are store in XAttrs, i.e. with the file.

bruns updated this revision to Diff 43176.Oct 8 2018, 11:29 PM

make (introduction of) error message translatable

I'm not entirely sure how translating system works, but won't it pop up as 4 identical lines in i.e. Lokalize, causing frustration to our translators?
It would also mean that if we would want to change the message, we would have to do it in 4 different lines.

I suggest introducing something like an QString errorMessage and printing it only once.

bruns updated this revision to Diff 44224.Oct 25 2018, 8:37 PM

use KLocalizedString for deferred i18n substitution

bruns marked 11 inline comments as done.Oct 28 2018, 3:45 PM
bruns marked 3 inline comments as done.

@poboiko - good to go?

poboiko accepted this revision.Oct 30 2018, 12:26 PM

Yep, fine by me

This revision is now accepted and ready to land.Oct 30 2018, 12:26 PM
This revision was automatically updated to reflect the committed changes.