Differential D12047

Avoid crash when reading corrupt data from document terms db
ClosedPublic
Actions

Authored by bruns on Apr 8 2018, 2:48 PM.

Details

Reviewers

michaelh
ngraham
dhaumann

Group Reviewers

Baloo
Frameworks

Commits

R293:e1d1b7e87ff1: Avoid crash when reading corrupt data from document terms db

Summary

The terms db contains terms, where each terms is stored independently
(terminated with 0), or as a suffix to the previous term (terminated with
1).
In case of corrupted data, the first terminator seen may be a 1, which
leads to a crash when trying to access the previous term with
QVector<>::last().
Show a debug message, to give a hint about the bad data, which can be
fixed by reindexing the relevant file.

BUG: 392878
CCBUG: 392877

Test Plan

Corrupt the database
Run balooshow -x <affected file(s)>

Diff Detail

Repository

R293 Baloo

Lint

Automatic diff as part of commit; lint not applicable.

Unit

Automatic diff as part of commit; unit tests not applicable.

bruns created this revision.Apr 8 2018, 2:48 PM

Restricted Application added projects: Frameworks, Baloo. · View Herald TranscriptApr 8 2018, 2:48 PM

Restricted Application added a subscriber: Frameworks. · View Herald Transcript

bruns requested review of this revision.Apr 8 2018, 2:48 PM

bruns added a reviewer: ngraham.Apr 8 2018, 3:37 PM

Corrupt the database

As described in BUG: 392877?

In D12047#246715, @michaelh wrote:

Corrupt the database

As described in BUG: 392877?

yes.

bruns added a reviewer: Frameworks.Apr 26 2018, 6:32 PM

Kind request to review ...

Restricted Application added a subscriber: kde-frameworks-devel. · View Herald TranscriptMay 15 2018, 2:38 PM

If there is noone willing to review, I will push this tomorrow

Rebase

If the format is really such that a term must appear before any Suffix, then this patch is already better that before.

Could it happen to have e.g.: a\0b1c1

If so, this code would extend the Suffix b with Suffix c. Would that be correct? Or can that never happen? Or should c be a Suffix for a? If so, this code should be improved.

This revision is now accepted and ready to land.May 29 2018, 2:28 AM

In D12047#269976, @dhaumann wrote:

If the format is really such that a term must appear before any Suffix, then this patch is already better that before.

Could it happen to have e.g.: a\0b1c1

If so, this code would extend the Suffix b with Suffix c. Would that be correct? Or can that never happen? Or should c be a Suffix for a? If so, this code should be improved.

a\x00b\x01c\x01 would be decoded as "a", "ab", "abc".

"the", "their", "theirs", "there" is encoded as "the\x00ir\x01s\x01there\x00".

Ok, then please commit.

dhaumann added inline comments.May 29 2018, 12:46 PM

src/engine/documentdb.cpp
101	Ah, maybe this should be a qWarning()? Feel free to decide as you wish.

Closed by commit R293:e1d1b7e87ff1: Avoid crash when reading corrupt data from document terms db (authored by bruns). · Explain WhyMay 29 2018, 11:52 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents
Changeset List

			Path	Packages
M			src/codecs/doctermscodec.cpp (5 lines)
M			src/engine/documentdb.cpp (6 lines)

Diff	ID	Base	Description	Created	Lint	Unit
Base			Base
Diff 1	31672	b94a91a		Apr 8 2018, 2:48 PM	★	★
Diff 2	35069	0030e88	Rebase	May 29 2018, 12:19 AM	★	★
Diff 3	35152	0030e88	R293:e1d1b7e87ff1e8ce6a7e03ecdf2902322cb8624a	May 29 2018, 11:47 PM	★	★

Avoid crash when reading corrupt data from document terms dbClosedPublicActions

Details

Diff Detail

Revision ContentsChangeset List

Diff 35152

src/codecs/doctermscodec.cpp

src/engine/documentdb.cpp

Avoid crash when reading corrupt data from document terms db
ClosedPublic
Actions

Revision Contents
Changeset List