Diffusion Baloo 8fcd690fe853

Simplify orPostingIterator and make it faster

Authored by bruns on Mar 30 2018, 6:21 PM.

Description

Simplify orPostingIterator and make it faster

Summary:
Trivial searches (e.g. baloosearch foo) are expanded to a large list
of ORed small results sets (e.g. 80 terms with 2...5 entries), thus
speeding it up is quite beneficial.

Currently, iterators which no longer return any entries are deleted
and replaced with nullptrs, thus the value has to be checked on each
iteration. The saved instructions are sufficient to more than amortize
the cost of moving the remaining elements in the vector.

The or operator has to return the smallest ID of the combined sets.
Instead of doing a traversal on each next() call, determine the smallest
ID on the first call and update it when checking if the iterators have
to be advanced.

Keep the docId in a local variable, as the virtual function call to
(PostingIterator*)->docId() is somewhat expensive.

According to valgrind, typical execution cost of Baloo::Query::exec()
is reduced by 25% to 40%.

Signed-off-by: Stefan BrĂ¼ns <stefan.bruens@rwth-aachen.de>

Test Plan: valgrind --tool=callgrind baloosearch foo OR bar

Reviewers: Baloo, Frameworks, poboiko, ngraham

Reviewed By: Baloo, ngraham

Subscribers: ngraham, fvogt, kde-frameworks-devel, Frameworks

Tags: Frameworks, Baloo

Differential Revision: https://phabricator.kde.org/D11828

Details

Committed
brunsApr 11 2019, 9:47 PM
Reviewer
Baloo
Differential Revision
D11828: Simplify orPostingIterator and make it faster
Parents
R293:899ed35c6872: Ensure QFileInfo is valid for the first FilteredDirIterator entry
Branches
Unknown
Tags
Unknown