Avoid manipulation of lists with quadratic complexity
ClosedPublic

Authored by bruns on Apr 15 2018, 8:43 PM.

Details

Summary

When removing items from lists Baloo uses an implementation with quadratic complexity,
i.e. O(m*n) - m: removed items, n: items in list.
Use std::remove_if/std::partition and erase instead.

Test Plan

make test

Diff Detail

Repository
R293 Baloo
Branch
remove-if (branched from master)
Lint
No Linters Available
Unit
No Unit Test Coverage
michaelh created this revision.Apr 15 2018, 8:43 PM
Restricted Application added projects: Frameworks, Baloo. · View Herald TranscriptApr 15 2018, 8:43 PM
Restricted Application added a subscriber: Frameworks. · View Herald Transcript
michaelh requested review of this revision.Apr 15 2018, 8:43 PM

If this is correct, I can extend.

bruns added a subscriber: bruns.Apr 15 2018, 9:27 PM
bruns added inline comments.
src/file/fileindexscheduler.cpp
148

You can put the lambda into the remove_if invocation:

auto tail = std::remove_if(list.begin(), list.end(),
                           [&dir](const QString& file) {
                               return file.startsWith(dir);
                           });
150

tail instead of it ?

src/file/pendingfilequeue.cpp
75

You are missing the m_pendingFiles.remove(...) and m_recentlyEmitted.remove(...) calls here.

Variant A:
You can use std::partition instead of std::remove_if here (which keeps the elements between tail and .end() intact), and iterate over the tail to remove the elements from m_pendingFiles and m_recentlyEmitted. Last, do the erase(tail, m_cache.end()).

Variant B:
Use std::remove_if, but call m_pendingFiles.remove() (dito m_recentlyEmitted) inside the lambda.

Variant A is definitely the much cleaner approach.

michaelh updated this revision to Diff 32285.Apr 16 2018, 1:37 PM
  • Include omissions
  • Const!
michaelh added inline comments.Apr 16 2018, 1:42 PM
src/file/fileindexscheduler.cpp
148

I suppose developers on my level will appreciate code to be most self-explanatory. I'm not objecting though, we might al well go "all the way":

list.erase(std::remove_if(list.begin(), list.end(), [&dir](const QString& file) {
        return file.startsWith(dir);
    }), list.end());
src/file/pendingfilequeue.cpp
75

Ooops.

bruns added inline comments.Apr 16 2018, 5:10 PM
src/file/fileindexscheduler.cpp
148

Following Qt coding style, the [] () { should go on a separate line, as written above

I think having list.erase as a separate statement makes reading simpler.

michaelh updated this revision to Diff 32322.Apr 16 2018, 7:28 PM
  • Apply suggested change
michaelh marked 4 inline comments as done.Apr 16 2018, 7:28 PM
michaelh marked 2 inline comments as done.
bruns added inline comments.Apr 18 2018, 12:15 AM
src/file/fileindexscheduler.cpp
148

indent by 4, please. And align }); with [&dir]

jtamate added inline comments.
src/file/pendingfilequeue.cpp
68 ↗(On Diff #32285)

According to std::partition
what matches isDescendant should be from m_cache.begin() up to startRemoving, unless you want to remove what does not match isDescendant, isn't it?

bruns requested changes to this revision.Apr 20 2018, 12:44 PM
bruns added inline comments.
src/file/pendingfilequeue.cpp
68 ↗(On Diff #32285)

Good catch, std::remove_if places the elements where the predicate returned false into the first "partition", std::partition shows the opposite behavior.

As we have to discard the second part (for O(1) behavior), the correct fix is to inverse the return value of the predicate.

Probably rename the the predicate to "preserveFile" or something like that.

This revision now requires changes to proceed.Apr 20 2018, 12:44 PM
bruns added a comment.May 15 2018, 2:15 PM

Apparently @michaelh is MIA - how do we proceed here?

Restricted Application edited subscribers, added: kde-frameworks-devel; removed: Frameworks. · View Herald TranscriptMay 15 2018, 2:15 PM

You could commandeer it and finish it yourself I guess. :-(

I'll see if I can contact Michael.

bruns updated this revision to Diff 35068.May 29 2018, 12:12 AM

Fix remaining issue (inverse condition in std::partition), cleanup

bruns commandeered this revision.May 29 2018, 12:13 AM
bruns edited reviewers, added: michaelh; removed: bruns.
bruns marked 2 inline comments as done.

Looks good to me.

src/file/pendingfilequeue.cpp
69 ↗(On Diff #32285)

++it is faster.

bruns edited the summary of this revision. (Show Details)May 30 2018, 12:24 AM
bruns added inline comments.May 30 2018, 12:44 AM
src/file/pendingfilequeue.cpp
69 ↗(On Diff #32285)

GCC generates identical code for both ...

I would give a ship-it - but maybe another review would be good? @ngraham maybe?

src/file/pendingfilequeue.cpp
69 ↗(On Diff #32285)

Did you check? This is certainly correct for i++ if i is an int. But for iterators, which are classes, this is often not the case, but I may be wrong here. In any case, this is a minor nitpick you can simply ignore, so please go ahead :-)

bruns added inline comments.May 30 2018, 3:31 PM
src/file/pendingfilequeue.cpp
69 ↗(On Diff #32285)

Yes, objdump output is identical.
It is likely true for any decent compiler when the iterator implementation is fully contained in the headers. The compiler is aware the iterator copy is unused and can drop the respective code.

I would give a ship-it - but maybe another review would be good? @ngraham maybe?

I'm afraid I don't really consider myself qualified to review this kind of change. I'm more of a front-end developer, and code like this makes my head hurt! :)

Still waiting for a final review ...

Still waiting for a final review ...

Hello! Anyone at home?

bruns added a comment.Jul 14 2018, 2:36 PM

Still waiting for a final review ...

Hello! Anyone at home?

Another 2 weeks ...

mpyne accepted this revision.Jul 15 2018, 5:51 PM
mpyne added a subscriber: mpyne.

I don't use Baloo directly but I've taken a look from a code sanity perspective and agree with @dhaumann that the change is a proper port of the existing logic.

bruns marked an inline comment as done.
bruns marked 4 inline comments as done.Jul 15 2018, 9:29 PM
bruns removed a reviewer: Baloo.
This revision is now accepted and ready to land.Jul 15 2018, 9:30 PM
Restricted Application added a subscriber: Baloo. · View Herald TranscriptJul 15 2018, 9:30 PM
bruns edited reviewers, added: Baloo; removed: michaelh.Jul 15 2018, 9:31 PM
bruns added a comment.Jul 15 2018, 9:33 PM

Sorry for the noise - apparently, if one requests changes, adopt the revision, does the changes, the "Changes requested" flag is not cleared and can not be cleared ...

This revision was automatically updated to reflect the committed changes.