KDirectoryContentsCounter: Avoid scanning twice the same dir, prioritise path not in cache
AbandonedPublic

Authored by meven on May 9 2020, 2:25 PM.

Details

Summary

Use QVector instead of QQueue to avoid adding to the queue twice the same dir for subsequent scanning.

Add a priority queue for path that were not already present in cache.

Diff Detail

Repository
R318 Dolphin
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 26616
Build 26634: arc lint + arc unit
meven created this revision.May 9 2020, 2:25 PM
Restricted Application added a project: Dolphin. · View Herald TranscriptMay 9 2020, 2:25 PM
Restricted Application added a subscriber: kfm-devel. · View Herald Transcript
meven requested review of this revision.May 9 2020, 2:25 PM
ngraham accepted this revision.May 11 2020, 2:44 PM
This revision is now accepted and ready to land.May 11 2020, 2:44 PM

Use QVector instead of QQueue to avoid adding to the queue twice the same dir for subsequent scanning.

Add a priority queue for path that were not already present in cache.

Can you split these two things into different commits? Or are they related?

meven added a comment.May 12 2020, 5:26 AM

Use QVector instead of QQueue to avoid adding to the queue twice the same dir for subsequent scanning.

Add a priority queue for path that were not already present in cache.

Can you split these two things into different commits? Or are they related?

They are not directly related, but one is dependent on the other because of the lines it touches.
Separating the two would mean undoing the code and is not just about selecting the right.
But they still relate to the same thing : improve the queue of dirs to compute the size of.

And given this patch is small, I'd rather not split and spend my time doing other things.

If you insist, I will comply.

Personally I don't really see the need.

feverfew added a subscriber: feverfew.EditedMay 13 2020, 2:50 PM

Hmmmm, this looks like a poor man's priority queue, would std::priority_queue not be appropriate here? (I'm not blocking here, idm tbh but it just seems more natural here to me). Also means that you don't have to explicitly enforce it all the time the queues are used, the priority is decided only in one place.

meven added a comment.May 13 2020, 4:54 PM

Hmmmm, this looks like a poor man's priority queue, would std::priority_queue not be appropriate here? (I'm not blocking here, idm tbh but it just seems more natural here to me). Also means that you don't have to explicitly enforce it all the time the queues are used, the priority is decided only in one place.

It is not a poor man std::priority_queue, because a std::priority_queue isn't really needed: the value we separate on is just a bit.
And It is not directly usable since I would need to keep this bit somewhere, and has not the required API : it does not have a contains function.

@meven Yes please. The patch may be small, but it's not trivial. Smaller atomic changes are always easier to review.

src/kitemviews/private/kdirectorycontentscounter.cpp
186–194

So if I'm not wrong, this is 2*O(n) + another O(n) for the insert. Previously it was just an amortized O(1) insert.

Are we sure we are not slowing down?

meven planned changes to this revision.May 18 2020, 6:43 AM
meven marked an inline comment as done.
meven added inline comments.
src/kitemviews/private/kdirectorycontentscounter.cpp
186–194

This won't slow anything down : those queues contain filename to stat on, i.e calling syscalls that are 10 times slower than those complexity changes.

I privileged convenience here as it is in fact not performant critical, and lacking better alternative as data structure.
In fact it makes me realize I should use QLinkedList instead of QVector, since I only ever insert and pop.
But again this won't be noticeable in the end.