Change qSort to std::sort
ClosedPublic

Authored by jtamate on Feb 26 2018, 12:11 PM.

Details

Summary

qSort is depecreated in Qt5 in favor of std::sort.
In some workloads qSort can be more than 200 times slower than std::sort,
in many others qSort is a little bit faster.

Test Plan

Select 50.000 small files and press Shift-Supr.
With qSort 11% of cpu


And times around 300.000 miliseconds.

Select the same 50.000 small files and press Shift-Supr, cancel and press Shift-Supr again.
With std::sort 3.3% of cpu


And times around 900 miliseconds.

Diff Detail

Repository
R241 KIO
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
jtamate created this revision.Feb 26 2018, 12:11 PM
Restricted Application added a project: Frameworks. · View Herald TranscriptFeb 26 2018, 12:11 PM
jtamate requested review of this revision.Feb 26 2018, 12:11 PM
apol added a subscriber: apol.Feb 26 2018, 1:10 PM

+1
If there's 11, we better change them all at once?

markg added a subscriber: markg.Feb 26 2018, 4:36 PM

+1, also for what @apol said.

dfaure requested changes to this revision.Feb 26 2018, 6:48 PM

I'm not opposed to the idea, but measuring CPU usage is a rather misleading indicator. What if it takes 10 times longer, because it's progressing much more slowly? ;)

Please at least measure with QElapsedTimer around the sorting (not to be committed, just to gather numbers about the actual performance of this from a user's point of view).
I'm interested in the result ;)

This revision now requires changes to proceed.Feb 26 2018, 6:48 PM

I'm not opposed to the idea, but measuring CPU usage is a rather misleading indicator. What if it takes 10 times longer, because it's progressing much more slowly? ;)

Please at least measure with QElapsedTimer around the sorting (not to be committed, just to gather numbers about the actual performance of this from a user's point of view).
I'm interested in the result ;)

The results are strange. All the results are measured sorting 50.000 small files:

Both Qt are the same version from opensuse.

In an i5, why the difference is so big? recent cpu bugs?
qSort in i3
274764, 276060 (with 3 directories), 365878, 424506 (without directories)
std::sort in i3
940, 995 (with 3 directories), 2472, 2539 (without directories)

In AMD the results are closer, qSort wins
qSort in AMD
658, 726, 695, 683, 676, 666, 649, 684, 666 (without directories)
std:sort in AMD
843, 839, 878, 896, 925 (without directories)

apol added a comment.Feb 27 2018, 12:05 PM

I'd say the bottom line is qSort is deprecated in favor of std::sort.

@jtamate make sure you are profiling release builds.

I'm not opposed to the idea, but measuring CPU usage is a rather misleading indicator. What if it takes 10 times longer, because it's progressing much more slowly? ;)

Please at least measure with QElapsedTimer around the sorting (not to be committed, just to gather numbers about the actual performance of this from a user's point of view).
I'm interested in the result ;)

The results are strange. All the results are measured sorting 50.000 small files:

Both Qt are the same version from opensuse.

In an i5, why the difference is so big? recent cpu bugs?
qSort in i3
274764, 276060 (with 3 directories), 365878, 424506 (without directories)
std::sort in i3
940, 995 (with 3 directories), 2472, 2539 (without directories)

In AMD the results are closer, qSort wins
qSort in AMD
658, 726, 695, 683, 676, 666, 649, 684, 666 (without directories)
std:sort in AMD
843, 839, 878, 896, 925 (without directories)

That surprises me a lot!
I "thought" qSort was just a template or alias to std::sort, but it isn't: http://code.qt.io/cgit/qt/qtbase.git/tree/src/corelib/tools/qalgorithms.h#n340
But it isn't which kinda makes my other expectations moot (qSort == std::sort... it isn't)

So i did some tests as well.
50000 filenames sorting them (computer sorting, not the natural one) with std::string and QString (yes, it matters)
Filenames, well, sorta.. Made them up in a loop :)

QString version: https://p.sc2.nl/r1JTFaf_z
qSort: ~220ms
std::sort ~276ms

std::string version: https://p.sc2.nl/HJ69Y6G_G
qSort: ~100ms
std::sort ~130ms

Note that std::string might not be the fair comparison as it's 8 bit/char. QString is 16.
My compiler (in this rare case) Visual Studio 2017 on an Intel CPU.

I still think it's wise to replace all qSort, if only for it being deprecated. But it does suck a little that qSort beats std::sort.

jtamate updated this revision to Diff 28274.Feb 28 2018, 3:32 PM
jtamate edited the summary of this revision. (Show Details)
jtamate edited the test plan for this revision. (Show Details)

Changed all qSort in Qlist to std::sort.
At least the people with Intel(R) Core(TM) i5-4200U (model 69) will see a big improvement in time.

I tried with KDE Neon with a kernel without patches for the cpu, and with the dolphin included in Neon and OpenSuse, the results are the same: very slow using qSort.

markg accepted this revision.Feb 28 2018, 5:51 PM

+1, but wait till someone else had a look as well.

src/widgets/kdirmodel.cpp
42

Don't you need to keep that one?
Nothing seems to be including <algorithm> (for std::sort at the very least) so i'm not sure why it would work at the moment without that header.

dfaure accepted this revision.EditedFeb 28 2018, 7:09 PM

" it does suck a little that qSort beats std::sort."
I was curious whether that was true, so I ran Mark's benchmark, with a few fixes: one more zero in the number of items for the vectors, because there was too much variation in numbers when measuring something that lasts around 500ms, and I looped over the two sorting methods, to avoid having to run the benchmark multiple times (because I'm lazy, and because the first iteration seems to have some cold cache effect, the later ones are faster). http://www.davidfaure.fr/2018/qsort_performance.cpp

That gives me results that are not clearly in favour of any of the two algorithms, in fact.
In release mode:

"qSort took: 4942 ms"
"std::sort took: 5080 ms"

"qSort took: 4853 ms"
"std::sort took: 4673 ms"

"qSort took: 4656 ms"
"std::sort took: 4782 ms"

"qSort took: 4606 ms"
"std::sort took: 4693 ms"

"qSort took: 4766 ms"
"std::sort took: 4659 ms"

"qSort took: 4656 ms"
"std::sort took: 5106 ms"

I ran both sorting algos (in RelWithDebInfo mode) in perf+hotspot to find out where the difference comes from, but I see nothing conclusive. std::sort goes via swap<QString,QString> which calls QString::swap(QString&) which calls qSwap(pointers) while qSort is able to call qSwap(pointers) directly, but that's all inline, shouldn't make a difference, I would expect. So it must be the sorting algorithm itself. Oh well, not much we can do, apart from, well, comparing libstdc++ and libc++, another day :)

(Curiously, in debug mode, qSort is much slower than std::sort (9s vs 7.6s).)

This revision is now accepted and ready to land.Feb 28 2018, 7:09 PM
jtamate planned changes to this revision.Mar 1 2018, 8:24 AM

I've run David test, it is not a CPU problem:

g++ -fPIC -O2 qsort_performance.cpp -I /usr/include/qt5/QtCore -I/usr/include/qt5 -l Qt5Core
./a.out
"qSort took: 3335 ms"
"std::sort took: 3452 ms"
"qSort took: 3285 ms"
"std::sort took: 3464 ms"
"qSort took: 3582 ms"
"std::sort took: 3681 ms"
"qSort took: 3338 ms"
"std::sort took: 3646 ms"
"qSort took: 3635 ms"
"std::sort took: 3826 ms"
"qSort took: 3747 ms"
"std::sort took: 3510 ms"
"qSort took: 3637 ms"
"std::sort took: 3590 ms"
"qSort took: 3418 ms"
"std::sort took: 3559 ms"
"qSort took: 3493 ms"
"std::sort took: 3750 ms"
"qSort took: 3445 ms"
"std::sort took: 3545 ms"

Even with a modified test simulating the input data in the call in simpliedUrlList, the results are normal


g++ -fPIC -O2 qsort_performance_test.cpp -I /usr/include/qt5/QtCore -I/usr/include/qt5 -l Qt5Core
./a.out
"qSort took: 786 ms"
"std::sort took: 930 ms"
"qSort took: 987 ms"
"std::sort took: 1143 ms"
"qSort took: 1006 ms"
"std::sort took: 1188 ms"
"qSort took: 1039 ms"
"std::sort took: 1161 ms"
"qSort took: 1005 ms"
"std::sort took: 1167 ms"
"qSort took: 1012 ms"
"std::sort took: 1171 ms"
"qSort took: 1026 ms"
"std::sort took: 1197 ms"
"qSort took: 1014 ms"
"std::sort took: 1158 ms"
"qSort took: 1010 ms"
"std::sort took: 1162 ms"
"qSort took: 1012 ms"
"std::sort took: 1236 ms"

Therefore I'll try to discover why it is so slow, there must be something wrong anywhere, before changing the sort method that hides the problem.

I've found the problem. The problem is qSort itself.

Chaning qSort to qStableSort I've got normal times in the test case, select 50.000 files and pressing Shift+Supr.

Also in the small test, the times, running under callgrind are:
qSort: between 33148 and 45745
std::sort between 53231 and 66780
qStableSort: between 11232 and 15116
std::stable_sort: between 22581 and 25957

Therefore, before patching, should I change all of them to std::sort or std::stable_sort or qStableSort?

dfaure added a comment.Mar 2 2018, 8:51 AM

That is amazing, I could have sworn that the whole point of sort vs stable_sort was that stable_sort was potentially slower, which was the reason for sort to exist (when you don't care about the "stability" of equal items)...

dfaure added a comment.Mar 2 2018, 9:04 AM

I cannot confirm that stable_sort is faster, on the contrary. http://www.davidfaure.fr/2018/qsort_performance.cpp updated, says (repeatedly)
"std::sort took: 5003 ms"
"std::stable_sort took: 7490 ms"

Maybe on specific data (the actual filenames you're testing this with), stable_sort ends up being faster, but this isn't true in general (with random data).

I cannot confirm that stable_sort is faster, on the contrary. http://www.davidfaure.fr/2018/qsort_performance.cpp updated, says (repeatedly)
"std::sort took: 5003 ms"
"std::stable_sort took: 7490 ms"

Maybe on specific data (the actual filenames you're testing this with), stable_sort ends up being faster, but this isn't true in general (with random data).

http://www.cplusplus.com/reference/algorithm/stable_sort/
If enough extra memory is available, linearithmic in the distance between first and last: Performs up to N*log2(N) element comparisons (where N is this distance), and up to that many element moves.
Otherwise, polyloglinear in that distance: Performs up to N*(log2(N))^2 element comparisons, and up to that many element swaps.

http://www.cplusplus.com/reference/algorithm/sort/
On average, linearithmic in the distance between first and last: Performs approximately N*log2(N) (where N is this distance) comparisons of elements, and up to that many element swaps (or moves).

So, std::sort then?

markg added a comment.Mar 2 2018, 11:25 AM

I cannot confirm that stable_sort is faster, on the contrary. http://www.davidfaure.fr/2018/qsort_performance.cpp updated, says (repeatedly)
"std::sort took: 5003 ms"
"std::stable_sort took: 7490 ms"

Maybe on specific data (the actual filenames you're testing this with), stable_sort ends up being faster, but this isn't true in general (with random data).

"qSort took: 3308 ms"
"std::sort took: 3383 ms"
"std::stable_sort took: 5829 ms"

My results. A direct copy-paste from your benchmark, no edits.
Release mode and linux this time.

@jtamate std::sort it is. Feel free to commit :)

jtamate retitled this revision from Change qSort to std::sort in simplifiedUrlList to Change qSort to std::sort.Mar 2 2018, 11:30 AM
jtamate edited the summary of this revision. (Show Details)
jtamate edited the test plan for this revision. (Show Details)
This revision was not accepted when it landed; it landed in state Changes Planned.Mar 2 2018, 4:51 PM
This revision was automatically updated to reflect the committed changes.