KDevelop: project filter support in the abstractfilemanagerplugin import benchmark
ClosedPublic

Authored by rjvbb on Oct 31 2017, 4:30 PM.

Details

Summary

This change makes it possible to benchmark project import with the Project Filter plugin activated. The patch exploits the fact that the plugin won't be available when the core engine is initialised without GUI (Core::NoUI); a simple place-dependent commandline argument is used to activate filter support by controlling the TestCore initialisation.

I'm not entirely certain of the interest of this change. The benchmark was written in the context of improving the dirwatching set-up. In the current implementation of that feature, the project filter has no control whatsoever over which elements get to be watched or not and the benchmark will only show the filter overhead on project import *after* the dirwatching set-up.
In the implementation I hope to propose, only directories actually loaded into the project get to be watched, which means the filter is applied before KDirWatch::addDir() in a regular KDevelop session. Separating filtering overhead from the gain of not dirwatching "invalid" directories will be tricky in that implementation.
In addition, this uses a default Project Filter configuration, as far as I have been able to figure out.

This change could thus just as well be part of my improved dirwatching patch?

Test Plan

If possible, set up the Project Filter in a controlled way, for instance such that it will reject an in-tree build directory (and not just the *.o files in there).

Diff Detail

Lint
Lint Skipped
Unit
Unit Tests Skipped
rjvbb created this revision.Oct 31 2017, 4:30 PM
Restricted Application added a subscriber: kdevelop-devel. · View Herald TranscriptOct 31 2017, 4:30 PM
rjvbb updated this revision to Diff 21680.Nov 1 2017, 9:23 AM
rjvbb retitled this revision from KDevelop: project filter support in the abstractfilemanagerplugin import benchmark to KDevelop: project filter support in the abstractfilemanagerplugin import benchmark.

Filtering has an incidence on the number of items imported so here it does make sense to print that number.

On my Linux rig, using only INotify dirwatching:

Without filtering:

> abstractfilemanagerpluginimportbenchmark gcc-7.2.0
The project filter plugin was not loaded
KDirWatch backend: Inotify
Starting import of project gcc-7.2.0
	creating dirwatcher took 2.843 seconds
importing 75038 items into project #0 took 5.306 seconds
Done in 8.149 seconds total
The project filter plugin was not loaded
KDirWatch backend: Inotify
Starting import of project gcc-7.2.0
	creating dirwatcher took 2.939 seconds
importing 75038 items into project #0 took 5.022 seconds
Done in 7.962 seconds total
The project filter plugin was not loaded
KDirWatch backend: Inotify
Starting import of project gcc-7.2.0
	creating dirwatcher took 2.923 seconds
importing 75038 items into project #0 took 5.111 seconds
Done in 8.035 seconds total

 Performance counter stats for 'abstractfilemanagerpluginimportbenchmark gcc-7.2.0' (3 runs):

       6751.834141      task-clock:u (msec)       #    0.706 CPUs utilized            ( +-  0.56% )
                 0      context-switches:u        #    0.000 K/sec                  
                 0      cpu-migrations:u          #    0.000 K/sec                  
            24,700      page-faults:u             #    0.004 M/sec                    ( +-  0.54% )
     5,480,083,797      cycles:u                  #    0.812 GHz                      ( +-  0.78% )
     3,270,959,423      instructions:u            #    0.60  insn per cycle           ( +-  0.01% )
       781,348,961      branches:u                #  115.724 M/sec                    ( +-  0.01% )
        27,707,410      branch-misses:u           #    3.55% of all branches          ( +-  0.40% )

       9.565704674 seconds time elapsed                                          ( +-  1.20% )

With the Project Filter just under 1000 fewer items are imported, but the import duration is much longer:
(this number should be much more representative of the import duration in a real-world session)

> abstractfilemanagerpluginimportbenchmark --withFilters gcc-7.2.0
cannot find .rc file "abstractfilemanagerpluginimportbenchmarkui.rc" for component "abstractfilemanagerpluginimportbenchmark"
KDirWatch backend: Inotify
Starting import of project gcc-7.2.0
	creating dirwatcher took 3.078 seconds
importing 74062 items into project #0 took 14.763 seconds
Done in 17.844 seconds total
cannot find .rc file "abstractfilemanagerpluginimportbenchmarkui.rc" for component "abstractfilemanagerpluginimportbenchmark"
KDirWatch backend: Inotify
Starting import of project gcc-7.2.0
	creating dirwatcher took 2.967 seconds
importing 74062 items into project #0 took 14.771 seconds
Done in 17.742 seconds total
cannot find .rc file "abstractfilemanagerpluginimportbenchmarkui.rc" for component "abstractfilemanagerpluginimportbenchmark"
KDirWatch backend: Inotify
Starting import of project gcc-7.2.0
	creating dirwatcher took 2.913 seconds
importing 74062 items into project #0 took 14.763 seconds
Done in 17.679 seconds total

 Performance counter stats for 'abstractfilemanagerpluginimportbenchmark --withFilters gcc-7.2.0' (3 runs):

      16657.694614      task-clock:u (msec)       #    0.811 CPUs utilized            ( +-  0.57% )
                 0      context-switches:u        #    0.000 K/sec                  
                 0      cpu-migrations:u          #    0.000 K/sec                  
            25,436      page-faults:u             #    0.002 M/sec                    ( +-  0.07% )
    21,311,526,918      cycles:u                  #    1.279 GHz                      ( +-  0.42% )
    21,322,013,689      instructions:u            #    1.00  insn per cycle           ( +-  0.00% )
     3,792,613,704      branches:u                #  227.679 M/sec                    ( +-  0.00% )
        52,435,462      branch-misses:u           #    1.38% of all branches          ( +-  5.63% )

      20.531457857 seconds time elapsed                                          ( +-  0.66% )
rjvbb set the repository for this revision to R32 KDevelop.Nov 1 2017, 9:23 AM
mwolff added a subscriber: mwolff.Nov 1 2017, 1:24 PM

I'd say always make this load in GUI mode, and remove the switch.

rjvbb added a comment.Nov 1 2017, 1:34 PM

And always keep the project filter plugin activated?

mwolff added a comment.Nov 1 2017, 3:14 PM
In D8579#162838, @rjvbb wrote:

And always keep the project filter plugin activated?

yes

mwolff added a comment.Nov 1 2017, 3:22 PM

Oh and do check where the time is being spent - is that really just the filter checks? Do you run your timings on a kdevelop codebase compiled with compiler optimizations? Most notably, is your Qt built with compiler optimizations? My gut feeling is that the filters shouldn't be that bad, performance wise. Again, a profiler will tell you.

rjvbb added a comment.EditedNov 1 2017, 4:43 PM

is that really just the filter checks?

Yes, I'd sure hope so, but you know the code (much) better than I do... The only difference between the 2 runs is that project plugins requiring GUI mode become available but from what I've seen they can only be used after calling allPluginsForExtension(). I only ask for project filter plugins in that call, and I haven't seen anything that suggests other plugins might be used as well by AbstractFileManagerPlugin.

(On a side-note, if the author of plugin controller code is still active, s/he might want to consider adding some comments!)

Do you run your timings on a kdevelop codebase compiled with compiler optimizations? Most notably, is your Qt built with compiler optimizations?

Yes and yes. KDevelop is built with clang 4.0.1 using -Ofast -g and Qt using GCC 6.2 with -O3 -ftracer -g.

My gut feeling is that the filters shouldn't be that bad, performance wise.

That was my feeling too, until I realised that the filter is run on each and every file, meaning nearly 75k evaluations of I don't know how many regular expressions there are in the default filter set-up. Even if each filter costs about as much to apply to a project item as adding the item to the project you'd still end up multiplying the import duration by roughly the number of filters (in a serial implementation). Here we're talking about a factor 3.

My Linux rig isn't exactly fast, either. Tests on my Mac show much less of an impact on the (6yo!) i7 powering it. Here I've shunted the call to KDirWatch::addDir() (which still takes almost 90 seconds and would pollute profiling measures).

> abstractfilemanagerpluginimportbenchmark gcc-7.2.0
The project filter plugin was not loaded
KDirWatch backend: QFSWatch
Starting import of project gcc-7.2.0
	creating dirwatcher took 0 seconds
importing 75039 items into project #0 took 2.401 seconds
Done in 2.401 seconds total
    1.776 user_cpu 1.255 kernel_cpu 0:03.58 total_time 84.3%CPU
The project filter plugin was not loaded
KDirWatch backend: QFSWatch
Starting import of project gcc-7.2.0
	creating dirwatcher took 0 seconds
importing 75039 items into project #0 took 2.401 seconds
Done in 2.401 seconds total
    1.766 user_cpu 1.246 kernel_cpu 0:03.54 total_time 84.7%CPU
The project filter plugin was not loaded
KDirWatch backend: QFSWatch
Starting import of project gcc-7.2.0
	creating dirwatcher took 0 seconds
importing 75039 items into project #0 took 2.406 seconds
Done in 2.407 seconds total
    1.772 user_cpu 1.245 kernel_cpu 0:03.54 total_time 85.0%CPU
> abstractfilemanagerpluginimportbenchmark --withFilters gcc-7.2.0
cannot find .rc file "abstractfilemanagerpluginimportbenchmarkui.rc" for component "abstractfilemanagerpluginimportbenchmark"
KDirWatch backend: QFSWatch
Starting import of project gcc-7.2.0
	creating dirwatcher took 0 seconds
importing 74064 items into project #0 took 5.184 seconds
Done in 5.187 seconds total
    4.584 user_cpu 1.476 kernel_cpu 0:07.96 total_time 76.0%CPU
cannot find .rc file "abstractfilemanagerpluginimportbenchmarkui.rc" for component "abstractfilemanagerpluginimportbenchmark"
KDirWatch backend: QFSWatch
Starting import of project gcc-7.2.0
	creating dirwatcher took 0 seconds
importing 74064 items into project #0 took 5.129 seconds
Done in 5.13 seconds total
    4.537 user_cpu 1.396 kernel_cpu 0:06.51 total_time 90.9%CPU
cannot find .rc file "abstractfilemanagerpluginimportbenchmarkui.rc" for component "abstractfilemanagerpluginimportbenchmark"
KDirWatch backend: QFSWatch
Starting import of project gcc-7.2.0
	creating dirwatcher took 0 seconds
importing 74064 items into project #0 took 5.18 seconds
Done in 5.18 seconds total
    4.560 user_cpu 1.400 kernel_cpu 0:06.51 total_time 91.5%CPU

Again, a profiler will tell you.

Sure, but this is really an aspect I don't want to spend time on at this point (esp. since it turns out I don't even have to invoke the filter from my ProjectWatcher class). A quick run through Apple's Instruments suggests the overhead indeed comes from regular expression parsing (see the screenshot).

rjvbb updated this revision to Diff 21735.Nov 1 2017, 9:30 PM

Always use the project filter.

rjvbb set the repository for this revision to R32 KDevelop.Nov 1 2017, 9:31 PM

thanks for the numbers and profile on the project filter, I guess I'll need to port it away from QRegExp to QRegularExpression, which will mean some custom parser to transfer unix style matching to regular expressions patterns. But it does seem to help (left is status quo, right is QRegularExpression with manual conversion of the default patters to regular expressions):

********* Start testing of TestProjectFilter *********                                    ********* Start testing of TestProjectFilter *********
Config: Using QtTest library 5.9.2, Qt 5.9.2 (x86_64-little_endian-lp64 shared (dynamic)  Config: Using QtTest library 5.9.2, Qt 5.9.2 (x86_64-little_endian-lp64 shared (dynamic)
PASS   : TestProjectFilter::initTestCase()                                                PASS   : TestProjectFilter::initTestCase()
PASS   : TestProjectFilter::bench(baseline-1716)                                          PASS   : TestProjectFilter::bench(baseline-1716)
RESULT : TestProjectFilter::bench():"baseline-1716":                                      RESULT : TestProjectFilter::bench():"baseline-1716":
     1.8 msecs per iteration (total: 60, iterations: 32)                                       1.6 msecs per iteration (total: 52, iterations: 32)
PASS   : TestProjectFilter::bench(baseline-3276)                                          PASS   : TestProjectFilter::bench(baseline-3276)
RESULT : TestProjectFilter::bench():"baseline-3276":                                      RESULT : TestProjectFilter::bench():"baseline-3276":
     4.1 msecs per iteration (total: 67, iterations: 16)                                       2.8 msecs per iteration (total: 90, iterations: 32)
PASS   : TestProjectFilter::bench(baseline-8591)                                          PASS   : TestProjectFilter::bench(baseline-8591)
RESULT : TestProjectFilter::bench():"baseline-8591":                                      RESULT : TestProjectFilter::bench():"baseline-8591":
     10 msecs per iteration (total: 81, iterations: 8)                                         8.3 msecs per iteration (total: 67, iterations: 8)
PASS   : TestProjectFilter::bench(baseline-12221)                                         PASS   : TestProjectFilter::bench(baseline-12221)
RESULT : TestProjectFilter::bench():"baseline-12221":                                     RESULT : TestProjectFilter::bench():"baseline-12221":
     15 msecs per iteration (total: 62, iterations: 4)                                         10 msecs per iteration (total: 85, iterations: 8)
PASS   : TestProjectFilter::bench(defaults-1716)                                          PASS   : TestProjectFilter::bench(defaults-1716)
RESULT : TestProjectFilter::bench():"defaults-1716":                                      RESULT : TestProjectFilter::bench():"defaults-1716":
     27 msecs per iteration (total: 54, iterations: 2)                                         9.8 msecs per iteration (total: 79, iterations: 8)
PASS   : TestProjectFilter::bench(defaults-3276)                                          PASS   : TestProjectFilter::bench(defaults-3276)
RESULT : TestProjectFilter::bench():"defaults-3276":                                      RESULT : TestProjectFilter::bench():"defaults-3276":
     51 msecs per iteration (total: 51, iterations: 1)                                         18 msecs per iteration (total: 75, iterations: 4)
PASS   : TestProjectFilter::bench(defaults-8591)                                          PASS   : TestProjectFilter::bench(defaults-8591)
RESULT : TestProjectFilter::bench():"defaults-8591":                                      RESULT : TestProjectFilter::bench():"defaults-8591":
     165 msecs per iteration (total: 165, iterations: 1)                                       55 msecs per iteration (total: 55, iterations: 1)
PASS   : TestProjectFilter::bench(defaults-12221)                                         PASS   : TestProjectFilter::bench(defaults-12221)
RESULT : TestProjectFilter::bench():"defaults-12221":                                     RESULT : TestProjectFilter::bench():"defaults-12221":
     208 msecs per iteration (total: 208, iterations: 1)                                       74 msecs per iteration (total: 74, iterations: 1)
PASS   : TestProjectFilter::cleanupTestCase()                                             PASS   : TestProjectFilter::cleanupTestCase()
Totals: 10 passed, 0 failed, 0 skipped, 0 blacklisted, 2859ms                             Totals: 10 passed, 0 failed, 0 skipped, 0 blacklisted, 2899ms
********* Finished testing of TestProjectFilter *********                                 ********* Finished testing of TestProjectFilter *********

Anyhow, not really related to this patch here, but good to keep in mind for the future

kdevplatform/project/tests/abstractfilemanagerpluginimportbenchmark.cpp
67

why is this required?

131

this isn't needed anymore, is it? i.e. it should never happen?

rjvbb marked 2 inline comments as done.Nov 2 2017, 12:13 PM

thanks for the numbers and profile on the project filter, I guess I'll need to port it away from QRegExp to QRegularExpression, which will mean some custom parser to transfer unix style matching to regular expressions patterns. But it does seem to help (left is status quo, right is QRegularExpression with manual conversion of the default patters to regular expressions):

...

Anyhow, not really related to this patch here, but good to keep in mind for the future

Indeed. Are those gains still visible in real world usage - it doesn't look like filtering is a bottleneck operation like the dirwatcher population currently is? In a real world project import, a properly set up filter would exclude an in-tree build directory first, thereby skipping a whole bunch of files that would otherwise be excluded one by one. (IOW, it might already be very effective to optimise the order of the entries in the default filter.)

kdevplatform/project/tests/abstractfilemanagerpluginimportbenchmark.cpp
67

It replaces a full-blown ProjectController initialisation, and tells the ProjectFilterProvider to set up the project filter. Without that signal, the filter doesn't do its full job (it only filters out the .kdev4 directories).

I tried setting up the ProjectController, but got crashes while plugins were being loaded that aren't of interest anyway, and in addition we'd have to import the test projects through the ProjectController in order for it to send out that signal. I don't think we want that here.

131

I suppose it should indeed never happen. The call itself is required though (I checked again), and if it fails it will do so quietly and results will be for project import without filtering. I wouldn't mind putting an assert here, but a version that doesn't disappear in release builds.

mwolff added inline comments.Nov 2 2017, 1:59 PM
kdevplatform/project/tests/abstractfilemanagerpluginimportbenchmark.cpp
67

hm, I'd prefer using the actual controller setup as that is less brittle. I.e. if anything changes in the future, this will silently break, most probably.

131

I'd say remove it

mwolff added a comment.Nov 2 2017, 2:34 PM
In D8579#163409, @rjvbb wrote:

thanks for the numbers and profile on the project filter, I guess I'll need to port it away from QRegExp to QRegularExpression, which will mean some custom parser to transfer unix style matching to regular expressions patterns. But it does seem to help (left is status quo, right is QRegularExpression with manual conversion of the default patters to regular expressions):

...

Anyhow, not really related to this patch here, but good to keep in mind for the future

Indeed. Are those gains still visible in real world usage - it doesn't look like filtering is a bottleneck operation like the dirwatcher population currently is? In a real world project import, a properly set up filter would exclude an in-tree build directory first, thereby skipping a whole bunch of files that would otherwise be excluded one by one. (IOW, it might already be very effective to optimise the order of the entries in the default filter.)

I once again fail to understand you. What is a "properly set up filter" vs. what you are measuring here? Both should do the same thing now, no? I.e. open the project and load the configuration that contains custom filters to exclude e.g. the build dir.

Also, how would reordering default filters change anything? If you don't exclude the folder, every file will be checked - there is no way around this. Reordering filters may help in your specific situation, but not in others where other filters would result in the early-out.

rjvbb marked 2 inline comments as done.Nov 2 2017, 3:53 PM
I once again fail to understand you. What is a "properly set up filter" vs. what you are measuring here? Both should do the same thing now, no? I.e. open the project and load the configuration that contains custom filters to exclude e.g. the build dir.
>   >   Also, how would reordering default filters change anything?

Hmm, you mean it doesn't really matter if you first check a directory called "build" against all "*.a", "*.o" and whatever uninteresting files that directory might contain before coming across the rule that excludes the entire directory, because either way you won't be scanning it?

Still, there should be some effect of rule order if the filter stops processing once a rule is triggered; for instance, why waste time checking all those *.o files against patterns that should occur much less often?

mwolff added a comment.Nov 2 2017, 4:06 PM
In D8579#163574, @rjvbb wrote:
I once again fail to understand you. What is a "properly set up filter" vs. what you are measuring here? Both should do the same thing now, no? I.e. open the project and load the configuration that contains custom filters to exclude e.g. the build dir.
>   >   Also, how would reordering default filters change anything?

Hmm, you mean it doesn't really matter if you first check a directory called "build" against all "*.a", "*.o" and whatever uninteresting files that directory might contain before coming across the rule that excludes the entire directory, because either way you won't be scanning it?

Huh?! The files in the dir have nothing to do with the folder. Importing a project works like this: We list the contents recursively. If a folder is filtered out, we don't recurse into it. Also, we won't run file rules like *.a against a directory anyways. So I think you simply don't have a clue of how filtering is operating.

Still, there should be some effect of rule order if the filter stops processing once a rule is triggered; for instance, why waste time checking all those *.o files against patterns that should occur much less often?

Maybe they occur less often for you. I never open folders that contain any *.o, since I always build out-of-source. So clearly haggling over order is a waste of time.

rjvbb added a comment.Nov 2 2017, 4:55 PM

So I think you simply don't have a clue of how filtering is operating.

Indeed, and I'd just as well remain blissfully unaware of its internal cuisine :)

Maybe they occur less often for you. I never open folders that contain any `*.o`, since I always build out-of-source. So clearly haggling over order is a waste of time.

Ah, because you're the standard and shining example and we should all be your clones? Let me repeat here that I've never been able to set up a usable KDevelop project for an autoconf-based project using an out-of-source build directory.

mwolff added a comment.Nov 2 2017, 5:06 PM
In D8579#163619, @rjvbb wrote:

So I think you simply don't have a clue of how filtering is operating.

Indeed, and I'd just as well remain blissfully unaware of its internal cuisine :)

Maybe they occur less often for you. I never open folders that contain any `*.o`, since I always build out-of-source. So clearly haggling over order is a waste of time.

Ah, because you're the standard and shining example and we should all be your clones? Let me repeat here that I've never been able to set up a usable KDevelop project for an autoconf-based project using an out-of-source build directory.

What the hell? Where did that come from?! I simply pointed out that the preferred "fast" order of rules for *you* won't make things faster for *me*. The only point being that haggling over the order is a waste of time! Take a deep breath Rene...

rjvbb added a comment.Nov 2 2017, 5:32 PM

hm, I'd prefer using the actual controller setup as that is less brittle. I.e. if anything changes in the future, this will silently break, most probably.

I understand. I'll see if calling ProjectController::addProject() has any side-effects (crashes as I've had before, activation of other plugins that shouldn't be activated etc) but that's a protected method. We'd have to make AbstractFileManagerPluginImportBenchmark a friend class of ProjectController too, or make the method public.
I'd say that ProjectController::openProject() is really inappropriate here.

rjvbb updated this revision to Diff 21799.Nov 2 2017, 5:59 PM

Uses ProjectController::addProject(). That requires adding the AFMP benchmark class as a friend to the ProjectController class (yuck?) as well as qobject_casting m_core->projectController() as the method is not part of the iProjectController API.

I haven't yet seen evidence that this would add overhead from unwanted plugins (like the clang parser) but an unexpected positive side-effect is that dirwatching *deletion* timing can now be obtained.

I've added a check for the qobject_cast result with a printed warning and fallback to the direct signal emit I used before. Rationale: the cited concern that something could change in the future and that that would cause the benchmark to do something else silently.

rjvbb set the repository for this revision to R32 KDevelop.Nov 2 2017, 5:59 PM
mwolff requested changes to this revision.Nov 13 2017, 1:20 PM

instead of using a friend, do what the other tests do - inherit from the ProjectController, make addProject public, then set your custom controller on the core.

kdevplatform/project/tests/abstractfilemanagerpluginimportbenchmark.cpp
68 ↗(On Diff #21680)

assert the project controller, always add the project, remove the else branch

131 ↗(On Diff #21680)

why is this required to be done explictly? add a comment

This revision now requires changes to proceed.Nov 13 2017, 1:20 PM
rjvbb updated this revision to Diff 22292.Nov 13 2017, 8:15 PM

Updated as requested.

mwolff accepted this revision.Nov 14 2017, 9:47 AM

thanks, lgtm now

This revision is now accepted and ready to land.Nov 14 2017, 9:47 AM

Master branch, right?

yes, only bug fixes go to stable branches

This revision was automatically updated to reflect the committed changes.