This greatly reduces the on-CPU time of the benchNotifyWatcher.
The issue was that for every inotify event, the list of all
entries would be walked which has abysmal performance when
the map is large. By introducing a direct mapping we greatly
speed things up.
I actually spotted this issue while profiling KDevelop, which
sometimes exhibits similar performance issues.
Running the benchmarks with -perf we can measure the cycles
which trasnaltes to on-CPU time, ignoring the off-CPU time
induced by sleeping and waiting in the tests.
Before:
RESULT : KDirWatch_UnitTest::benchNotifyWatcher():
306,496,490.1 CPU cycles per iteration (total: 3,064,964,902, iterations: 10)
After:
RESULT : KDirWatch_UnitTest::benchNotifyWatcher():
219,120,818.3 CPU cycles per iteration (total: 2,191,208,183, iterations: 10)
Note that the other backends could leverage a similar trick to
speed them up.