Optimize UrlParseLock to remove excessive sleep times.

Authored by mwolff on Aug 17 2016, 6:26 PM.

Description

Optimize UrlParseLock to remove excessive sleep times.

Running duchainify on the heaptrack sources on a nproc=8 machine
now gives me the following results:

Performance counter stats for 'duchainify -t 8 .' (5 runs):

  23230.558005      task-clock (msec)         #    3.369 CPUs utilized            ( +-  1.88% )
        60,838      context-switches          #    0.003 M/sec                    ( +-  7.23% )
         1,380      cpu-migrations            #    0.059 K/sec                    ( +- 19.25% )
       206,106      page-faults               #    0.009 M/sec                    ( +-  3.04% )
86,087,979,829      cycles                    #    3.706 GHz                      ( +-  1.82% )
70,508,334,605      instructions              #    0.82  insn per cycle           ( +-  1.13% )
15,187,539,592      branches                  #  653.774 M/sec                    ( +-  1.07% )
   283,447,232      branch-misses             #    1.87% of all branches          ( +-  1.31% )

   6.896230441 seconds time elapsed                                          ( +-  6.05% )

Before, the result was:

  23720.891477      task-clock (msec)         #    2.979 CPUs utilized            ( +-  0.46% )
        32,629      context-switches          #    0.001 M/sec                    ( +-  7.98% )
           997      cpu-migrations            #    0.042 K/sec                    ( +- 11.20% )
       198,436      page-faults               #    0.008 M/sec                    ( +-  2.10% )
87,645,125,683      cycles                    #    3.695 GHz                      ( +-  0.45% )
67,272,691,473      instructions              #    0.77  insn per cycle           ( +-  0.98% )
14,515,423,390      branches                  #  611.926 M/sec                    ( +-  0.97% )
   256,262,860      branch-misses             #    1.77% of all branches          ( +-  0.46% )

   7.962761391 seconds time elapsed                                          ( +-  2.39% )

Note that the previous implementation was mostly bad because of
excessive 1s sleeping. The new implementation relies on per-url
mutices to sleep only as long as needed.

We really could benefit from work stealing in our parse job queue...
But in C/C++ projects, the issue comes from central headers that
are included in nearly all files and thus easily trigger sleeps
here.

The code was reviewed by David Faure, thanks!

Details

Committed
mwolffAug 17 2016, 6:30 PM
Parents
R32:c799623cb2a4: Don't check on the IndexedString(QUrl) input if running in release mode
Branches
Unknown
Tags
Unknown