Reimplementation of KisRelaxedTimer
Needs RevisionPublic

Authored by poke1024 on Nov 17 2017, 2:58 PM.

Details

Reviewers
dkazakov
Group Reviewers
Krita
Summary

Follow up of https://phabricator.kde.org/D8804

This reimplements KisRelaxedTimer with a new goal: not eliminating timer restarts, but rather minimizing them within given boundaries.

This should allow us a much better control of what happens and what we want to happen.

We can later fine-tune this to different use cases/signal compressors. For now, the patch hard codes that a callback may always be 1 ms too early and up to 10% too late, which I hope should even fix the frame drop problems with the 10 ms timer at 100 fps (as 10% of 10ms is just 1ms).

Diff Detail

Repository
R37 Krita
Lint
Lint Skipped
Unit
Unit Tests Skipped
poke1024 created this revision.Nov 17 2017, 2:58 PM

@neviril Could you try out this patch if it also has the "frames lost" problem (i.e. the one from https://phabricator.kde.org/D8804)?

neviril added a comment.EditedNov 20 2017, 3:33 PM

@poke1024: here are my observations.

At the default maximum fps limit setting of 100 fps with an external OpenGL/Direct3D monitoring application (MSI Afterburner) I measure 131 fps, which is similar to Krita 3.3.2 but, compared to this version, Krita 4.0-git with the patch still doesn't feel as smooth. Again, upon analyzing a 60fps screen capture made with OBS Studio, with the patch applied brush strokes appear to update roughly once every 2 frames / 50% of the time, resulting in an actual drawing frame rate that is close to 30 fps.

To be sure of my methodology, I checked again with the video capture test under Krita 3.3.2, where I could confirm that brush strokes update almost always every frame of the 60 fps video.

Without the patch, MSI Afterburner externally measures with Krita 4.0 git-master a framerate of 95-97 fps, thus closer to the user-set limit. However the perceived/displayed drawing framerate appears to be significantly lower than with the patch.

@poke1024: here are my observations.

At the default maximum fps limit setting of 100 fps with an external OpenGL/Direct3D monitoring application (MSI Afterburner) I measure 131 fps, which is similar to Krita 3.3.2 but, compared to this version, Krita 4.0-git with the patch still doesn't feel as smooth. Again, upon analyzing a 60fps screen capture made with OBS Studio, with the patch applied brush strokes appear to update roughly once every 2 frames / 50% of the time, resulting in an actual drawing frame rate that is close to 30 fps.

To be sure of my methodology, I checked again with the video capture test under Krita 3.3.2, where I could confirm that brush strokes update almost always every frame of the 60 fps video.

Without the patch, MSI Afterburner externally measures with Krita 4.0 git-master a framerate of 95-97 fps, thus closer to the user-set limit. However the perceived/displayed drawing framerate appears to be significantly lower than with the patch.

I think we have to make this clear first... is this really a regression with KisSignalCompressor? How does it compare with Krita 4.0 pre-alpha 2, which does not even include KisRelaxedTimer (6f3dc831b)?

Because if the update rate on 4.0 pre-alpha 2 is already worse than 3.3.2, then there really is no point to try to "fix" KisRelaxedTimer based on observing the update rate. One thing to try is to go back to before the multi-threaded brush is merged into master (9488b5b0e3) because it changes how the brush strokes are painted to the layer and it does intentionally limit the update rate of the strokes. This is the last nightly build before that (but do note that the nightly build uses an older version of Qt which might exhibit different behaviour): https://ci.appveyor.com/project/alvinhochun/krita-packaging/build/0.1.1.288/artifacts

Or if you are looking at the brush outline painted on the canvas (not the mouse cursor), then it mostly does reflect the actual redraw rate of the OpenGL canvas...

(Except that both can be affected by how the input events are processed, which so happens to also make use of KisSignalCompressor... making the matter even more complicated.)

@alvinhochun: I tried the indicated nightly build of Krita (4.0.0 pre-alpha git-465432e) and here brush strokes are painted as smoothly (i.e. with the same update rate) as with 3.3.2 official.

I'm referring to the actually painted brush strokes and not just the brush outline. From what I've experimented in the past they can have a different update rate, although below the standard maximum user limit of 100 fps there doesn't seem to be perceivable differences between both.

Hi, @neviril!

The real brush update speed is now limited in master to make painting faster. That is a consequence of implementing a multithreaded brush...

Hi, @poke1024!

Did you measure real efficiency of this extra check? Does it remove any significant amount of calls? From a wild guess it seems like that is relaxed version will remove only 10% of the calls, which really not much.

It seems like we finally need some kind of unittest for the signals compressor...

neviril added a comment.EditedNov 21 2017, 5:21 AM

@dkazakov: I'm aware that the framerate is not "unlimited" anymore as in past versions of Krita, but I think the normal user expectation would be that when the update rate limit is set to 60 fps (for example) brush strokes should also consistently update at that rate.

This isn't happening, however. The main issue here is that the actual update rate appears to be roughly half of the set value, which is low enough to be noticeable. Furthermore, I suspect it's noticeable especially on Windows (the OS I'm using) where vertical synchronization (VSync) is forced enabled and cannot be turned off for desktop applications. This condition is similar to that of 3D games when VSync is used but the framerate is not sufficiently high (at least that of the display's refresh rate).

Since the brush isn't updating every frame anymore and since the desktop manager (on Windows) is in sync with the display's refresh rate, this gives the impression of lower performance when using small brushes, even if performance with very big brushes might now be higher.

If you're using Linux you might want to try to enable vertical synchronization ("full screen repaints") and/or desktop composition to check out the difference, if you don't have them enabled in your desktop manager (e.g. KDE Plasma 5).

Hi, @poke1024!

I have implemented a simple unittest that checks the statistics of the compression period. It shows clearly that your current patch really fixes the problem, but it doesn't measure the number of timer restarts. I will probably not have time to implement this measurement until the end of the week (departing to a sprint). Feel free to use this patch for your own needs :)

Here are the results of the three versions of the compressor implement:

## Compressor 60ms, ticks 30+-5ms

# raw QTimer

Period: mean: 36.6604 st.dev.: 24.4163 min/max: 0.002691 64.438
    +offset: mean: 0.366584 max: 4.43799 ### <--- most important value!
    -offset: mean: -23.7062 max: -59.9973

# KisRelaxedTimer with the latest patch

Period: mean: 36.7508 st.dev.: 24.3699 min/max: 0.005454 64.5693
   +offset: mean: 0.491943 max: 4.56929 ### <--- most important value!
   -offset: mean: -23.7412 max: -59.9945

# KisRelaxedTimer in master (without the patch)

Period: mean: 60.4571 st.dev.: 47.3537 min/max: 0.005307 124.703
   +offset: mean: 23.319 max: 64.7032 ### <--- most important value!
   -offset: mean: -22.8619 max: -59.9947

Link to the patch itself: https://paste.kde.org/pknhbnlqu

@dkazakov: I'm aware that the framerate is not "unlimited" anymore as in past versions of Krita

The brush FPS is now automatically adjusted in range 12.5...100 FPS depending on the current speed of the brush. And, yes, it is not in sync with the display update rate, because display update should also update brush outline.

Since the brush isn't updating every frame anymore and since the desktop manager (on Windows) is in sync with the display's refresh rate, this gives the impression of lower performance when using small brushes, even if performance with very big brushes might now be higher.

I'll try to think about it. Though I'm not sure I know what solution I can make for it... The brush updates were never in sync with display updates. They were just quick enough to "look" like being in sync :)

The brush FPS is now automatically adjusted in range 12.5...100 FPS depending on the current speed of the brush. And, yes, it is not in sync with the display update rate, because display update should also update brush outline.

Since the brush isn't updating every frame anymore and since the desktop manager (on Windows) is in sync with the display's refresh rate, this gives the impression of lower performance when using small brushes, even if performance with very big brushes might now be higher.

I'll try to think about it. Though I'm not sure I know what solution I can make for it... The brush updates were never in sync with display updates. They were just quick enough to "look" like being in sync :)

I added a wishlist item about this a short time ago, which I assume you may have already seen.

I also noticed that the values assigned to KisBrushOp's m_currentUpdatePeriod are in the range 20 ... 100.
The value is eventually compared to what a call to QElapsedTimer::elapsed() returns in FreehandStrokeStrategy::tryDoUpdate(), so the unit would appear to be milliseconds.

A period of 100 ms would be 10 Hz, and 20 ms would be 50 Hz. If you wanted 12.5 ... 100 FPS, shouldn't the range be 10 ... 80? I suspect I could be misunderstanding something...

Additionally, with this patch applied, I tried setting m_currentUpdatePeriod to fixed values, then used a 100px Basic_circle brush on a 1024x1024 canvas to draw circles at 100% zoom with the mouse.
It's certainly far from a precise test, but I don't know if the results still might be interesting to someone.

  • 20 gave me a "Last brush framerate" (displayed when "Debug logging for brush rendering speed" is checked) of about 42 FPS.
  • 15 gave about 57 FPS.
  • 10 gave about 61 FPS.
  • 5 gave about 112 FPS.

(20 felt too slow to me, so if my wish bug is valid, I suppose I'd want the minimum to also be user adjustable, or simply set lower than it is now.)

I added a wishlist item about this a short time ago, which I assume you may have already seen.

I will check that

A period of 100 ms would be 10 Hz, and 20 ms would be 50 Hz. If you wanted 12.5 ... 100 FPS, shouldn't the range be 10 ... 80? I suspect I could be misunderstanding something...

Erm... yep. It seems like I've changed the values at some point. The initial range was 10...80, but it seems like I've changed that.

Additionally, with this patch applied, I tried setting m_currentUpdatePeriod to fixed values, then used a 100px Basic_circle brush on a 1024x1024 canvas to draw circles at 100% zoom with the mouse.
It's certainly far from a precise test, but I don't know if the results still might be interesting to someone.

  • 20 gave me a "Last brush framerate" (displayed when "Debug logging for brush rendering speed" is checked) of about 42 FPS.
  • 15 gave about 57 FPS.
  • 10 gave about 61 FPS.
  • 5 gave about 112 FPS. (20 felt too slow to me, so if my wish bug is valid, I suppose I'd want the minimum to also be user adjustable, or simply set lower than it is now.)

So you mean that 10-15 period value looks much better for you?

dkazakov added a comment.EditedNov 23 2017, 11:14 AM

Hi, @poke1024!

I have added a simple test for the number of restarts. It seems like the patch does no optimization, at least in the case, when the tick period is a multiple of the compressor period:

Latest poke1024's patch:
QDEBUG : KisSignalCompressorTest::test() Ticks: 247 Restarts: 94 0.380567

Baseline QTimer implementation:
QDEBUG : KisSignalCompressorTest::test() Ticks: 249 Restarts: 100 0.401606

Here is the patch I used for testing:
https://phabricator.kde.org/P139

dkazakov requested changes to this revision.Nov 23 2017, 11:22 AM

Same for 10+-2ms ticks on 60ms compressor. The patch does almost no benefit in comparison to the baseline implementation:

### 10+-2ms ticks on 60ms compressor

Latest poke1024's patch:
QDEBUG : KisSignalCompressorTest::test() Ticks: 660 Restarts: 86 0.130303

Raw QTimer implementation:
QDEBUG : KisSignalCompressorTest::test() Ticks: 664 Restarts: 101 0.152108
This revision now requires changes to proceed.Nov 23 2017, 11:22 AM

So you mean that 10-15 period value looks much better for you?

Yes. 15 and 10 felt more or less the same to me, but I'd go for at least 10 -- it matches your original choice and reached 60 FPS in my case (although still not the 100 FPS rate that it should have, in theory).

So you mean that 10-15 period value looks much better for you?

Yes. 15 and 10 felt more or less the same to me, but I'd go for at least 10 -- it matches your original choice and reached 60 FPS in my case (although still not the 100 FPS rate that it should have, in theory).

My opinion would be to have a 15ms default for the target minimum interval and allow users to configure the value.

(Sorry that this is getting a bit off-topic...)

Regarding timers, I think it might be worth looking at the implementation details of Qt timers.

Regarding Windows (win32): https://github.com/qt/qtbase/blob/eade2255ea7cd8200569080e9b295479b1f51bed/src/corelib/kernel/qeventdispatcher_win.cpp#L398
Qt on Windows can use two different types of timers:

  • When the interval is <20ms or when the timerType is set to Qt::PreciseTimer, it will use a Multimedia Timer, which triggers on a worker thread (which Qt then posts an event to the proper thread) and is pretty accurate if not too busy (oh and it also decreases the OS scheduler tick interval if it isn't already low enough and waste power waking up the CPU frequently but this isn't really a big deal to Krita, right).
  • In other cases, it will use SetTimer in which the OS posts a WM_TIMER message to the message queue, in a very inaccurate manner (even at 100ms interval, the error can be as high as 10%).

Here is an article comparing these timers: http://omeg.pl/blog/2011/11/on-winapi-timers-and-their-resolution/
With either cases, it's pretty straightforward, that each Qt timer is backed by a native timer.

Qt on all *nix systems seems to use the QTimerInfoList helper to arrange timer triggers. It looks like a timer queue, but I couldn't really figure out what mechanisms actually trigger the timers. I didn't dig too deep into how Qt timers on *nix are handled, but I can see that Qt::CourseTimer is handled by rounding the timing to a multiple of set intervals within an "acceptable" error.


Now, for events that require more accurate timing (like canvas update), it would be worth specifying Qt::PreciseTimer, especially on Windows where`SetTimer` can have such a huge error even with a larger interval (100ms). (The error may be much smaller on *nix, but it would take some benchmarking to verify.)

I would wish to do a bit of timer benchmarking on Windows this weekend, but I don't think I will get any surprising results.