Incremental screen updates via FBO
Needs RevisionPublic

Authored by poke1024 on Oct 30 2017, 10:29 AM.

Details

Reviewers
dkazakov
alvinhochun
Group Reviewers
Krita
Summary

This apples to the OpenGL canvas and is something I've been playing around with ever since it became that increasing the texture size was not an option and I had seen some time spent inside paintGL on an iMac. It's still experimental, but it looks quite promising to me.

The idea is: do not draw all tiles every frame, but only draw those tiles that actually changed. Since we're usually in a double buffer GL context, this needs an FBO to accumulate the incremental changes. The benefit of this is that the paintGL does get faster as it has much less work to do and the final FBO blit to the actual GL buffer is blazingly fast.

From what I measure, here on my MacBook Pro 2017 (Intel Iris Plus Graphics 650 1536 MB), when turning this on and drawing using a pen on a 4000 x 4000 canvas, the time spent inside renderCanvasGL goes to 10% as to what it was before.

Here are some number; the first bool indicates whether fbo incremental updates are in place, the second time is in microseconds (i.e. 1/1000 ms). So it goes from rougly 1 ms to rougly 1/10 ms, which, for 100 frames per second, would mean reducing the total time spent here from 100 ms to 10 ms.

no fbo

paintGL false 670
paintGL false 757
paintGL false 1074
paintGL false 1293
paintGL false 701
paintGL false 1052
paintGL false 645
paintGL false 1154
paintGL false 1465
paintGL false 1919
paintGL false 780
paintGL false 736
paintGL false 627
paintGL false 908
paintGL false 617
paintGL false 824
paintGL false 617
paintGL false 935
paintGL false 663
paintGL false 637
paintGL false 653
paintGL false 1174
paintGL false 633
paintGL false 915
paintGL false 661
paintGL false 797
paintGL false 618
paintGL false 802
paintGL false 785
paintGL false 1456
paintGL false 1198
paintGL false 869
paintGL false 669
paintGL false 988
paintGL false 736
paintGL false 661
paintGL false 668
paintGL false 1054
paintGL false 961
paintGL false 882
paintGL false 869
paintGL false 632
paintGL false 1220
paintGL false 685
paintGL false 653
paintGL false 711
paintGL false 1328
paintGL false 954
paintGL false 1025
paintGL false 17349
paintGL false 674
paintGL false 793
paintGL false 683

with fbo

paintGL true 137
paintGL true 78
paintGL true 236
paintGL true 149
paintGL true 84
paintGL true 67
paintGL true 149
paintGL true 91
paintGL true 151
paintGL true 70
paintGL true 77
paintGL true 35
paintGL true 68
paintGL true 74
paintGL true 58
paintGL true 60
paintGL true 43
paintGL true 71
paintGL true 39
paintGL true 66
paintGL true 126
paintGL true 53
paintGL true 67
paintGL true 69
paintGL true 79
paintGL true 65
paintGL true 67
paintGL true 94
paintGL true 106
paintGL true 75
paintGL true 133
paintGL true 76

Diff Detail

Repository
R37 Krita
Lint
Lint Skipped
Unit
Unit Tests Skipped
poke1024 created this revision.Oct 30 2017, 10:29 AM
dkazakov added a subscriber: dkazakov.EditedOct 30 2017, 1:13 PM

Hi, @poke1024!

Qt's FBO vs our own

The funny thing is that Qt already maintains an FBO for a widget. So, theoretically, you can reuse it. The only trouble is that you will have to refactor quite a lot of code to call updateCanvas(rect) instead of updateCanvas(), which is used in quite a few places now.

So, there are two solutions:

  1. Reuse Qt's FBO:
    • con: needs refactoring for all the updateWidget() calls
    • pro: we do not need to maintain our own copy of the scaled down canvas (up to 8MiB extra, not much)
  1. Maintain our own FBO, like you do here
    • con: on every update we need to write data twice, to our FBO and Qt's FBO (should be quick enough)
    • con: extra 8MiB of RAM (not much)
    • pro: no refactoring

Personally, I like your (second) approach a bit more :)

Reset state transitions

As far as I can see, you update the FBO with scaled down and transformed tiles on every upload, right? But where do you track current scale/offset/rotation? As far as I can guess, the FBO should be reset on every transformation change that happens to the canvas.

Regression!

If you enter Wraparound mode (press W), some portion of the canvas disappears.

Summary

  1. The patch is really good in general. We wanted to implement partial updates for a long time already.
  2. Please fix the regression, split all the new classes from kis_opengl_canvas2.cpp and try to make them a bit more understandable, so we could promote the patch from "experimental" into "stable" and push to master :)
dkazakov requested changes to this revision.Oct 30 2017, 1:14 PM

Mark the patch as needs changes

This revision now requires changes to proceed.Oct 30 2017, 1:14 PM
alvinhochun added a subscriber: alvinhochun.

Self reminder: need to check if it works well with ANGLE

poke1024 updated this revision to Diff 21568.Oct 30 2017, 1:53 PM

rebased to current top of tree

alvinhochun requested changes to this revision.Oct 30 2017, 2:55 PM

QOpenGLFramebufferObject::hasOpenGLFramebufferObjects called by KisOpenGL::supportsFBO requires a valid current OpenGL context, which means opening the config dialog will crash when no documents are opened or OpenGL is not available. Probably QOpenGLFramebufferObject::hasOpenGLFramebufferBlit too.
You should make the check only once in the ctor of OpenGLCheckResult and store the result, then change KisOpenGL::supportsFBO to call initialize() and get the result from openGLCheckResult, just like the other checks.
Please also change KisOpenGL::initialize to add supportsFBO to the OpenGL debug text.

Otherwise it seems to work ok with both native OpenGL and ANGLE, though with the regression that dmitry noticed. I haven't been able to do more testing and performance comparison yet.

This revision now requires changes to proceed.Oct 30 2017, 2:55 PM
alvinhochun added inline comments.Oct 30 2017, 3:06 PM
libs/ui/opengl/kis_opengl_canvas2.cpp
225

You probably shouldn't take the devicePixelRatio from the screen, but from the QWidget of the canvas. Otherwise this will likely break when there are multiple monitors with different dpi. (Haven't tested it yet though.)

Also, I would like KisConfig::setUseFBOForOpenGLUpdates to not write the value to the config if it hasn't been changed by the user (i.e. same as current/default value), or when OpenGL is not available on the system. Not saving the unchanged default value to the config means we can change the default later and it can be applied to most users automatically (unless the user has changed it manually).

rempt added a subscriber: rempt.Nov 2 2017, 8:20 AM

Am I right that I cannot apply this patch at the same time as D8603: Asynchronous texture uploading ?

In D8550#163275, @rempt wrote:

Am I right that I cannot apply this patch at the same time as D8603: Asynchronous texture uploading ?

Yes, thw two patches conflict adding similar things to KisTextureTile.

poke1024 updated this revision to Diff 21962.Nov 6 2017, 1:14 PM

An improved version, but not yet ready for prime time.

Main remaining problem:

If the bottom draw layer is not opaque, and thus the tile actually needs blending to draw against the checkered background, the current approach does not work. The code has some preparations to fix this (namely a background FBO that can then be used to fetch a texture and draw the background for one tile), but it is going to be very difficult due to the involved projection transformations and I think this should be saved for a later stage (it might take long to get this one border case right). So I believe the best fix for now is to just disable incremental updates with a non-opaque base layer; however, how can I detect this reliably? My current approach is just taking the first layer, but that is obviously not correct. Would it be like the first deepest node that is a paint layer or something like that? @dkazakov, what do you think?

Things fixed in this update:

  • a lot of code went from KisOpenGLCanvas into a new class KisOpenGLAccumulator; quite some cleanup along the way
  • wrapped mode should basically work now, even though I still see problem when resizing on macOS (I'm not sure this is a problem because wrapped mode seems to be kind of generally broken under macOS)
  • checks for FBO support is now properly encapsulated in KisOpenGL and done in valid context (@alvinhochun)
  • devicePixelRatio is now taken from widget instead of screen (@alvinhochun)
  • KisConfig::setUseFBOForOpenGLUpdates will not write a default value to the config file (@alvinhochun)

After talking to @boud in the meeting, I now understand that the opaqueness thing is not detectable via layer opaqueness at all, as each layer is full RGBA. So the fix for this has to be the "draw one tile background" thing. Ok.

poke1024 updated this revision to Diff 21987.Nov 6 2017, 6:11 PM

Solves the alpha background problem by using masking via blend. Works now as far as I can tell. Probably needs more cleaning up inside KisOpenGLCanvas2::drawImage.

rempt added a comment.Nov 7 2017, 9:29 AM

I don't see any regressions in a quick manual test.

dkazakov requested changes to this revision.Nov 14 2017, 2:35 PM

Hi, @poke1024!

At first, I still have a regression in Wraparound mode :( It looks like "row 0" is lost. See the attached screenshots:

Codewise, I like the idea of having a separate FB accumulator class, it looks quite readable. Though I have a few questions:

  1. Why do we need a two-pass blending for the checkers? Can you move this code to a higher level and just call drawCheckers(); drawImage() sequentially? Right now it really looks weird that drawCheckers is called from two different places, which are on different hierarchy levels. Probably, some "applicator" or "painter" pattern could help?
  2. I don't see the accumulator's state transition from clean to dirty on image 'pan'. Where does it happen? How does it know that it should redraw the FB when the user changes zoom/pan?

From the speed point of view I don't see any regressions nor improvements. So, basically, WA-mode bug and the drawImage() are the only two blockers of the patch.

libs/ui/opengl/kis_opengl_canvas2.cpp
709

Could you give/add some comments why we need a two-pass thing here? Speaking truly, I don't understand what it does :(

If we redraw the cached frame buffer every time the image is panned (and the checkers are offset), why do we need any special handling for them?

This revision now requires changes to proceed.Nov 14 2017, 2:35 PM
poke1024 updated this revision to Diff 22558.Nov 18 2017, 10:42 AM

Fixes wrapping mode regression (broken due to my columns loop refactoring)
Adds several comments to core parts of the incremental update
Moves drawCheckers() call into drawImage() to have one defined place of call (this is more logical, as drawImage() always draw checkers if it operates in incremental mode, so it was not logical to not have do it in non-incremental mode); the semantics of drawImage() is thus now: draw checkers and image

poke1024 marked an inline comment as done.Nov 18 2017, 10:51 AM

Hi @dkazakov,

(1) see comments and my inline remark.
(2) this happens inside KisOpenGLAccumulator::bind, namely with qFuzzyCompare(m_modelMatrix, modelMatrix). This works good enough in practice, although for very small changes this might pose a problem, but I think in practice matrix either stays the same or changes radically (i.e. like > 0.01). I didn't find a convenient observer or callback pattern to detect these changes, which might be even better of course.

libs/ui/opengl/kis_opengl_canvas2.cpp
709

It has to do with transparency in tiles. Imagine drawing a tile with some transparent pixels over and over again over the same checkered background that is not updated; it will accumulate the alphas, i.e. the incremental update would be wrong - so there needs to be a way to update the checker background for those tiles that get redrawn in a controlled manner (without touching the tiles that stay the same). That is what the first pass is for, which builds a mask of pixels that will get drawn to later on in the second pass. I tried to put this in the comments.

(2) this happens inside KisOpenGLAccumulator::bind, namely with qFuzzyCompare(m_modelMatrix, modelMatrix). This works good enough in practice, although for very small changes this might pose a problem, but I think in practice matrix either stays the same or changes radically (i.e. like > 0.01). I didn't find a convenient observer or callback pattern to detect these changes, which might be even better of course.

I would've thought you can just compare the equality even without fuzziness.

alvinhochun requested changes to this revision.Nov 18 2017, 3:45 PM

Found some bugs with incremental screen update enabled:

  • With the default new file, a transparent layer and a solid background layer are created. If I draw on the first layer and then hide the background layer, I get visual artifacts. The rendered colour also becomes saturated when painting over them to cause updates.
  • When there are two solid layers, toggling the visibility of one of them will result in the checkerboard showing in some area for a short time.
  • Issues in wrap around mode: If I draw on the top-left area outside of the main canvas, the rendering will eventually glitch out showing the checkerboard in some places. Toggling the visibility of layers in wrap around mode can also produce artifacts.

(Drawing outside the main canvas area in wrap around mode can result in broken lines, but it seems to be a regression with the multi-threaded brushes, not caused by this patch.)

Another issue is that toggling the incremental screen update config seems to apply only after "canvas graphics acceleration" is disabled, applied and then re-enabled, or after restarting Krita. It looks like you can add a check in KisCanvas2::resetCanvas to allow the config to be applied without a restart, can you do it?

Other than the above issues, it seems to work ok with ANGLE. Though I can't tell if there are any performance differences.

This revision now requires changes to proceed.Nov 18 2017, 3:45 PM
poke1024 marked an inline comment as done.Nov 18 2017, 5:14 PM

I would've thought you can just compare the equality even without fuzziness.

You can, and that's what I should do. I originally used fuzzy compare as there seemed to be a constant minimal numeric flicker between frames, but I can no longer see this.

By the way, I think I broke things with the latest patch; I tried your test cases, and turning off the layer visibility also no longer works properly on OS X, but it used to work there with the last version. So I broke something. I need to investigate.

poke1024 updated this revision to Diff 22579.Nov 18 2017, 6:57 PM
  • Fixes a regression I introduced with the last update that broke the checker board mask setup (this might also fix one or two @alvinhochun's flickering checkboard bugs)
  • Simplified the checkerboard mask code (eliminated glBlendColor)
  • Modified KisCanvas2::resetCanvas so that the change of setting takes place immediately, not only after opening a new canvas

BTW, if you're not sure which mode you're in, a simple way is to change KisOpenGLAccumulator:87 to something like f->glClearColor(1, 0, 0, 1.0); to produce a red border.

alvinhochun accepted this revision as: alvinhochun.Nov 19 2017, 12:50 PM

Ok, now there seems to be no more visual artifacts I can find (at least not yet).

With ANGLE, enabling incremental screen updates seems to be slower than without it, but I don't know enough OpenGL and the internals of ANGLE to find out why.

Also, I've added an inline comment.

libs/ui/opengl/kis_opengl_canvas2.cpp
693

I don't know how I feel about this... seems to me that making the inside a separate function and calling it twice might be a cleaner approach? (Perhaps with pass as a template parameter to also reduce run-time branching, but it's probably redundant if the compiler can simply inline the function..)

poke1024 added inline comments.Nov 20 2017, 1:53 PM
libs/ui/opengl/kis_opengl_canvas2.cpp
693

I agree, a separate function would be much cleaner. There are quite a number of local variables this depends on though. I'll think about how this can be refactored.

With ANGLE, enabling incremental screen updates seems to be slower than without it, but I don't know enough OpenGL and the internals of ANGLE to find out why.

Hm, ok. That would mean that ANGLE has very low overhead when drawing a large number of tiles. On macOS drawing a large number of tiles is relatively slow as the glDrawArrays calls are so expensive. Probably I should add some performance measurements and outputs, then we could see why it's slow on ANGLE; maybe I'm doing something with the new code that ANGLE does not like and I can eliminate that.

dkazakov requested changes to this revision.Nov 20 2017, 7:41 PM

Hi, @poke1024!

I have tested your patch and now it seems to work correctly. In my tests on Linux it also shows a bit of FPS improvements, especially in wraparound mode. When testing painting on a 8000x6000 in wraparound(!) mode, the FPS increases from 25 fps to 75 fps. Which is very nice :)

The only blocking problem I see now is that two-pass painting. Is it really necessary? As far as I can tell, you can paint the checkers in the same way as the tiles are being painted. Just offset the texture coordinates you pass to the primitive and it will paint a small piece of background checkers. This masking algorithm is really difficult to understand and, I'm afraid, it may cause troubles for maintenance :(

libs/ui/opengl/kis_opengl_canvas2.cpp
709

Why don't you just paint this only tile and pass the offset to the checkers shader?

Iirc, the checkers texture is just a simple texture of 64x64 pixles that is painter in "repeat" mode. You can easily add an offset to the texture (or offset the texture coordinates) and paint the tiles separately without any masking.

Speaking truly, I think this two-pass solution with masking is really an overcomplication... :(

This revision now requires changes to proceed.Nov 20 2017, 7:41 PM