Change the default texture size
AbandonedPublic

Authored by poke1024 on Oct 13 2017, 9:25 PM.

Details

Summary

The default texture tile size seems to have a significant impact on performance.

On my iMac, a 256 texture size Krita (the current default) ran with 46 fps, whereas a 1024 texture size Krita runs with 94 fps (100 fps seems to be the maximum the OpenGL driver seems to be able to output to the screen).

This is also seen in profiling. With 256, KisOpenGLCanvas2::paintGL takes 6.9%/31.4% = 22% inside the main thread:

With 1024, it's 4.9%/36.7% = 13.4%:

With 4096, it's 2.7%/28.5% = 9.5%:

The actual time is spent inside glDrawArrays in KisOpenGLCanvas2::drawImage.

This patch thus proposes upping the default texture size to 2048. This should be safe for all desktop GPUs. For mobile platforms, there might be some old platforms that need 1024 (but see Krita's check below). I found some discussions on these topics here:

https://www.reddit.com/r/gamedev/comments/491uww/is_a_max_texture_size_of_2048x2048_a_safe_bet_for/
https://stackoverflow.com/questions/16931295/android-devices-gl-max-texture-size-limitation-safe-texture-size

As KisOpenGLImageTextures::getTextureSize additionally limits the configured or default texture size by the system limit (GL_MAX_TEXTURE_SIZE) anyway, this change should not break things, even if the platform does not support the configured size. The question is though if smaller GPUs get hurt in performance by going to the maximum possible texture size in this way. Inside KisTextureTile::update Krita does already use glTexSubImage2D to minimize only updating parts of the texture if possible and not the whole tile, so I imagine that having large texture sizes will usually not have a negative effect. As a last resort, the texture size can always be configured as a user setting.

Diff Detail

Repository
R37 Krita
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
poke1024 created this revision.Oct 13 2017, 9:25 PM
rempt accepted this revision.Oct 14 2017, 8:46 AM
rempt added a subscriber: rempt.

On Linux, I don't see any regressions, but I've also not done profiling myself.

This revision is now accepted and ready to land.Oct 14 2017, 8:46 AM

I'd be interested to see a more detailed profiling inside paintGL and see what's actually changed...

libs/ui/kis_config.cc
655

Indentation is 4 spaces?

Also, when writing the commit message, please try to keep the first line short and add a bit of detail to the body. Here is a nice guide to do it: https://chris.beams.io/posts/git-commit/#seven-rules
(It's not a requirement to follow it exactly, but it does make the commit messages clearer.)

At the end of the commit message, please add a line Differential Revision: https://phabricator.kde.org/D8280 to link the commit back to this Phabricator revision. Please also do the same for your future commits. It helps us to trace the changes. :)

poke1024 updated this revision to Diff 20957.Oct 18 2017, 5:30 PM

fixed indentation

@alvinhochun I did more tests on two Macs and the time difference seems to come from glDrawArrays.

I also suspect that having much less tiles (i.e. 1 -5 to > 100) destresses the whole tile management pipeline and mutexing.

Here are the basic results with a smaller canvas (1500 x 1500).

First on a iMac 5K:

And here on a MacBook Pro:

In both cases, the right side is the 2048 texture tile size. As you can see, the hit is inside glDrawArrays and the driver implementation. On the other hand, there's not more time spent in glTexSubImage2D for the larger tile size. Both main threads took about 60% of total.

Now, what confused me in the measures above is that paintGL as a whole did not take less time. Don't ask me to explain, but this only seems to happen for larger images. I did one more test with a 4000 by 4000 canvas. Here's the result:

For smaller canvases the new texture size means that there will be only one tile and one texture that is drawn which in terms of the drawing process this is fast (as long as we don't do any kind of incremental updates on a single buffered context).

It's hard to track down all the implications of this change performance-wise, but I think it should usually be a small performance improvement.

This revision was automatically updated to reflect the committed changes.

@dkazakov Just recognized that this change has implications for the pool sizes defined in kis_texture_tile_info_pool.h:

const int minPoolChunk = 32; // 8 MiB (default, with tilesize 256)
const int maxPoolChunk = 128; // 32 MiB (default, with tilesize 256)
const int freeThreshold = 64; // 16 MiB (default, with tilesize 256)

What unsettles me a bit is that I just a that m_maxAllocations went up to 35 while drawing in my tests here. With the old tile size (256), this meant allocating 256 * 256 * 4 * 35 = 8,75 MB, while with 2048 as tile size, this is 2048 * 2048 * 4 * 35 = 560 MB.

We might change the defs to not use numbers but rely on the given texture size, something like:

const int minPoolSize = 8 * 1024 * 1024; // 8 MiB
const int freeThresholdSize = 16 * 1024 * 1024; // 16 MiB

class KisTextureTileInfoPoolSingleSize
{
public:
    KisTextureTileInfoPoolSingleSize(int tileWidth, int tileHeight, int pixelSize)
        : m_chunkSize(tileWidth * tileHeight * pixelSize),
          m_freeThreshold(freeThresholdSize / m_chunkSize),
          m_pool(m_chunkSize, std::max(1, minPoolSize / m_chunkSize)),
          m_numAllocations(0),
          m_maxAllocations(0),
          m_numFrees(0)
    {
    }
dkazakov reopened this revision.Oct 23 2017, 1:00 PM

Nooooooo!!! This patch should not have been pushed into master :(

Making the texture size higher causes severe regressions when drawing with small (~10-50 px) brushes on big (>5000x) canvases. GPU bloats with the mipmap regeneration requests. The problem is that every time we do glSubImage2D and update only a subset of pixels of the texture, we should still regenerate the full mipmap for the entire texture. OpenGL has (at least, had) not got any mean of regenerating only a portion of the mipmap.

Please revert this patch :(

This revision is now accepted and ready to land.Oct 23 2017, 1:00 PM
dkazakov requested changes to this revision.Oct 23 2017, 1:00 PM
This revision now requires changes to proceed.Oct 23 2017, 1:00 PM
poke1024 abandoned this revision.Oct 28 2017, 3:45 PM

Making the texture size larger is not an option. To address the profiling measurements, the right approach is to use an FBO and only redraw those tiles that need changing.