Get (almost) rid of glFinish() on OS X
AbandonedPublic

Authored by poke1024 on Oct 22 2017, 11:07 AM.

Details

Reviewers
rempt
dkazakov
Group Reviewers
Krita
Summary

This is the result of three weeks of experimenting how to get rid of glFinish() on OS X and this is the only/best solution I can come up with.

The glFinish() is still in place in KisOpenGLCanvas2::paintGL, but it's usually no longer called (only very sporadically like once per minute and during startup).

The logic extends KisOpenGLCanvas2::isBusy to usually work around having to call glFinish(). It does this by tracking the texture upload status using an Apple specific call for tracking the texture status.

A major problem is that I cannot verify this solution as I can no longer reproduce the original problems with removing glFinish() altogether. I did see the problems sporadically happening two weeks ago but can no longer see them on any of the three Macs I test with. I also tried bisecting git up to August without success. I still believe there is a problem so I would suggest against removing glFinish() altogether at this point. I wonder if we could seed this to other Mac users.

As for the benefit, glFinish() currently takes about 10% in the main thread (that takes about 50% of the total load), which means about 20% of the time in the main thread is blocked inside glFinish() on OS X, which is a problem for event consumption.

Other things I tried and abandoned:

(1) Blocking texture upload altogether if an update event has been scheduled via QT so that paintGL will never be in a wait state. This added complex, brittle logic and sporadically failed to upload textures for reasons I do not understand.
(2) Using a standard OpenGL fence for detecting when the texture upload is finished; this did not work as the fence did not get signaled, neither with glFlush() before or after the fence.
(3) Using double buffering (i.e. two textures for each tiles). This sounds like a charming idea at first until you recognize that you need to accumulate changes, i.e. if you have two double buffered textures T1 and T2, the changes that arrive for T1 need to be later reapplied to T2 in addition to the changes for T2. The logic gets complex very soon.
(4) Moving the OpenGL rendering to a separate thread. Basically not possible as long as QT hogs the OpenGL contexts for its compositing of widgets in an unpredictable manner. I also investigated moving the whole OpenGL context into its own QOpenGLWindow but that had lots of other problems.
(5) Moving the fence done detection logic to a separate thread to work around polling. Only possible on platforms that support multithreading with OpenGL and shared contexts. You would need a shared OpenGL context that shares fences with the main context. You also need to duplicate the whole polling logic for platforms that do not support it. After two days of investigating this, this seemed like a whole lot of effort for very little benefit.

Diff Detail

Repository
R37 Krita
Lint
Lint Skipped
Unit
Unit Tests Skipped

Is it really necessary to link to the OpenGL library? Can you use QOpenGLContext::getProcAddress to get the function pointer instead? Also, if a function is specific to an extension you may want to check it with QOpenGLContext::hasExtension.

I like QOpenGLContext::getProcAddress and it works here but it adds some more wrapper code: there needs to be a place to keep the ptr and having it inside KisTextureTile at least needs to pass in the OpenGL context when constructing the tile and then it's kind of ugly, looking up the function for each single tile. And having a whole wrapper like for fences is overkill here. In short, linking against the OpenGL framework IMHO produces more concise code changes here even though generally I'm also a getProcAddress advocator. But if we have Q_OS_OSX defined, we know we're in an ecosystem where that call is always available, and it's also not useful for other platforms. So unless adding an additional linked library is a bad hit which could cause problems (I'm not really knowledgeable about that I admit), I'd prefer the current approach.

The patch looks correct. I tested it on Linux and it doesn't cause any regressions. I guess we need to ask @rempt to test the patch. If it works fine, then just push.

libs/ui/opengl/kis_opengl_canvas2.cpp
365

Should we also wrap it into #ifdef Q_OS_OSX? Or this check is also useful for other platforms?

poke1024 updated this revision to Diff 21275.Oct 25 2017, 4:21 AM

Adds Q_OS_OSX guard.

poke1024 marked an inline comment as done.Oct 25 2017, 4:22 AM
poke1024 added inline comments.
libs/ui/opengl/kis_opengl_canvas2.cpp
365

Should be wrapped, just missed this one.

poke1024 marked an inline comment as done.Oct 25 2017, 4:23 AM

Actually I detected a regression with this on my iMac 5K right now. When painting with a 100px Ink_brush_25 on a 4000 by 4000 canvas, this happens (notice the grey zebra-like areas in the stroke that actually disappear as soon as an additional update happens in the area):

Well, at least I have a regression I can test against now!

poke1024 updated this revision to Diff 21278.Oct 25 2017, 5:24 AM

Seems to fix regression.

Texture uploads now also wait until previous texture uploads have finished. This is what the previous glFinish() implementation also enforced.

dkazakov accepted this revision.Oct 25 2017, 7:24 AM

The patch looks fine, though I would prefer to use KisSignalCompressor(POSTPONE) instead of a raw timer here. Please push whatever version you like :)

This revision is now accepted and ready to land.Oct 25 2017, 7:24 AM

I like QOpenGLContext::getProcAddress and it works here but it adds some more wrapper code: there needs to be a place to keep the ptr and having it inside KisTextureTile at least needs to pass in the OpenGL context when constructing the tile and then it's kind of ugly, looking up the function for each single tile. And having a whole wrapper like for fences is overkill here. In short, linking against the OpenGL framework IMHO produces more concise code changes here even though generally I'm also a getProcAddress advocator. But if we have Q_OS_OSX defined, we know we're in an ecosystem where that call is always available, and it's also not useful for other platforms. So unless adding an additional linked library is a bad hit which could cause problems (I'm not really knowledgeable about that I admit), I'd prefer the current approach.

It doesn't seem difficult: You only need to get the function pointer once to pass it to KisTextureTile and KisTextureTile is constructed in only one place anyway. It already passes QOpenglFunctions to it so passing an extra function pointer doesn't seem any uglier.

The thing is, currently all the OpenGL calls go through Qt in some way. The OpenGL libraries are loaded by Qt and Qt sets up the function tables for us. Technically, the OpenGL library that Qt uses is an implementation detail and they can change the actual OpenGL implementation used (for example a bundled MESA software OpenGL renderer, which is actually possible on Windows and might be on Linux in the future), so there is no guarantee that the linked OpenGL library is the same as the one loaded by Qt.

Realistically though, that is very unlikely to happen for Qt on OS X since OpenGL on OS X is pretty decent when compared to the mess on Windows (except for the lack of compatibility profiles). The only real problem with linking directly to OpenGL is that it feels a bit dirty and is inconsistent with the existing code. (It may also cause someone in the future, developing on OS X, to call OpenGL functions directly thinking it would work on other platforms, but this is pretty hypothetical.)

I'm not strongly opposed to linking OpenGL directly only on OS X, and I'll be fine with this change committed. Just saying.

(Off-topic: Though, I do wonder if the issues with OpenGL on OS X are really OS-X-specific or actually due to the use of OpenGL 3.2 core profile instead of a compatibility profile. If Krita is to use OpenGL 3.2 core profile on other platforms would it show the same issues seen on OS X?)

It doesn't seem difficult: You only need to get the function pointer once to pass it to KisTextureTile and KisTextureTile is constructed in only one place anyway. It already passes QOpenglFunctions to it so passing an extra function pointer doesn't seem any uglier.

Ah, but getProcAddress lives inside QOpenGLContext, not inside QOpenGLFunctions and QOpenGLFunctions doesn't let you know its QOpenGLContext; and glTestObjectAPPLE, the function I need, isn't known to QOpenGLFunctions as it's an Apple extension.

The thing is, currently all the OpenGL calls go through Qt in some way. The OpenGL libraries are loaded by Qt and Qt sets up the function tables for us. Technically, the OpenGL library that Qt uses is an implementation detail and they can change the actual OpenGL implementation used (for example a bundled MESA software OpenGL renderer, which is actually possible on Windows and might be on Linux in the future), so there is no guarantee that the linked OpenGL library is the same as the one loaded by Qt.

But that is a point, we could see that on OS X at some point.

Realistically though, that is very unlikely to happen for Qt on OS X since OpenGL on OS X is pretty decent when compared to the mess on Windows (except for the lack of compatibility profiles).

I beg to disagree :-) OpenGL on OS X in 2017 is as horrible as it gets if you ask me. Apple is completely focused on their own Metal API and ships their 2010 implementation of OpenGL that they have basically abandoned for 8 years now (see https://www.quora.com/Why-is-OpenGL-poorly-supported-on-Mac-OS-X). They are still at OpenGL 4.1, when the rest of the world has 4.5. The whole QOpenGLDebugLogger that Krita also employs is not working on OS X as it needs GL_KHR_debug from 2012, which Apple still does not support. What's worse, Apple's own debug tools for OpenGL haven't been updated for years and have completely broken with the recent OS release (https://forums.developer.apple.com/thread/70141). There is no sensible way to debug OpenGL on OS X currently. I actually could imagine that QT will be forced to deliver their own OpenGL on OS X one day if things continue as they do. So in regard to this patch: I think this is actually really a point in favor of using getProcAddress so we could theoretically use a different OpenGL implementation (as unlikely as this may sound now).

(Off-topic: Though, I do wonder if the issues with OpenGL on OS X are really OS-X-specific or actually due to the use of OpenGL 3.2 core profile instead of a compatibility profile. If Krita is to use OpenGL 3.2 core profile on other platforms would it show the same issues seen on OS X?)

I don't expect to see this on other platforms with core profile.

It doesn't seem difficult: You only need to get the function pointer once to pass it to KisTextureTile and KisTextureTile is constructed in only one place anyway. It already passes QOpenglFunctions to it so passing an extra function pointer doesn't seem any uglier.

Ah, but getProcAddress lives inside QOpenGLContext, not inside QOpenGLFunctions and QOpenGLFunctions doesn't let you know its QOpenGLContext; and glTestObjectAPPLE, the function I need, isn't known to QOpenGLFunctions as it's an Apple extension.

I mean you only need to get the function pointer once before constructing KisTextureTile, at the same place where QOpenGLFunctions is obtained also from QOpenGLContext. (KisOpenGLImageTextures::createImageTextureTiles)

The thing is, currently all the OpenGL calls go through Qt in some way. The OpenGL libraries are loaded by Qt and Qt sets up the function tables for us. Technically, the OpenGL library that Qt uses is an implementation detail and they can change the actual OpenGL implementation used (for example a bundled MESA software OpenGL renderer, which is actually possible on Windows and might be on Linux in the future), so there is no guarantee that the linked OpenGL library is the same as the one loaded by Qt.

But that is a point, we could see that on OS X at some point.

Realistically though, that is very unlikely to happen for Qt on OS X since OpenGL on OS X is pretty decent when compared to the mess on Windows (except for the lack of compatibility profiles).

I beg to disagree :-) OpenGL on OS X in 2017 is as horrible as it gets if you ask me. Apple is completely focused on their own Metal API and ships their 2010 implementation of OpenGL that they have basically abandoned for 8 years now (see https://www.quora.com/Why-is-OpenGL-poorly-supported-on-Mac-OS-X). They are still at OpenGL 4.1, when the rest of the world has 4.5. The whole QOpenGLDebugLogger that Krita also employs is not working on OS X as it needs GL_KHR_debug from 2012, which Apple still does not support. What's worse, Apple's own debug tools for OpenGL haven't been updated for years and have completely broken with the recent OS release (https://forums.developer.apple.com/thread/70141). There is no sensible way to debug OpenGL on OS X currently. I actually could imagine that QT will be forced to deliver their own OpenGL on OS X one day if things continue as they do. So in regard to this patch: I think this is actually really a point in favor of using getProcAddress so we could theoretically use a different OpenGL implementation (as unlikely as this may sound now).

At least it's better than the crap built-in implementation on Windows which is still stuck on OpenGL 1.1... Sure, vendors do ship their own OpenGL drivers, but they can be pretty broken (especially Intel's).

poke1024 updated this revision to Diff 21665.Oct 31 2017, 6:27 PM

switched to KisSignalCompressor.

added one glFlush to fix a new regression that turned up with macOS High Sierra.

Just for the record, I just found out that this whole approach does not work properly on my MacBook Pro 2017 under High Sierra. Basically, glTestObjectAPPLE returns true (finished) all the time, even though the thing is not finished.

From the experiments I've done now, there just seems to have to be some kind of glFinish after texture uploading or things break; this works (even though glTestObjectAPPLE returns true before and after the finish, what the heck):

diff --git a/libs/ui/CMakeLists.txt b/libs/ui/CMakeLists.txt
index 7bd24639aa..f718796b80 100644
--- a/libs/ui/CMakeLists.txt
+++ b/libs/ui/CMakeLists.txt
@@ -14,6 +14,7 @@ add_subdirectory( tests )
 if (APPLE)
     find_library(FOUNDATION_LIBRARY Foundation)
     find_library(APPKIT_LIBRARY AppKit)
+    find_library(OPENGL_LIBRARY OpenGL)
 endif ()
 
 set(kritaui_LIB_SRCS
@@ -538,6 +539,7 @@ endif()
 if(APPLE)
     target_link_libraries(kritaui ${FOUNDATION_LIBRARY})
     target_link_libraries(kritaui ${APPKIT_LIBRARY})
+    target_link_libraries(kritaui ${OPENGL_LIBRARY})
 endif ()
 
 
diff --git a/libs/ui/opengl/kis_opengl_canvas2.cpp b/libs/ui/opengl/kis_opengl_canvas2.cpp
index 0d72cbeb85..9130c2b290 100644
--- a/libs/ui/opengl/kis_opengl_canvas2.cpp
+++ b/libs/ui/opengl/kis_opengl_canvas2.cpp
@@ -884,17 +884,6 @@ QRect KisOpenGLCanvas2::updateCanvasProjection(KisUpdateInfoSP info)
         d->openGLImageTextures->recalculateCache(info);
     }
 
-#ifdef Q_OS_OSX
-    /**
-     * There is a bug on OSX: if we issue frame redraw before the tiles finished
-     * uploading, the tiles will become corrupted. Depending on the GPU/driver
-     * version either the tile itself, or its mipmaps will become totally
-     * transparent.
-     */
-
-    glFinish();
-#endif
-
     return QRect(); // FIXME: Implement dirty rect for OpenGL
 }
 
diff --git a/libs/ui/opengl/kis_texture_tile.cpp b/libs/ui/opengl/kis_texture_tile.cpp
index d8fb03d92d..a40ab95fbd 100644
--- a/libs/ui/opengl/kis_texture_tile.cpp
+++ b/libs/ui/opengl/kis_texture_tile.cpp
@@ -369,6 +369,17 @@ void KisTextureTile::update(const KisTextureTileUpdateInfo &updateInfo)
     //     qDebug() << "WARNING: LodN -> Lod0 switch is requested for the partial tile udpate! Flickering is possible..." << ppVar(patchSize);
     // }
 
+#ifdef Q_OS_OSX
+    /**
+     * There is a bug on OSX: if we issue frame redraw before the tiles finished
+     * uploading, the tiles will become corrupted. Depending on the GPU/driver
+     * version either the tile itself, or its mipmaps will become totally
+     * transparent.
+     */
+
+    glFinishObjectAPPLE(GL_TEXTURE, m_textureId);
+#endif
+
     if (!patchLevelOfDetail) {
         setNeedsMipmapRegeneration();
     } else {

@rempt @dkazakov

Ok, I'll abandon this patch, it's not functioning safely. I have another one coming with a threaded approach that's the way to go.

Here's the good news: I think I finally found a reliable way to reproduce the problem even with the current top of tree. You just use Giphy from the MacApp Store (https://itunes.apple.com/us/app/giphy-capture-the-gif-maker/id668208984?mt=12) and record Krita drawing. You even see the effect with the current top of tree, which confirms my suspicion that the current glFinish() works because it adds a delay but not because it really solved the problem (by the way, this is not only what Giphy records, this is what you see on the screen while drawing):

https://www.dropbox.com/s/un8vdaabyffgrcr/macOS-10.13-OpenGL-1cfb64ab4bc2e1af2eeedda2c625a19e15147cb8.mp4?dl=0

poke1024 abandoned this revision.Nov 1 2017, 8:18 PM

Abandoned, as this needed another inexplicable glFlush to fix regressions that probably only worked because it added another delay. The main problem with this is that glTestObjectAPPLE always returns 1 (done) even directly after uploading a texture. Whereas using glFinish() seems to work, so we want glFinish(), which basically negates the whole approach of this patch.

rempt added a comment.Nov 2 2017, 8:14 AM

Right. The other patch looks promising, though, and I'm building that now on my mac.