[X11] Force glXSwapBuffers to block with NVIDIA driver
ClosedPublic

Authored by ekurzinger on Mar 18 2019, 9:41 PM.

Details

Summary

The NVIDIA implementation of glXSwapBuffers will, by default, queue up
to two frames for presentation before blocking. KWin's compositor,
however, assumes that calls to glXSwapBuffers will always block until
the next vblank when rendering double buffered. This assumption isn't
valid, as glXSwapBuffers is specified as being an implicit glFlush,
not an implicit glFinish, and so it isn't required to block. When this
assumption is violated, KWin's frame timing logic will
break. Specifically, there will be extraneous calls to
setCompositeTimer with a waitTime of 0 after the non-blocking buffer
swaps, dramatically reducing desktop responsiveness. To remedy this,
a call to glXWaitGL was added by Thomas Luebking after glXSwapBuffers
in 2015 (see bug 346275, commit
8bea96d7018d02dff9462326ca9456f48e9fe9fb). That glXWaitGL call is
equivalent to a glFinish call in direct rendering, so it was a good
way to make glXSwapBuffers behave as though it implied a glFinish
call.

However, the NVIDIA driver will by default do a busy wait in glFinish,
for reduced latency. Therefore that change dramatically increased CPU
usage. GL_YIELD can be set to USLEEP (case insensitive) to change
the behavior and use usleep instead. When using the NVIDIA driver,
KWin will disable vsync entirely if
GL_YIELD isn't set to USLEEP
(case sensitive, a bug in KWin).

However, the NVIDIA driver supports another environment variable,
__GL_MaxFramesAllowed, which can be used to control how many frames
may be queued by glXSwapBuffers. If this is set to 1 the function
will always block until retrace, in line with KWin's expectations.
This allows the now-unnecessary call to glXWaitGL to be removed along
with the logic to conditionally disable vsync, providing a better
experience on NVIDIA hardware.

Test Plan

Run KWin using the X11 backend with the proprietary NVIDIA driver.
Ensure the TripleBuffer option is not set to true in Xorg's configuration file.
Ensure vsync has not been manually disabled (either in NVIDIA's or KWin's settings).

  • Moving windows should not exhibit significant lag
  • CPU usage should never be excessively high
  • No screen tearing should be observed

Diff Detail

Repository
R108 KWin
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
ekurzinger created this revision.Mar 18 2019, 9:41 PM
Restricted Application added a subscriber: kwin. · View Herald TranscriptMar 18 2019, 9:41 PM
ekurzinger requested review of this revision.Mar 18 2019, 9:41 PM
zzag accepted this revision.Mar 19 2019, 9:20 AM
zzag added a subscriber: zzag.

I don't have any NVIDIA GPU to test, but code-wise this change makes sense.

Minor nitpick: please change "[X11]" to "[platforms/x11]" :-)

This revision is now accepted and ready to land.Mar 19 2019, 9:20 AM
davidedmundson accepted this revision.Mar 19 2019, 9:22 AM
davidedmundson added a subscriber: davidedmundson.

The rationale all makes sense.
Thanks for investigating this.

This revision was automatically updated to reflect the committed changes.

I guess you guys would consider this to be too risky for 5.15? (Although a lot of Nvidia owners - me included - are dying to get this fix).

Thanks a lot! I suggest to backport it to all supported versions.

zzag added a comment.Mar 27 2019, 8:35 AM

I backported this patch to 5.12 and 5.15.

Since the introduction of this patch we have seen a deadlock when QtQuick calls glXSwapBuffers:
https://bugs.kde.org/show_bug.cgi?id=406180

QtQuick is used a separate GL context with a different render path, often in a different thread to kwin's main rendering. It's used in various tab switchers and the frame rate is not directly tied to kwin's render updates (*though this is something I did want to fix...maybe it would be a solution?).
The introduction of the environment seems to have had an adverse effect on this separate codepath.

This freeze is not universal, unfortunately.

Can you clarify what multiple glXSwapBuffers blocks until?

Curious, sorry for overlooking PRIME systems in my testing. Since the QtQuick window would be redirected, MaxFramesAllowed=1 should have the effect of forcing glxSwapBuffers to wait until rendering for the previous frame has completed and the back buffer contents have been copied to the front buffer. I'm taking a look into this internally as well, having access to a debug NVIDIA driver might provide some more insight into what's going wrong. Will update with any progress, hopefully we can avoid having to revert the fix.

Alright, I think I see what the issue is. For context, on PRIME systems, since there isn't actually a display connected to the NVIDIA GPU, we rely on xf86-video-modeset, which drives the Intel GPU, to call to our driver after each vblank. However, it will actually only do so if there has been damage to the screen. So what I think we have here is QtQuick's thread waiting on this vblank signal before swapping buffers (since we set MaxFramesAllowed to 1), and KWin's main compositing thread waiting on the QtQuick thread. Since KWin isn't rendering to the screen while it waits, no damage occurs and hence this vblank signal is never sent. I think some timeout is what eventually gets things going again (after about 30 seconds). Does that make sense?

So to fix this, I'm wondering if there would be some way to get KWin to do a compositing cycle before it waits on the QtQuick thread?

It makes sense. Thanks for looking.

So to fix this, I'm wondering if there would be some way to get KWin to do a compositing cycle before it waits on the QtQuick thread?

Not trivially.

The main part of the design (on X11) is that these popups are "normal windows" that draw using the platform rendering mechanism, and the compositor part of kwin just treats them as normal windows. We don't really have hooks into their low level internals.

Super worst case, we can enable a software only renderer on our QtQuick views, but that's not much better than reverting.
I'll have a think to try and find something nice.

fredrik added a subscriber: fredrik.May 8 2019, 6:38 PM

Do we actually want the QtQuick rendering to sync to vblank in this case?
If not, the solution could be to set the swap interval for those drawables to zero.

That'd be a nice easy fix at least.

Here's a patch: https://phabricator.kde.org/P385

I think it might tear because we have the threaded render loop so it's got no reason to be in sync with kwin/anything - but it looks to me to work fine.

It needs someone with the bug to test if it fixes the issue.

That'd be a nice easy fix at least.

Here's a patch: https://phabricator.kde.org/P385

I think it might tear because we have the threaded render loop so it's got no reason to be in sync with kwin/anything - but it looks to me to work fine.

It won't tear because the result of the buffer swap won't be visible until kwin renders the window on the screen - and the rendering kwin does is still synced to vblank.
It can potentially make any animations in those windows less smooth however.

davidedmundson added a comment.EditedMay 8 2019, 10:20 PM

Feedback from bugzilla and that patch.

One person says it fixes it. \o/

One person says it improves the situation but doesn't fix everything. (https://bugs.kde.org/show_bug.cgi?id=406180#c30) it implies we're no longer deadlocked, but missing some damages how

I posted a comment on the bug (shall we just move this discussion there going forward?) with a bit more explanation, but in summary David's patch seems to only prevent the hang the first time a QtQuick window is displayed. However, I think this might be more of a Qt issue that might be worth fixing upstream, although maybe it can be worked around in KWin as well. I'm not sure to what extend Qt and KDE development is coordinated?

(shall we just move this discussion there going forward?)

Sure

gchandran added a subscriber: gchandran.EditedThu, May 30, 1:18 PM

Good day @ekurzinger. @davidedmundson I am on the latest TUMBLEWEED snapshot. obviously waiting for 5.16.

I followed this thread but you requested to continue the discussion elsewhere? WHere can i go to see the updates.

I have GTX NVIDIA 1070 TI - Proprietary Drivers installed on OPENSUSE TUMBLEWEED.

ANd i have screen tearing when i move windows around. Is this issue resolved? I know a patch was regressed as it affected other users.

Will KDE finally be smooth in 5.16?

Thanks in advance

SuSe has this patch reverted downstream.

Once the regression for the few prime users is fixed hopefully that will change.

THanks for the quick response.

Anywhere i can follow progress on SUSE ? Im guessing PRIME is for laptops. Alternatively, how would i go about applying this on my own?

Kind REgards

Greg

Hi @davidedmundson, any chance this has been fixed now that we are on 5.16.1? Im not a Prime user, I just have one Nvidia GTX card. Ive tried to re-read this thread but dont understand too much. Im on Opensuse tumbleweed. has the patch been tested or any feedback given on no more tearing in Kwin? are we in smoothe territory?