Tentative fix for bug 372305
AcceptedPublic

Authored by jpalecek on Mar 8 2020, 1:33 AM.

Details

Reviewers
davidedmundson
ekurzinger
Group Reviewers
KWin
Summary

After much testing, I found that although KWin resets the graphic system and reallocates the buffer and all, for some strange reason NVidia unmaps the memory on first GLVertexBuffer::draw(). As a wild guess, I tried working it around by syncing the GL state during the reset and it seems it works (although I will still test).

I personally think this is a bug in NVidia's drivers. I've got traces of syscalls and KWin functions (not OpenGL calls though). But I don't have much experience with OpenGL. Maybe someone could help to get this reported?

BUG: 382305

Test Plan
  1. Stress the window manager by constantly moving windows (you'll have to do it programmatically). Eg.:
( for((;;)); do  qdbus org.kde.kopete /kopete/MainWindow_1 org.qtproject.Qt.QWidget.minimumHeight 100; qdbus org.kde.kopete /kopete/MainWindow_1 org.qtproject.Qt.QWidget.maximumHeight 100; qdbus org.kde.kopete /kopete/MainWindow_1 org.qtproject.Qt.QWidget.maximumHeight 16000000; for i in $(seq 1 10); do qdbus org.kde.kopete /kopete/MainWindow_1 org.qtproject.Qt.QWidget.minimumHeight $((i*100)); done; done ) &
  1. Invoke graphics reset by periodic switching into text mode, or by suspending/resuming the computer:
for i in $(seq 1 60); do [ -n "$(pidof kwin_x11)" ] && rtcwake -m mem -s 30; sleep 20; done

Diff Detail

Repository
R108 KWin
Branch
wip
Lint
Lint ErrorsExcuse: ?
SeverityLocationCodeMessage
Errorlibkwineffects/kwineffects.h:392CppcheckunknownMacro
Unit
No Unit Test Coverage
Build Status
Buildable 23414
Build 23432: arc lint + arc unit
jpalecek created this revision.Mar 8 2020, 1:33 AM
Restricted Application added a project: KWin. · View Herald TranscriptMar 8 2020, 1:33 AM
Restricted Application added a subscriber: kwin. · View Herald Transcript
jpalecek requested review of this revision.Mar 8 2020, 1:33 AM
jpalecek edited the summary of this revision. (Show Details)Mar 8 2020, 2:45 AM
jpalecek edited the test plan for this revision. (Show Details)
jpalecek added reviewers: davidedmundson, KWin.
jpalecek edited the summary of this revision. (Show Details)Mar 8 2020, 3:05 AM

Not performing any action till the next call makes sense, kinda like the delay in the first call after a swapBuffers(), but even so surely the order of events remains the same and we realloc before usage? It doesn't seem to make sense to me.

But this fix is clean and putting in an explicit fence on cleanup doesn't sound like a bad idea.

In lieu of any other comments, lets go for it.

This revision is now accepted and ready to land.Mar 16 2020, 11:39 AM

Not performing any action till the next call makes sense, kinda like the delay in the first call after a swapBuffers(), but even so surely the order of events remains the same and we realloc before usage? It doesn't seem to make sense to me.

That's true. I still see this more as a kludge than a real fix, since IMO that is up to Nvidia. Moreover, I've found that while it does reduce the amount of crashes about 10 times, it doesn't unfortunately fix them all (some happen when the computer is suspended in the middle of drawing, without a chance to do cleanup/initGL - however the unmapping of the memory which causes the crash happens some time later and pretty synchronously, although I'm not yet sure on which call it happens).

But this fix is clean and putting in an explicit fence on cleanup doesn't sound like a bad idea.

I would rather wait on this, there's no need to be hasty. But if there are other comments on this, I'd like to read them (right now, I can't see anything).