address race condition around setoperation
ClosedPublic

Authored by sebas on Jun 1 2016, 12:04 AM.

Details

Summary

Use a timer to avoid catching configChanged signals after we set
changes.

The long version:

TL;DR: We have a race condition when the kscreen daemon starts. It looks
up a known config, then applies it and subsequently resaves the config.
The same happens on config changes, it writes, then re-reads and then
re-writes the config change.
I've managed to prevent this from happening by adding a timer that does
avoids saving the config as a direct reaction to our own config changes.

So what happens on kded5 startup after loading the kscreen2 module:

  • the kscreen config is requested and received
  • the kscreen daemon (KD) looks into its config directory for a suitable config file (a config file is identified by a combined hash of all screen

attached, so unique per connected set of outputs)

  • KD usually finds a config
  • KD ignores configChanged events before it starts ...
  • a KScreen::SetConfigOperation to apply the "known config"
  • SetConfigOperation returns after a while (say 100ms later)
  • we re-enable the change monitor
  • we receive a configChanged signal
  • we save the new config (usually to the existing config file)

I don't think this behavior is desirable. I don't see a reason why the
daemon should save its config right after applying it. I think this
causes more problems than we want, since the startup may overwrite the
user's config. This behavior seems to be desired by the code in KD, it's
already blocking configChanged signals during the SetOperation (which,
to be honest may result in nightmarish behavior in any way, so it might
be a kludge which aims too short).

From libkscreen perspective, SetConfigOperation::finished cannot
guarantee that all configChanged signals are already fired and that it's
safe to watch for new, independent changes now. At least on X11, we
simply don't know, and what we can do is wait a bit and cross fingers
that we're not catching our own noise. The changed signal *may* come
from a re-request of the edid information, but this is a bit hard to
track down, and not too useful, anyway, since changed Edid may affect a
large number of a screen's properties.
In the Wayland backend, that's a different story and we can prevent this
behavior at an earlier stage, so this timer is "probably not needed" (I
haven't tested that).

This effectively prevents KD from catching reactions to its own changes
and does not trigger saving the config file on every login. It still
reacts to changes from libkscreen, but will avoid re-saving the config a
few times. The timer may not be the neatest of solutions for this, but
it does help narrowing down the problem and may be a last resort action.
Most importantly, it avoids the re-writing of the config on startup and
plugging/unplugging a monitor effectively.

The timer value of 100ms is also used in kwin, which should make the
behavior (which is no problem in kwin) more solid.

CCBUG:346961
CCBUG:358011

Diff Detail

Repository
R104 KScreen
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
sebas updated this revision to Diff 4103.Jun 1 2016, 12:04 AM
sebas retitled this revision from to address race condition around setoperation.
sebas updated this object.
sebas edited the test plan for this revision. (Show Details)
sebas added a reviewer: graesslin.
sebas added a subscriber: Plasma.
Restricted Application added a project: Plasma. · View Herald TranscriptJun 1 2016, 12:04 AM
Restricted Application added a subscriber: plasma-devel. · View Herald Transcript
graesslin requested changes to this revision.Jun 1 2016, 7:26 AM
graesslin edited edge metadata.

Given that you don't connect to the timer at all, I think the usage of QTimer is wrong here. QElapsedTimer seems like the better choice here.

The main change would be in configChanged where it would become:

if (m_changedBlockTimer->isValid() && !m_changedBlockTimer->hasExpired(100)) {
    // still active
   m_changedBlockTimer->start();
} else {
    // stop the timer
   m_changedBlockTimer->invalidate();
}
kded/daemon.cpp
87

why delete manually? Either pass this on construction or use a QScopedPointer.

I see that it's already done like that for the other cases, but I don't think it's a good idea to copy bad practice in new code.

This revision now requires changes to proceed.Jun 1 2016, 7:26 AM
sebas updated this revision to Diff 4135.Jun 1 2016, 2:47 PM
sebas edited edge metadata.
  • Use QElapsedTimer instead of QTimer
graesslin requested changes to this revision.Jun 1 2016, 2:51 PM
graesslin edited edge metadata.
graesslin added inline comments.
kded/daemon.cpp
190

never call elapsed on a not valid timer! The behavior is undefined.

195

I suggest to do the invalidate as I suggested as it frees up resources.

This revision now requires changes to proceed.Jun 1 2016, 2:51 PM
sebas updated this revision to Diff 4136.Jun 1 2016, 2:54 PM
sebas edited edge metadata.
  • --debug and invalidate
graesslin accepted this revision.Jun 1 2016, 2:55 PM
graesslin edited edge metadata.
This revision is now accepted and ready to land.Jun 1 2016, 2:55 PM
This revision was automatically updated to reflect the committed changes.