let's continue in debug code instead of returning from XRandRConfig::applyKScreenConfig
ClosedPublic

Authored by clutz on May 29 2017, 2:33 PM.

Details

Summary

In XRandRConfig::applyKScreenConfig there is some debug code for a situation that should not really happen (but that was seen in the past) in which currentMode is NULL. After printing the debug info, the method directly returned from applyKScreenConfig, omitting all the other potentially useful initializations code that comes afterwards.

From my POV this is wrong and we had situations in which desktop initialization works better if we don't return here. That's why I changed "return" into "continue".

Analysis of the underlying inconsistency:

The debugging led me from libkscreen to xrandr to libxrandr (XRRGetOutputInfo) to the kernel. Here I noticed the following things:

    
1) outputInfo = XRRGetOutputInfo(outputId) returns some outputInfo that dosn't contain
   the mode it should contain (concrete the ID 77). It was simply missing.
    
2) crtcInfo = XRRGetCrtcInfo(outputInfo->crtc) But this crtcInfo-Objekt does reference
   the mode 77: crtcInfo->mode is 77
    
3) XRRGetScreenResourcesCurrent() also thinks that mode 77 should be present
    
4) Because mode 77 was not in outputInfo, XRandROutput::updateModes could not re-add
   this mode to the map m_modes and XRandROutput::currentMode() had to return NULL.
   Nothing wrong in updateModes!

Because libxrandr in 1) also delegates the above mentioned requests directly to the CRTC-driver of the kernel, I came to the conclusion that the inconsistency behind this situation comes from the kernel. It was also hard to reproduce on side of the kernel since multiple requests returned different results (kind of sporadic).

As mentioned above, I could sporadically reproduce the inconsistency with kernel 3.13.0 and as I tried a newer kernel the issue seemed to be vanished. This is the point where I decided to not dig deeper into such an old kernel version.

So in conclusion for me the underlying bug seems to be a kernel bug and not a libkscreen-bug. Maybe in the meanwhile nobody will ever cross the debug code again. BUT in my concrete situation libkscreen seems to be able to recover things from this inconsistency and "continuing" instead of "returning" simply improves the behaviour of the complete system.

Test Plan

The concrete situation in which I regularily got this currentMode == NULL situation leads to the following bug:

    
- kernel 3.13.0
- physical pc with two displays, one 4:3 and the other 16:9 format
- forced the system to always start in clone-mode if there is not
  yet a user profile --> we patched kscreen therefore
- doing the following workflow:
    - log in --> sessions comes up in clone mode (as wished)
    - autorandr-gui --> "extended desktop - left"
    - press OK and DON't store the profile (so no change is stored)
    - log off
    - log in again
- The following things happen:
    - Now we run into this "currentMode == NULL" situation (message in log file)
    - the screens come up in clone mode (as the previous settings were not saved)
    - BUG: the resolutions are wrong (parts from one of the two screens are cut).
           applyKScreenConfig returned before it was finished.

With this chage, this bug no longer occures and the resolution is as expected set to the best commonly available resolution.

Diff Detail

Repository
R110 KScreen Library
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
clutz created this revision.May 29 2017, 2:33 PM
Restricted Application added a project: Plasma. · View Herald TranscriptMay 29 2017, 2:33 PM
Restricted Application added a subscriber: plasma-devel. · View Herald Transcript

but as you say, we should never be in here, that's the entire sentiment of the comment above here.

// Since we haven't been able to figure out the reason why
// this happens, we are adding this debug code to try to
// figure out how this happened.

As you can concretely reproduce it, can you look at fixing the actual issue?

clutz added a comment.Jun 1 2017, 6:59 AM

Thanks David for reviewing that. Oh, I already tried to find the underlying bug that causes the inconsistency. The debugging led me from libkscreen to xrandr to libxrandr (XRRGetOutputInfo) to the kernel. Here I noticed the following things:

  1. outputInfo = XRRGetOutputInfo(outputId) returns some outputInfo that dosn't contain the mode it should contain (concrete the ID 77). It was simply missing.
  2. crtcInfo = XRRGetCrtcInfo(outputInfo->crtc) But this crtcInfo-Objekt does reference the mode 77: crtcInfo->mode is 77
  3. XRRGetScreenResourcesCurrent() also thinks that mode 77 should be present
  4. Because mode 77 was not in outputInfo, XRandROutput::updateModes could not re-add this mode to the map m_modes and XRandROutput::currentMode() had to return NULL. Nothing wrong in updateModes!

Because libxrandr in 1) also delegates the above mentioned requests directly to the CRTC-driver from the kernel, I came to the conclusion that the inconsistency behind this situation comes from the kernel. It was also hard to reproduce on side of the kernel since multiple requests returned different results (kind of sporadic).

As mentioned above, I could sporadically reproduce the inconsistency with kernel 3.13.0 and as I tried a newer kernel the issue seemed to be vanished. This is the point where I decided to not dig deeper into such an old kernel version.

So in conclusion for me the underlying bug seems to be a kernel bug and not a libkscreen-bug. Maybe in the meanwhile nobody will ever notice the above situation again. BUT in my concrete situation libkscreen seems to be able to recover things from this inconsistency and "continuing" instead of "returning" simply improves the behaviour of the complete system. So for me this is a clear improvement and I see no reason to not apply it. Do you?

davidedmundson accepted this revision.Jun 1 2017, 7:24 AM

That's exactly the sort of thing I was hoping for. (the analysis, not there being a kernel bug).

Can you update the comment and/or the bug report with that information so it's not lost.

This revision is now accepted and ready to land.Jun 1 2017, 7:24 AM
clutz edited the summary of this revision. (Show Details)Jun 1 2017, 8:29 AM
clutz edited the summary of this revision. (Show Details)
clutz edited the summary of this revision. (Show Details)
clutz added a comment.Jun 1 2017, 8:36 AM

Hi David, there's no bug report for that in the KDE issue tracker (but in our custom one). I adjusted the patch description above. Is that ok for the kind of documentation that you want to have? (I would prefer to send you a new git-patch that contains a complete description, but phabricator seems to drop any patch descriptions...)

This revision was automatically updated to reflect the committed changes.