[applet] Let specify a version for applets private plugins
AbandonedPublic

Authored by anthonyfieroni on Feb 19 2017, 7:43 AM.

Details

Reviewers
davidedmundson
sitter
dfaure
Group Reviewers
Plasma
Summary

This is an attempt to start versioning plrivate applet plugins, as a beginning i provide only one example if it's accepted i'll provide for others too. We use a leaf classes in private plugins since we not expose this abi to others, but consider long-running application, like plasmashell, is attempt to be update to newer version e.g. 5.9.2 to 5.9.3 and we not provide any abi compability from 5.9.2 to 5.9.3 for those leaf classes. Package manager updates these shared libs and reload ld caches, so depending on abi breakage this will cause a crash in running plasmashell cause nmapped shared libraries will be updated (this is from my Linux knowledge and researching ldconfig -v -n as use for updating ld.so.cache). We don't have such a problems with applications cause they depends on frameworks who have strict abi compliance.
My mind thoughts can be completely in wrong direction, you can provide better explanation :)

Diff Detail

Repository
R119 Plasma Desktop
Lint
Lint Skipped
Unit
Unit Tests Skipped
anthonyfieroni created this revision.Feb 19 2017, 7:43 AM
Restricted Application added a project: Plasma. · View Herald TranscriptFeb 19 2017, 7:43 AM
Restricted Application added a subscriber: plasma-devel. · View Herald Transcript

That's not how the kernel works. If it opens a file and then the file gets deleted/replaced the original file will only actually be deleted when any open references are dropped.

Let try to better clarification, we breaking abi like this (https://cgit.kde.org/plasma-desktop.git/commit/?h=Plasma/5.9&id=dc17f78ebcc76600040a8d2ef8c57b9a41a8d06e) running update from 5.9.1 to 5.9.2 cause a crash in plasmashell, i don't save backtrace but crash was there) in my case at application stop (logout)

dfaure edited edge metadata.Feb 19 2017, 10:04 AM

I was always told "plugins should not get versioned at the .so level". Not sure why exactly. The more usual solution was version numbers in .desktop files, to query for things with the right version. I guess this can be transposed to json since Qt reads the json data without really loading the .so file.

David Ed: you're right, but think of the case where you open a new plugin (in a long running process) which itself links to a freshly upgraded shared lib - I'm not sure what happens but it's one of these two cases:

  1. it loads the new shared lib next to the old shared lib -> duplicate symbols, crash.
  2. the old shared lib keeps being unused because "already loaded", and the new plugin fails to find its symbols.

Overall there is no way to actually load a new plugin in an old plasma, if the underlying APIs are incompatible. So the only thing you can do is forbid loading incompatible plugins by using a version check (see above), or not support the use case at all (i.e. tell users to restart plasma after upgrading).

We've had the same with kdeinit5 for ever. You upgrade KF5 (or kdelibs before) and apps, then try to run an app that needs a new symbol in kio, the long-running kdeinit5 doesn't have it, boom, undefined symbol. The only solution has always been: restarting kdeinit5 after upgrading kdelibs/KF5.

sitter edited edge metadata.Feb 19 2017, 11:43 AM
In D4667#87551, @dfaure wrote:

I was always told "plugins should not get versioned at the .so level". Not sure why exactly.

Are you sure this wasn't simply about plugins not being in public library directories? I know some people don't like that. I find it hard to find a reason why adding a bunch of digits and dots would be advised against though. That being said, I don't think the proposed solution solves anything as these plugins supposedly would be loaded through QLibrary and so that call would need to get a version specification when loaded... I have no knowledge over Plasma's way of using the qtquickengine though, so maybe I am talking nonesense and that is already done.

The more usual solution was version numbers in .desktop files, to query for things with the right version. I guess this can be transposed to json since Qt reads the json data without really loading the .so file.

Options as I see it are:

  • An approach I saw in (I think) grantlee is installing the plugins to tightly versioned directories (e.g. $prefix/lib/grantlee/1.2.3/foo.so) and then making sure that is a qlibrary search path.
  • Plugins loaded by a library or application could use JSON metadata to establish version compatibility.
  • Lastly one could probably use soversions as suggested in the diff here combined with requesting specific versions from QLibrary.

The latter two options actually may have negative performance impact on software with many plugins since one needs to iterate all potential directories (in JSON's case even fopen all files). So, the subdirectory approach is, to me, the most elegant and efficient way to version plugins.

Now, all that said, I think this needs some discussion by the Plasma team on the principle of it. Ultimately QML module or plugin version-locking gives you nothing. Sure, say you lock taskmanagerplugin in step with plasmashell it probably prevents crashing by loading incompatible C++ bits. BUT the user can still add (potentially new) applets to the shell which in turn is then broken because of the module Plasma refuses to load because of version-locking. This is similar to a problem we had with the logout screen where loading the (new) QML assets upon logout would run into new code paths that the old plasmashell binary cannot handle and so nothing is rendered.

To put it bluntly: any application that lazy-loads anything is simply broken after a "live update" with API changes. To varying degrees, sure, broken none the less. Whatever band-aid is put on top doesn't matter, the application remains partially non-functional from a user POV.
The one and only way I can think of to prevent this is to have old and new versions installed side-by-side until a reboot occurs otherwise you'll have cases where you need to load new stuff (which you shouldn't) or can't load old stuff (which was replaced by new stuff). That would however require you to make distros do that (I find that unlikely to happen) and write lots of code to fully isolate the file trees. All plugins, all QML modules, all runtime assets (pictures what have you), there must not be any overlap really. You'll essentially have to make a Plasma bundle ;)
Whether or not that is worth it to Plasma and to which degree, is the question we need to answer first.

My 2 cents are: if the breakage does not impair in a meaningful way then the problem is unfortunate but simply the cost of doing live updates. So if the logout dialog is broken that is a big deal, if plasmashell crashes once I'll argue it doesn't matter because it would restart anyway and at that point be compatible with its own plugins again (NB: it may now be incompatible with surrounding services such as kactivitymanagerd ;))

Sorry for the rambling. This is a vast and tricky problem :)

(Also see offline updates https://phabricator.kde.org/T3137 for the distribution approach of avoiding this by only conducting updates on start-up. As mentioned, bundling all of plasma would also solve this, albeit not really for distributions.)

@sitter: BUT the user can still add (potentially new) applets to the shell which in turn is then broken because of the module Plasma refuses to load because of version-locking.

I think that's not a problem, old plasma can load only old plugins, user decide when and why to restart your shell or session.

@sitter: All plugins, all QML modules, all runtime assets (pictures what have you), there must not be any overlap really. You'll essentially have to make a Plasma bundle ;)

Hope i will never see such a OSX way :)

@sittter: My 2 cents are: if the breakage does not impair in a meaningful way then the problem is unfortunate but simply the cost of doing live updates. So if the logout dialog is broken that is a big deal, if plasmashell crashes once I'll argue it doesn't matter because it would restart anyway and at that point be compatible with its own plugins again (NB: it may now be incompatible with surrounding services such as kactivitymanagerd ;))

Crashing on logout isn't a big deal, but we can prove an acceptable solution to KDE software QA :)

David Ed: you're right, but think of the case where you open a new plugin (in a long running process) which itself links to a freshly upgraded shared lib - I'm not sure what happens but it's one of these two cases:

That's a good question; though it seems somewhat different to what Anthony was describing. You're suggesting things will crash if the ABI of things the plugin links to change. The example given in the cgit link above is an ABI change of the plugin itself. Also it's described as crashing "when upgrading" which I don't get.

But for your case I don't think taskmanagerplugin is a problem (or could ever be) an ABI break of libtaskmanager which is used by both the taskamangerplugin and the pagerplugin could be.

anthonyfieroni added a comment.EditedFeb 20 2017, 5:11 AM

OK, upgrading is just an override existing library, no? So lets make 2 tests:

  1. Change in taskmanagerplugin that isn't abi -> force replace the lib when plasmashell works -> nothing happened
  2. Make an abi change -> force replace it when plasmashell works -> crash

You could just build an autotest maybe? 👅

So, let's try with the previously mentioned https://cgit.kde.org/plasma-desktop.git/commit/?h=Plasma/5.9&id=dc17f78ebcc76600040a8d2ef8c57b9a41a8d06e

I am doing:

git clone --branch Plasma/5.9 kde:plasma-desktop
git revert dc17f78ebcc76600040a8d2ef8c57b9a41a8d06e
cmake && make && make install
kquitapp plasmashell
git revert HEAD # revert the revert to apply the change again
make && make install

I do not observe a crash with this. Nor would I expect one because, as davidedmundson pointed out, the new plugin library would not get loaded until plasmashell is restarted. Maybe I am testing incorrectly though.

But to verify the file loading let's check the process map. This is loaded at first:

$ pmap -X -p `pidof plasmashell` | grep private/taskmanager/libtaskmanagerplugin
    7f1723dd8000 r-xp  00000000  00:15 10808567     152    152    152        152         0              0              0               0    0       0      0 /usr/lib/x86_64-linux-gnu/qt5/qml/org/kde/plasma/private/taskmanager/libtaskmanagerplugin.so
    7f1723dfe000 ---p  00026000  00:15 10808567    2048      0      0          0         0              0              0               0    0       0      0 /usr/lib/x86_64-linux-gnu/qt5/qml/org/kde/plasma/private/taskmanager/libtaskmanagerplugin.so
    7f1723ffe000 r--p  00026000  00:15 10808567       4      4      4          4         4              0              0               0    0       0      0 /usr/lib/x86_64-linux-gnu/qt5/qml/org/kde/plasma/private/taskmanager/libtaskmanagerplugin.so
    7f1723fff000 rw-p  00027000  00:15 10808567       4      4      4          4         4              0              0               0    0       0      0 /usr/lib/x86_64-linux-gnu/qt5/qml/org/kde/plasma/private/taskmanager/libtaskmanagerplugin.so

Notably, the file is inode 10808567. So, I bring back the change and install again:

$ pmap -X -p `pidof plasmashell` | grep private/taskmanager/libtaskmanagerplugin
    7f1723dd8000 r-xp  00000000  00:15 10808567     152    152    152        152         0              0              0               0    0       0      0 /usr/lib/x86_64-linux-gnu/qt5/qml/org/kde/plasma/private/taskmanager/libtaskmanagerplugin.so (deleted)
    7f1723dfe000 ---p  00026000  00:15 10808567    2048      0      0          0         0              0              0               0    0       0      0 /usr/lib/x86_64-linux-gnu/qt5/qml/org/kde/plasma/private/taskmanager/libtaskmanagerplugin.so (deleted)
    7f1723ffe000 r--p  00026000  00:15 10808567       4      4      4          4         4              0              0               0    0       0      0 /usr/lib/x86_64-linux-gnu/qt5/qml/org/kde/plasma/private/taskmanager/libtaskmanagerplugin.so (deleted)
    7f1723fff000 rw-p  00027000  00:15 10808567       4      4      4          4         4              0              0               0    0       0      0 /usr/lib/x86_64-linux-gnu/qt5/qml/org/kde/plasma/private/taskmanager/libtaskmanagerplugin.so (deleted)

As indicated by pmap we still have the same library loaded and as indicated the backing file is now deleted from disk.

Checking the newly installed file we'll also see it having a different inode.

$ ls -li /usr/lib/x86_64-linux-gnu/qt5/qml/org/kde/plasma/private/taskmanager/libtaskmanagerplugin.so
10808655 -rw-r--r-- 1 root root 2259408 Feb 20 09:30 /usr/lib/x86_64-linux-gnu/qt5/qml/org/kde/plasma/private/taskmanager/libtaskmanagerplugin.so

So, clearly the new library was not loaded, and I honestly fail to imagine a scenario where it would get loaded even though we have the file loaded already.

I'm a bit pretty harsh, add other private variable and handcopy newly plugin e.g. sudo cp libtaskmanagerplugin.so $(locate private/taskmanager/libtaskmanagerplugin.so)

Provide a diff so we can reproduce?

hein added a subscriber: hein.Feb 20 2017, 9:45 AM

It's worth keeping in mind KDE stuff has never supported runtime updates. Consider e.g. kconf_update, which runs while apps run, and apps can undo its changes on quit.

diff --git a/applets/taskmanager/plugin/backend.h b/applets/taskmanager/plugin/backend.h
index 87037c6..ef47344 100644
--- a/applets/taskmanager/plugin/backend.h
+++ b/applets/taskmanager/plugin/backend.h
@@ -104,6 +104,7 @@ class Backend : public QObject
     private:
         void updateWindowHighlight();
 
+        int *abi_break = new int{1};
         QQuickItem *m_taskManagerItem;
         QQuickItem *m_toolTipItem;
         QQuickWindow *m_groupDialog;
make -j4
sudo cp libtaskmanagerplugin.so $(locate private/taskmanager/libtaskmanagerplugin.so)

I don't want to make a crashes just for fun, i try to make hard to happen :)

Still does not crash.

Move mouse over taskmanager :)

anthonyfieroni added a comment.EditedApr 27 2017, 3:24 AM

It's interesting to understaind why it's happened, after couple of suspend / resume (in RAM) then pacman -S plasma-workspace (even with same version) then kquitapp5 plasmashell -> crash

Application: Plasma (plasmashell (deleted)), signal: Segmentation fault
[Current thread is 1 (LWP 25926)]

Thread 7 (LWP 10156):
#0  0x00007f097ecbfffd in ?? ()
#1  0x0000000000000000 in ?? ()

Thread 6 (LWP 26190):
#1  0x0000000000000000 in ?? ()

Thread 5 (LWP 26183):
#1  0x000000000290f8e0 in ?? ()
#2  0x000000000290f9e0 in ?? ()
#3  0x0000000000000000 in ?? ()

Thread 4 (LWP 26063):
#1  0x00000000021ad560 in ?? ()
#2  0x00000000021bc1c0 in ?? ()
#3  0x0000000000000000 in ?? ()

Thread 3 (LWP 26016):
#1  0x00007f0978638908 in ?? ()
#2  0x0000000000000000 in ?? ()

Thread 2 (LWP 25953):
#1  0x0000000000000000 in ?? ()

Thread 1 (LWP 25926):
#1  0x0000000000000000 in ?? ()

Crash report every time is on deleted process, scenarios to me:

  1. Kernel bug
  2. Some newly (plugin / lib and so on) tries to notify or something plasmashell

Testing done (no other updates between tests):

  1. Suspend / resume -> pacman -Su plasma-workspace -> kquitapp5 plasmashell -> not crash (5/5)
  2. But continuous suspend / resumes (again i don't perform any updates to be sure when applying same plasma-workspace version) -> pacman -Su plasma-workspace -> kquitapp5 plasmashell -> crash with above delete'd backtrace.

Any ideas ?

I perform restart once per month (sometimes twice)

[toni@toni-pc ~]$ inxi -I
Info:      Processes: 192 Uptime: 11 days Memory: 1265.2/3851.8MB
           Client: Shell (bash) inxi: 2.3.8
sitter requested changes to this revision.Jun 19 2018, 10:02 AM

-1 from me until someone finds out what exactly is wrong.

This revision now requires changes to proceed.Jun 19 2018, 10:02 AM
anthonyfieroni abandoned this revision.Jun 19 2018, 10:35 AM

Let's abandon it, i'll return to it, if still needed.