Killing KInit With Fire
Open, Needs TriagePublic

Description

See KInit - Current state and benchmarks thread started by @davidedmundson on kde-frameworks-devel.

The current situation is:

  • the performance gain is way less impressive than it used to be
  • the tests on slow hardware still give a little gain, but far from impressive either
  • alberts really old machine seems to gain a lot, why? this needs to be investigated. is it maybe measuring a debug build that does a lot more disk-io due to larger library sizes?
  • most applications don't do the necessary work to benefit from kdeinit

    almost no code path nowadays really use it (as far as startup is concerned, it goes through kdeinit only from the file explorer and you have to try to open a file for which the corresponding application did the work, chances it happens are very low)
  • no one complained that things got slow

Based on the above, the conclusion is: let's phase it out and move it to unmaintained. If that was to become a real problem in the future we could team up with an existing solution or resurrect it. Nothing indicates this is currently needed to go through an effort to generalize it again.

Alternatives or related tools

https://github.com/facebookincubator/BOLT

-Wl,--as-needed but disable that for KCrash and ICU and potentially others.

Actually profile startup time to investigate which libs are triggering large overhead at _dl_start time

mwolff created this task.Nov 23 2019, 11:45 AM
nicolasfella moved this task from Backlog to Metatasks on the KF6 board.Nov 25 2019, 9:27 PM
nicolasfella moved this task from Metatasks to In Progress on the KF6 board.Nov 26 2019, 5:50 PM

almost no code path nowadays really use it (as far as startup is concerned, it goes through kdeinit only from the file explorer and you have to try to open a file for which the corresponding application did the work, chances it happens are very low)

Just to super pedantically clarify that.

As far as plasma startup is concerned it goes through kdeinit only for kded and kcminit_startup, but not the rest of autostart with plasmashell, ksmserver etc.

As far as later application launching is concerned, it goes through kdeinit only from the file explorer and you have to try to open a file for which the corresponding application did the work, chances it happens are very low.

alberts really old machine seems to gain a lot, why? this needs to be investigated. is it maybe measuring a debug build that does a lot more disk-io due to larger library sizes?

I did have a look via ssh. They weren't debug builds. Supposedly they have the linker flags --as-needed already. I could reproduce the potential gains with dolphin. A lot of regular app startup time was spent dealing with fonts, which is one area kdeinit supposedly helps with. Though I was unable to confirm that this was where kdeinit saved time.

This raises a question. what do we want to do with KToolInvocation?

KToolinvocation effectively does two tasks:

  • Act as way of starting processes. Rather like KRun, but very explicitly tied to klauncher and at a lower tier.
  • Helper methods for launching emails, browsers and terminals that follow the user preferences.

It's not especially well used.
Generally KRun is better at the first task. It's more portable, it does StartupIDs property, and can do arguments and files better.

The helper methods I think aren't too bad. IMHO we should move them to the pending new KRun so there's one app-launching implementation.

Found something interesting:

I removed kdeinit5_startup from startplasma.

In theory still going into a kinit path should call ensureKdeInit running and still work.

In practice when it is auto-activated something is wrong and klauncher doesn't work properly. I haven't fully investigated what, I ended up having to put the kdeinit5 startup back into startplasma again.

That implies kinit there's always been a problem for code that uses KToolInvocation on Gnome unnoticed for ages :/

Subtasks, we want to remove kinit from the few places that do use it correctly.

Excluding ioslaves for now that's:
https://lxr.kde.org/ident?_i=kdemain&_remember=1

volkov added a subscriber: volkov.Apr 27 2020, 5:10 PM

alberts really old machine seems to gain a lot, why?

Does it use SysV-style hash table instead of GNU-style hash table for ELFs?

sandsmark added a subscriber: sandsmark.EditedJul 16 2020, 2:32 PM

FWIW, kdeinit still provides a very noticeable improvement here as well.

Even with konsole running in single-process mode makes the difference between instant-appear vs. noticeable wait (on a x1 carbon 2019 model). I have profiled konsole to death and back, the only thing that helps is kdeinit or ripping out large dependencies lddtree finds (ref. Actually profile startup time to investigate which libs are triggering large overhead at _dl_start time it's death by a thousand papercuts).

And existing things like prelink won't help with e. g. fonts, there isn't really a replacement for doing what zygote or kdeinit does (otherwise I guess google would have replaced the zygote since it makes a lot of things harder, like proper ASLR).

But I think it would be a good idea to update the libraries kdeinit loads, it has mapped ~50 here, and konsole is around 250-300, quickly looking at /proc/foo/maps).

Does it use SysV-style hash table instead of GNU-style hash table for ELFs?

GNU style here:

[16:21:45] thulcandra: ~/ readelf -S /usr/bin/konsole   | grep hash
[ 4] .gnu.hash         GNU_HASH         0000000000000308  00000308

And I also build with --as-needed.

And semi-related, is BOLT different from plain old PGO?

edit: ... now I see that kdeinit5_wrapper now links against KF5DBusAddons, which makes the entire point of it useless. There's a very good reason it didn't use dbus.

And existing things like prelink won't help with e. g. fonts

I think an interesting piece of profiling would be whether any potential boost is actually from linking or whether it is from the extra kdeinit things like fonts. If it's mostly fonts maybe we can look into alternative solutions at a binary level.

Can also confirm how you're starting konsole?

sandsmark added a comment.EditedJul 16 2020, 3:13 PM

And existing things like prelink won't help with e. g. fonts

I think an interesting piece of profiling would be whether any potential boost is actually from linking or whether it is from the extra kdeinit things like fonts. If it's mostly fonts maybe we can look into alternative solutions at a binary level.

The only thing showing up in perf when forcing a quick exit early in main() is ld and related symbol-resolution functions.

I also confirmed that this was the issue by doing the aforementioned ripping out of dependencies, which improved the launch time. But I like having all the konsole features that those dependencies provide. :-)

Can also confirm how you're starting konsole?

kdeinit5_wrapper konsole

But as I mentioned in my comment above, the wrapper now suddenly links against dbusaddons, which means it loads in more than five times the number of libraries and it's much slower than earlier. The entire point of a custom protocol there was to avoid having a large amount of dependencies (AFAIK, it came before my time).

edit: did some quick checking: wrote a tiny c wrapper that just execvp() whatever it's given after printing CLOCK_MONOTONIC_RAW, and put the same clock_gettime() + printf code in kdeinit5_wrapper and in konsole's kdemain().

the diff in timestamps is ~20ms when using kdeinit5, and ~170 when just using the c wrapper wrapper.

and since I have a pretty minimal setup (so no breeze or plasma-integration, which pull in a bunch of extra libraries), and a fairly modern machine (with an ssd), I'd say that is a significant difference.

edit2:
the (very ugly) code/patches to reproduce, btw:
http://ix.io/2rN8
http://ix.io/2rN9
http://ix.io/2rNa

davidedmundson added a comment.EditedJul 17 2020, 2:54 PM

FWIW, updated link to my timer is at https://invent.kde.org/davidedmundson/inittimer from the original email thread linked at the top.

Running mine with konsole I get:

RESULT : DaveTest::testQProcess():

367 msecs per iteration (total: 367, iterations: 1)

RESULT : DaveTest::testKInit():

326 msecs per iteration (total: 326, iterations: 1)

which shows ~40ms faster with kinit

It's a very different test, it creates a wayland server then times the time till a window appears with the two launch approaches.
My rationale being partly to get a sense of the overall picture of how it affects the grand scheme of things and also to try and take the font stuff into account.

I'll try and run yours, in theory I should see the same 40ms difference.

Edit: And with your tests
exec: 33ms
kdeinit: 7ms

which doesn't quite add up, but sounds plausible.

I wish we could do a benchmark with Prelink. Then we we know which area does it come from

We could kill it regardless of regressions in performance. That could simplify plasma significantly

BTW, using -Bsymbolic-functions in KF5 could probably reduce startup times: https://wiki.qt.io/Performance_Tip_Startup_Time

ervin moved this task from In Progress to Needs Input on the KF6 board.Mar 27 2021, 2:06 PM
ervin moved this task from Needs Input to In Discussion on the KF6 board.Mar 27 2021, 2:33 PM
dfaure added a subscriber: dfaure.EditedMar 27 2021, 2:59 PM

Cleanups needed: removing the last uses of kf5_add_kdeinit_executable ==> T14298

dfaure added a comment.EditedMar 27 2021, 3:45 PM

Meeting notes from the KF6 sprint:

It's confirmed that we want to kill it. Completely, not even keeping the support for launching slaves.
Using kdeinit/klauncher for kioslaves allows to pass them between processes. It was useful for HTTP (e.g. click on HTTP link in kmail, it opens gwenview which reuses the slave on hold). But we don't need this anymore, now that HTTP is sent to a webbrowser, always (see componentchooser).
Performance impact on slave launching seems negligible or even better with KDE_FORK_SLAVES, as measured by perf stat --repeat 100 kioclient5 ls . >/dev/null with and without KDE_FORK_SLAVES.
Without it, we have to talk DBus to klauncher and wait for a reply, which maybe compensates for the time saved by saving relocations.

The still useful parts of KToolInvocation should be redesigned as jobs on top of ApplicationLauncherJob, like MailClientLauncherJob, TerminalLauncherJob etc. Maybe in KIO for now, then it can all move down together for KF6 => moving that to T12185

sitter added a subscriber: sitter.Mar 29 2021, 6:37 AM

I've been thinking that it'd make sense to globally set KDE_FORK_SLAVES for neon's unstable edition already to get broader test coverage. Any objections?

Given that is the default mode we use them on other platforms, I would love to see this more tested, thought I am not able to grasp the full impact of this.

Yes that would make sense, as a first step before I toggle it for good in KIO.
In theory it should all be fine :)

Email sent to kde-frameworks-devel ML, to have more testers (in case someone who follows that ML didn't attend the sprint) https://mail.kde.org/pipermail/kde-frameworks-devel/2021-March/116765.html

OK, I've had KDE_FORK_SLAVES=1 exported for a couple of months now, I didn't see any issues.

(Apart from having to change how I attached gdb to a kio slave https://community.kde.org/Guidelines_and_HOWTOs/Debugging/Debugging_IOSlaves#If_KDE_FORK_SLAVES_is_set).

Assuming kinit will be gone for kf6, what would be the replacement for org.kde.klauncher dbus interface?

For example how to replicate below dbus call?

qdbus org.kde.klauncher5 /KLauncher org.kde.KLauncher.exec_blind /opt/kde5/bin/kate ""

Gnome has org.gnome.Shell.AppLauncher.Launch() or org.gtk.Application.Activate. Is there any such dbus interface left in kde afterm kinit removal?

Where does this situation come up for clients to need to?

davidedmundson added a comment.EditedJun 17 2022, 10:31 PM

Sounds more like users working round the limited API of kwins scripting. No need to use DBus there. If that's the only reason I'll export something more suited there. I need a Qml wrapper for Plasma6 anyway.

kwin scripting was example usecase (perhaps most popular) but generic one can be summed up as "run command through dbus".

Personally I use it for app which lives in container and can't interact with anything directly from host but since it has access to dbus it can execute everything through org.kde.klauncher5 interface.

There is org.kde.krunner interface which could be used instead but it requires passing
QVariantMap as one of args and this can't be done through cmdline.

Sandbox exploitation seems a reason to not have this available :/

One option on dbus is system. A call to the manager to start a transient service. There's a code path for that in kio's processrunner if you need something to copy.

A proper sandbox would just block dbus or filter it so I called it container without security connotations :) It's not like klauncher is the only dbus interface that have exploitation potential. For my needs it was perfect fit that improves the workflow.

Do you mean using something like systemd-run through dbus? Could you give example command for that?

alex added a subscriber: alex.Jun 20 2022, 3:50 PM

There is org.kde.krunner interface which could be used instead but it requires passing

I don't see what KRunner has to do with this. The DBus interface is to show KRunner, show it with the clipboard content or show it with a specified text.

@alex It was proposed doing something like below in C or python wrapper:

callDBus('org.kde.krunner', '/org/kde/krunner', 'org.kde.KDBusService.CommandLine', ['/usr/bin/konsole'], '/home/', 'what should I put here?');

I don't know if it was successful though but some users can be desperate 😄

https://old.reddit.com/r/kde/comments/uvbt3i/kwin_how_to_run_a_shell_command_with_qdbus/i9l90lu/

vkrause moved this task from Waiting on KF6 Branching to Done on the KF6 board.Feb 21 2023, 4:35 PM