AI/LLM assistant integration
Open, WishlistPublic

Description

Description

Large-language model AI assistants are all the rage, and this is a trend that is evolving quickly. Given that there is so much open availability to integrating an AI LLM into, well, anything, really, it would be very cool to have an assistant built into KDE that will do things for the end-user, rather than having to click or touch the screen. Want to launch Steam? Speak, verbally, to KDE, and tell it (literally) to launch Steam. Want to change a wallpaper? The KDE assistant should be able to, at least, launch settings, and at best, adjust those settings for the end-user, without ever having to touch the mouse, keyboard, or screen.

Why does it need to be done?

This would be extremely helpful from an accessibility perspective for people who are at a disadvantage when trying to use a mouse, a keyboard, or a screen. Additionally, given that KDE's main "claim to fame" (can't believe I just said that) is its configurability, and as that continues to expand with new features (and their configurations), it's only going to get more complex. Having an AI LLM to do the work for the end-user would make it much easier to sift through the various configurations.

How does it connect to KDE's vision of "A world in which everyone has control over their digital life and enjoys freedom and privacy"?

It puts the vast power of configurability into the (literal) voice of the end-user, allowing them to tell KDE what they want, and KDE can help them to do it.

How would it affect different parts of KDE?

KDE would need a built-in system that accept speech input from a microphone device. Phonon would be the logical place to put this additional functionality, although a separate AI/LLM subsystem to interpret speech and integrate into the system would be net new. And that system would need to be able to integrate deeply into existing functionality, including prompting for sudo privileges.

What it will take

A LOT of time, probably a LOT of money, and a LOT of knowledge that I don't have.

How we know we succeeded

When you can tell KDE what you want to do, with your own voice, and it does it for you. And then watching as end-users bow before you, sacrificing little gnomes, little apples, and little colored "windows" to your greatness. ;)

Champions

The team is:

  • XXX
  • XXX
  • XXX

I am willing to put work into this

  • Timothy Gravier, Jr. -- although I haven't actually coded in C++ in years, barely even seen QT libraries, and don't know how to integrate an AI LLM. But it can be done, right? People do it! Oh, and I can end-user-test the crap out of it.

I am interested

  • Timothy Gravier, Jr.
timgravier triaged this task as Wishlist priority.
timgravier updated the task description. (Show Details)Jun 7 2024, 5:55 PM
ngraham added a subscriber: ngraham.Jun 8 2024, 2:55 AM

Is this about AI, or voice control? Because I had voice control on my Macintosh LC III In 1993. So clearly you don't need AI for voice control, and AI is not a magic bullet that gives you whatever feature you happen to want. :)

Voice control still exists in some form in niche software packages, but those generally accept pre-fabricated voice commands for pre-fabricated actions. There may even exist voice-command macro's that will allow the creation of new, scripted actions, but again, that's not really at the level of automation of an AI/LLM. Voice control is just one (rather small) idea for leveraging an AI/LLM, but it could be leveraged to do a wide variety of things. Microsoft has Co-Pilot to help with coding and document composition, for example. And even that is a bit limited, compared to what it COULD do.

I'm not saying that KDE should leverage AI/LLM's only for voice control, or only for document authoring. I'm saying that it could be leveraged to do a LOT more...much more than my old Gen-X brain can think of this late at night. Brainstorming by smarter people than me would be worthwhile. Microsoft managed to figure out coding and document authoring assistance with it. I'm sure KDE could take it further.

You know, it might be a good idea to create an interface that allows end-users to converse with their AI/LLM of choice... rather than having to go to that vendor's website and log in, there could be a KDE application that can hook into the popular ones. ¯\_(ツ)_/¯

Solution like this https://invent.kde.org/utilities/alpaka can be a good idea i think ^^

Somehow I can`t create a task, however I discovered this goal and I believe it aligns with my idea.

As english is not my native language I use DeepL to improve my english skills, however I have been unable to find any integration in KDE plasma.

My goal is to create a default implementation in the input system of kde plasma in order to integrate emojis, translarion, Deepl, audio to text and so on.

DeepL has a plugin for Chrome that is highly effective. However, I would prefer a system-wide solution. During my internet research, I came across an implementation by David Edmundson. You can find the link to his blog and GitLab here.

https://blog.davidedmundson.co.uk/blog/new-ideas-using-wayland-input-methods/
https://invent.kde.org/davidedmundson/inputmethod-playground

He refers to it as "InputMethod Playground", and it is a truly innovative concept. I would love to see that implemented system Wide in plasma with the basics like emojis and spelling checker, as well as the option of using third-party plugins such as DeppL or ChatGpt.

lydia updated the task description. (Show Details)Jun 8 2024, 5:15 PM
lydia updated the task description. (Show Details)Jun 8 2024, 5:23 PM
nerumo added a subscriber: nerumo.Jun 12 2024, 5:29 AM

I made a speech specific goal, so we could make this about an AI assitant.

https://phabricator.kde.org/T17404

lydia added a subscriber: lydia.Jun 14 2024, 6:08 PM

Each goal needs Champions. If no-one is found it will unfortunately not be eligible for voting.

mrjulius added a subscriber: mrjulius.EditedJul 5 2024, 10:19 AM

The large picture would be to make KDE ready for AI-integrations. That would mean exposing more actions and information to DBUS or other API, and to update it's documentation, perhaps even provide examples.

For instance, some time ago I tried to simply get a list of directories open in Dolphin, but simply couldn't: I found no answers from the dbus documentation and qdbus didn't show any relevant methods.

In my opinion, KDE would not need to ship with the actual virtual assistant, but rather provide a suitable environment for such implementations, so that
A) It's easy to experiment and develop virtual assistants on KDE
B) Most of the community-created virtual assistants will support KDE

frdbr added a subscriber: frdbr.Jul 29 2024, 3:54 PM
frdbr added a comment.Aug 12 2024, 7:41 PM

Hello,

Please note that the deadline just around the corner on Wednesday, so now is the time to finalize your proposal. Remember that proposals without a Goal Champion will be disqualified, so this step is crucial to ensure your idea moves forward. If you need help or have any questions, please let me know.

If you’re unable to finish your proposal but still want to participate, consider contributing to other ongoing tasks.

Thank you for submitting your ideas for the KDE Goals!