They block the startup notably but the API doesn't require it to be blocking.
Details
Diff Detail
- Repository
- R120 Plasma Workspace
- Branch
- master
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 13720 Build 13738: arc lint + arc unit
Why not just a singleshot timer from the constructor? Avoids any initial blocking ...
Initial, but doesn't fix the problem. We could potentially delay this few seconds instead, but we'd still be getting an odd ~500ms freeze randomly
Or we can just decide to move blocking or heavy algorithms to separate threads and enjoy our multicore computers and a fluid experience.
Where does the blocking happen? How do you guarantee none of the later call block? Large parts of the code are executed in the main thread anyway, only the initial list creation happens in a worker thread.
Also, if this is a generic problem, why only fix it in the dataengines, not in Solid itself?
listFromQuery is the big blocking call we are moving into a separate thread.
Also, if this is a generic problem, why only fix it in the dataengines, not in Solid itself?
This would mean refactoring how Solid works and even coming up with entirely new concepts. I don't discard doing it at some point but I don't think plasmashell startup blockage is reason enough to redo a framework that has been working for about 5 years as is.
Again, where is it blocking? Which backend?
Also, if this is a generic problem, why only fix it in the dataengines, not in Solid itself?
This would mean refactoring how Solid works and even coming up with entirely new concepts. I don't discard doing it at some point but I don't think plasmashell startup blockage is reason enough to redo a framework that has been working for about 5 years as is.
listFromQuery can be replaced by an asynchronous "enumerate(predicate)" call which uses the existing DeviceAdded signal. This would also remove the inherent race between the listFromQuery and DeviceAdded/DeviceRemoved.
udisks2 mainly, but every backend can block by its virtue.
listFromQuery can be replaced by an asynchronous "enumerate(predicate)" call which uses the existing DeviceAdded signal. This would also remove the inherent race between the listFromQuery and DeviceAdded/DeviceRemoved.
I've looked into changes that could be done to solid to improve this, nothing felt like a good step forward. This approach works, makes sense and shows results.
The code can still block, as every invocation of 'Solid::Device(udi)` does a blocking DBus call to udisks2.
Also, the code is calling non-threadsafe code from multiple threads now (e.g. once from each the two dataengines helper threads). Each one will call the udisks2 Manager::deviceCache() method.
There's a Backend per-thread and deviceCache is not static so each thread will call its own.
Sorry, but you are wrong, look for
Q_GLOBAL_STATIC(Solid::DeviceManagerStorage, globalDeviceStorage)
Yes, you are correct, so there are no threading issues, sorry.
On the other hand, this IMHO makes this approach even less profitable:
- every backend is initialized once from the two helpers (assuming both dataengines are used).
- after the threads finish, all but the relevant UDIs are thrown away (and care must be taken not to use anything else - the devices would be dangling pointers in the now gone thread storage)
- the first completing thread triggers the initialization of the backends again - in the main thread.
As the initialization happens implicitly by the Device instantiation, for e.g. UDisks there are two synchronous DBus calls now, one for introspection from the manager backend and one for the device itself from the device backend. Both are done in the main thread, and we will block even longer.
So any gain here is AFAICS from the fact the constructor no longer blocks on a DBus call.
Kinda, you're summary missing a key part.
(the commit description is a bit poor)
This patch does the following:
- It does the search in the other thread. That creates and iterates every possible device. This is expensive as there are lots of potential devices. We fetch pretty much everything udisks knows.
- From the given list we recreate the select few devices that are relevant. This is relatively cheap. It does still block as you point out, but now for a much much smaller set of objects compared to before.
(on my laptop that's 56 devices created when searching, resulting in 8 useful devices).
This is where the main thread saving is meant to come in.
As for dangling pointers. The private manager object dies as its backed by TLS. The device object itself does not. The device object has guards against a deleted manager internally. It won't crash, but will return "dead" values.
You're absolutely right it's a hack around the current state, rather than a good long term direction. I'm not exactly loving it.
I would love to rework solid to have a nice async api, and also to not use data engines. I'm not sure its doable within kf5.
Things we could do right now, without a super big refactoring:
On a solid level:
- Maybe a useful intermediate change would be to add API in Solid to move devices between threads.
- Alternatively we could consider making sure backends stick to one specific non-GUI thread, keeping only 1 backend per application and have internals speak to it (admittedly a bigger refactoring).
On a plasma level:
- Keep Solid::Device::listFromQuery calls into a separate QThread and only extract tuples with the information we need rather than a Solid::Device (which was obviously faster but crashed because we can't move devices between threads).
Either way, it's not a matter of KF5, adding new API is perfectly fine, problem is that we may end up redesigning both the frontend and the backend and this is far too much work right now. And so it will be when we're porting things to KF6.
Also, I would suggest not really expecting to be able to do a nice, in-thread async API. Note that here we have exactly the same problem we have on plasma-nm and in Qt bearing networkmanager backend. It's hard to use these types asynchronously and we shouldn't go through hoops to make it thread-local.
Just for clarification:
On my system, solid-hardware5 list returns 77 devices, but with the Predicate Filter I am down to 10, and this is filtered by the hotplug engine down to 7.
Half of the devices (32) are serial ports. These are returned by the udev backend, but when combined with the "Camera | PortableMediaPlayer" predicate, the udev backend returns an empty list and nothing is instantiated.
The enumeration in the udev backend, especially 'PortableMediaPlayer' has become much cheaper since https://phabricator.kde.org/D21379.
The Udisks backend indeed returns to much, and unfortunately it creates a backend device for each block device, which implies a number of DBus roundtrips. But this would come down to a single DBus call when D19677 lands.