OLD OUTDATED INFORMATION: Qt5::Xml and therefore QDom* is supposed to go away in Qt5. This is a meta task to collect all places in KF5 we need to adjust for this.
Description
Status | Assigned | Task | ||
---|---|---|---|---|
Open | dfaure | T12104 Port syndication away from QXmlInputSource API | ||
Invalid | leinir | T12106 Port KNewStuff away from QDom API | ||
Open | dvratil | T12145 Port kbookmarks away from QDom API |
Turns out that QDom actually won't go away, but we still need to check for any usage of deprecated APIs from QDomDocument
What's deprecated is only QXmlInputSource, QXmlReader/QXmlSimpleReader and associated classes (QXmlEntityResolver, QXmlAttributes...).
https://lxr.kde.org/ident?_i=QXmlInputSource&_remember=1 says only syndication needs to be ported away.
This is not as easy as I thought it would be. When used without QXmlInputSource, QDomDocument simplifies whitespace-only CDATA sections.
This patch: http://www.davidfaure.fr/2020/port_syndication_away_from_qxmlinputsource.diff
leads to a failure in autotests/atom/atom10_entry_content.xml which can be narrowed to
-id: #hash:aff2c4358030579d2c3dcea6e92b40fe#
+id: #hash:a359558b397d24593c3b55afb85d173a#
Somehow the hash is calculated over the contents, and this breaks due to whitespace simplification?
But the HTML itself doesn't really care about that whitespace (see the unittest fixes in the patch).
What can we do about this? Is the hash thing useful, and we *have* to get a no-whitespace-collapsing feature in Qt again? Or can we somehow adapt?
I think this would be fine in general (there is no guarantee on how exactly the hash in computed from what I can see), the only problem could be that we at some point rely on it being long-term stable, ie. store it on disk for something actually relevant (e.g. losing selection state is probably acceptable, breaking Akregator's database entirely probably not). Hard to tell from a quick look at Akregator though.
Disclaimer: I haven't looked at Akregator's codebase in almost 10 years...
Originally, the hash was meant to serve as item ID where the source feed didn't provide IDs an item could be identified with. In Akregator, this ID was used to tell items that were already seen from new items when fetching the RSS/Atom. So I would assume that if the hash changes, that you would end up with duplicate items for every item that already existed locally and is still in the feed, because the old item and the new item now have different IDs/hashes.
I would say that's annoying but not critical, provided that it only happens once.
So, to be sure, I continue with this patch, adjusting the unittest to the new hash -- and no change required in akregrator?