#379003: Fix National Geographic POTD provider
ClosedPublic

Authored by vitali on May 6 2017, 1:14 PM.

Details

Summary

The current parsing mechanism to obtain the current NatGeo POTD based on XML is broken, this patch proposes an alternative mechanism using QRegularExpression library.
For this method to continue working in the future it is supposed that the web page will maintain a sane format, namely that the relevant tag will not be broken among many lines, the fields inside the tag will maintain the same relative order, and obviously that it will not be obfuscated.

Diff Detail

Repository
R114 Plasma Addons
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
vitali created this revision.May 6 2017, 1:14 PM
Restricted Application added a project: Plasma. · View Herald TranscriptMay 6 2017, 1:14 PM
Restricted Application added a subscriber: plasma-devel. · View Herald Transcript
sebas added inline comments.May 6 2017, 4:38 PM
dataengines/potd/natgeoprovider.cpp
96

Why this change? Moving away from https to http seems backward...

vitali planned changes to this revision.May 6 2017, 6:59 PM
vitali added inline comments.
dataengines/potd/natgeoprovider.cpp
61

An alternative regex may be "^<meta\\s.*property=\"og:image\"\\s.*content=\"(.*)\".*/>$", so to match any possible future fields before property, and between property and content.

96

The versions with https or without the trailing slash were not working for me, so I changed it to that one, but it could as well be just be my mistake.

Zren added a subscriber: Zren.May 6 2017, 11:24 PM
Zren added inline comments.
dataengines/potd/natgeoprovider.cpp
96

The https redirects to http (HTTP 302 temp move), then redirects to add the trailing slash (HTTP 301 Perm Move).

So, what do we want to do with this patch?
I've been using it for a week now, and didn't notice any problems. Has anyone else been testing it?

As for the link, I am for keeping the http version, as it is the one used by NatGeo, so it would be less likely to be broken in the near future. If and when they will move to TLS it should be their responsibility to issue the right redirect.
Anyway, any decision will be fine for me, as long as it works.

Lastly, any opinion on the regex?

mart accepted this revision.Jun 8 2017, 12:36 PM
mart added a subscriber: mart.

go for it

This revision was automatically updated to reflect the committed changes.