Fix importing cookies from KCookieServer at startup
ClosedPublic

Authored by stefanocrocco on Aug 23 2019, 1:32 PM.

Details

Summary

The cookies imported from KCookieServer were normalized before being
passed to QWebEngineCookieStore::setCookie. Since this function in turn
normalizes cookies, this caused a new cookie with the wrong domain to be
added: the first call to normalize filled the domain field with the
cookie origin host, the second call, seeing a cookie with a non-empty
domain, added a dot at the beginning of the domain.

Diff Detail

Repository
R226 Konqueror
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
stefanocrocco requested review of this revision.Aug 23 2019, 1:32 PM
stefanocrocco created this revision.
dfaure accepted this revision.Aug 24 2019, 8:46 AM

I admit that I'm quite lost between the Qt/KDE differences, what normalization does, and what a dot in front means. Something sounds fishy if normalizing twice isn't a no-op, indeed.

webenginepart/src/webenginepartcookiejar.cpp
288–290

const ...

289

while at it, foreach being deprecated,

for (const CookieWithUrl& cookieWithUrl : cookies)
webenginepart/src/webenginepartcookiejar.h
132

QVector is a much better container when sizeof(T) > sizeof(void*)

This revision is now accepted and ready to land.Aug 24 2019, 8:46 AM
stefanocrocco added a comment.EditedAug 24 2019, 10:21 AM

I admit that I'm quite lost between the Qt/KDE differences, what normalization does, and what a dot in front means. Something sounds fishy if normalizing twice isn't a no-op, indeed.

According to he specification, a Set-Cookie header can omit the Domain field. A missing Domain means that the browser must return the cookie only to the origin server. This means that a cookie received from abc.xyz.com with Domain=abc.xyz.com will be treated differently from a cookie received from the same host but with no Domain field: in the first case, the cookie will be returned also to, for example, www.abc.xyz.com; in the second case it will be returned only to abc.xyz.com. The specification also says that if the Domain field starts with a dot, the dot should be ignored: Domain=.abc.xyz.com is treated as Domain=abc.xyx.com.

A browser has to choose how to internally represent distinguish a cookie with a Domain field from a cookie with no Domain field. Unfortunately, QWebEngineStore and KCookieJar made two different choices:

  • KCookieJar leaves an empty domain as it is and stores the origin host separately. For non-empty domains, a leading dot is added if it's not already there
  • QWebEngineCookieStore uses a normalized QNetworkCookie to store cookies. In a normalized QNetworkCookie, the domain() function never returns an empty string: if the cookie had a Domain field, domain() returns a string starting with a dot; if the cookie didn't have the domain field, domain() returns the origin host, without a leading dot. (QNetworkCookie::normalize also replaces an empty Path with a /, but this doesn't causes issue since KCookieJar does the same).

I agree that one could expect that calling normalize twice would be a no-op (I thought so, too, before discovering, by chance, this bug). On second thought, however, there's a reason it can't be so. What normalize does is essentially:

  • if the domain is empty, make it equal to the host given as argument with no leading dots
  • if the domain is not empty, add a leading dot to it if it's not there.

After calling normalize() once, the domain won't be empty anymore, so the second call will simply add a leading dot to it. To avoid this behavior, normalize() should assume that a non-empty domain always starts with a dot, but can't do so (I guess) because the specification allows it. I think that the documentation for normalize() should emphasize this fact.

Another difference between KCookieServer and QWebEngineCookieStore which creates issues for Konqueror is that KCookieServer::addCookies requires the origin URL, but QWebEngineCookieStore doesn't provide it. What Konqueror currently does is to use the domain() field as origin host: this is correct if the cookie had no domain field (as explained above) and seems to be working in other cases, too. Actually, according to the specification, the origin domain shouldn't be needed at all for cookies with a domain field: however, if I understand correctly its source code, KCookieJar uses it to derive a key for the hash it stores cookies in.

I wonder whether it could be a good idea to file a bug report with Qt to ask to pass the origin URL in the QWebEngineCookieStore::cookieAdded signal.

"if the domain is not empty, add a leading dot to it if it's not there" << that's a weird thing to do, if the spec says leading dots are just superfluous.

This means you could have your own normalize function (which doesn't prepend a dot), but I can see how that might just be more confusing.

"if the domain is not empty, add a leading dot to it if it's not there" << that's a weird thing to do, if the spec says leading dots are just superfluous.

Sorry, I made a mistake: according to the documentation for QNetworkCookie, it follows a specification that is different (older?) than the one I was looking at. According to the correct specification, the domain, if not empty, must start with a dot. This is what distinguish a domain explicitly set from a missing domain (which means the origin host).

OK, then it's not just normalizing, but translating from the HTTP header to the internal representation. That can indeed only happens once.

Then you need to reason with "which code works with the http header" and "which code works with the internal representation".

OK, then it's not just normalizing, but translating from the HTTP header to the internal representation. That can indeed only happens once.

Not exactly. You can create a QNetworkCookie in two ways: parsing the Set-Cookie header, using the static QNetworkCookie::parseCookies() function or creating an empty cookie and setting its fields manually. I think normalize in this context means something like: fill the empty fields (domain and path, taking the values from the url passed as argument) and make sure the domain starts with a dot.

Then you need to reason with "which code works with the http header" and "which code works with the internal representation".

I think the workflow is supposed to be: receive the Set-Cookie header -> parse it using parseCookies -> call normalize -> forget the HTTPl representation and work only with the internal one. If you need to use the cookie again in an HTTP header, call QNetworkCookie::toRawForm and you have the header ready. Unfortunately, for Konqueror we need to work with two internal representations: one for QNetworkCookie and one for KCookieJar.

Made the requested changes. I also replaced QList with QVector for CookieUrlList.

dfaure accepted this revision.Aug 24 2019, 3:31 PM
This revision was automatically updated to reflect the committed changes.