Fall back to language name for translations lookup if locale name fails
ClosedPublic

Authored by wbauer on Jan 10 2018, 3:47 PM.

Details

Summary

For locales like de_AT, the current code only looks in share/locale/de_AT/ and share/locale/de-AT/ for translation catalogs, but not in share/locale/de/ where they most likely are.
That's because bcp47Name() returns "de-AT" for de_AT (though in the case of de_DE e.g. it does return "de").

This patch additionally tries to fall back to the general language by taking the part of the locale name before the first '_'.

Test Plan

Rebuilt kwidgetsaddons with this change.
Unset $LANGUAGE (if set) and set $LANG=de_AT.UTF-8 (or set $LANGUAGE=de_AT in the first place)
Ran systemsettings5 and enter some module.

Without this patch the Help/Default/Reset/Apply buttons are untranslated because the kwidgetsaddons translations are not loaded (according to strace or added debug output they are only looked up in share/locale/de_AT/ and share/locale/de-AT/, they are in share/locale/de/ on my system).
With this patch the buttons are translated, the translations are now loaded from share/locale/de/ (after trying share/locale/de_AT/ and share/locale/de-AT/, which both fail).

Also ran "kdesu xxx". "Password:", "Remember password", "Cancel" were untranslated in the password dialog before, they are translated to german now.

Tests that passed before still pass.
The new/modified test passes now too, it fails with the original code.

Diff Detail

Repository
R240 Extra CMake Modules
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
wbauer created this revision.Jan 10 2018, 3:47 PM
Restricted Application added a project: Build System. · View Herald TranscriptJan 10 2018, 3:47 PM
Restricted Application added a subscriber: Build System. · View Herald Transcript
wbauer requested review of this revision.EditedJan 10 2018, 3:47 PM

I don't think bcp47Name makes any sense here.
According to the Qt documentation, this "Returns the dash-separated language, script and country (and possibly other BCP47 fields) of this locale as a string" (and "Unlike the uiLanguages() the returned value of the bcp47Name() represents the locale name of the QLocale data but not the language the user-interface should be in").

I will gladly leave it in if preferred though, and add my change afterwards if that fails as well.

I went through Qt code, as Qt applications are opened with my language correctly where KF ones (those with QM) don't.
The whole issue is that we have each locale's translations in a separate folder (ar/LC_MESSAGES, en/LC_MESSAGES, de/LC_MESSAGES, etc.)
Qt uses internally find_translation which goes through a list of guesses to find the correct language, according to the LANGUAGE preferences and the locale itself.
Maybe we can borrow that code and utilize it, that would be much better.

Thanks for opening this, @wbauer!

wbauer added a comment.EditedJan 11 2018, 6:04 PM

I went through Qt code, as Qt applications are opened with my language correctly where KF ones (those with QM) don't.

I tested with your ar_BH locale meanwhile, and it is fixed too, i.e. LANGUAGE="ar_BH" didn't work either, and both LANG="ar_BH.UTF-8"/LANGUAGE="" and LANGUAGE="ar_BH" do work now.

The whole issue is that we have each locale's translations in a separate folder (ar/LC_MESSAGES, en/LC_MESSAGES, de/LC_MESSAGES, etc.)

Why is this an issue?
There's no difference really in loading ar/LC_MESSAGES/xxx.qm and LC_MESSAGES/xxx_ar.qm (or something like that), i.e. you would have the same problem if all translations would be in the same folder.

Qt uses internally find_translation which goes through a list of guesses to find the correct language, according to the LANGUAGE preferences and the locale itself.
Maybe we can borrow that code and utilize it, that would be much better.

The only relevant things I can see is that it replaces all occurences of '-' with '_' (which is necessary only because it gets the languages from QLocale::uiLanguages()), and that it doesn't cut off at the first '_', but creates entries cut off at every one. (i.e. "de_XX_YY" would yield "de_XX_YY", "de_XX" and "de" to try, IIUIC, while the current patch would only try "de_XX_YY" and "de")

I could do something similar, i.e. lookup translations in a while loop and cut off at the right-most '_' if a lookup fails until it succeeds.

The current code does get the locale/language from Qt anyway. (it calls QLocale::system(), which does respect LANG and LANGUAGE )

It probably would also be a good idea to get a list of languages via QLocale::uiLanguages() instead like that find_translation() function does (that would also support things like LANGUAGE="de:ar"). But that's unrelated to this fix IMHO.

Why is this an issue?
There's no difference really in loading ar/LC_MESSAGES/xxx.qm and LC_MESSAGES/xxx_ar.qm (or something like that), i.e. you would have the same problem if all translations would be in the same folder.

Well, we were to simplify this to one call of loadTranslation, but yes, not a big deal.

The only relevant things I can see is that it replaces all occurences of '-' with '_' (which is necessary only because it gets the languages from QLocale::uiLanguages()), and that it doesn't cut off at the first '_', but creates entries cut off at every one. (i.e. "de_XX_YY" would yield "de_XX_YY", "de_XX" and "de" to try, IIUIC, while the current patch would only try "de_XX_YY" and "de")

I could do something similar, i.e. lookup translations in a while loop and cut off at the right-most '_' if a lookup fails until it succeeds.

Well, that would be great for an ideal world. We can live for now and check the locales that KDE is currently translated to, and just adapt to them.

The current code does get the locale/language from Qt anyway. (it calls QLocale::system(), which does respect LANG and LANGUAGE )

It probably would also be a good idea to get a list of languages via QLocale::uiLanguages() instead like that find_translation() function does (that would also support things like LANGUAGE="de:ar"). But that's unrelated to this fix IMHO.

Maybe later, I like full concept implementations, but yes, not needed.

This will be enough for the time being. Thanks! :-)

Why is this an issue?
There's no difference really in loading ar/LC_MESSAGES/xxx.qm and LC_MESSAGES/xxx_ar.qm (or something like that), i.e. you would have the same problem if all translations would be in the same folder.

Well, we were to simplify this to one call of loadTranslation, but yes, not a big deal.

No.

The issue is that not every locale has a separate translation, so you need fallbacks. (e.g. de_AT -> de)
But it could have one, so just using the general language in any case would be wrong too as you couldn't have different translations for the same general language. (think of British English vs. American English e.g., there is a separate en_GB translation for KDE Frameworks)
That's unrelated to whether the translations are in separate folders or all in one though, the latter wouldn't change anything IMHO.

The only relevant things I can see is that it replaces all occurences of '-' with '_' (which is necessary only because it gets the languages from QLocale::uiLanguages()), and that it doesn't cut off at the first '_', but creates entries cut off at every one. (i.e. "de_XX_YY" would yield "de_XX_YY", "de_XX" and "de" to try, IIUIC, while the current patch would only try "de_XX_YY" and "de")

I could do something similar, i.e. lookup translations in a while loop and cut off at the right-most '_' if a lookup fails until it succeeds.

Well, that would be great for an ideal world. We can live for now and check the locales that KDE is currently translated to, and just adapt to them.

It only makes a difference if the locale/language is actually set to something like "ll_XX_YY" (i.e. more than one '_') though, and only if there actually is a "ll_XX" translation.

But I will probably have a look at implementing that tomorrow...

Maybe later, I like full concept implementations, but yes, not needed.

This is a bugfix for existing code though, not a new implementation. ;-)

But it shouldn't be hard to add (even later), just call QLocale::uiLanguages() instead of QLocale::system() and loop over all strings in the list you get (after replacing '-' with '_').
The rest of the code is still needed (unchanged) and would be inside that loop.

aacid added a subscriber: aacid.Jan 14 2018, 12:12 PM
aacid added inline comments.
modules/ECMQmLoader.cpp.in
68–73

I agree that locale.bcp47Name() is most probably not want we want, but since it's there, some third party user of extra-cmake-modules may be depending on it, so us removing will break them·

So please leave that line in and turn it into a

if (!loadTranslation(locale.bcp47Name())) {

... your new code.

}

Also very small nitpick, make i const

wbauer updated this revision to Diff 25536.Jan 17 2018, 12:27 PM
wbauer edited the summary of this revision. (Show Details)
wbauer edited the test plan for this revision. (Show Details)

Leave bcp47Name() in for compatibility, add const.

wbauer marked an inline comment as done.Jan 17 2018, 12:28 PM
aacid accepted this revision.Jan 17 2018, 8:01 PM
This revision is now accepted and ready to land.Jan 17 2018, 8:01 PM
This revision was automatically updated to reflect the committed changes.