Splitting of i18n strings
Open, Needs TriagePublic

Description

Strings as argument of i18n calls can't be easily split onto multiple lines in the source code. Because of that with longer lines the 100 chars line limit of KDE Frameworks Coding Style gets violated (see for example D24073).

Would something like this be possible?

i18n(Lorem ipsum dolor sit amet, consectetur adipisici elit,\
     sed eiusmod tempor incidunt ut labore et dolore magna aliqua.)
romangg created this task.Sep 19 2019, 11:30 AM

Wasn't this solution mentioned here https://phabricator.kde.org/D23783#527600 ?

You mean like the following?

i18n(Lorem ipsum dolor sit amet, consectetur adipisici elit,\
     sed eiusmod tempor incidunt ut labore et dolore magna aliqua.)

True, the only difference is that the solution mentioned in D23783#527600 can not have an arbitrary indent. I'm not an expert on JavaScript string concatenation capabilities, but it would be nice to have multi-line strings not break the indent of code before and after.

pino added a subscriber: pino.Sep 20 2019, 6:02 AM

BTW, since the "100 characters per line" was brought as single argument, let's actually check what the KDE Frameworks Coding Style actually say.
Quoting bits from https://techbase.kde.org/Policies/Frameworks_Coding_Style (emphases mine):

This style guide describes the preferred coding style in KDE Frameworks 5. It is no set of absolute rules. You can break them in particular situations if it improves the readability of the code or you have some other good reason for doing so.

So already at the beginning says that

  • they are not absolute rules
  • they can be broken in case there are good reasons to do so

Try to keep lines shorter than 100 characters, inserting line breaks as necessary.

So here says that the 100 characters is not an absolute limit either, but rather a recommendation.

What's more: these coding style guidelines are sure good when programming, however their impact is basically limited to the few people actually coding (plus people reading/debugging/etc it). Once the code is compiled, things like tabs vs spaces for indentation, 80/100/etc characters per line, snake_case vs CamlCase, and so on become absolutely moot. (Even in case of QML, as it is just "run").
OTOH, a string puzzle in the UI makes the text harder to understand, a string to translate split in smaller pieces with no context can be translated in non-sense way -- even colors, fonts, icons/images, sounds: everything done bad in the UI is a problem for all the users, not just developers (and the former category is way bigger than the latter).

So @romangg: please do not use non-absolute rules as absolute hammer to enforce poor localization, and think more about the users, thanks.

In T11721#201544, @pino wrote:

[...]
So @romangg: please do not use non-absolute rules as absolute hammer to enforce poor localization, and think more about the users, thanks.

I can tell you that I will enforce the 100 chars per line limit as an absolute rule in all code I maintain (which is KWin, KWayland KScreen, libkscreen) as long as none of my co-maintainers is against it. And no, you are not one of them. If I care or don't care about our users is irrelevant to this. But your rhetoric here is pointless since I already showed with accepting revision D24073 that functionality is more important to me. On the other side this can't mean the code style can just go to shit. You don't have to work with it on a daily basis, do you? This in particular means that next time you won't commit changes to one of my maintained project behind my back when where are still open questions to it. Open a review, then we can try to find short- and hopefully long-term solution together or at least kick-start the process.

But this discussion is off-topic to this task where we only try to find a solution for problematic i18n line splitting in QML code that conflicts with the Frameworks Coding Style recommendation. So if you have a problem with my decision reach out to me personally or take it directly to the community working group.

This is not off-topic, because the limit you are trying to enforce is not an absolute limit, as stated in the same code that you are quoting. It's not about "code style can just go to shit". If the QML does not allow to have this, well, either fix QML (at this point probably for Qt6), there in no much to do: an important feature like translations cannot be broken, as I pointed out on IRC, with my translator co-coordinator hat on.

If a limit cannot be enforced, then let's write down an explicit additional rule in the coding style about QML explicitly accepting longer lines. This is something that is up to the Frameworks maintainers. Should that rule become part of the coding style, there wouldn't be any reason for discussion.

And yes, this should go to the Community Working Group.

This is not off-topic, because the limit you are trying to enforce is not an absolute limit, as stated in the same code that you are quoting. It's not about "code style can just go to shit". If the QML does not allow to have this, well, either fix QML (at this point probably for Qt6), there in no much to do: an important feature like translations cannot be broken, as I pointed out on IRC, with my translator co-coordinator hat on.

As already said I interpret it as an absolute limit and will enforce it this way. Besides you're reiterating what I already said: An important feature like translations cannot be broken? As I said earlier: "functionality is more important to me." Either fix QML? Well, that's what this task is supposed to be about.

If a limit cannot be enforced, then let's write down an explicit additional rule in the coding style about QML explicitly accepting longer lines. This is something that is up to the Frameworks maintainers. Should that rule become part of the coding style, there wouldn't be any reason for discussion.

I'm sure you don't mean we should add a rule just because the translation tools are malfunctioning. You make the tools respect the rules not the other way around.

This is not off-topic, because the limit you are trying to enforce is not an absolute limit, as stated in the same code that you are quoting. It's not about "code style can just go to shit". If the QML does not allow to have this, well, either fix QML (at this point probably for Qt6), there in no much to do: an important feature like translations cannot be broken, as I pointed out on IRC, with my translator co-coordinator hat on.

As already said I interpret it as an absolute limit and will enforce it this way. Besides you're reiterating what I already said: An important feature like translations cannot be broken? As I said earlier: "functionality is more important to me." Either fix QML? Well, that's what this task is supposed to be about.

I find the fact that you want to give a stricter interpretation of a rule like this, which also affects other community members, a bit over the line with the powers of a maintainer. It should be part of a community discussion inside your project at least (Plasma), if not a more general one. I also personally find the fact that this is a "will" without adding a "if possible" a bit too strong; see below.

If a limit cannot be enforced, then let's write down an explicit additional rule in the coding style about QML explicitly accepting longer lines. This is something that is up to the Frameworks maintainers. Should that rule become part of the coding style, there wouldn't be any reason for discussion.

I'm sure you don't mean we should add a rule just because the translation tools are malfunctioning. You make the tools respect the rules not the other way around.

I'm saying that sometimes there are things which we don't have a direct influence on. The issue here is either in QML, which have a limited syntax which can't be easily fixed before Qt 6 without breaking the compatibility, or in the xgettext tool . Both tools are not part of what we (KDE) directly control. We may not have the resource to help fixing them, not in the medium or the long run. And even if they are fixed, it may take years to be able to use them. In that case, community rules, like physics, follow the reality, not the other way.

mart added a subscriber: mart.Sep 20 2019, 8:35 AM

as my personal 2 cents user facing strings are so important that they should exampted by that rule (or even the frameworks coding styles amended"
for QML, it's not easily changeable, and in general having javascriptis always kindof dangerous having newlines that aren't strictly necessary, as the interpreter can always think "let's try to put a ; in there and let's see what happens"
for qml, we should really have separate coding styles, that don't diverge from kdelibs where make sense, but the kdelibs ones really are c++ specific ones

romangg added a comment.EditedSep 20 2019, 8:39 AM

[...]
I find the fact that you want to give a stricter interpretation of a rule like this, which also affects other community members, a bit over the line with the powers of a maintainer. It should be part of a community discussion inside your project at least (Plasma), if not a more general one. I also personally find the fact that this is a "will" without adding a "if possible" a bit too strong; see below.

Just to reiterate when I say that I will enforce the rule it means that I will do this while keeping other priorities in check, in particular that "functionality is more important to me". What I don't want to see is commits breaking the rules without giving me notice, prior discussion and a path forward. That's why I reverted the original commit. With this task here there is now a room for discussion (although I would prefer if we only talk about a technical solution but maybe this discussion is necessary after all) and hopefully it becomes possible to find a path forward. Overall I would like to see the translations team and devs work better together on such questions, so I hope you don't see my engagement here as hostile to you guys. I respect your work and hope we can strengthen the ties. That's why I was also present at the translation BoF at Akademy (albeit not actively participating in the discussion).

I'm saying that sometimes there are things which we don't have a direct influence on. The issue here is either in QML, which have a limited syntax which can't be easily fixed before Qt 6 without breaking the compatibility, or in the xgettext tool . Both tools are not part of what we (KDE) directly control. We may not have the resource to help fixing them, not in the medium or the long run. And even if they are fixed, it may take years to be able to use them. In that case, community rules, like physics, follow the reality, not the other way.

If a solution would need allocation of considerable resources then of course we can decide to not do it at this point in time. But this doesn't mean the rule should not hold anymore in this case, it just means that currently it can't. Rules are about what should, not what is and we might find a solution later on to make the is the should. But before deciding this we should actually research if it could be possible already now. That's what this task is about.

In T11721#201554, @mart wrote:

as my personal 2 cents user facing strings are so important that they should exampted by that rule (or even the frameworks coding styles amended"
for QML, it's not easily changeable, and in general having javascriptis always kindof dangerous having newlines that aren't strictly necessary, as the interpreter can always think "let's try to put a ; in there and let's see what happens"
for qml, we should really have separate coding styles, that don't diverge from kdelibs where make sense, but the kdelibs ones really are c++ specific ones

If not via newlines could it be possible to make it work by combining strings with +? Like:

i18n("Lorem ipsum dolor sit amet, consectetur adipisici elit, " +
     "sed eiusmod tempor incidunt ut labore et dolore magna aliqua.")

Not sure why this doesn't work at the moment. Is it because of this xgettext tool?

If not via newlines could it be possible to make it work by combining strings with +? Like:

i18n("Lorem ipsum dolor sit amet, consectetur adipisici elit, " +
     "sed eiusmod tempor incidunt ut labore et dolore magna aliqua.")

Not sure why this doesn't work at the moment. Is it because of this xgettext tool?

There are two issues, I think:

  • xgettext not extracting the second string (which if I understand correctly is what @aspotashev reported)
  • even if that is fixed, the two strings would be still considered separate and extracted separately; there is no way to say that they are really the same string (otherwise weird things can happen if a variable is part of a concatenation). The string should really be a unique string.

That makes me think that the problem lies more on the QML side not accepting multi-line strings like:

i18n("foo"
     "bar")

as C++ does.

Just to reiterate when I say that I will enforce the rule it means that I will do this while keeping other priorities in check, in particular that "functionality is more important to me". What I don't want to see is commits breaking the rules without giving me notice, prior discussion and a path forward. That's why I reverted the original commit.

Thanks for the clarification but let me reiterate that from my point of view, and probably not only my point of view, the commit which you reverted did not break any rule. It was a needed bug-fix after a breakage, which did not violate any rule (only a personal interpretation which is debatable), and translators have always been given an implicit "yes, feel free to fix those strings" for this kind of fixes which are under our domain. So everything happened under the existing community rules.

As those rules have not lead so far to breakages, but improved the general quality of the software, I would really be surprised to see this possibility (directly fixing broken strings) going away.

i18n("Lorem ipsum dolor sit amet, consectetur adipisici elit, " +
     "sed eiusmod tempor incidunt ut labore et dolore magna aliqua.")
  • even if that is fixed, the two strings would be still considered separate and extracted separately; there is no way to say that they are really the same string (otherwise weird things can happen if a variable is part of a concatenation). The string should really be a unique string.

I think it's not that hard to fix xgettext so that it considers everything inside a i18n() call like a single string and accounts for concatenations. After xgettext is patched, we can use this patched version on our Scripty server without waiting for xgettext's official release.

However there's probably nobody who will do this shortly, the i18n team manpower is close to zero.

aacid added a subscriber: aacid.Sep 23 2019, 9:27 PM
the i18n team manpower is close to zero.

I don't know who are you speaking for, but honestly that's not really helpful.

aacid added a comment.Sep 23 2019, 9:36 PM

FWIW you can do

    text: i18n("bla \
hola")

and it will work, but note the hola needs to start at column 0, probably your strict code style doesn't allow for that either.

And that's the solution, your codestyle needs to stop being so strict you can't write correct code with it.

aacid added a comment.Sep 23 2019, 9:37 PM

That makes me think that the problem lies more on the QML side not accepting multi-line strings like:

It's not really QML, it's more a JavaScript being not really good at multiline strings (and QML using javascript)

pino added a comment.Sep 24 2019, 4:15 AM

! In T11721#201558, @romangg wrote:
If not via newlines could it be possible to make it work by combining strings with +? Like:

i18n("Lorem ipsum dolor sit amet, consectetur adipisici elit, " +
     "sed eiusmod tempor incidunt ut labore et dolore magna aliqua.")

Not sure why this doesn't work at the moment. Is it because of this xgettext tool?

Already asked by you, and answered: https://phabricator.kde.org/D23783#527730

! In T11721#202172, @aacid wrote:
FWIW you can do

    text: i18n("bla \
hola")

and it will work, but note the hola needs to start at column 0, probably your strict code style doesn't allow for that either.

This was suggested: https://phabricator.kde.org/D23783#527600
... and not accepted by the maintainer either because "ugly".

In T11721#202182, @pino wrote:

! In T11721#202172, @aacid wrote:
FWIW you can do

    text: i18n("bla \
hola")

and it will work, but note the hola needs to start at column 0, probably your strict code style doesn't allow for that either.

This was suggested: https://phabricator.kde.org/D23783#527600
... and not accepted by the maintainer either because "ugly".

The reason I gave in what you cited was what @aacid already noted as well: that the indent can't be chosen arbitrary. To expand on that: since the strings are normally embedded into other controls, often in declarative QML having several levels of indent, I would like to avoid this.

I find it disingenuous to leave me mentioning the indent out from what you quoted making my opinion look overly subjective. If we want to work on this issue, we should be fair to each other.

There is also another reason for not using these multi-line strings with \: they are not recommended in JavaScript. Either + should be used or back-tick multiline strings (these again don't respect the indent though).

In overall Plasma, we typically simply use longer lines. It's not been an issue.
A good editor (such as kate) will wrap in a way that shows the indent properly anyway.