Consolidate {branches/stable,trunk}/l10n-{kde4,kf5}/scripts into a git repository
Open, NormalPublic

Description

Import with history, each SVN branch should be a separate branch, with trunk/l10n-kf5 as master.

ltoscano created this task.Dec 2 2016, 11:51 AM

... and also few other branches, so all scripts/ directory under:

  • trunk/l10n-kf5
  • branches/stable/l10n-kf5
  • branches/stable/l10n-kf5/l10n-kf5-plasma-lts
  • trunk/l10n-kde4
  • branches/stable/l10n-kde4
  • trunk/l10n-kde3
  • branches/stable/l10n
nalvarez claimed this task.Aug 16 2018, 9:33 AM

@nalvarez hi! Do you have any news about this task? Can we help?

My top-level plan is to

  1. Make minor changes in SVN to make the move to Git seamless. D25333 is part of it.
  2. Lock */scripts directories in SVN to avoid divergence of scripts in SVN from scripts in Git.
  3. Import the 5 branches (*/l10n-kf5*/scripts and */l10n-kde4/scripts) to Git as subling directories:
    • branches_stable_l10n-kde4
    • branches_stable_l10n-kf5
    • branches_stable_l10n-kf5-plasma-lts
    • trunk_l10n-kde4
    • trunk_l10n-kf5
  4. Update trunk/kde-common/makemessages to work with the new Git repo scripty.git.
  5. Remove files related to scripty (aka update_translations) from SVN.
  6. Then we might want to hide some files from Git that are unrelated to scripty (aka update_translations): lokalize/*, conversion/*, autogen.sh, etc - either
    • remove from Git, keep them in SVN or
    • move them into a separate Git repo
    • move them into a subdirectory of scripty.git.
  7. Unlock */scripts directories in SVN.

There is no reason for using multiple branches in scripty.git, it will only make usage and development harder:

  1. You won't be able to submit for review a patch involving multiple branches or commit the patch atomically.
  2. The main scripty l10n daemon will need to either switch between branches or make multiple Git clones (working copies).
  3. During refactoring and unifying of common code, it would be hard to see which files are common and which still need to be unified. On the other hand with one single branch you can gradually move common scripts into common/ thus making the refactoring process very easy to observe.
  • trunk/l10n-kde3
  • branches/stable/l10n

Why do you think we need these over 10 year old branches in Git?

Even if we move them to Git, the next reasonable step would be to remove them from Git (with git rm) for being unused.

pino added a subscriber: pino.Nov 16 2019, 5:22 PM
  1. Import the 5 branches (*/l10n-kf5*/scripts and */l10n-kde4/scripts) to Git as subling directories:
    • branches_stable_l10n-kde4
    • branches_stable_l10n-kf5
    • branches_stable_l10n-kf5-plasma-lts
    • trunk_l10n-kde4
    • trunk_l10n-kf5

Ugh no, please import them as branches, as they actually are.

There is no reason for using multiple branches in scripty.git, it will only make usage and development harder:

  1. You won't be able to submit for review a patch involving multiple branches or commit the patch atomically.

Which multiple branches are we talking about? It is very rare to do changes that must be done on all the branches, and in that case you can send a new review for the specific branch (in case a simple cherry-pick does not work, or a different patch is needed).
Usually all the improvements are done for trunk/l10n-kf5 these days, and stable branches are kept in "maintenance mode" mostly.

In any case, this would be a only a short-/middle-term solution -- see below for the reason why.

  1. The main scripty l10n daemon will need to either switch between branches or make multiple Git clones (working copies).

scripty does that already for all the repositories it handles, and it will do that also for scripty.git. The reason is that each l10n branch (trunk/l10n-kf5, stable/l10n-kf5, etc) must have its own copies of the repositories to avoid clashing with other branches of scripty.

Also, I don't think scripty.git will be a big repository anyway.

  1. During refactoring and unifying of common code, it would be hard to see which files are common and which still need to be unified. On the other hand with one single branch you can gradually move common scripts into common/ thus making the refactoring process very easy to observe.

As mentioned above, having each currently l10n branch in its own branch of scripty.git will make it easier to switch scripty to it, keep using the scripts of each branch.
The long term goal is to make sure that the content of the trunk/l10n-kf5 branch is usable also for other branches.

  • trunk/l10n-kde3
  • branches/stable/l10n

Why do you think we need these over 10 year old branches in Git?

For the same reason we have 20 years old branches in applications, i.e. history preservation and searching?

Even if we move them to Git, the next reasonable step would be to remove them from Git (with git rm) for being unused.

Not if we keep them as old branches, as they ought to be.

As I write initially:

  • separate branches, as it's easier to adapt the current code;
  • later we will transition all kf5 branches to use master, in a non disruptive way.

That's it.

And of course the kde4 branches won't need.any update: one is already gone, the other will be soon.

You did not convince me, sorry.

In T4803#208465, @pino wrote:
  1. Import the 5 branches (*/l10n-kf5*/scripts and */l10n-kde4/scripts) to Git as subling directories:
    • branches_stable_l10n-kde4

Ugh no, please import them as branches, as they actually are.

As a counter-argument to this statement, I'm going to explain why */l10n-kf5*/ and */l10n-kde4/ are not real branches. Even though the various scripts/ directories are stored under branches/stable/ and trunk/ in SVN, they are not used as branches anymore.

I believe branches/stable/l10n-* are currently stored under branches/stable/ for historical reasons only. Once upon a time you could branch off a stable version of KDE or KDE SC by copying the whole l10n directory in SVN along with the scripts/ directory, see for example https://websvn.kde.org/?view=revision&revision=1239283
Today we abuse SVN branches while the scripts would normally be stored in one directory under SVN trunk/

SCM branches and tags are commonly used to store a self-sufficient version of software. This way it's easy to set up CI/CD for automatic testing and deployment of master/development branch, stable branches and feature branches.

At the moment individual "branches", for example trunk/l10n-kf5, are not self-sufficient because you cannot run trunk/kde-common/makemessages on one single branch. No reasonable CI/CD process would expect that you switch branches .

When the source code is scattered across several Git branches, you also cannot use feature branches since Git does not support nested branches.

You may say that all the above problems are not important because we are talking about a temporary solution and there will be only one branch in the end. However I had to point out that what you called a branch is not a branch, thus your reasoning "as they actually are [branches]" is void.

There is no reason for using multiple branches in scripty.git, it will only make usage and development harder:

  1. You won't be able to submit for review a patch involving multiple branches or commit the patch atomically.

Which multiple branches are we talking about? It is very rare to do changes that must be done on all the branches, and in that case you can send a new review for the specific branch (in case a simple cherry-pick does not work, or a different patch is needed).
Usually all the improvements are done for trunk/l10n-kf5 these days, and stable branches are kept in "maintenance mode" mostly.

It's almost the same problem as with feature branches. Of course I can create 5 branches per feature if my feature requires changes in all 5 branches, but that would be a mess and lots of extra work for no reason.

git-cherry-pick would help most of the time, however that also adds to the amount of extra work.

Now imagine a real-world development process:

  • you start working on a feature
  • test it locally (by running makemessages against all branches)
  • fix bugs
  • test again
  • ... and so on

In this case

  • you cannot have uncommitted changes (which is common and desirable during development) because these changes would block branch switches
  • you have to use cherry-pick all the time.

About frequency of improvements: I guess changes in all branches are rare because the barrier for making them is too high. In other words, we are slowing down development by using SVN and multiple branches.

  1. During refactoring and unifying of common code, it would be hard to see which files are common and which still need to be unified. On the other hand with one single branch you can gradually move common scripts into common/ thus making the refactoring process very easy to observe.

As mentioned above, having each currently l10n branch in its own branch of scripty.git will make it easier to switch scripty to it, keep using the scripts of each branch.
The long term goal is to make sure that the content of the trunk/l10n-kf5 branch is usable also for other branches.

You didn't address my concern about refactoring (aka unifying the code) after moving to Git.

I have no good idea how to tackle the task of refactoring code across multiple Git branches. One bad idea is to copy all files from all Git branches into one branch and work there - however it would be basically the same as if we imported from SVN into one Git branch like I suggested.

How would you approach refactoring?

As I write initially:

  • separate branches, as it's easier to adapt the current code;

It's almost adapted already.

  • Support work in a different path: D25333 + needs a few more changes regarding subdirs file
  • Change of paths in trunk/kde-common/makemessages - easy
  • Add git clone/pull for scripty.git in trunk/kde-common/makemessages - easy

May be I don't see some more pitfalls?

  • later we will transition all kf5 branches to use master, in a non disruptive way.

Someone will need to do this. And that "someone" will be more happy to work in one branch, I guess. I'm not saying that would be me, but common sense suggests that working in one branch in the first place is easier than working in 5 branches.

pino added a comment.Nov 16 2019, 10:34 PM
In T4803#208465, @pino wrote:
  1. Import the 5 branches (*/l10n-kf5*/scripts and */l10n-kde4/scripts) to Git as subling directories:
    • branches_stable_l10n-kde4

Ugh no, please import them as branches, as they actually are.

As a counter-argument to this statement, I'm going to explain why */l10n-kf5*/ and */l10n-kde4/ are not real branches. Even though the various scripts/ directories are stored under branches/stable/ and trunk/ in SVN, they are not used as branches anymore.

They are branches: each l10n/l10n-kde4/l10n-kf5 is a branch of translations.

I believe branches/stable/l10n-* are currently stored under branches/stable/ for historical reasons only. Once upon a time you could branch off a stable version of KDE or KDE SC by copying the whole l10n directory in SVN along with the scripts/ directory, see for example https://websvn.kde.org/?view=revision&revision=1239283
Today we abuse SVN branches while the scripts would normally be stored in one directory under SVN trunk/

True, each does not represent a full branch of l10n, although they are logically still branches of the same thing: translations.

Also, today this is not done anymore because the stable l10n branches do not represent a full (or even partial) branch of "trunk l10n", but a mixed of branches (SC/Applications, extragear apps, etc), and it is easier to copy translation directories rather than copying l10n as a whole and reconstruct it. This is a different matter though.

SCM branches and tags are commonly used to store a self-sufficient version of software.

This is just a convention, just like the way our l10n branches are set up.

At the moment individual "branches", for example trunk/l10n-kf5, are not self-sufficient because you cannot run trunk/kde-common/makemessages on one single branch. No reasonable CI/CD process would expect that you switch branches .

I am not sure what you are talking about here. makemessages acts on the various l10n directories with the l10n branches, checking it out if necessary.
And no, you are wrong here: you can perfectly checkout trunk/kde-common, configure makemessages to run on the branches you want, and run it, and it will work. I did it once in the past (mostly because I did not have the disk space for it to do it often).

You may say that all the above problems are not important because we are talking about a temporary solution and there will be only one branch in the end. However I had to point out that what you called a branch is not a branch, thus your reasoning "as they actually are [branches]" is void.

It is still a branch, even if you don't like to call it that way.

There is no reason for using multiple branches in scripty.git, it will only make usage and development harder:

  1. You won't be able to submit for review a patch involving multiple branches or commit the patch atomically.

Which multiple branches are we talking about? It is very rare to do changes that must be done on all the branches, and in that case you can send a new review for the specific branch (in case a simple cherry-pick does not work, or a different patch is needed).
Usually all the improvements are done for trunk/l10n-kf5 these days, and stable branches are kept in "maintenance mode" mostly.

It's almost the same problem as with feature branches. Of course I can create 5 branches per feature if my feature requires changes in all 5 branches, but that would be a mess and lots of extra work for no reason.

We are talking about having 4 branches actually still in use today:

  • trunk/l10n-kf5 (the most active one)
  • branches/stable/l10n-kf5 (stable, however still with changes)
  • branches/stable/l10n-kf5-lts (stable, basically frozen once a Plasma LTS is created/forked)
  • branches/stable/l10n-kde4 (stable, definitely frozen)

New features would definitely be only on the first, and backported to 2 and 3 if needed. Definitely not for the 4.

About frequency of improvements: I guess changes in all branches are rare because the barrier for making them is too high. In other words, we are slowing down development by using SVN and multiple branches.

This is all theoretical talk, really. The number of branches has never been a top reason why people do not contribute to scripty. I can name better reasons:

  • lack of comments
  • interwoven mix of scripts, with various way of interacting between each other (with C++ sources even)
  • extremely difficult to setup and test
  • very small set of intersection between the set of programmers and the set of translators
  • very very few people with knowledge of scripty (mostly also because of the reasons above)

And actually, talking from my own experience: most of the changes that I did in the past to the scripts directory and that required backporting in more branches were basically the same, and would have been solvable by a cherry-pick.

  1. During refactoring and unifying of common code, it would be hard to see which files are common and which still need to be unified. On the other hand with one single branch you can gradually move common scripts into common/ thus making the refactoring process very easy to observe.

As mentioned above, having each currently l10n branch in its own branch of scripty.git will make it easier to switch scripty to it, keep using the scripts of each branch.
The long term goal is to make sure that the content of the trunk/l10n-kf5 branch is usable also for other branches.

You didn't address my concern about refactoring (aka unifying the code) after moving to Git.

Just like you want to diff directories, just diff branches instead. Also, the content of the scripts directories is mostly the same, and in general the one in trunk/l10n-kf5 is the reference to follow, and thus the one to make general enough to be used with almost no changes for the other branches.

I have no good idea how to tackle the task of refactoring code across multiple Git branches. One bad idea is to copy all files from all Git branches into one branch and work there - however it would be basically the same as if we imported from SVN into one Git branch like I suggested.

See above.

  • Support work in a different path: D25333 + needs a few more changes regarding subdirs file
  • Change of paths in trunk/kde-common/makemessages - easy

These changes are too invasive, and not needed when $PWD is the root of each l10n branch (as makemessages already does).

  • Add git clone/pull for scripty.git in trunk/kde-common/makemessages - easy

When the scripts directories will be in scripty.git, the changes in makemessages for each l10n branch will be:

  • clone scripty.git under a different directory (each its own copy to avoid conflicts) for the current l10n branch, and switch to the branch of scripty.git matching the l10n branch
  • symlink that clone as scripts subdirectory in that l10n branch
  • run as done now
  • later we will transition all kf5 branches to use master, in a non disruptive way.

Someone will need to do this. And that "someone" will be more happy to work in one branch, I guess. I'm not saying that would be me, but common sense suggests that working in one branch in the first place is easier than working in 5 branches.

Again, you do not need to work in 5 branches, but just in 1 (the equivalent of trunk/l10n-kf5) to make it generic and usable for all the others.

As I write initially:

  • separate branches, as it's easier to adapt the current code;

It's almost adapted already.

  • Support work in a different path: D25333 + needs a few more changes regarding subdirs file
  • Change of paths in trunk/kde-common/makemessages - easy
  • Add git clone/pull for scripty.git in trunk/kde-common/makemessages - easy

    May be I don't see some more pitfalls?

Each branch will stay as it is.

  • later we will transition all kf5 branches to use master, in a non disruptive way.

Someone will need to do this. And that "someone" will be more happy to work in one branch, I guess. I'm not saying that would be me, but common sense suggests that working in one branch in the first place is easier than working in 5 branches.

I guess we don't share the same point of view. The me working on it wants to keep the history and the branch separate, and converge them later.
There are no 5 branches. There will be at most 3 branches, and the source for the translations needs to be adapted anyway.

Anyway, the real world process is not even the one you are describing. The point is here is not to fix this set of script: it's to throw it away in the long run and have it something more flexible. Hence a minimal amount of work should go into keeping this scripts alive. Each branch can live on its own own until it's unified.

How can they be unified without copying the script? The big dependency in terms of branch is the definition or which branch should each repository be checked out from, and if it should be. That information should be taken somewhere else, and this is another change that should be introduced one branch at a time.

Now, please, let's keep this ticket as it is: we need to import the full history *with branches* anyway.

Now, please, let's keep this ticket as it is: we need to import the full history *with branches* anyway.

OK, let's kickstart it this way with branches.

@nalvarez , I guess we need your help to convert SVN dirs to Git, like you did many times already for other projects. The list of your scratch repos at https://cgit.kde.org/scratch/nalvarez indicates you have quite a lot of experience :)

Please see the results of my first attempt to use svn2git: https://cgit.kde.org/scratch/aspotashev/converted-scripty-v1.git/

svn2git rules file:

I'm still not satisfied with the resulting Git repo, "known bugs" are:

  • The history of master branch stops around 2007-03-05. However we could probably track the history of the most important branch at the time in master: trunk/kde-i18n, then trunk/KDE/kde-i18n, then trunk/l10n-kde4 and finally trunk/l10n-kf5.
  • I forgot to include some old branches:
  • Some branches are not connected to their parents, e.g. stable_l10n-kf5-plasma-lts should be branched from stable_l10n-kf5
  • I tried to incorporate the scripts from kde-common/, at least makemessages, under prefix kde-common/ in branch master. However this led to also pulling unnecessary kde-common/admin/, kde-common/accounts and others.
  • pology/ directory should not be in this repo, we already have pology.git

Please comment if these bugs should be fixed and how (no technical details necessary at this point, I'm asking for requirements).

pino added a comment.Dec 8 2019, 4:42 PM

Nice start.

Please comment if these bugs should be fixed and how (no technical details necessary at this point, I'm asking for requirements).

Yes, they ought to.

  • I tried to incorporate the scripts from kde-common/, at least makemessages, under prefix kde-common/ in branch master. However this led to also pulling unnecessary kde-common/admin/, kde-common/accounts and others.

Leave kde-common away, as it does not belong to this. If needed, that will be converted separately (it has a different history).

In T4803#213133, @pino wrote:
  • I tried to incorporate the scripts from kde-common/, at least makemessages, under prefix kde-common/ in branch master. However this led to also pulling unnecessary kde-common/admin/, kde-common/accounts and others.

Leave kde-common away, as it does not belong to this. If needed, that will be converted separately (it has a different history).

The scripts in kde-common (https://websvn.kde.org/trunk/kde-common/makemessages?revision=1547145&view=markup and its predecessors) are related to l10n*/scripts because makemessages directly calls Bash scripts from l10n*/scripts and contains other business logic directly tied to nightly updates of translations in SVN. However I would be happy if everyone agrees we can drop makemessages': it would reduce the amount of work. I just don't know what would be the perfect balance between precise SVN->Git migration and simplicity.

Please comment if these bugs should be fixed and how (no technical details necessary at this point, I'm asking for requirements).

Yes, they ought to.

I need more details on what to do with master branch. They may be at least two approaches:

  1. Only map trunk/l10n-kf5/scripts to master branch. This is literal interpretation of what Luigi suggested in this task's description: "each SVN branch should be a separate branch, with trunk/l10n-kf5 as master."
  2. Fill the master branch with commits into trunk/kde-i18n, until it was moved to trunk/KDE/kde-i18n. Also add to the master branch the commits to trunk/KDE/kde-i18n, until the path trunk/l10n-kde4 was created. Also add to the master branch the commits to trunk/l10n-kde4, until trunk/l10n-kf5 was added. And for the most recent commits from when trunk/l10n-kf5 already existed, add these to the master branch.

Which of these approaches sounds like the best one?

The first approach is already implemented. I believe it won't be hard to search through old history even if we stick with the first approach.

Another question to everyone: do you think we should add an SVN commit reference to all commit messages in Git?

  • svn2git's option --add-metadata results in a commit message footer like this:
commit fe204a82463bc79fe3580f485543c7f84d5090fa
Author: Tobias Burnus <burnus@gmx.de>
Date:   Fri Jul 16 18:59:46 1999 +0000

    make use of experimental comments in the German statistics
    
    svn path=/trunk/kde-i18n/; revision=25678
  • svn2git's option --add-metadata-notes adds something called Git object notes, visualized like this:
commit 774718ed340b384e2ae67eb5c3857c1085f5fe8b
Author: Tobias Burnus <burnus@gmx.de>
Date:   Fri Jul 16 18:34:53 1999 +0000

    add support for comments

Notes:
    svn path=/trunk/kde-i18n/; revision=25677

(where the note seems to be stored separately from the Git commit object)

yes, that's what our other repos have, for example Okular says

commit fb78a4c1b89c93aefcb72e56d2fe291ed768d908
Author: Albert Astals Cid <tsdgeos@terra.es>
Date: Sat May 28 17:13:43 2011 +0000

correctly update pos (shouldn't matter but let's do it correctly)

svn path=/trunk/KDE/kdegraphics/okular/; revision=1233921

So i'd go for the first to be consistent with the rest of stuff

Please see the results of my second attempt to use svn2git: https://github.com/aspotashev/converted-scripty-v2 (can't push to KDE Git hosting because hooks decline it)
I used these scripts: https://phabricator.kde.org/D25929

I'm now happy with the resulting Git repo. It does not contain files from kde-common, like suggested by @pino .

The migrated repo does not include scripts from */x-test/internal/, is that OK?

aacid added a comment.Dec 29 2019, 4:36 PM

We need those two scripts

aacid added a comment.Dec 29 2019, 4:37 PM

ah you mean that they are not part of "scripty" itself but of x-test, sure i guess that's fine (at least for now)

Last version: https://invent.kde.org/nalvarez/l10n-scripts-conversion

Before actually doing the move, I guess you need to adapt scripty to keep a git clone of the scripts, so it will need some coordination. And you should also decide if you want to do all this before or after the migation to GitLab.

I just noticed l10n-support: https://websvn.kde.org/trunk/l10n-support/scripts/

It's not in the git repo currently. Is it needed too? What branch should it go to?

aacid added a comment.EditedMay 3 2020, 1:26 PM

I just noticed l10n-support: https://websvn.kde.org/trunk/l10n-support/scripts/

Ignore that, scripty doesn't use it

aacid added a comment.Jun 13 2020, 9:43 AM

Given the current state of instability we have due to the gitlab migration + different structure fallout, let's give us a few weeks to try to stabilize things and come back to this?

aacid added a comment.Fri, Jul 3, 9:11 PM

I feel like it would be a good moment to pick up this again, @ltoscano what do you think?

@nalvarez i see that https://invent.kde.org/nalvarez/l10n-scripts-conversion has "recent" commits, have you been keeping it up to date?

Right.

So what's missing apart from rechecking the history and adapting makemessages to checkout the various branches in order?

I have been updating the conversion every few days when I remember, but it's not automated. I just updated it again.