Arcanist doesn't apply patches correctly in certain circumstances.
Open, Needs TriagePublic

Description

Hi!
I have the vague feeling something is wrong with the git server(s) lately.
For example I'm experiencing unexpected merge necessities. see D12028.

It looks like there's a huge delay between anongit and cgit.
E.g. D12165 landed Thu, Apr 12, 23:19 it immediately appeared on cgit, but wasn't pulled until ~ 9:00 CETS today. Apparently it was not available at 5:00 see D12165.

Maybe related, but I'm experiencing this for much longer:

$ git co master
Ihr Branch ist auf demselben Stand wie 'origin/master'. (German for branch is up-to-date)

Nevertheless sometimes git pull does pull something.

Restricted Application added a subscriber: sysadmin. · View Herald TranscriptApr 13 2018, 10:09 AM
michaelh updated the task description. (Show Details)Apr 13 2018, 10:13 AM
bruns added a subscriber: bruns.Apr 13 2018, 8:08 PM

This apparently is a genuine phabricator issue.

arc uses the base tree hash which has been transferred when creating the revision. Now, if there was a change to the base tree (e.g. you reordered some patches locally, or ammended the commit message), the tree hash does no longer exist.

Next it assumes this is because some commits of the stack were not applied to your tree. It fetches the parent revisions from the stack, and then fails miserably when trying to apply them:

  • it does not check if the revision has been closed (strong hint it has already been applied).
  • it does not fetch the commit of the revision to check it has been applied

If I hack my /usr/share/phabricator/arcanist/src/workflow/ArcanistPatchWorkflow.php to not apply any revisions in state closed, it only tries to apply the remaining ones (e.g. arc patch D12044 only fetches D11745, D11753 and D12044).

You can workaround this by using arc patch --skip-dependencies ..., but then you have to apply each revision of the stack yourself.

bcooksley added a subscriber: bcooksley.

Thanks for raising that issue with upstream. Let's see what they have to say in response to that.
Given what you've described it is something they should fix.

rkflx added a subscriber: rkflx.Apr 14 2018, 7:23 AM
In T8506#138206, @bruns wrote:

A closed revision is a good hint the patch has been applied to some branch, but not necessarily to the branch you are currently targeting. I don't think Arcanist should make too many assumptions about your local workflow, so I doubt upstream will implement that heuristic (no upstream reply to your question as of now, too).


Nevertheless, in D12458 we experienced something similar today, which AFAIK has the same root cause. Let me share what I found in my analysis:

When you create a "dependent revision", the baseRevision field is set to the commit sha of the local parent commit, as the parent does not yet have a sha in the upstream repo. This means for arc patch the new arcpatch- branch is not attached to the "correct" parent, but to the tip of the branch.

With that in mind, here is how applying a revision (let's call it "R") goes step-by-step and ultimately fails in case its parent (let's call it "P") has already been landed:

  1. Get called with arc patch R.
  2. Look at baseRevision, see that it's not present on local branch, therefore create empty arcpatch-R at tip of branch.
  3. Look at dependent revisions, see P, call arc patch P.
  4. New arcpatch-P is created, which works just fine because it isn't created at the tip of the branch, but at the parent revision of P itself, which actually has a commit sha in the upstream repo, as P itself does not depend on another revision and was branched directly from an upstream commit.
  5. arcpatch-P is being cherry-picked into arcpatch-R, because the ultimate goal is to have both P and R on the final arcpatch-R branch.
  6. The last step fails, because P already landed, i.e. P is already present on the local tip arcpatch-R was branched from, and Git does not allow to perform "empty" cherry-picks, unless git commit --allow-empty is specified (which Arcanist does not at the moment).
  7. Workflow stops due to error, R itself has not even been fetched at this point, let alone did Arcanist try to apply the patch (which would have worked just fine, as can be demonstrated with arc patch R --skip-dependencies).

There are two possible ways I could imagine upstream solving this:

  • Add --allow-empty flag to Arcanist's workflow.
  • Change the workflow in such a way that for revisions which have dependencies the non-existing baseRevision is ignored, and instead the baseRevision of the parent (or rather the merge-base of all leaves in the dependency tree?) is used instead.

The latter way is harder, but more correct, as the baseRevision is there for a reason: To be able to specify a commit in the repo the patch is meant for, isolating the patch author and reviewer from future changes. IOW this prevents the "Cannot apply your patch, please rebase it" we see so often on Phabricator when people use the web uploader (which does not set the baseRevision) compared to when people use Arcanist to submit the patch. (Of course rebasing needs to be done on landing time, but that's not the point here.)

Obviously I'm no Arcanist expert and probably this reasoning is all bogus, but I'd be glad if people could confirm or refute my thinking, so maybe @bcooksley could then bring it to the attention of upstream Phabricator.


@michaelh Would you mind renaming this task, e.g. to "Improve arc patch for when dependent revisions have already been landed"?

So what Arcanist really needs to do in this case is fallback to using the remote branch name (which should be available to it) when baseRevision isn't available in the local repository?

rkflx added a comment.EditedApr 25 2018, 10:26 AM

So what Arcanist really needs to do in this case is fallback to using the remote branch name (which should be available to it) when baseRevision isn't available in the local repository?

AFAIK the remote branch name is only availabe if you used arc feature or manually set the upstream tracking branch with git (e.g. with --set-upstream-to= or git branch --track). But even then that's not ideal, because as I wrote typically you don't want to attach to the tip, but to a specific commit available on the remote branch.

(Edit: It's even more complex: For dependent patches the "upstream" branch will be the local branch of the depending parent branch, i.e. nothing which has some equivalent "remote" branch.)

For P this works just fine. What I'd like to have is that when I arc patch R and P is a parent dependency of R, Arcanist will branch arcpatch-R from the baseRevision of P, instead of second-guessing some commit on the tip of some branch.


BTW, this would also get rid of this annoying message when reviewing dependent patches:

This diff is against commit 0500f8c7a2699ea7f8fdb46419a580c24626e4f2, but
the commit is nowhere in the working copy. Try to apply it against the
current working copy state? (8e1b392d0df66fbb93238e67909d3e66610f0502)
[Y/n]

In this case, the Diff was for stable, but it ended up being branched from my local branch. If it wasn't a dependent revision, automatically switching branches would have worked just fine, but here it failed because of the issue.

rkflx added a comment.EditedApr 25 2018, 10:49 AM

To make this a bit more tangible, here is an example: In both cases I'm starting out in resize-dialog-overhaul, which is a local dev branch. I'm trying to review D12509, which depends on D12465.

Case 1: arc patch D12509, i.e. fetch parent plus child. This will end up on my local branch (after answering with Y to the warning message from above), i.e. not what I want:

Case 2: arc patch D12465, i.e. fetch only the parent. This will automatically (no warning message) switch to the stable branch, just like the author intended (note that in this case baseRevision is equivalent to the tip of the stable branch, but that's not always the case for older patches!):

The reason case 1 fails is because of the missing baseRevision. For other workflows, i.e. reviewing R when P already landed, the issue is exactly the same. Instead of trying to attach (and failing) to the tip of the local branch, Arcanist should attach to the exact (!) commit specified in P.

Okay. So essentially what it needs to do is walk the dependency tree, then determine the branch to switch to, before it starts trying to apply patches.
Is this correct?

rkflx added a comment.May 4 2018, 9:22 PM

Okay. So essentially what it needs to do is walk the dependency tree, then determine the branch to switch to, before it starts trying to apply patches.
Is this correct?

It's not about branches, but about commits. A patch might apply against a specific state of a branch (i.e. a commit), but might fail to apply against another state of the same branch (i.e. another commit). That's why there is baseRevision and not baseBranch. Nevertheless, a specific commit will at the same time determine a specific branch.

As for walking the dependency tree, that's correct (at least that's how I understood it – upstream Phabricator will give you a better answer for sure). One caveat is that there can be multiple dependencies, but if the tree of Diffs is in a consistent state, there should be a commit which will work as a root of the tree.

Note that baseRevision already works for the top-most dependency, it's only when cherry-picking from this top-most dependency (P) to the actual target branch (R) that the target branch is branched off of the wrong commit.

bcooksley renamed this task from Is cgit.kde.org / anongit.kde.org OK? to Arcanist doesn't apply patches correctly in certain circumstances..Jun 10 2018, 6:09 AM
bcooksley edited projects, added Phabricator; removed Sysadmin.