Display user information so that others can land/amend a patch
Open, HighPublic

Description

See e.g.: https://phabricator.kde.org/D4334

When people without commit rights provide a patch, I cannot directly land it as I need to know their full name and email address.
I now always have to ask manually, i.e. my horrible workflow is:

  • ask for name + email
  • arc patch --nobranch
  • git commit --amend --author ...
  • git push

This sucks a lot compared to what we could do with ReviewBoard

mwolff created this task.Feb 1 2017, 6:05 PM

While i'm not aware if Reviewboard published email addresses... you can get someone's name out of Phabricator - by accessing their profile, or hovering over any comment they've made (a tooltip should show up). Phabricator indeed doesn't show email addresses, something which I think is intentional on the Phabricator developers part.

If I read this correctly: https://secure.phabricator.com/T4333 the authorship should be preserved. So may be worth to investigate why it's not working, as it would solve the issue (arc patch + arc land would "just work" as expected)

mwolff added a comment.EditedFeb 2 2017, 11:32 AM

@bcooksley:

Please see the left part of https://git.reviewboard.kde.org/users/mwolff/ . Also, the reviewboard-am script did that automatically for one.

I don't care if it was done intentionally by Phabricator, we need this information in KDE. Right now, I have to ask people manually, so the information will be public one way or the other.

@ltoscano: Yes, this does not work currently. Indeed, fixing that would be the best. Is this phabricator instance up2date with upstream? Maybe that's the issue.

Those October 2016 changes should be present on this installation. You may need to check your version of Arcanist as well?

I'm using arch and have the latest arc: 6.r1127.gade25fac-1

@mwolff I don't see email addresses there on the left hand panel. Changes to Phabricator require maintenance by Sysadmin, which IS very much relevant to me.

I'm not sure why Arcanist wouldn't be working properly for you. Could this possibly be due to the use of --nobranch?

<-- I've used this information for years.

Regarding the maintenance cost: Sure, I agree. But adding a cost of 5 minutes to every contribution easily adds up to something very high as well. Are there workarounds? Is everything going through identity.kde.org which I could query instead? I could then write something like reviewboard-am that scripts all the custom commands for me.

Regarding arc patch not working: It also doesn't work without the --nobranch:

kdevplatform|master=$ arc patch D4334    
You have untracked files in this working copy.

  Working copy: /home/milian/projects/kf5/src/extragear/kdevelop/kdevplatform/

  Untracked changes in working copy:
  (To ignore these changes, add them to ".git/info/exclude".)
    p
    perf.data

    Ignore these untracked files and continue? [y/N] y

Created and checked out branch arcpatch-D4334.
Checking patch shell/debugger/kdevdebuggershellui.rc...
Checking patch shell/debugcontroller.h...
Checking patch shell/debugcontroller.cpp...
Applied patch shell/debugger/kdevdebuggershellui.rc cleanly.
Applied patch shell/debugcontroller.h cleanly.
Applied patch shell/debugcontroller.cpp cleanly.
 OKAY  Successfully committed patch.
kdevplatform|arcpatch-D4334$ git show
commit de30d0d581ff17e672c8eb0ccfcd87ea30558e03
Author: Milian Wolff <mail@milianw.de>
Date:   Fri Feb 3 11:52:45 2017 +0100

    Pressing "Continue" starts "Debug" if program is not running
    
    Summary:
    As requested in bug 342245 (https://bugs.kde.org/show_bug.cgi?id=342245).
    
    Changes the kdevdebuggershellui to make the "continue" action visible when the debugger is not running, otherwise the "continue" action would be disabled.
    Triggering the "continue" action when while no debug is running will have the same effect as triggering "debug" action. The "continue" action for running debugs remains the same as before.
    
    Test Plan: Manual Testing
    
    Reviewers: apol, kfunk, mwolff
    
    Reviewed By: mwolff
    
    Subscribers: mwolff, kdevelop-devel
    
    Tags: #kdevelop
    
    Differential Revision: https://phabricator.kde.org/D4334

Can someone else please try it out - is this working for anyone else? If it is, then I will have to figure out what's broken on my side. So far, I have to suspect that it is actually broken for everyone.

aacid added a subscriber: aacid.Feb 7 2017, 11:49 PM
mwolff added a comment.Feb 9 2017, 7:51 PM

anyone tried it out yet? is this working for anyone?

Albert reported that it did not work for him. It did not work for me too. I have not had time to dig into the code.

I'm going to be inquiring with upstream as to why this might be happening.

Thanks a lot Ben

If there's anything I should try out, let me know.

I've discussed this with upstream, and they've checked that revision in question.

A principle of Phabricator's design is user privacy regarding email addresses and not disclosing them, unless it's been made clear that the user intends for it to be made public. I think this is a rather sensible approach, especially considering the complaints we've received with Bugzilla in the past about it making email addresses public (and let's not get into the silly requirements of European law).

Currently retaining commit metadata is only supported when the poster of it uses Arcanist at the moment as a result. This is because it can grab the actual commit authorship information from the users system when it runs - and that commit information is treated as an intention to make the information public (which is in keeping with the above approach).

We're currently discussing options on how to facilitate this with patch uploads. Currently 'git show' and 'git format-patch' do include the necessary information, it's just not captured by Phabricator. 'git diff' uploads would still lack the needed information however, so we may end up blocking acceptance of those kinds of patch uploads.

I think I disagree here. Think about the most common usercases (github, launchapd, etc), the expection is that the email in a git-based system is visible, otherwise you can't have your patch committed. Even if you don't consider that you can dig in identity. So "should the email of the author be visible when you submit a patch" should be simply configurable (yes/no) per instance, and we should default to yes. Did you discuss on a ticket (so that I can subscribe)?

Wait Luigi said needs to be stressed more: If you want to submit a patch, you have to disclose your email address. Period. Without that, we cannot accept it.

And again: Now I have to ask users for their email address. So the address will be visible one way or another.

The discussion with upstream is taking place in a private Comprehence session on their Phabricator instance, so you can't be subscribed. From their perspective an option to make email addresses public for reviews would only solve a small portion of the issue.

There are other parts which parsing commit metadata from the diff would solve, hence why we're proceeding along that path.

Additionally, neither of you have dealt with the account closure requests for Bugzilla or the mailing list email takedown requests which come from users who complain about email addresses being published - and Phabricator is used for far more than just code review.

From my perspective, implementation of parsing of git show / git format-patch metadata resolves this issue, as it makes the information available - when the user has made an explicit declaration that it should be made public.

I disagree again, and also those discussions should be tracked in a public task.

We did not deal with account closure, but this is a different case. We are not talking about reading the information anytime, but about reading them when applying a patch. From the point of view of the user, there is a patch submitted through phabricator, pasted through the web interface or through git format-patch does not matter, which express the intent of publishing some private information required to proceed with the commit. This could be made clear in the web UI, if you want. But in both case, regardless on how the patch is generated, the committer should be able to get the information of the user.

please add a big fat warning to differential then. if anyone submits a patch he _must_ be aware that his email has to be disclosed before his patch gets accepted. Otherwise we just waste each others time. Alternatively, have this coupled to identity.kde.org and only allow people to submit patches who have an identy (and thus give me a way to find their email address)

@mwolff Their details will be available if they use Arcanist, and in the future will quite possibly be available if the diff is uploaded using one of the git show / git format-patch formats, or if it's submitted using git push (per upstream T5000). Only raw diffs as generated by "git diff" won't contain the metadata in question.

Given the issues had in a different thread regarding context availability, i'm inclined to declare anything other than uploads via Arcanist unsupported.

@ltoscano I've inquired as to upstream's position on email address disclosure. We won't be making any custom patches around this, and people are very sensitive around email address disclosure for various reasons so I totally understand the rationale behind not disclosing it (as a Sysadmin I see all the mail surrounding this - you don't and thus cannot comment). This is very much linked to account closure/deletion, as it's one of the more common reasons for it (especially in relation to Bugzilla).

In regards to this discussion being public - as upstream is only a couple of people, it's necessary to have only 1-2 representatives from each community (ourselves, Wikimedia, etc) to ensure they get a clear view on what the issue is and to minimize time wastage on each side. The output of those discussions then goes into tasks which are public. Having the discussions themselves public will just muddy the waters here.

@mwolff Their details will be available if they use Arcanist, and in the future will quite possibly be available if the diff is uploaded using one of the git show / git format-patch formats, or if it's submitted using git push (per upstream T5000). Only raw diffs as generated by "git diff" won't contain the metadata in question.

Given the issues had in a different thread regarding context availability, i'm inclined to declare anything other than uploads via Arcanist unsupported.

This would be a big regression compared to what we have now and certainly not helping the broader development community. See below for a possible solution.

@ltoscano I've inquired as to upstream's position on email address disclosure. We won't be making any custom patches around this, and people are very sensitive around email address disclosure for various reasons so I totally understand the rationale behind not disclosing it (as a Sysadmin I see all the mail surrounding this - you don't and thus cannot comment). This is very much linked to account closure/deletion, as it's one of the more common reasons for it (especially in relation to Bugzilla).

The fact that I did not deal with facts surrounding emails does not mean that I can't comment here, sorry.
You wrote a very harsh thing here that could have been at least phrased differently.

What I don't understand (I already asked and there was no answer) the difference between an user sending a review through the web interface, produced with and without git format-patch, and sending it through arcanist. From an inexperienced user they are the same. If the difference is that two of the methods contain the email is just a technical details and not a though agreement. No more than allowing the system to retrieve the email from the internal database. If we want to be sure about this, the solution could be a agreement form that needs to be marked the first time a patch is subumitted through the web interface, as @mwolff suggested. And I'm not suggesting a downstream patch (no one did), this is the suggestion to what ask upstream to implement.

In regards to this discussion being public - as upstream is only a couple of people, it's necessary to have only 1-2 representatives from each community (ourselves, Wikimedia, etc) to ensure they get a clear view on what the issue is and to minimize time wastage on each side. The output of those discussions then goes into tasks which are public. Having the discussions themselves public will just muddy the waters here.

I partially understand the reasoning, even if I disagree (I've seen this in other communities too), but if there are no other ways, I volunteer to be the second representative for the KDE community. I understand

Given the issues had in a different thread regarding context availability, i'm inclined to declare anything other than uploads via Arcanist unsupported.

If you do this, I'm out of KDE.

apol added a subscriber: apol.Feb 13 2017, 12:30 AM

Albert, don't overreact. I just talked to Ben about this again. Don't forget he's on our side ;-)

Going forward, there will probably soon be an update to our phab instance which should allow us to put a warning sign on the diff upload page, saying that people must also give us their email address. In the long term, the feature that allows parsing of git show or git format-patch would fix this properly. At that point, Ben said a custom Herald rule could even prohibit uploading of patch files without an email address.

That sounds like a good way forward to me.

Albert, don't overreact. I just talked to Ben about this again. Don't forget he's on our side ;-)

I'm not overreacting, I'm just stating a fact, if one person can single-handedly decide how our development workflow has to work (i.e. "declare only Arcanist is supported"), I'm out.

bgupta added a subscriber: bgupta.Feb 13 2017, 10:33 AM

There's no way in hell we can make email addresses public without the user's explicit consent. It has to be opt-in. Literally everyone on this thread except Ben lives in Europe. The law of the land is very protective of personal data, which most certainly includes email addresses. Refer to Directive 95/46/EC, and to the Bundesdatenschutzgesetz, Sections 4, 4a and 5. I'm sure Spain and Italy and other EU countries have implementations of the EC directive. I could well sue KDE e.V. for making my email address public, if I haven't consented to it.

You need to inform the patch uploader that the email address will be made available to other people by law. IANAL, but my understanding of the law, erring on the side of caution, is that we're allowed to collect their personal data without consent it because it is essential to our business. On the other hand, we have to explicitly ask for publication authorisation from the user, and the user must have the choice of not contributing his patch at all if he would rather preserve his privacy. This is the law and we're not above it.

I've discussed this with upstream, and they've checked that revision in question.

A principle of Phabricator's design is user privacy regarding email addresses and not disclosing them, unless it's been made clear that the user intends for it to be made public. I think this is a rather sensible approach, especially considering the complaints we've received with Bugzilla in the past about it making email addresses public (and let's not get into the silly requirements of European law).

Currently retaining commit metadata is only supported when the poster of it uses Arcanist at the moment as a result. This is because it can grab the actual commit authorship information from the users system when it runs - and that commit information is treated as an intention to make the information public (which is in keeping with the above approach).

We're currently discussing options on how to facilitate this with patch uploads. Currently 'git show' and 'git format-patch' do include the necessary information, it's just not captured by Phabricator. 'git diff' uploads would still lack the needed information however, so we may end up blocking acceptance of those kinds of patch uploads.

This seems to be the sanest approach that we can adopt. Other more complicated solutions would be to have UI that forces the user to make a choice with regards to disclosure at sign-up time, but this would need to be implemented upstream. We could have some granularity - email addresses are only disclosed to people in the Developers project, for example, but again, this has to be an upstream implementation. KDE does not have the resources to maintain this patch.

We should also have some warning somewhere that says uploading a commit via Git will make the email address associated with the commit public - and so please use a restricted email address, a honeypot if you so wish - but that's another discussion. But we *are* in violation of EU law because of Git.

I'd like to make one point clear:
no one suggested keeping patches downstream, ever.
The point of this task, since the beginning, was to find what to discusswith upstream.
I'm happy to see that it will be possible to use the UI to send patches.

I still think that explicitly ask that the user to give the permission to publish the email when an patch is posted, like it happens on every possible git provider/review system (github, gitlab, bitbucket, lkml) is a good solution (again *with upstream implementation*, and what @bgupta wrote confirms it. This could be another different discussion and task.

Given the issues had in a different thread regarding context availability, i'm inclined to declare anything other than uploads via Arcanist unsupported.

Related upstream ticket: https://secure.phabricator.com/T5000 ("Using Differential with plain Git, without requiring Arc")

abika added a subscriber: abika.Feb 13 2017, 9:08 PM

The two parts of the solution I discussed with Milian is being tracked by upstream at https://secure.phabricator.com/T12256 and https://secure.phabricator.com/T12257

I'll be looking at doing the upgrade in the next few days, all going well.

@bcooksley: Thanks!

@bgupta: You are missing the point, again. We make mail addresses of contributors public. It's in the git log. That's a fact and we have to deal with it. And claiming people don't know that their mail will be associated with their change... They are competent to git clone, write a patch in C++, submit it, but don't know their mail will be shown? Silly!

lydia added a subscriber: lydia.Feb 18 2017, 7:00 PM
dhaumann added a subscriber: dhaumann.EditedJan 2 2018, 9:23 PM

Hi all, I regularly stumble over this issue again and again. In fact, I would like to give especially first-time contributors the correct authorship (git commit --author="..."). But I cannot, since the email address is hidden. This is *really* cumbersome, and in fact kind of breaks my workflow. In the end, I mostly claim the work by others for myself, just to get work done. This is not how it should be ideally :-)

I was unaware of this thread previsouly, that's why I also just posted here: https://marc.info/?l=kde-core-devel&m=151492773414285&w=2

With respect to privacy: One central point in our community is reaching other developers. In this case, I cannot reach other developers, which is a big problem. Also note, that e.g. once logged into bugs.kde.org, I can also see email addresses (and this is indeed important once in a while).

One way to possibly ease this would be to also have an identity account. Since once logged into the KDE identity account, I can search for other users and see the email addresses - still not nice, but it would do the job.

rkflx added a subscriber: rkflx.EditedJan 2 2018, 10:20 PM

I agree in light of our newly selected focus goals, i.e. T7116: Streamlined onboarding of new contributors, that this is kind of important to solve.

once logged into bugs.kde.org, I can also see email addresses

Bugzilla is old and IMO not representative of the privacy options KDE should pioneer going forward (another focus goal, BTW, and applicable to our infrastructure too).

One way to possibly ease this would be to also have an identity account. Since once logged into the KDE identity account, I can search for other users and see the email addresses - still not nice, but it would do the job.

AFAIK everyone using Phabricator is doing so via an identity account (via LDAP? – Strangely changing full names does not seem to sync between systems). The effect you are seeing (new contributor apparently not available on identity) is that new contributors are not part of the developers group, meaning users in the general user group are hidden from each other. This makes some sense, because in general you would not expect exposing your email address to every user of a random service you are registering for.

The best solution is to require the patch upload feature to have a valid email address. Until that is done, either we accept the current state or add a checkbox to the identity registering page (as well as the account properties page, and perhaps some integration/checking in the commit access granting process), e.g. something like this:

  • I would like my email address to be visible to other users on identity.kde.org.
    • In addition, I would like to contribute code to the KDE project and agree that my email address and full name will be made public as part of every commit I do.

The state of the checkboxes then would determine how identity shows or hides the account. Needing to check identity when committing on behalf of new contributors still hampers the workflow a bit, though. Thoughts?

In T5242#122455, @rkflx wrote:

The state of the checkboxes then would determine how identity shows or hides the account. Needing to check identity when committing on behalf of new contributors still hampers the workflow a bit, though. Thoughts?

I suggest to re-read the entire ticket...

rkflx added a comment.EditedJan 2 2018, 10:41 PM

Of course I did read the entire ticket before posting. Care to elaborate what point I am missing? Note I am talking about identity (i.e. something we can fix now), while the proper fix would be in Phabricator (i.e. something which might be very far away).

I am not aware of any way for someone registering with identity (which is a prerequisite for Phabricator) to make his email visible to other developers. Doing so entails becoming a developer or even administrator on identity, which is a catch-22. If there is another way, please tell me and I'll happily admit I was wrong. I am only offering my 2 cents here based on my own experiences failing to find those email adresses on identity and my understanding of the situation.

@rkflx your proposal was already discussed and dismissed (see the comments from https://phabricator.kde.org/T5242#79416 to https://phabricator.kde.org/T5242#79559 ). Please note that I agree with your proposal and I still disagree with the rebuttal.

rkflx added a comment.EditedJan 2 2018, 10:56 PM

In those links, I don't see any mention of changes to identity (which is different from Phab, right!?) as I am proposing, but I'll stop here. We'll need to wait for @bcooksley to comment, maybe he knows more about the current upstream status of the issue.


Edit: The discussion is now split between here and the kcd-thread, where Ben answered. Still, having an optional checkbox on identity is different to a mandatory checkbox like on Bugzilla.

habacker added a subscriber: habacker.EditedJan 3 2018, 2:18 AM

The two parts of the solution I discussed with Milian is being tracked by upstream at https://secure.phabricator.com/T12256

To get an idea how to fix topic (2) in the mentioned task I digged a bit into the code on a local phabricator installation and got the following results:

  1. phabricator fetches the user from the request at https://github.com/phacility/phabricator/blob/master/src/applications/differential/conduit/DifferentialCreateRawDiffConduitAPIMethod.php#L27
  2. The raw diff is available in $raw_diff at https://github.com/phacility/phabricator/blob/master/src/applications/differential/conduit/DifferentialCreateRawDiffConduitAPIMethod.php#L28.

With the following patch I can fetch the email address from the raw diff and if present set the viewer object:

diff --git a/src/applications/differential/conduit/DifferentialCreateRawDiffConduitAPIMethod.php b/src/applications/differential/conduit/DifferentialCreateRawDiffConduitAPIMethod.php
index 6f8c2e864..7cb91b814 100644
--- a/src/applications/differential/conduit/DifferentialCreateRawDiffConduitAPIMethod.php
+++ b/src/applications/differential/conduit/DifferentialCreateRawDiffConduitAPIMethod.php
@@ -26,6 +26,12 @@ final class DifferentialCreateRawDiffConduitAPIMethod
   protected function execute(ConduitAPIRequest $request) {
     $viewer = $request->getUser();
     $raw_diff = $request->getValue('diff');
+    if(preg_match('/From:.*<(.*)>/', $raw_diff, $matches)) {
+      $from = $matches[1];
+      $user = PhabricatorUser::loadOneWithEmailAddress($from);
+      if ($user)
+        $viewer = $user;
+    }
 
     $repository_phid = $request->getValue('repositoryPHID');
     if ($repository_phid) {

Unfortunally this does not solve the problem on arc patch. It still uses the currently logged in user, which indicates that there are more fixes required to get a solution. May be that helps.

@rkflx Yes, Identity and Phabricator are completely separate pieces of software. A checkbox such as that would be semi-difficult as Phabricator can't easily access Identity to check your preferences in that regard. It only really has access to that information when you login.

In terms of upstream, there has been no real change as yet, as I noted in my replies to the mailing list thread. It would have been preferrable to avoid the split discussion.

@habacker From my understanding, setting $viewer will alter who is the owner / creator of the review in Phabricator. What we would instead need to update would be the same commit metadata that Arcanist transmits to Phabricator when it uploads reviews. You're probably not far from that code though, given that Arcanist does everything through the Conduit API.

@bcooksley: To be more able to know where to patch - It will help to take a look at the internal data first - I found the following small code review examples (web:D9545, arc:D9429)

  1. login to phabricator
  2. enter ids: {9545,9429] at https://phabricator.kde.org/conduit/method/differential.query/
  3. press "Call Method"

This returns 24449,24156 as last diff id's uploaded to the reviews. The related internal data could be fetched with

  1. enter ids: [24449,24156] at https://phabricator.kde.org/conduit/method/differential.querydiffs/
  2. press "Call Method"

The difference is that on uploading by web the "properties" key is empty. Compared to id 24156 important data seems to be located below "local:commits", but not all data are available by uploading a git formatted patch e.g. 'tree' and 'parent' and below 'arc:onto'. Don't know yet if arc would accept that missing data.

It's hard to confirm (without checking the Arcanist or Phabricator codebases) but it's quite likely that Phabricator is using the top level "authorName" and "authorEmail" keys here.

I say this as Arcanist doesn't attempt to reconstruct the commits, meaning the "local:commits" data is likely just used for informative / display purposes (among other things it will assist the repository hooks in detecting if you've pushed the change, even if you don't add any references to the review - i've seen this happen on our Phabricator)

This comment was removed by habacker.

It's hard to confirm (without checking the Arcanist or Phabricator codebases) but it's quite likely that Phabricator is using the top level "authorName" and "authorEmail" keys here.

This answers that the initial patch I provided is obsolate, because a patch uploaded by web will be assigned to the currently logged on user and the patch author does not need to be registered at phabricator.

From the arc trace informations
xx@yyy:~/src/kmymoney-4.8> arc --trace patch D9545

[0] <http> https://phabricator.kde.org/api/differential.querydiffs
[1] <http> https://phabricator.kde.org/api/user.whoami
[2] <http> https://phabricator.kde.org/api/differential.querydiffs
[3] <exec> $ git symbolic-ref --quiet HEAD
[4] <exec> $ git rev-parse --symbolic-full-name '4.8'upstream
[5] <exec> $ git --version
[6] <exec> $ git ls-remote --get-url 'origin'
[7] <http> https://phabricator.kde.org/api/repository.query
[8] <exec> $ git status --porcelain=2 -z
[9] <exec> $ git show -s --format='%H' '4cdb07aa56126988d575feef3d4a414ee67ff0df' --
[10] <exec> $ git symbolic-ref --quiet HEAD
[11] <exec> $ git rev-parse --verify 'arcpatch-D9545'
[12] <exec> $ git show -s --format='%H' '4cdb07aa56126988d575feef3d4a414ee67ff0df' --
[13] <exec> $ git checkout -b 'arcpatch-D9545' '4cdb07aa56126988d575feef3d4a414ee67ff0df'
[14] <exec> $ git show -s --format='%H' '4cdb07aa56126988d575feef3d4a414ee67ff0df' --
[15] <exec> $ git apply --whitespace nowarn --index --reject -- '/tmp/e90ah4jm7xckgws4/24373-Hm6Ww8'
[16] <exec> $ git submodule update --init --recursive
[17] <http> https://phabricator.kde.org/api/differential.getcommitmessage
[18] <exec> $ git commit -a -F - --no-verify

I checked [17], which returns the raw commit message, so the related calls required to be inspected are [0] and [2]

It's hard to confirm (without checking the Arcanist or Phabricator codebases) but it's quite likely that Phabricator is using the top level "authorName" and "authorEmail" keys here.

Those dict values are set by the following function on processing the 'differential.creatediff' api function

class DifferentialDiff ... {

public function getDiffAuthorshipDict() {
  $dict = array('properties' => array());

  $properties = id(new DifferentialDiffProperty())->loadAllWhere(
    'diffID = %d',
    $this->getID());
  foreach ($properties as $property) {
    $dict['properties'][$property->getName()] = $property->getData();

    if ($property->getName() == 'local:commits') {
      foreach ($property->getData() as $commit) {
        $dict['authorName'] = $commit['author'];
        $dict['authorEmail'] = idx($commit, 'authorEmail');
        break;
      }
    }
  }

  return $dict;
}

I say this as Arcanist doesn't attempt to reconstruct the commits, meaning the "local:commits" data is likely just used for informative / display purposes (among other things it will assist the repository hooks in detecting if you've pushed the change, even if you don't add any references to the review - i've seen this happen on our Phabricator)

I think this is not true for 'author' and 'authorEMail', from which the top level dict author related variables are created (see above)

Upload diffs by web uses a different code path (api function 'differential.createrawdiff'), in which a similar support needs to be added.

The codepath used by Arcanist won't need modifying, as that already does the right thing (author metadata gets set and is used). This setup probably originates with other VCSes which aren't distributed, with "local:commits" being added later to accomodate Git.

I've checked the Arcanist codebase and from my understanding if the top level authorName and authorEmail properties are set then the right thing will happen (see arcanist: src/workflow/ArcanistWorkflow.php, line 1,215), so yes, all that needs modifying is the handling of raw diffs to add that metadata if it's available. (I'd expect Phabricator to already have code to detect the initial lines of git show/format-patch output, so that would be the best place to start)

all that needs modifying is the handling of raw diffs to add that metadata if it's available.

agreed. I added author metadata to a local:commits property to $diff at the end of DifferentialCreateRawDiffConduitAPIMethod::execute() and saw that author meta data in querydiffs api method.

(I'd expect Phabricator to already have code to detect the initial lines of git show/format-patch output, so that would be the best place to start)

This is located in arcanist/src/parser/ArcanistDiffParser.php, but currently the header is simply skipped see ArcanistDiffParser::skipGitFormatPatch(). It may be changed to splitGitFormatPatch() and to parse the header and setup a class member $header, which could be fetched with a getHeader() method after calling ArcanistDiffParser::parseDiff() in DifferentialCreateRawDiffConduitAPIMethod::execute().

Ralf, have you got a patch which can be discussed with the Phabricator developers, or is this something I need to be looking into?

habacker added a comment.EditedJan 7 2018, 8:56 PM

I added author metadata to a local:commits property to $diff at the end of DifferentialCreateRawDiffConduitAPIMethod::execute() and saw that author meta data in querydiffs api method.

@bcooksley: I have a quick and dirty patch for this - where should I post it to ?

(I'd expect Phabricator to already have code to detect the initial lines of git show/format-patch output, so that would be the best place to start)

This is located in arcanist/src/parser/ArcanistDiffParser.php, but currently the header is simply skipped see
ArcanistDiffParser::skipGitFormatPatch(). It may be changed to splitGitFormatPatch() and to parse
the header and setup a class member $header, which could be fetched with a getHeader() method
after calling ArcanistDiffParser::parseDiff() in DifferentialCreateRawDiffConduitAPIMethod::execute().

Not yet - I guess it needs to be discussed how the api should be extended:

  1. where to extract the header (I guess in ArcanistDiffParser::parseDiff()
  2. additional api method for checking if this diff is a git formatted patch e.g. ArcanistDiffParser::isGitFormattedPatch()
  3. additional api method for fetching git formatted header (should return an array ?)

If you could upload it here that would be great.

habacker added a comment.EditedJan 8 2018, 4:14 PM
From 34e34d430f4dce69f50859babaf5a5e8e9bf97c5 Mon Sep 17 00:00:00 2001
From: Ralf Habacker <ralf.habacker@freenet.de>
Date: Fri, 5 Jan 2018 02:26:30 +0100
Subject: [PATCH] differential.createrawdiff: add commit author to diff if
 present

---
 .../DifferentialCreateRawDiffConduitAPIMethod.php       | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/src/applications/differential/conduit/DifferentialCreateRawDiffConduitAPIMethod.php b/src/applications/differential/conduit/DifferentialCreateRawDiffConduitAPIMethod.php
index 6f8c2e864..257a70935 100644
--- a/src/applications/differential/conduit/DifferentialCreateRawDiffConduitAPIMethod.php
+++ b/src/applications/differential/conduit/DifferentialCreateRawDiffConduitAPIMethod.php
@@ -90,6 +90,23 @@ final class DifferentialCreateRawDiffConduitAPIMethod
       ->setLookupRepository(false) // respect user choice
       ->applyTransactions($diff, $xactions);
 
+    // add commit author information if present
+    if (preg_match('/From:(.*)<(.*)>/', $raw_diff, $matches)) {
+      $authorName = trim($matches[1]);
+      $authorEMail = $matches[2];
+      $local_commit_data = array(
+        '0' => array(
+          'author' => $authorName,
+          'authorEmail' => $authorEMail,
+        ),
+      );
+
+      $property = new DifferentialDiffProperty();
+      $property->setDiffID($diff->getID());
+      $property->setName('local:commits');
+      $property->setData($local_commit_data);
+      $property->save();
+    }
     return $this->buildDiffInfoDictionary($diff);
   }
 
-- 
2.12.3

@Ben: you added this patch ?

Sorry, I haven't had time to look into this i'm afraid.