Bug 21457

Summary: resolve-ChangeLogs should be able to operate on a git revision range
Product: WebKit Reporter: David Kilzer (:ddkilzer) <ddkilzer>
Component: Tools / TestsAssignee: David Kilzer (:ddkilzer) <ddkilzer>
Status: RESOLVED FIXED    
Severity: Normal CC: aroben, eric, mrowe
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Mac   
OS: OS X 10.5   
Attachments:
Description Flags
Patch v1
none
Patch v2 aroben: review+

Description David Kilzer (:ddkilzer) 2008-10-07 19:12:48 PDT
* SUMMARY
The resolve-ChangeLogs script should be able to operate on a git revision range so it can re-merge multiple commits.

* STEPS TO REPRODUCE
This scenario commonly happens when the developer commits a ChangeLog entry to an external repository (e.g., to svn), then commits more than one additional patch locally (changing the same ChangeLog file, e.g., to git), while another developer commits a ChangeLog entry to the external repository (svn).

When the developer attempts to rebase their local commits, git tends to merge the ChangeLogs such that the new (local) ChangeLog entries appear just above their entry committed to svn, rather than at the top of the file.  (See also Bug 21185.)

* NOTES
The patch that will be attached adds another way to invoke resolve-ChangeLogs to fix ChangeLog entries throughout a revision range:

resolve-ChangeLogs -f revision..range
resolve-ChangeLogs -f     # implies: HEAD^..HEAD
resolve-ChangeLogs -f commitish     # implies commitish..HEAD
Comment 1 David Kilzer (:ddkilzer) 2008-10-07 19:20:55 PDT
Created attachment 24181 [details]
Patch v1

Proposed fix.

Note that this doesn't clear the .git/refs/original directory structure when it's done, nor does it use the --force flag to ignore that directory existing when run a second time.  (Looking for feedback on what to do there.)
Comment 2 Adam Roben (:aroben) 2008-10-08 07:28:18 PDT
Comment on attachment 24181 [details]
Patch v1

 87 Usage: @{[ basename($0) ]} -f|--fix-merge [revision-range]
 88        @{[ basename($0) ]} [options] path/to/ChangeLog [path/to/another/ChangeLog ...]

I think it makes sense to reverse the order of these two examples, since the second one is the more common usage for non-git users.

 264     exec("$GIT filter-branch --tree-filter 'PREVIOUS_COMMIT=\`$GIT rev-parse \$GIT_COMMIT^\` && MAPPED_PREVIOUS_COMMIT=\`map \$PREVIOUS_COMMIT\` $0 -f \"" . join('" "', @changeLogs) . "\"' $revisionRange");

Does exec have a multi-argument form? Would using that make this easier to read?

 384     if (!$fileMine || !$fileOlder || !$fileNewer) {
 385         next;
 386     }

I know this code just got moved, but I think this would be a little easier to read as:

next unless $fileMine && $fileOlder && $fileNewer;

r=me on this version of the patch, but let's keep talking about the issues you raised.
Comment 3 Adam Roben (:aroben) 2008-10-08 07:28:43 PDT
(In reply to comment #1)
> Note that this doesn't clear the .git/refs/original directory structure when
> it's done, nor does it use the --force flag to ignore that directory existing
> when run a second time.  (Looking for feedback on what to do there.)

What are the effects of this?
Comment 4 David Kilzer (:ddkilzer) 2008-10-08 21:37:07 PDT
(In reply to comment #3)
> (In reply to comment #1)
> > Note that this doesn't clear the .git/refs/original directory structure when
> > it's done, nor does it use the --force flag to ignore that directory existing
> > when run a second time.  (Looking for feedback on what to do there.)
> 
> What are the effects of this?

If you run the command twice in a row, the resolve-ChangeLogs script will exit with an error.

If you passed "--force" as a switch to git filter-branch OR removed the .git/refs/original directory, you could run the command twice in a row without triggering that specific error.

Note that "twice in a row" means you could do a lot of work in your git repository between runs of this script, and it would still error out without the above changes.
Comment 5 Adam Roben (:aroben) 2008-10-09 06:50:08 PDT
(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #1)
> > > Note that this doesn't clear the .git/refs/original directory structure when
> > > it's done, nor does it use the --force flag to ignore that directory existing
> > > when run a second time.  (Looking for feedback on what to do there.)
> > 
> > What are the effects of this?
> 
> If you run the command twice in a row, the resolve-ChangeLogs script will exit
> with an error.
> 
> If you passed "--force" as a switch to git filter-branch OR removed the
> .git/refs/original directory, you could run the command twice in a row without
> triggering that specific error.

It sounds like removing .git/refs/original when the command completes successfully is the right way to go. If the command fails, ideally we would reset the repository back to the way it was before the command was run (which of course includes deleting .git/refs/original). If we can't do that we should at least print out an informative error message describing how the user can recover (and of course mention that .git/refs/original needs to be deleted).
Comment 6 David Kilzer (:ddkilzer) 2008-10-11 21:18:58 PDT
Created attachment 24300 [details]
Patch v2

Patch v2

(In reply to comment #2)
> (From update of attachment 24181 [details] [edit])
>  87 Usage: @{[ basename($0) ]} -f|--fix-merge [revision-range]
>  88        @{[ basename($0) ]} [options] path/to/ChangeLog
> [path/to/another/ChangeLog ...]
> 
> I think it makes sense to reverse the order of these two examples, since the
> second one is the more common usage for non-git users.

Updated.  The -f|--fix-merged switch actually takes an optional argument now (the range) as well as a list of ChangeLog files, so having separate lines doesn't make sense anymore.

>  264     exec("$GIT filter-branch --tree-filter 'PREVIOUS_COMMIT=\`$GIT
> rev-parse \$GIT_COMMIT^\` && MAPPED_PREVIOUS_COMMIT=\`map \$PREVIOUS_COMMIT\`
> $0 -f \"" . join('" "', @changeLogs) . "\"' $revisionRange");
> 
> Does exec have a multi-argument form? Would using that make this easier to
> read?

I tried using the multi-argument form, but I think it breaks due to the way git re-invokes the command (since git-filter-branch doesn't exist anymore).  The exec() changed to system() anyway to support removal of .git/refs/original.

>  384     if (!$fileMine || !$fileOlder || !$fileNewer) {
>  385         next;
>  386     }
> 
> I know this code just got moved, but I think this would be a little easier to
> read as:
> 
> next unless $fileMine && $fileOlder && $fileNewer;

Using "next" here was actually a bug--I extracted this code from a for() loop into a method, but forgot to change the "next" into an early return statement.  Thanks!

(In reply to comment #5)
> It sounds like removing .git/refs/original when the command completes
> successfully is the right way to go. If the command fails, ideally we would
> reset the repository back to the way it was before the command was run (which
> of course includes deleting .git/refs/original). If we can't do that we should
> at least print out an informative error message describing how the user can
> recover (and of course mention that .git/refs/original needs to be deleted).

The .git/refs/original directory is removed on a successful run of git filter-branch.  The refs aren't changed if the filter-branch operation errors out, so there is nothing for the script to do in that case.

Also note that this patch fixes all known issues with the example you provided offline.
Comment 7 David Kilzer (:ddkilzer) 2008-10-11 21:19:27 PDT
Comment on attachment 24181 [details]
Patch v1

Clearing r+ flag since this wasn't landed.
Comment 8 Adam Roben (:aroben) 2008-10-13 11:04:47 PDT
Comment on attachment 24300 [details]
Patch v2

 265     # On success, remove the backup refs directory
 266     if (($? >> 8) == 0) {
 267         rmtree(qw(.git/refs/original));
 268     }

Should we use WEXITSTATUS (from the POSIX module) instead?

r=me!
Comment 9 David Kilzer (:ddkilzer) 2008-10-13 11:55:57 PDT
Committed r37559
Comment 10 David Kilzer (:ddkilzer) 2008-10-13 11:56:59 PDT
(In reply to comment #8)
> (From update of attachment 24300 [details] [edit])
>  265     # On success, remove the backup refs directory
>  266     if (($? >> 8) == 0) {
>  267         rmtree(qw(.git/refs/original));
>  268     }
> 
> Should we use WEXITSTATUS (from the POSIX module) instead?

Changed this and fixed up some error messages in the final patch.