RESOLVED FIXED 32834
svn-apply should handle git patches with similarity index, rename and copy directives
https://bugs.webkit.org/show_bug.cgi?id=32834
Summary svn-apply should handle git patches with similarity index, rename and copy di...
David Kilzer (:ddkilzer)
Reported 2009-12-21 12:09:42 PST
* SUMMARY If you use the "[diff] renames = copies" git config setting (or the --find-copies-harder switch) with git diff, git will include information in the patch that hints at the history of the code and may significantly reduce the size of the patch. While git uses only heuristics to accomplish this, it would be nice if svn-apply understood these directives so that it could map them to equivalent svn commands. * EXAMPLES $ git mv Makefile NMakefile$ git diff --find-copies-harder HEAD diff --git a/Makefile b/NMakefile similarity index 100% rename from Makefile rename to NMakefile $ git diff --find-copies-harder HEAD diff --git a/Makefile b/NMakefile similarity index 100% copy from Makefile copy to NMakefile $ git mv WebCore/ChangeLog WebCore/ChangeLog-2009-12-21 $ vi WebCore/ChangeLog-2009-12-21 $ git add WebCore/ChangeLog-2009-12-21 $ git diff --find-copies-harder --cached diff --git a/WebCore/ChangeLog b/WebCore/ChangeLog-2009-12-21 similarity index 99% rename from WebCore/ChangeLog rename to WebCore/ChangeLog-2009-12-21 index ab18e51..8cb2b03 100644 --- a/WebCore/ChangeLog +++ b/WebCore/ChangeLog-2009-12-21 @@ -72516,7 +72516,7 @@ form controls that have different validity states. * bindings/v8/custom/V8CustomBinding.h: * bindings/v8/custom/V8HTMLDataGridElementCustom.cpp: Added. -2009-06-24 David Kilzer <ddkilzer@apple.com> +2009-06-24 David Kilzer <ddkilzer@webkit.org> Build fixes for ENABLE(PLUGIN_PROXY_FOR_VIDEO) NOTE: For small files (under a certain number of lines), the rename detection doesn't work.
Attachments
Proposed patch (21.62 KB, patch)
2010-05-07 16:57 PDT, Chris Jerdonek
cjerdonek: commit-queue-
Proposed patch 2 (21.68 KB, patch)
2010-05-07 17:01 PDT, Chris Jerdonek
dbates: review+
cjerdonek: commit-queue-
Chris Jerdonek
Comment 1 2010-01-12 13:52:16 PST
I have a general question about our approach to improving support for Git. For the purpose of submitting Git patches, it seems there can be (at least) two approaches: (1) Make a "create-patch" script that supports creating a patch in standard format from both SVN and Git. (2) Make svn-apply and svn-unapply support both Git and SVN format. (Combining both approaches may also make sense -- e.g. to address the issue in this bug report.) From what I can tell, we currently do (2) to some extent, but not (1). Has anyone given much thought to these approaches? Are there any reasons to prefer (2) over (1) or vice versa, or to not do (1)?
David Kilzer (:ddkilzer)
Comment 2 2010-01-13 08:36:20 PST
(In reply to comment #1) > For the purpose of submitting Git patches, it seems there can be (at least) two > approaches: > > (1) Make a "create-patch" script that supports creating a patch in standard > format from both SVN and Git. > > (2) Make svn-apply and svn-unapply support both Git and SVN format. > > (Combining both approaches may also make sense -- e.g. to address the > issue in this bug report.) > > From what I can tell, we currently do (2) to some extent, but not (1). Has > anyone given much thought to these approaches? Are there any reasons to > prefer (2) over (1) or vice versa, or to not do (1)? I think (2) is a higher priority than (1) because git's own diff subcommand (with the --binary switch) is sufficient to capture all of the changes in a patch. Because the project's source repository is subversion (with a moderate number of developers using git), I think it's slightly more important that git patches work with the tools to apply patches to an svn working directory than vice-versa.
Chris Jerdonek
Comment 3 2010-01-28 04:17:52 PST
(In reply to comment #2) > (In reply to comment #1) > > For the purpose of submitting Git patches, it seems there can be (at least) two > > approaches: > > > > (1) Make a "create-patch" script that supports creating a patch in standard > > format from both SVN and Git. > > > > (2) Make svn-apply and svn-unapply support both Git and SVN format. > > > > (Combining both approaches may also make sense -- e.g. to address the > > issue in this bug report.) > > > > From what I can tell, we currently do (2) to some extent, but not (1). Has > > anyone given much thought to these approaches? Are there any reasons to > > prefer (2) over (1) or vice versa, or to not do (1)? > > I think (2) is a higher priority than (1) because git's own diff subcommand > (with the --binary switch) is sufficient to capture all of the changes in a > patch. I gave an example here of a case where the "git diff" command doesn't generate a patch in the standard format (at least with the way that I was running it): https://lists.webkit.org/pipermail/webkit-dev/2010-January/011495.html The example occurred because "git diff" doesn't run fixChangeLogPatch. > Because the project's source repository is subversion (with a moderate number > of developers using git), I think it's slightly more important that git patches > work with the tools to apply patches to an svn working directory than > vice-versa.
David Kilzer (:ddkilzer)
Comment 4 2010-01-28 08:08:58 PST
(In reply to comment #3) > (In reply to comment #2) > > (In reply to comment #1) > > > For the purpose of submitting Git patches, it seems there can be (at least) two > > > approaches: > > > > > > (1) Make a "create-patch" script that supports creating a patch in standard > > > format from both SVN and Git. > > > > > > (2) Make svn-apply and svn-unapply support both Git and SVN format. > > > > > > (Combining both approaches may also make sense -- e.g. to address the > > > issue in this bug report.) > > > > > > From what I can tell, we currently do (2) to some extent, but not (1). Has > > > anyone given much thought to these approaches? Are there any reasons to > > > prefer (2) over (1) or vice versa, or to not do (1)? > > > > I think (2) is a higher priority than (1) because git's own diff subcommand > > (with the --binary switch) is sufficient to capture all of the changes in a > > patch. > > I gave an example here of a case where the "git diff" command doesn't generate > a patch in the standard format (at least with the way that I was running it): > > https://lists.webkit.org/pipermail/webkit-dev/2010-January/011495.html > > The example occurred because "git diff" doesn't run fixChangeLogPatch. It'd be great if you could post the example here. I think I know what you're referring to, but it's always best to have a concrete example. I don't think this has been an issue in practice because: - There is no "loss of information" if the unaltered ChangeLog patch is not fixed up, and it may optionally be fixed up at any point in the future. All the pieces are there, just not in the optimal order for applying it to the top of the ChangeLog file. - The commit queue uses svn-apply to apply patches which fixes up the ChangeLog patch before applying it. Individual contributors may also be using svn-apply to apply patches, or they may be using resolve-ChangeLogs afterwards. - Even in the case where the patch is completely rejected, there's still enough information in the *.rej file to reconstruct the original patch, fix it up, then apply it. (I don't think fixChangeLogPatch() works with non-unified diff format, though, so this may not work properly today.) Having said that, if you feel that (1) is more important than (2), then I'd focus on that first. For whatever reason, this issue hasn't come up much. Maybe that's because the ChangeLog patches end up getting applied successfully to the middle of the ChangeLog file instead of at the beginning, but look "okay" otherwise, and no one has noticed/complained about it yet. (I think I've seen this behavior in the commit queue, but I though it was due to something else at the time.)
Chris Jerdonek
Comment 5 2010-01-28 08:24:12 PST
(In reply to comment #4) > It'd be great if you could post the example here. I think I know what you're > referring to, but it's always best to have a concrete example. Here is a concrete example (same submitter on same day): https://bugs.webkit.org/show_bug.cgi?id=34249#c1 https://bug-34249-attachments.webkit.org/attachment.cgi?id=47606 I needed to manually commit it since I had to make changes anyways. > I don't think this has been an issue in practice because: I recognize that this might be totally innocuous, but just wanted to raise it as something else that can come up. I want to avoid having to use wrapper scripts if possible, like you. > - There is no "loss of information" if the unaltered ChangeLog patch is not > fixed up, and it may optionally be fixed up at any point in the future. All > the pieces are there, just not in the optimal order for applying it to the top > of the ChangeLog file. > > - The commit queue uses svn-apply to apply patches which fixes up the ChangeLog > patch before applying it. Individual contributors may also be using svn-apply > to apply patches, or they may be using resolve-ChangeLogs afterwards. If this is the case, it seems like it might not be necessary to call fixChangeLogPatch in svn-create-patch.
David Kilzer (:ddkilzer)
Comment 6 2010-01-28 08:51:56 PST
(In reply to comment #5) > (In reply to comment #4) > > It'd be great if you could post the example here. I think I know what you're > > referring to, but it's always best to have a concrete example. > > Here is a concrete example (same submitter on same day): > > https://bugs.webkit.org/show_bug.cgi?id=34249#c1 > > https://bug-34249-attachments.webkit.org/attachment.cgi?id=47606 > > I needed to manually commit it since I had to make changes anyways. This one is relatively minor. If you have a patch series with the same date/author/email line followed by the same "Reviewed by NOBODY(OOPS!)." line, then you'll have even more of an offset. (FWIW, this is another reason I like bug titles before the "Reviewed by" line, but that's really neither here nor there.) > > - There is no "loss of information" if the unaltered ChangeLog patch is not > > fixed up, and it may optionally be fixed up at any point in the future. All > > the pieces are there, just not in the optimal order for applying it to the top > > of the ChangeLog file. > > > > - The commit queue uses svn-apply to apply patches which fixes up the ChangeLog > > patch before applying it. Individual contributors may also be using svn-apply > > to apply patches, or they may be using resolve-ChangeLogs afterwards. > > If this is the case, it seems like it might not be necessary to call > fixChangeLogPatch in svn-create-patch. It's nice to clean up the ChangeLog patches, though, as long as you're using svn-create-patch. The corrected patch shows the intent of the ChangeLog patch much better than the uncorrected version.
Chris Jerdonek
Comment 7 2010-01-28 09:07:21 PST
(In reply to comment #6) > This one is relatively minor. If you have a patch series with the same > date/author/email line followed by the same "Reviewed by NOBODY(OOPS!)." line, > then you'll have even more of an offset. (FWIW, this is another reason I like > bug titles before the "Reviewed by" line, but that's really neither here nor > there.) Have you considered any schemes to break the symmetry even more cleanly -- e.g. including the time in the date stamp?
David Kilzer (:ddkilzer)
Comment 8 2010-01-28 09:44:14 PST
(In reply to comment #7) > (In reply to comment #6) > > This one is relatively minor. If you have a patch series with the same > > date/author/email line followed by the same "Reviewed by NOBODY(OOPS!)." line, > > then you'll have even more of an offset. (FWIW, this is another reason I like > > bug titles before the "Reviewed by" line, but that's really neither here nor > > there.) > > Have you considered any schemes to break the symmetry even more cleanly -- e.g. > including the time in the date stamp? No, I haven't. I don't think that a timestamp would add any useful information to the date/name/email line, and it would complicate the parsing of that line.
Chris Jerdonek
Comment 9 2010-05-04 16:00:03 PDT
I was a bit surprised after comparing the behavior of the -C, -M, and --find-copies-harder options of git-diff. It seems that in some cases the --find-copies-harder option will do a worse job of detecting renames than the -C and -M options (which are less expensive at least for a smaller number of changed files). See below, for example (it's the last of the three scenarios where the difference can be observed). # Make a copy. cp Makefile NMakefile git add NMakefile git commit -a git diff --find-copies-harder HEAD^ # detects the copy git diff -C HEAD^ # does not detect the copy # Turn the copy into a move. rm Makefile git commit -a --amend git diff --find-copies-harder HEAD^ # detects the rename git diff -C HEAD^ # detects the rename # Turn the move into a move-with-changes. echo "more content" >> NMakefile git commit -a --amend git diff --find-copies-harder HEAD^ # does not detect the rename git diff -C HEAD^ # does detect the rename (output is below) git diff -M HEAD^ # same result as -C diff --git a/Makefile b/NMakefile similarity index 99% rename from Makefile rename to NMakefile index 1e50d1d..1459d21 100644 --- a/Makefile +++ b/NMakefile @@ -15,3 +15,4 @@ release r deployment dep deploy: clean: @for dir in $(MODULES); do ${MAKE} $@ -C $$dir; exit_status=$$?; \ if [ $$exit_status -ne 0 ]; then exit $$exit_status; fi; done +test From the git-diff manual, this seems to be because --find-copies-hard searches for potential sources by examining *only* those files that haven't changed, and not both files that have and haven't changed. Consequently, it may make sense in more cases to prefer using -C (or -M) over --find-copies-harder. This is in part because I imagine moves and moves-with-changes are more common for contributors than straight copying of files (without also deleting the source). Also, based on the above, it looks like it might not be possible to detect all copies and moves-with-changes for some diffs using a single "git-diff" command.
Chris Jerdonek
Comment 10 2010-05-07 16:57:34 PDT
Created attachment 55434 [details] Proposed patch
WebKit Review Bot
Comment 11 2010-05-07 17:00:56 PDT
Attachment 55434 [details] did not pass style-queue: Failed to run "['WebKitTools/Scripts/check-webkit-style', '--no-squash']" exit_code: 1 WebKitTools/Scripts/webkitperl/VCSUtils_unittest/parseSvnDiffHeader.pl:122: Line contains tab character. [whitespace/tab] [5] WebKitTools/Scripts/webkitperl/VCSUtils_unittest/parseSvnDiffHeader.pl:132: Line contains tab character. [whitespace/tab] [5] Total errors found: 2 in 6 files If any of these errors are false positives, please file a bug against check-webkit-style.
Chris Jerdonek
Comment 12 2010-05-07 17:01:38 PDT
Created attachment 55435 [details] Proposed patch 2 Cleaned up ChangeLog.
WebKit Review Bot
Comment 13 2010-05-07 17:07:05 PDT
Attachment 55435 [details] did not pass style-queue: Failed to run "['WebKitTools/Scripts/check-webkit-style', '--no-squash']" exit_code: 1 WebKitTools/Scripts/webkitperl/VCSUtils_unittest/parseSvnDiffHeader.pl:122: Line contains tab character. [whitespace/tab] [5] WebKitTools/Scripts/webkitperl/VCSUtils_unittest/parseSvnDiffHeader.pl:132: Line contains tab character. [whitespace/tab] [5] Total errors found: 2 in 6 files If any of these errors are false positives, please file a bug against check-webkit-style.
Daniel Bates
Comment 14 2010-05-08 18:40:42 PDT
Comment on attachment 55435 [details] Proposed patch 2 We should update the documentation of sub parseDiff to reflect the fact that we now return an array. Looks sane. r=me.
Chris Jerdonek
Comment 15 2010-05-08 19:54:11 PDT
Note You need to log in before you can comment on or make changes to this bug.