* SUMMARY If you use the "[diff] renames = copies" git config setting (or the --find-copies-harder switch) with git diff, git will include information in the patch that hints at the history of the code and may significantly reduce the size of the patch. While git uses only heuristics to accomplish this, it would be nice if svn-apply understood these directives so that it could map them to equivalent svn commands. * EXAMPLES $ git mv Makefile NMakefile$ git diff --find-copies-harder HEAD diff --git a/Makefile b/NMakefile similarity index 100% rename from Makefile rename to NMakefile $ git diff --find-copies-harder HEAD diff --git a/Makefile b/NMakefile similarity index 100% copy from Makefile copy to NMakefile $ git mv WebCore/ChangeLog WebCore/ChangeLog-2009-12-21 $ vi WebCore/ChangeLog-2009-12-21 $ git add WebCore/ChangeLog-2009-12-21 $ git diff --find-copies-harder --cached diff --git a/WebCore/ChangeLog b/WebCore/ChangeLog-2009-12-21 similarity index 99% rename from WebCore/ChangeLog rename to WebCore/ChangeLog-2009-12-21 index ab18e51..8cb2b03 100644 --- a/WebCore/ChangeLog +++ b/WebCore/ChangeLog-2009-12-21 @@ -72516,7 +72516,7 @@ form controls that have different validity states. * bindings/v8/custom/V8CustomBinding.h: * bindings/v8/custom/V8HTMLDataGridElementCustom.cpp: Added. -2009-06-24 David Kilzer <ddkilzer@apple.com> +2009-06-24 David Kilzer <ddkilzer@webkit.org> Build fixes for ENABLE(PLUGIN_PROXY_FOR_VIDEO) NOTE: For small files (under a certain number of lines), the rename detection doesn't work.
I have a general question about our approach to improving support for Git. For the purpose of submitting Git patches, it seems there can be (at least) two approaches: (1) Make a "create-patch" script that supports creating a patch in standard format from both SVN and Git. (2) Make svn-apply and svn-unapply support both Git and SVN format. (Combining both approaches may also make sense -- e.g. to address the issue in this bug report.) From what I can tell, we currently do (2) to some extent, but not (1). Has anyone given much thought to these approaches? Are there any reasons to prefer (2) over (1) or vice versa, or to not do (1)?
(In reply to comment #1) > For the purpose of submitting Git patches, it seems there can be (at least) two > approaches: > > (1) Make a "create-patch" script that supports creating a patch in standard > format from both SVN and Git. > > (2) Make svn-apply and svn-unapply support both Git and SVN format. > > (Combining both approaches may also make sense -- e.g. to address the > issue in this bug report.) > > From what I can tell, we currently do (2) to some extent, but not (1). Has > anyone given much thought to these approaches? Are there any reasons to > prefer (2) over (1) or vice versa, or to not do (1)? I think (2) is a higher priority than (1) because git's own diff subcommand (with the --binary switch) is sufficient to capture all of the changes in a patch. Because the project's source repository is subversion (with a moderate number of developers using git), I think it's slightly more important that git patches work with the tools to apply patches to an svn working directory than vice-versa.
(In reply to comment #2) > (In reply to comment #1) > > For the purpose of submitting Git patches, it seems there can be (at least) two > > approaches: > > > > (1) Make a "create-patch" script that supports creating a patch in standard > > format from both SVN and Git. > > > > (2) Make svn-apply and svn-unapply support both Git and SVN format. > > > > (Combining both approaches may also make sense -- e.g. to address the > > issue in this bug report.) > > > > From what I can tell, we currently do (2) to some extent, but not (1). Has > > anyone given much thought to these approaches? Are there any reasons to > > prefer (2) over (1) or vice versa, or to not do (1)? > > I think (2) is a higher priority than (1) because git's own diff subcommand > (with the --binary switch) is sufficient to capture all of the changes in a > patch. I gave an example here of a case where the "git diff" command doesn't generate a patch in the standard format (at least with the way that I was running it): https://lists.webkit.org/pipermail/webkit-dev/2010-January/011495.html The example occurred because "git diff" doesn't run fixChangeLogPatch. > Because the project's source repository is subversion (with a moderate number > of developers using git), I think it's slightly more important that git patches > work with the tools to apply patches to an svn working directory than > vice-versa.
(In reply to comment #3) > (In reply to comment #2) > > (In reply to comment #1) > > > For the purpose of submitting Git patches, it seems there can be (at least) two > > > approaches: > > > > > > (1) Make a "create-patch" script that supports creating a patch in standard > > > format from both SVN and Git. > > > > > > (2) Make svn-apply and svn-unapply support both Git and SVN format. > > > > > > (Combining both approaches may also make sense -- e.g. to address the > > > issue in this bug report.) > > > > > > From what I can tell, we currently do (2) to some extent, but not (1). Has > > > anyone given much thought to these approaches? Are there any reasons to > > > prefer (2) over (1) or vice versa, or to not do (1)? > > > > I think (2) is a higher priority than (1) because git's own diff subcommand > > (with the --binary switch) is sufficient to capture all of the changes in a > > patch. > > I gave an example here of a case where the "git diff" command doesn't generate > a patch in the standard format (at least with the way that I was running it): > > https://lists.webkit.org/pipermail/webkit-dev/2010-January/011495.html > > The example occurred because "git diff" doesn't run fixChangeLogPatch. It'd be great if you could post the example here. I think I know what you're referring to, but it's always best to have a concrete example. I don't think this has been an issue in practice because: - There is no "loss of information" if the unaltered ChangeLog patch is not fixed up, and it may optionally be fixed up at any point in the future. All the pieces are there, just not in the optimal order for applying it to the top of the ChangeLog file. - The commit queue uses svn-apply to apply patches which fixes up the ChangeLog patch before applying it. Individual contributors may also be using svn-apply to apply patches, or they may be using resolve-ChangeLogs afterwards. - Even in the case where the patch is completely rejected, there's still enough information in the *.rej file to reconstruct the original patch, fix it up, then apply it. (I don't think fixChangeLogPatch() works with non-unified diff format, though, so this may not work properly today.) Having said that, if you feel that (1) is more important than (2), then I'd focus on that first. For whatever reason, this issue hasn't come up much. Maybe that's because the ChangeLog patches end up getting applied successfully to the middle of the ChangeLog file instead of at the beginning, but look "okay" otherwise, and no one has noticed/complained about it yet. (I think I've seen this behavior in the commit queue, but I though it was due to something else at the time.)
(In reply to comment #4) > It'd be great if you could post the example here. I think I know what you're > referring to, but it's always best to have a concrete example. Here is a concrete example (same submitter on same day): https://bugs.webkit.org/show_bug.cgi?id=34249#c1 https://bug-34249-attachments.webkit.org/attachment.cgi?id=47606 I needed to manually commit it since I had to make changes anyways. > I don't think this has been an issue in practice because: I recognize that this might be totally innocuous, but just wanted to raise it as something else that can come up. I want to avoid having to use wrapper scripts if possible, like you. > - There is no "loss of information" if the unaltered ChangeLog patch is not > fixed up, and it may optionally be fixed up at any point in the future. All > the pieces are there, just not in the optimal order for applying it to the top > of the ChangeLog file. > > - The commit queue uses svn-apply to apply patches which fixes up the ChangeLog > patch before applying it. Individual contributors may also be using svn-apply > to apply patches, or they may be using resolve-ChangeLogs afterwards. If this is the case, it seems like it might not be necessary to call fixChangeLogPatch in svn-create-patch.
(In reply to comment #5) > (In reply to comment #4) > > It'd be great if you could post the example here. I think I know what you're > > referring to, but it's always best to have a concrete example. > > Here is a concrete example (same submitter on same day): > > https://bugs.webkit.org/show_bug.cgi?id=34249#c1 > > https://bug-34249-attachments.webkit.org/attachment.cgi?id=47606 > > I needed to manually commit it since I had to make changes anyways. This one is relatively minor. If you have a patch series with the same date/author/email line followed by the same "Reviewed by NOBODY(OOPS!)." line, then you'll have even more of an offset. (FWIW, this is another reason I like bug titles before the "Reviewed by" line, but that's really neither here nor there.) > > - There is no "loss of information" if the unaltered ChangeLog patch is not > > fixed up, and it may optionally be fixed up at any point in the future. All > > the pieces are there, just not in the optimal order for applying it to the top > > of the ChangeLog file. > > > > - The commit queue uses svn-apply to apply patches which fixes up the ChangeLog > > patch before applying it. Individual contributors may also be using svn-apply > > to apply patches, or they may be using resolve-ChangeLogs afterwards. > > If this is the case, it seems like it might not be necessary to call > fixChangeLogPatch in svn-create-patch. It's nice to clean up the ChangeLog patches, though, as long as you're using svn-create-patch. The corrected patch shows the intent of the ChangeLog patch much better than the uncorrected version.
(In reply to comment #6) > This one is relatively minor. If you have a patch series with the same > date/author/email line followed by the same "Reviewed by NOBODY(OOPS!)." line, > then you'll have even more of an offset. (FWIW, this is another reason I like > bug titles before the "Reviewed by" line, but that's really neither here nor > there.) Have you considered any schemes to break the symmetry even more cleanly -- e.g. including the time in the date stamp?
(In reply to comment #7) > (In reply to comment #6) > > This one is relatively minor. If you have a patch series with the same > > date/author/email line followed by the same "Reviewed by NOBODY(OOPS!)." line, > > then you'll have even more of an offset. (FWIW, this is another reason I like > > bug titles before the "Reviewed by" line, but that's really neither here nor > > there.) > > Have you considered any schemes to break the symmetry even more cleanly -- e.g. > including the time in the date stamp? No, I haven't. I don't think that a timestamp would add any useful information to the date/name/email line, and it would complicate the parsing of that line.
I was a bit surprised after comparing the behavior of the -C, -M, and --find-copies-harder options of git-diff. It seems that in some cases the --find-copies-harder option will do a worse job of detecting renames than the -C and -M options (which are less expensive at least for a smaller number of changed files). See below, for example (it's the last of the three scenarios where the difference can be observed). # Make a copy. cp Makefile NMakefile git add NMakefile git commit -a git diff --find-copies-harder HEAD^ # detects the copy git diff -C HEAD^ # does not detect the copy # Turn the copy into a move. rm Makefile git commit -a --amend git diff --find-copies-harder HEAD^ # detects the rename git diff -C HEAD^ # detects the rename # Turn the move into a move-with-changes. echo "more content" >> NMakefile git commit -a --amend git diff --find-copies-harder HEAD^ # does not detect the rename git diff -C HEAD^ # does detect the rename (output is below) git diff -M HEAD^ # same result as -C diff --git a/Makefile b/NMakefile similarity index 99% rename from Makefile rename to NMakefile index 1e50d1d..1459d21 100644 --- a/Makefile +++ b/NMakefile @@ -15,3 +15,4 @@ release r deployment dep deploy: clean: @for dir in $(MODULES); do ${MAKE} $@ -C $$dir; exit_status=$$?; \ if [ $$exit_status -ne 0 ]; then exit $$exit_status; fi; done +test From the git-diff manual, this seems to be because --find-copies-hard searches for potential sources by examining *only* those files that haven't changed, and not both files that have and haven't changed. Consequently, it may make sense in more cases to prefer using -C (or -M) over --find-copies-harder. This is in part because I imagine moves and moves-with-changes are more common for contributors than straight copying of files (without also deleting the source). Also, based on the above, it looks like it might not be possible to detect all copies and moves-with-changes for some diffs using a single "git-diff" command.
Created attachment 55434 [details] Proposed patch
Attachment 55434 [details] did not pass style-queue: Failed to run "['WebKitTools/Scripts/check-webkit-style', '--no-squash']" exit_code: 1 WebKitTools/Scripts/webkitperl/VCSUtils_unittest/parseSvnDiffHeader.pl:122: Line contains tab character. [whitespace/tab] [5] WebKitTools/Scripts/webkitperl/VCSUtils_unittest/parseSvnDiffHeader.pl:132: Line contains tab character. [whitespace/tab] [5] Total errors found: 2 in 6 files If any of these errors are false positives, please file a bug against check-webkit-style.
Created attachment 55435 [details] Proposed patch 2 Cleaned up ChangeLog.
Attachment 55435 [details] did not pass style-queue: Failed to run "['WebKitTools/Scripts/check-webkit-style', '--no-squash']" exit_code: 1 WebKitTools/Scripts/webkitperl/VCSUtils_unittest/parseSvnDiffHeader.pl:122: Line contains tab character. [whitespace/tab] [5] WebKitTools/Scripts/webkitperl/VCSUtils_unittest/parseSvnDiffHeader.pl:132: Line contains tab character. [whitespace/tab] [5] Total errors found: 2 in 6 files If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 55435 [details] Proposed patch 2 We should update the documentation of sub parseDiff to reflect the fact that we now return an array. Looks sane. r=me.
Committed: http://trac.webkit.org/changeset/59044