38093 – svn-apply: Remove unnecessary dividing line (i.e. "====...") logic

RESOLVED FIXED38093

svn-apply: Remove unnecessary dividing line (i.e. "====...") logic

https://bugs.webkit.org/show_bug.cgi?id=38093

Summary svn-apply: Remove unnecessary dividing line (i.e. "====...") logic

Chris Jerdonek

Reported 2010-04-25 05:23:23 PDT

This report is to simplify the code called by svn-apply and svn-unapply in the following way: The code for parsing git-formatted patches contains logic for adding a dividing line (i.e. "====...") to make the patch look more SVN-like: http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/VCSUtils.pm?rev=57453#L374 This code is cosmetic and isn't really necessary. In fact, it doesn't do what it purports to do because it doesn't handle lines lacking the 6-character file mode string or lines containing more than 7 characters of the git hash. Removing this code will simplify the code somewhat and make some subsequent changes somewhat easier.

Attachments
Proposed patch (5.89 KB, patch) 2010-04-25 05:47 PDT, Chris Jerdonek	cjerdonek: commit-queue-	Details Formatted Diff Diff
Proposed patch 2 (8.20 KB, patch) 2010-04-25 06:37 PDT, Chris Jerdonek	cjerdonek: commit-queue-	Details Formatted Diff Diff
Proposed patch 3 (8.20 KB, patch) 2010-04-25 06:42 PDT, Chris Jerdonek	cjerdonek: commit-queue-	Details Formatted Diff Diff
Proposed patch 4 (8.21 KB, patch) 2010-04-28 06:18 PDT, Chris Jerdonek	cjerdonek: commit-queue-	Details Formatted Diff Diff
Proposed patch 5 (8.26 KB, patch) 2010-04-28 06:25 PDT, Chris Jerdonek	dbates: review+ cjerdonek: commit-queue-	Details Formatted Diff Diff
Show Obsolete (4) View All Add attachment proposed patch, testcase, etc.

Chris Jerdonek

Comment 1 2010-04-25 05:47:36 PDT

Created attachment 54236 [details] Proposed patch

Chris Jerdonek

Comment 2 2010-04-25 05:49:19 PDT

(In reply to comment #1) > Created an attachment (id=54236) [details] > Proposed patch By the way, I ran "test-webkitpy --all" and the svn-apply unit tests there checked out okay.

Chris Jerdonek

Comment 3 2010-04-25 06:37:21 PDT

Created attachment 54238 [details] Proposed patch 2 Added the same correction to parseDiff.pl that I made to parseDiffHeader.pl.

Chris Jerdonek

Comment 4 2010-04-25 06:42:17 PDT

Created attachment 54239 [details] Proposed patch 3 Fixed typo in ChangeLog (somwhat).

Daniel Bates

Comment 5 2010-04-26 12:36:00 PDT

Comment on attachment 54239 [details] Proposed patch 3 > diff --git a/WebKitTools/Scripts/VCSUtils.pm b/WebKitTools/Scripts/VCSUtils.pm > index 25a319b..7e09839 100644 > --- a/WebKitTools/Scripts/VCSUtils.pm > +++ b/WebKitTools/Scripts/VCSUtils.pm > @@ -371,10 +371,6 @@ sub gitdiff2svndiff($) > if (m#^diff --git \w/(.+) \w/([^\r\n]+)#) { > return "Index: $1$POSTMATCH"; > } > - if (m#^index [0-9a-f]{7}\.\.[0-9a-f]{7} [0-9]{6}#) { > - # FIXME: No need to return dividing line once parseDiffHeader() is used. > - return "===================================================================$POSTMATCH"; > - } I agree, this doesn't seem needed. Moreover, as we discussed in person, by keeping this Git diff extension we can error check file mode changes. > - $_ = "=" x 67 . "$eol$_"; # Prepend dividing line ===.... (*) > } elsif (s/^\+\+\+ \S+/+++ $indexPath/) { > # +++ > $foundHeaderEnding = 1; > - } else { > - # Skip unrecognized lines. > - next; > } Now that we do not remove the git diff extension of the form "index [0-9a-f]{7}\.\.[0-9a-f]{7} [0-9]{6}", these lines (as well as other unrecognized lines such Git file mode changes) can appear in the SVN converted text. So, the SVN converted text may not correspond to a properly formatted SVN patch. If we want to keep a SVN converted form of the patch around then I would suggest that the converted text reflect a properly formatted SVN patch. Or at least, we should change the comments about svnConvertedText to reflect the fact that it represents a reasonable conversion. Some additional notes... From testing, the patch command does not seem to mind if extra lines appear in the diff header, so long as the ---/+++ lines exist. Notice, if we keep the "else" statement and (*) then the SVN converted text will reflect a properly formatted SVN patch, as produced by svn diff.

Chris Jerdonek

Comment 6 2010-04-26 13:11:12 PDT

(In reply to comment #5) > (From update of attachment 54239 [details]) Thanks for the comments on the patch, Daniel. As you probably noticed, I need to make this change and the change in bug 38094 before landing bug 34033. > > } elsif (s/^\+\+\+ \S+/+++ $indexPath/) { > > # +++ > > $foundHeaderEnding = 1; > > - } else { > > - # Skip unrecognized lines. > > - next; > > } > Now that we do not remove the git diff extension of the form "index > [0-9a-f]{7}\.\.[0-9a-f]{7} [0-9]{6}", these lines (as well as other > unrecognized lines such Git file mode changes) can appear in the SVN converted > text. So, the SVN converted text may not correspond to a properly formatted SVN > patch. It turns out that the current code already allows some "non-SVN" lines to pass through in certain cases. That's what I meant in comment 0 when I said that the line in gitdiff2svndiff() "doesn't do what it purports to do...". For example, some git diffs leave out the 6-character file mode string (I think for new files, for example), and other git diffs include full git hashes and not just 7 characters. The current code -- which doesn't use parseDiffHeader() -- allows those lines to pass through. Maybe I'll dig up a couple examples from bugs.webkit.org and post links to them here. > If we want to keep a SVN converted form of the patch around then I would > suggest that the converted text reflect a properly formatted SVN patch. Or at > least, we should change the comments about svnConvertedText to reflect the fact > that it represents a reasonable conversion. The latter is probably better since it might not make sense to speak of converting a git header into an SVN header in all cases (since they store different information). Note that your point can also be made of gitdiff2svndiff() since it allows any lines to pass through that don't match any of the if-block regular expressions: http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/VCSUtils.pm?rev=57453#L365 Really, what it's doing is converting a *subset* of the git diff lines to lines recognizable by the SVN-parsing code. It just happens to look like it's converting the header into a properly formatted SVN header because that's what it does for the most common case. > Some additional notes... > From testing, the patch command does not seem to mind if extra lines appear in > the diff header, so long as the ---/+++ lines exist. Yes, that's what I've found, too. In the long-term, it may make sense just to get rid of the svnConvertedText property altogether. Whatever information is needed from the header we can store in a structured way as properties of the object, so we don't lose any information. That way the calling code doesn't have to reparse the lines, and we won't be storing the same information in multiple spots (the data in the object will be more "normalized").

Chris Jerdonek

Comment 7 2010-04-26 13:16:26 PDT

(In reply to comment #6) > Maybe I'll dig up a couple examples from bugs.webkit.org and post links to them > here. A couple examples can be seen in this attachment: https://bug-38099-attachments.webkit.org/attachment.cgi?id=54257 Here the 6-character file mode does not appear in the index line: diff --git a/LayoutTests/platform/chromium-linux/fast/css-generated-content/after-duplicated-after-split-expected.checksum b/LayoutTests/platform/chromium-linux/fast/css-generated-content/after-duplicated-after-split-expected.checksum new file mode 100644 index 0000000..273b433 --- /dev/null +++ b/LayoutTests/platform/chromium-linux/fast/css-generated-content/after-duplicated-after-split-expected.checksum @@ -0,0 +1 @@ And here, full git hashes are being used: diff --git a/LayoutTests/platform/chromium-linux/fast/css-generated-content/after-duplicated-after-split-expected.png b/LayoutTests/platform/chromium-linux/fast/css-generated-content/after-duplicated-after-split-expected.png new file mode 100644 index 0000000000000000000000000000000000000000..809d7df4235478cc35d5378fb2b181997a06e356 GIT binary patch

Daniel Bates

Comment 8 2010-04-26 14:03:52 PDT

(In reply to comment #6) > (In reply to comment #5) > > (From update of attachment 54239 [details] [details]) > > Thanks for the comments on the patch, Daniel. As you probably noticed, I need > to make this change and the change in bug 38094 before landing bug 34033. > > > > } elsif (s/^\+\+\+ \S+/+++ $indexPath/) { > > > # +++ > > > $foundHeaderEnding = 1; > > > - } else { > > > - # Skip unrecognized lines. > > > - next; > > > } > > > Now that we do not remove the git diff extension of the form "index > > [0-9a-f]{7}\.\.[0-9a-f]{7} [0-9]{6}", these lines (as well as other > > unrecognized lines such Git file mode changes) can appear in the SVN converted > > text. So, the SVN converted text may not correspond to a properly formatted SVN > > patch. > > It turns out that the current code already allows some "non-SVN" lines to pass > through in certain cases. That's what I meant in comment 0 when I said that > the line in gitdiff2svndiff() "doesn't do what it purports to do...". Correct, and as you pointed out, gitdiff2svndiff only recognizes and translates some subset of git diff extensions to the equivalent SVN diff extensions. Shouldn't we skip unrecognized lines then (i.e. keep the "else {... next; }" clause (*) in parseDiffHeader())? Then svnConvertedText represents a properly formatted SVN diff header. > For example, some git diffs leave out the 6-character file mode string (I think > for new files, for example), and other git diffs include full git hashes and > not just 7 characters. The current code -- which doesn't use parseDiffHeader() > -- allows those lines to pass through. Right. When we switch over to using the parsePatch()/parseDiffHeader() code path and if we keep (*) then such lines would not be part of svnConvertedText. > In the long-term, it may make sense just to get rid of the svnConvertedText > property altogether. Whatever information is needed from the header we can > store in a structured way as properties of the object, so we don't lose any > information. That way the calling code doesn't have to reparse the lines, and > we won't be storing the same information in multiple spots (the data in the > object will be more "normalized"). Right. We seem to only keep it around as a by-product of using gitdiff2svndiff and for unit testing.

Chris Jerdonek

Comment 9 2010-04-26 14:58:41 PDT

(In reply to comment #8) > Correct, and as you pointed out, gitdiff2svndiff only recognizes and translates > some subset of git diff extensions to the equivalent SVN diff extensions. > Shouldn't we skip unrecognized lines then (i.e. keep the "else {... next; }" > clause (*) in parseDiffHeader())? Then svnConvertedText represents a properly > formatted SVN diff header. For the benefit of others reading the thread, Daniel and I discussed this on IRC. One of the reasons that parseDiffHeader() needs to preserve unrecognized lines (at least in some cases) is that svn-apply has code which relies on certain Git-specific lines being present in the lines "converted" by gitdiff2svndiff(), e.g. http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/svn-apply?rev=57455#L347 Eventually, we should probably refactor that parsing code into parseDiffHeader() so that svn-apply doesn't need to do any parsing itself. I included this as a FIXME in the patches for bug 34033. See line 321 of the svn-apply portion of this patch, for instance: https://bugs.webkit.org/attachment.cgi?id=54016&action=prettypatch That is the reason why, for simplicity, I thought it would be best to preserve the existing behavior for now and allow all lines to pass through (even the unrecognized Git-specific ones).

Daniel Bates

Comment 10 2010-04-26 23:27:45 PDT

> --- a/WebKitTools/Scripts/webkitperl/VCSUtils_unittest/parseDiff.pl > +++ b/WebKitTools/Scripts/webkitperl/VCSUtils_unittest/parseDiff.pl > [...] > @@ -288,19 +288,19 @@ sub testParseDiffAssertionArgs($) > my $testNameStart = "parseDiff(): [$testCaseHashRef->{diffName}] "; > > my @assertionArgsArrayRefs; # Return value > - my @assertionArgs; > + > foreach my $diffHashRefKey (@diffHashRefKeys) { > my $testName = "${testNameStart}key=\"$diffHashRefKey\""; > - @assertionArgs = ($diffHashRef->{$diffHashRefKey}, $testCaseHashRef->{$diffHashRefKey}, $testName); > - push(@assertionArgsArrayRefs, \@assertionArgs); > + my @assertionArgs1 = ($diffHashRef->{$diffHashRefKey}, $testCaseHashRef->{$diffHashRefKey}, $testName); > + push(@assertionArgsArrayRefs, \@assertionArgs1); > } > > - @assertionArgs = ($lastReadLine, $testCaseHashRef->{lastReadLine}, "${testNameStart}lastReadLine"); > - push(@assertionArgsArrayRefs, \@assertionArgs); > + my @assertionArgs2 = ($lastReadLine, $testCaseHashRef->{lastReadLine}, "${testNameStart}lastReadLine"); > + push(@assertionArgsArrayRefs, \@assertionArgs2); > > my $nextLine = <$fileHandle>; > - @assertionArgs = ($nextLine, $testCaseHashRef->{nextLine}, "${testNameStart}nextLine"); > - push(@assertionArgsArrayRefs, \@assertionArgs); > + my @assertionArgs3 = ($nextLine, $testCaseHashRef->{nextLine}, "${testNameStart}nextLine"); > + push(@assertionArgsArrayRefs, \@assertionArgs3); > From my understanding of (2) of section Making References of PerlRef <http://perldoc.perl.org/perlref.html>, we do not actually have to define new variables @assertionArgs1, @assertionArgs2, @assertionArgs3. Instead, we can write this portion of the patch using square-bracket notation (which returns a reference to an anonymous array) as follows: ... my $assertion; foreach my $diffHashRefKey (@diffHashRefKeys) { my $testName = "${testNameStart}key=\"$diffHashRefKey\""; $assertion = [$diffHashRef->{$diffHashRefKey}, $testCaseHashRef->{$diffHashRefKey}, $testName]; push(@assertionArgsArrayRefs, $assertion); } $assertion = [$lastReadLine, $testCaseHashRef->{lastReadLine}, "${testNameStart}lastReadLine"]; push(@assertionArgsArrayRefs, $assertion); my $nextLine = <$fileHandle>; $assertion = [$nextLine, $testCaseHashRef->{nextLine}, "${testNameStart}nextLine"]; push(@assertionArgsArrayRefs, $assertion); > return @assertionArgsArrayRefs; > } > diff --git a/WebKitTools/Scripts/webkitperl/VCSUtils_unittest/parseDiffHeader.pl b/WebKitTools/Scripts/webkitperl/VCSUtils_unittest/parseDiffHeader.pl > index a7a3c26..a828800 100644 > --- a/WebKitTools/Scripts/webkitperl/VCSUtils_unittest/parseDiffHeader.pl > +++ b/WebKitTools/Scripts/webkitperl/VCSUtils_unittest/parseDiffHeader.pl > [...] > @@ -247,19 +248,19 @@ sub testParseDiffHeaderAssertionArgs($) > my $testNameStart = "parseDiffHeader(): [$testCaseHashRef->{diffName}] "; > > my @assertionArgsArrayRefs; # Return value > - my @assertionArgs; > + > foreach my $diffHeaderHashRefKey (@diffHeaderHashRefKeys) { > my $testName = "${testNameStart}key=\"$diffHeaderHashRefKey\""; > - @assertionArgs = ($headerHashRef->{$diffHeaderHashRefKey}, $testCaseHashRef->{$diffHeaderHashRefKey}, $testName); > - push(@assertionArgsArrayRefs, \@assertionArgs); > + my @assertionArgs1 = ($headerHashRef->{$diffHeaderHashRefKey}, $testCaseHashRef->{$diffHeaderHashRefKey}, $testName); > + push(@assertionArgsArrayRefs, \@assertionArgs1); > } > > - @assertionArgs = ($lastReadLine, $testCaseHashRef->{lastReadLine}, "${testNameStart}lastReadLine"); > - push(@assertionArgsArrayRefs, \@assertionArgs); > + my @assertionArgs2 = ($lastReadLine, $testCaseHashRef->{lastReadLine}, "${testNameStart}lastReadLine"); > + push(@assertionArgsArrayRefs, \@assertionArgs2); > > my $nextLine = <$fileHandle>; > - @assertionArgs = ($nextLine, $testCaseHashRef->{nextLine}, "${testNameStart}nextLine"); > - push(@assertionArgsArrayRefs, \@assertionArgs); > + my @assertionArgs3 = ($nextLine, $testCaseHashRef->{nextLine}, "${testNameStart}nextLine"); > + push(@assertionArgsArrayRefs, \@assertionArgs3); Similarly, we can write this portion of the patch using the square-bracket notation.

Chris Jerdonek

Comment 11 2010-04-28 06:18:37 PDT

Created attachment 54558 [details] Proposed patch 4 Incorporated Daniel's suggestion to create references to anonymous arrays instead of creating arrays. (Great suggestion, Daniel!)

Chris Jerdonek

Comment 12 2010-04-28 06:25:06 PDT

Created attachment 54559 [details] Proposed patch 5 Renamed $assertionArgs variables to $assertionArgsRef since they now represent references to arrays rather than arrays.

Daniel Bates

Comment 13 2010-04-28 09:34:52 PDT

Comment on attachment 54559 [details] Proposed patch 5 Thanks Chris! I didn't see the changes that we discussed about updating the svnConvertedText comments to reflect the fact that this variable does not correspond to a properly formatted SVN diff. r=me.

Chris Jerdonek

Comment 14 2010-04-28 09:50:50 PDT

(In reply to comment #13) Thanks for the review, Daniel! > (From update of attachment 54559 [details]) > Thanks Chris! I didn't see the changes that we discussed about updating the > svnConvertedText comments to reflect the fact that this variable does not > correspond to a properly formatted SVN diff. You're right. I overlooked that. I'll be sure to update the code comment prior to landing.

Chris Jerdonek

Comment 15 2010-04-28 19:33:21 PDT

Committed: http://trac.webkit.org/changeset/58464

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware All

OS All

Product WebKit

Component Tools / Tests

Assignee

Chris Jerdonek

Reported

2010-04-25 05:23 PDT

Modified

2010-04-28 19:33 PDT History

CC List

5 users Show

URL

Keywords

Depends on

Blocks

38094

Dependencies

tree graph