RESOLVED FIXED Bug 34033
svn-apply: Change svn-apply and svn-unapply to use new parsePatch().
https://bugs.webkit.org/show_bug.cgi?id=34033
Summary svn-apply: Change svn-apply and svn-unapply to use new parsePatch().
Chris Jerdonek
Reported 2010-01-22 20:35:31 PST
This will simplify svn-apply and svn-unapply, increase their test coverage, and make them more similar. It also paves the way for supporting SVN properties, e.g.: https://bugs.webkit.org/show_bug.cgi?id=27204 parsePatch() is the unit-tested method introduced here: https://bugs.webkit.org/show_bug.cgi?id=33475
Attachments
Proposed patch (19.37 KB, patch)
2010-01-24 12:25 PST, Chris Jerdonek
eric: review-
cjerdonek: commit-queue-
Proposed patch 2 (19.33 KB, patch)
2010-04-21 19:40 PDT, Chris Jerdonek
eric: review+
cjerdonek: commit-queue-
Chris Jerdonek
Comment 1 2010-01-24 12:25:22 PST
Created attachment 47299 [details] Proposed patch I discovered Test::More's is_deeply function in this patch, which simplifies the unit tests a lot. (I can use it to simplify some of the other unit test files, the next time I make a change to one of those.) cq- so I can be around when this goes out.
Chris Jerdonek
Comment 2 2010-01-28 02:00:23 PST
Oops, like I messed up last weekend and added my check-webkit-style reviewers list instead of my svn-apply reviewer(s). Adding you now, David K. Thanks!
Eric Seidel (no email)
Comment 3 2010-02-04 15:14:08 PST
Comment on attachment 47299 [details] Proposed patch I tend to make long loop blocks into their own functions (well, long blocks in general): 616 for my $diffHashRef (@diffHashRefs) { But perhaps this one makes sense inline like this. Why is shouldForce supported here? 627 die "Index path $indexPath appears twice as a copy target." unless $shouldForce; Would that ever be a valid diff that we wanted to support processing of? It seems more like such might be caused by programmer error. If would have probably written it as a one-line perl if. prepareParsedPatch seems mostly a sanity-checking method. One which splits out into the various arrays of hashs. Maybe it should be called something with "sort" in the name, since it sorts into buckets. Although that might get confusing as it's not sorting the order of the diffs in any way. I guess if you did split the for innards out into a separate function, it could return the name of the hash and the outer loop could actually do the push. copiedFiles seems uneeded. You could do a linear search in all 3 cases, instead of having the special copiedFiles? Or you could have a separate pass over the list of hashes (either before or after splitting them out into buckets) which makes sure that none of the index paths are doubled. A separate pass strikes me as cleaner than wrapping it into this loop here. Maybe python has just gotten me used to that pattern with all of it's map/reduce/list-comprehension style processing where you tend to make lots of separate passes over your data to do different things. I assume we don't support applying multipel files, or do we? 29 my @diffHashRefs = parsePatch(*ARGV); 130 131 print "Parsed " . @diffHashRefs . " diffs from patch file.\n"; If we do, we need to update the print message. I'm sad that we do so much of svn-apply at the root level instead of using nicely named subroutines. the move-copy for loop is one example of a case where I'm surprised we don't just make it a sub routine and call it: 151 for my $copyDiffHashRef (@copyDiffHashRefs) { Maybe that's a separate patch. :) Can't perl return a list and decompose it on the fly? 133 my $preparedPatchHash = prepareParsedPatch($force, @diffHashRefs); So you could write that: my (@copyDiffHashRefs, @ nonCopyDiffHashRefs, % sourceRevisions) = prepareParsedPatch(); Or am I dreaming of python? Why don't we just call the hash-key "svnFormattedDiff": 305 my $patch = $diffHashRef->{svnConvertedText}; # SVN-formatted diff instead of adding a comment? Does this work on windows to? I guess we use CYGWIN? 54 # Relative to root 55 my $pattern = "WebKitTools/Scripts/webkitperl/*_unittest/*.pl"; 4656 Is this semantically differnet from having never defined the key? 34 copiedFromPath => undef, I'm not sure my little brain can parse those unit tests. I guess I'll have to trust you that they're sane. Overall this looks like a great change. I'd love to see your response to my naming thoughts/questions before giving this a final r+ though. Marking this r- for now, you can re-mark it r? if you believe it should land unchanged, or i'm happy to review any revised version now that I've read this one through and understand it! :)
Chris Jerdonek
Comment 4 2010-02-06 11:54:55 PST
(In reply to comment #3) Thanks a lot for taking a crack at this, Eric! Just a quick note on this patch since I agree with a lot of your sentiments below. My main goals with this patch were to connect up the unit-tested code with more unit-tested code, while minimizing the non-unit-tested changes to svn-apply and svn-unapply. In other words, I was trying to resist doing any other refactoring of svn-apply/unapply, since as you have also noticed, there is a lot of stuff that can be done. :) A consequence of this decision was keeping those hashes and arrays they were already looping through: the versions, copied files, and non-copied files. Unfortunately, this made for a unit-tested function with a somewhat awkward purpose and return value. > (From update of attachment 47299 [details]) > I tend to make long loop blocks into their own functions (well, long blocks in > general): > 616 for my $diffHashRef (@diffHashRefs) { > > But perhaps this one makes sense inline like this. Yes, I agree with your instinct here. But since I'm not sure we'll want to keep prepareParsedPatch() in its current form as the refactoring of svn-apply/unapply continues, it's probably not worth the extra effort to break the inside out. It's also possible that the inside function would have an unavoidably messy signature and return value given what prepareParsedPatch() does, but I'm not sure. > Why is shouldForce supported here? > 627 die "Index path $indexPath appears twice as a copy > target." unless $shouldForce; > > Would that ever be a valid diff that we wanted to support processing of? It > seems more like such might be caused by programmer error. Good question. The original code actually proceeds without any interruption in this case. But it didn't seem quite right to me. The compromise I made was to die, but to give the user the ability to continue if it was wrong to die (that sounds weird :) ). That way we will also be able to find out whether this case occurs in the wild. It's possible this situation can come up if patches are chained together -- I'm not sure. If we are going to make any change to this line, I would probably just take it out altogether. It's always a bit risky to constrain/add more validation to a script after the fact, in the absence of detailed documentation. > If would have > probably written it as a one-line perl if. The prevailing style I've noticed in these scripts is to use unless, e.g. the following line which is visible in the patch file: > if ($merge) { > die "--merge is currently only supported for SVN" unless isSVN(); > prepareParsedPatch seems mostly a sanity-checking method. One which splits out > into the various arrays of hashs. It was more to prepare the variables so it can be "dropped in" to svn-apply/unapply with little changes. The sanity-checking was added mostly after the fact. > Maybe it should be called something with "sort" in the name, since it sorts > into buckets. Although that might get confusing as it's not sorting the order > of the diffs in any way. Or perhaps "divide" or "split". The dictionary of revision numbers is orthogonal to the splitting up, so it's really doing a couple things. Basically, it's not the most natural function. I expect it will go away over time. > I guess if you did split the for innards out into a separate function, it could > return the name of the hash and the outer loop could actually do the push. > > copiedFiles seems uneeded. You could do a linear search in all 3 cases, > instead of having the special copiedFiles? Yes, I thought about that. I ended up choosing the route that allows the code path to be unit tested. Once we have more test coverage of svn-apply/unapply, we can refactor in a more natural way. > I assume we don't support applying multipel files, or do we? > 29 my @diffHashRefs = parsePatch(*ARGV); > 130 > 131 print "Parsed " . @diffHashRefs . " diffs from patch file.\n"; > > If we do, we need to update the print message. Yes, I believe that can be done (although I'm not sure if anyone does it), so you're right on changing the message. Applying multiple patches is one case that might have implications for the sanity-checking (as I said above), so this is another case where we're not sure on the scope of use. > I'm sad that we do so much of svn-apply at the root level instead of using > nicely named subroutines. Yes, I agree. I can envision svn-apply/unapply both being reduced to a single call to the same function -- with a boolean $shouldReverse parameter. > the move-copy for loop is one example of a case where I'm surprised we don't > just make it a sub routine and call it: > 151 for my $copyDiffHashRef (@copyDiffHashRefs) { > Maybe that's a separate patch. :) Yes, we can do more later. > Can't perl return a list and decompose it on the fly? > 133 my $preparedPatchHash = prepareParsedPatch($force, @diffHashRefs); > > So you could write that: > > my (@copyDiffHashRefs, @ nonCopyDiffHashRefs, % sourceRevisions) = > prepareParsedPatch(); Perl flattens arrays, so you can only rely on an array at the end. To do nested arrays, you need to use references. I chose to return (a reference to) a hash, though, to avoid having to rely on the ordering of the return value. I probably should be calling that "$preparedPatchHashRef". It's possible there's a way to dereference multiple hash values at once, but I'm not sure. It might make for a messy line. > Why don't we just call the hash-key "svnFormattedDiff": > 305 my $patch = $diffHashRef->{svnConvertedText}; # SVN-formatted diff > > instead of adding a comment? The containing "object" is itself called a diff, so it didn't make as much sense for me to call one of its "properties" a diff. So I chose to call it the text associated to the diff. As for converted versus formatted, I chose converted since it is doing more than just formatting. For example, there can be a loss of information in the conversion, etc. (e.g. the file modes that Git diffs have and other Git metadata). > Does this work on windows to? I guess we use CYGWIN? > 54 # Relative to root > 55 my $pattern = "WebKitTools/Scripts/webkitperl/*_unittest/*.pl"; > 4656 I tried this on Windows, and it worked (no CYGWIN). David K. has also reviewed and commented on this specific line, and I'm sure he would have mentioned something if it wasn't cross-platform. > Is this semantically differnet from having never defined the key? > 34 copiedFromPath => undef, Yes, for example-- > my $hashRef = {key1 => undef}; means that-- exists($hashRef->{key1}) is true while exists($hashRef->{key2}) is not true. I will make some adjustments and re-submit after thinking more about your comments. Thanks again!
Chris Jerdonek
Comment 5 2010-02-06 12:00:03 PST
(In reply to comment #4) > (In reply to comment #3) > > Why don't we just call the hash-key "svnFormattedDiff": > > 305 my $patch = $diffHashRef->{svnConvertedText}; # SVN-formatted diff > > > > instead of adding a comment? > > The containing "object" is itself called a diff, so it didn't make > as much sense for me to call one of its "properties" a diff. So I > chose to call it the text associated to the diff. As for converted > versus formatted, I chose converted since it is doing more than just > formatting. For example, there can be a loss of information in the > conversion, etc. (e.g. the file modes that Git diffs have and > other Git metadata). I meant to add here that I should probably just take that comment out since it is redundant. Thanks for noticing that.
Chris Jerdonek
Comment 6 2010-02-06 12:18:53 PST
(In reply to comment #4) > (In reply to comment #3) > > Is this semantically differnet from having never defined the key? > > 34 copiedFromPath => undef, > > Yes, for example-- > > > my $hashRef = {key1 => undef}; > > means that-- > > exists($hashRef->{key1}) is true while > exists($hashRef->{key2}) is not true. One more thing. For completeness, you can use-- defined($hashRef->{key1}) rather than "exists" to distinguish "undef" from other values. Both key1 and key2 have defined false in the above example.
Chris Jerdonek
Comment 7 2010-04-19 01:38:49 PDT
This is a heads up that I should be reposting a rebased patch within 1 or 2 days, incorporating whatever comments make sense at this time from Eric's comments. This was spurred by conversation with Dan Bates from this report: https://bugs.webkit.org/show_bug.cgi?id=27204#c28
Chris Jerdonek
Comment 8 2010-04-21 19:40:43 PDT
Created attachment 54016 [details] Proposed patch 2 Resubmitting in response to Eric's comments in comment 3. It's been quite a while since patch #1 was submitted. Since it's been such a long time, below are the differences this patch has with patch #1: diff --git a/WebKitTools/Scripts/VCSUtils.pm b/WebKitTools/Scripts/VCSUtils.pm index 9c0b5bd..6c6f938 100644 --- a/WebKitTools/Scripts/VCSUtils.pm +++ b/WebKitTools/Scripts/VCSUtils.pm @@ -611,7 +611,7 @@ sub prepareParsedPatch($@) my %copiedFiles; # Return values - my @copyDiffHashRefs = (); # FIXME: Remove these initializers? + my @copyDiffHashRefs = (); my @nonCopyDiffHashRefs = (); my %sourceRevisionHash = (); for my $diffHashRef (@diffHashRefs) { @@ -624,9 +624,9 @@ sub prepareParsedPatch($@) # Then the diff is a copy operation. $sourcePath = $copiedFromPath; - if (exists($copiedFiles{$indexPath})) { - die "Index path $indexPath appears twice as a copy target." unless $shouldForce; - } + # FIXME: Consider printing a warning or exiting if + # exists($copiedFiles{$indexPath}) is true -- i.e. if + # $indexPath appears twice as a copy target. $copiedFiles{$indexPath} = $sourcePath; push @copyDiffHashRefs, $diffHashRef; diff --git a/WebKitTools/Scripts/svn-apply b/WebKitTools/Scripts/svn-apply index 20a3f06..13b215b 100755 --- a/WebKitTools/Scripts/svn-apply +++ b/WebKitTools/Scripts/svn-apply @@ -128,7 +128,7 @@ my %copiedFiles; # otherwise get a bareword error. my @diffHashRefs = parsePatch(*ARGV); -print "Parsed " . @diffHashRefs . " diffs from patch file.\n"; +print "Parsed " . @diffHashRefs . " diffs from patch file(s).\n"; my $preparedPatchHash = prepareParsedPatch($force, @diffHashRefs); @@ -303,7 +303,7 @@ sub patch($) { my ($diffHashRef) = @_; - my $patch = $diffHashRef->{svnConvertedText}; # SVN-formatted diff + my $patch = $diffHashRef->{svnConvertedText}; unless ($patch =~ m|^Index: ([^\r\n]+)|) { my $separator = '-' x 67; diff --git a/WebKitTools/Scripts/svn-unapply b/WebKitTools/Scripts/svn-unapply index ec66b33..d5b95e8 100755 --- a/WebKitTools/Scripts/svn-unapply +++ b/WebKitTools/Scripts/svn-unapply @@ -101,7 +101,7 @@ my %directoriesToCheck; # otherwise get a bareword error. my @diffHashRefs = parsePatch(*ARGV); -print "Parsed " . @diffHashRefs . " diffs from patch file.\n"; +print "Parsed " . @diffHashRefs . " diffs from patch file(s).\n"; my $preparedPatchHash = prepareParsedPatch($force, @diffHashRefs); @@ -139,7 +139,7 @@ sub patch($) { my ($diffHashRef) = @_; - my $patch = $diffHashRef->{svnConvertedText}; # SVN-formatted diff + my $patch = $diffHashRef->{svnConvertedText}; unless ($patch =~ m|^Index: ([^\r\n]+)|) { my $separator = '-' x 67;
Chris Jerdonek
Comment 9 2010-04-23 11:22:42 PDT
(In reply to comment #8) > Created an attachment (id=54016) [details] > Proposed patch 2 Hi Eric, can you r+ this change? A few months back (in comment 3), you more or less said you were willing to r+ it: > I'd love to see your response to my > naming thoughts/questions before giving this a final r+ though. Marking this > r- for now, you can re-mark it r? if you believe it should land unchanged The update I just submitted is largely the same as the first patch with only minor differences in response to your comments. Thanks!
Eric Seidel (no email)
Comment 10 2010-04-23 12:50:34 PDT
Comment on attachment 54016 [details] Proposed patch 2 Looks right as far as I can tell. Please be sure you ran test-webkitpy --all as those have more tests of svn-apply.
Chris Jerdonek
Comment 11 2010-04-23 13:11:44 PDT
(In reply to comment #10) > (From update of attachment 54016 [details]) > Looks right as far as I can tell. Please be sure you ran test-webkitpy --all > as those have more tests of svn-apply. Will do, thanks. Will also be monitoring after landing.
Daniel Bates
Comment 12 2010-04-23 13:14:12 PDT
Awesome. Then we can look at bug #27204. (In reply to comment #11) > (In reply to comment #10) > > (From update of attachment 54016 [details] [details]) > > Looks right as far as I can tell. Please be sure you ran test-webkitpy --all > > as those have more tests of svn-apply. > > Will do, thanks. Will also be monitoring after landing.
Chris Jerdonek
Comment 13 2010-04-23 13:16:14 PDT
(In reply to comment #12) > Awesome. Then we can look at bug #27204. Definitely! I'll be touch with you about it off-line. Thanks again for your patience. :)
Chris Jerdonek
Comment 14 2010-04-25 07:14:28 PDT
(In reply to comment #10) > (From update of attachment 54016 [details]) > Looks right as far as I can tell. Please be sure you ran test-webkitpy --all > as those have more tests of svn-apply. Good suggestion to run "test-webkitpy --all". It turns out the new parsing code didn't support the format used by binary diffs. I've added that support in a patch on another bug (bug 38094), so that "test-webkitpy --all" should now pass with this change. Once bug 38094 lands, I should be able to land this as is.
Chris Jerdonek
Comment 15 2010-04-29 03:35:14 PDT
Note You need to log in before you can comment on or make changes to this bug.