Bug 39207 - webkit-patch upload needs to pass diff to check-webkit-style
Summary: webkit-patch upload needs to pass diff to check-webkit-style
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC OS X 10.5
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks: 48041
  Show dependency treegraph
 
Reported: 2010-05-17 01:36 PDT by Eric Seidel (no email)
Modified: 2010-10-21 08:18 PDT (History)
5 users (show)

See Also:


Attachments
patch (19.34 KB, patch)
2010-05-24 00:52 PDT, Eric Seidel (no email)
abarth: review-
abarth: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Seidel (no email) 2010-05-17 01:36:47 PDT
webkit-patch upload needs to pass diff to check-webkit-style

Right now both webkit-patch and check-webkit-style compute their own copy of the current diff.  This is very expensive, especially under SVN.

We should pass the diff to check-webkit-style via stdin, we have it sitting in a string right there during the CheckStyle step.  All we need to do is call self.cached_lookup(state, "diff") and pass it as input= as part of run_command.  We obviously would have to teach check-webkit-style how to expect the diff over stdin, but that should be easy to do.

I expect this could be a very large speedup for webkit-patch users.
Comment 1 Eric Seidel (no email) 2010-05-17 01:38:26 PDT
We should also pass the diff to prepare-ChangeLog, although that can be done in a separate bug.
Comment 2 Eric Seidel (no email) 2010-05-24 00:52:15 PDT
Created attachment 56854 [details]
patch
Comment 3 Eric Seidel (no email) 2010-05-24 00:54:20 PDT
Comment on attachment 56854 [details]
patch

Actually, this is ready for review.  This makes webkit-patch upload 15s faster on my laptop, and should be an even bigger speed improvement for SVN users.
Comment 4 Adam Barth 2010-05-24 02:20:35 PDT
Comment on attachment 56854 [details]
patch

The invalidation question below is important or else we could upload the wrong diff.

WebKitTools/Scripts/webkitpy/tool/steps/abstractstep.py: 
 +          command = [port.script_path(script_name)]
This seems unrelated, but ok.

WebKitTools/Scripts/webkitpy/tool/steps/checkstyle.py:52
 +          diff = self.cached_lookup(state, "diff")
Please make sure that prepare change log and edit change log correctly invalidate the diff.  I'm not sure if this really saves much since we need to recompute the diff after editing the change log anyway.  We need to track the modified files.
Comment 5 Chris Jerdonek 2010-05-24 09:34:06 PDT
It's a layering violation for TextFileReader to have to know about patches.  TextFileReader shouldn't have a _patch_reader data attribute.  It also shouldn't have a process_files_listed_in_patch() method.

Side question: is there a reason we want code in webkitpy to be calling other code in webkitpy via the command line instead of simply calling it as a Python function?
Comment 6 Adam Barth 2010-05-24 10:23:36 PDT
Comment on attachment 56854 [details]
patch

Please address Chris's comments.
Comment 7 Eric Seidel (no email) 2010-05-24 10:38:13 PDT
(In reply to comment #5)
> It's a layering violation for TextFileReader to have to know about patches.  TextFileReader shouldn't have a _patch_reader data attribute.  It also shouldn't have a process_files_listed_in_patch() method.

Why is this a layering violation?  The previous code was horribly contorted.  I'm OK with moving process_files_listed_in_patch down into the caller (check-webkit-style), but the previous design with PatchReader was wrong, PatchReader had no business knowing about TextFileReader as demonstrated by how hard it was to mock the code.

The disadvantage with moving the process_files_listed_in_patch call down into check-webkit-style is that it makes it less testable.

Please explain why you see this as a layering violation.  TextFileReader is about reading files and processing them.  PatchReader teaching it how to read patch files. just like it already implicitly knows how to read directory files.

> Side question: is there a reason we want code in webkitpy to be calling other code in webkitpy via the command line instead of simply calling it as a Python function?

I like that it spits out the same output as it would for someone calling it via the command line.  It would be OK to call it directly, but I simply kept it this way for now.  In general process boundaries are really nice for forcing us to do simple (user-repeatable) actions and make exceptions/error cases easier to handle in a generic way.
Comment 8 Chris Jerdonek 2010-05-24 13:36:51 PDT
(In reply to comment #7)
> (In reply to comment #5)
> > It's a layering violation for TextFileReader to have to know about patches.  TextFileReader shouldn't have a _patch_reader data attribute.  It also shouldn't have a process_files_listed_in_patch() method.
> 
> Why is this a layering violation?

It's a layering violation because TextFileReader was designed to be a generic
class for performing generic operations on files.  Notice that it has no
dependencies on webkitpy other than webkitpy logging.  In particular, it
doesn't have any style-specific code in it.

Your code adds to filereader.py a reference to PatchReader, which in turn
knows about DiffParser from webkitpy.common.checkout.diff_parser.  So now
TextFileReader knows about all this other stuff.

Your code also adds style-specific logic to TextFileReader because it
passes a "line_numbers" parameter to process_file() which only makes sense
in the context of check-webkit-style.

+    def process_files_listed_in_patch(self, patch_string):
+        files_to_lines = self._patch_reader.parse_files_and_line_numbers(patch_string)
+        for path, line_numbers in files_to_lines.items():
+            self.process_file(file_path=path, line_numbers=line_numbers)
+
     def process_file(self, file_path, **kwargs):
         """Process the given file by calling the processor's process() method.

The line_numbers parameter is part of the StyleProcessor class's API:

http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/webkitpy/style/checker.py?rev=60053#L655

So that would add an implicit circular reference (checker.py <-> filereader.py).

TextFileReader's process_file() method accepts **kwargs so it wouldn't have to
know about any style-specific parameters like "line_numbers".

>  The previous code was horribly contorted.  I'm OK with moving process_files_listed_in_patch down into the caller (check-webkit-style), but the previous design with PatchReader was wrong, PatchReader had no business knowing about TextFileReader as demonstrated by how hard it was to mock the code.

It wasn't hard to mock.  The TextFileReader mock only consisted of two one-line methods.

-    class MockTextFileReader(object):
-
-        def __init__(self):
-            self.passed_to_process_file = []
-            """A list of (file_path, line_numbers) pairs."""
-
-        def process_file(self, file_path, line_numbers):
-            self.passed_to_process_file.append((file_path, line_numbers))

Incidentally, I notice that your patch didn't add any unit tests for process_files_listed_in_patch(self, patch_string).

It sounds like maybe you just don't like the fact that PatchReader
is responsible for calling both DiffParser and TextFileReader.  If that's
the case, it seems like you might be able to just break PatchReader up into smaller pieces
or methods, which would be fine.  Originally, PatchReader was really just a higher-level
helper class to make things easier to test and call from check-webkit-style.

Perhaps the confusion was all from the name.  As it is now, PatchReader is more a souped-up
version of TextFileReader for check-webkit-style purposes, rather than something
that simply "reads patches".  If the name had been something more accurate but longer like "TextFileReaderForPatches"
instead of the more innocent-sounding PatchReader, the dependency might have made more sense.