Bug 37122 - check-webkit-style: exits when encountering a deleted file
Summary: check-webkit-style: exits when encountering a deleted file
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Shinichiro Hamaji
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-04-05 16:00 PDT by Chris Jerdonek
Modified: 2010-04-19 22:42 PDT (History)
3 users (show)

See Also:


Attachments
Proposed patch (2.49 KB, patch)
2010-04-05 20:04 PDT, Chris Jerdonek
no flags Details | Formatted Diff | Diff
Ignore file deletions in a patch (4.18 KB, patch)
2010-04-05 21:34 PDT, Shinichiro Hamaji
no flags Details | Formatted Diff | Diff
Added a comment and updated the ChangeLog (4.49 KB, patch)
2010-04-05 23:33 PDT, Shinichiro Hamaji
no flags Details | Formatted Diff | Diff
Ignore file deletions in a patch 2 (5.16 KB, patch)
2010-04-18 19:29 PDT, Shinichiro Hamaji
no flags Details | Formatted Diff | Diff
Ignore file deletions in a patch 3 (5.16 KB, patch)
2010-04-19 01:53 PDT, Shinichiro Hamaji
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Jerdonek 2010-04-05 16:00:25 PDT
See here for an example:

https://bugs.webkit.org/show_bug.cgi?id=35811#c50

I'm not sure of all the circumstances in which this issue can arise.  Ideally, the error would be preserved at least when invoking with the path syntax.  Otherwise, the quickest fix would be to change this sys.exit() to a return statement:

http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/webkitpy/style/checker.py#L680
Comment 1 Chris Jerdonek 2010-04-05 16:03:03 PDT
(In reply to comment #0)
> Otherwise, the quickest fix would be to change this sys.exit() to a return
> statement:

And change the log.error() to log.warn() probably.
Comment 2 Chris Jerdonek 2010-04-05 20:04:17 PDT
Created attachment 52600 [details]
Proposed patch
Comment 3 Shinichiro Hamaji 2010-04-05 20:41:46 PDT
Comment on attachment 52600 [details]
Proposed patch

Though this doesn't mark as r? (I guess Chris just forgot to set the flag), I'd r+ and cq+ this patch as this fixes the regression. Thanks Chris for the quick fix and sorry for my bad review.
Comment 4 Chris Jerdonek 2010-04-05 20:55:02 PDT
(In reply to comment #3)
> (From update of attachment 52600 [details])
> Though this doesn't mark as r? (I guess Chris just forgot to set the flag), I'd
> r+ and cq+ this patch as this fixes the regression. Thanks Chris for the quick
> fix and sorry for my bad review.

Thanks, Shinichiro.  I didn't question the change, so I bear some responsibility, too. :)

By the way, it looks like the _check_directory() method can probably use similar logic.  It's less critical though, so it can wait.
Comment 5 Shinichiro Hamaji 2010-04-05 20:58:38 PDT
Comment on attachment 52600 [details]
Proposed patch

I'd land this manually
Comment 6 Shinichiro Hamaji 2010-04-05 20:59:31 PDT
Committed r57119: <http://trac.webkit.org/changeset/57119>
Comment 7 Shinichiro Hamaji 2010-04-05 21:12:24 PDT
Reopening this bug because I agree it's better to terminate the program if a user specifies paths explicitly. I think we should not call check_file when a file deletion is detected in check_patch.
Comment 8 Chris Jerdonek 2010-04-05 21:21:46 PDT
(In reply to comment #7)
> I think we should not call check_file when a file deletion is detected in check_patch.

Yes, I agree.  I was thinking this, too.

We should also double-check what happens if check_patch() encounters a directory path in the patch (e.g. a removed or added directory).  I'm not sure if this can happen, but depending on the behavior, we may want to have check_file() also check that the path is a file rather than a directory (or at least have check_patch() check that the path is a file before calling check_file()).
Comment 9 Shinichiro Hamaji 2010-04-05 21:34:56 PDT
Created attachment 52603 [details]
Ignore file deletions in a patch
Comment 10 Chris Jerdonek 2010-04-05 22:13:15 PDT
(In reply to comment #9)
> Created an attachment (id=52603) [details]
> Ignore file deletions in a patch

-            check_file(file_path=file_path, line_numbers=line_numbers)
+            if line_numbers:
+                check_file(file_path=file_path, line_numbers=line_numbers)

Before landing, do you think you could add a comment explaining the reasons
for the if clause and the cases in which it can come into play?

Thanks for fixing this the right way!
Comment 11 Shinichiro Hamaji 2010-04-05 23:33:00 PDT
Created attachment 52611 [details]
Added a comment and updated the ChangeLog
Comment 12 Shinichiro Hamaji 2010-04-05 23:36:27 PDT
CCing David Levin to ask a review.

> Before landing, do you think you could add a comment explaining the reasons
> for the if clause and the cases in which it can come into play?

Yeah, thanks for your comment. I've updated my patch.
Comment 13 Chris Jerdonek 2010-04-06 06:25:15 PDT
(In reply to comment #11)
> Created an attachment (id=52611) [details]
> Added a comment and updated the ChangeLog

Thanks for adding a comment.

-            check_file(file_path=file_path, line_numbers=line_numbers)
+            # If line_numbers is empty, file_path was deleted.
+            # In this case, we should not call check_file because it
+            # terminates the program for a nonexistent file.
+            if line_numbers:
+                check_file(file_path=file_path, line_numbers=line_numbers)

Not a big deal, but it's probably better to say there that the patch has only line deletions for the file, which includes the case of the entire file being deleted.
Comment 14 Chris Jerdonek 2010-04-06 06:54:54 PDT
(In reply to comment #13)
> (In reply to comment #11)
> > Created an attachment (id=52611) [details] [details]
> > Added a comment and updated the ChangeLog
> 
> Thanks for adding a comment.
> 
> -            check_file(file_path=file_path, line_numbers=line_numbers)
> +            # If line_numbers is empty, file_path was deleted.
> +            # In this case, we should not call check_file because it
> +            # terminates the program for a nonexistent file.
> +            if line_numbers:
> +                check_file(file_path=file_path, line_numbers=line_numbers)
> 
> Not a big deal, but it's probably better to say there that the patch has only
> line deletions for the file, which includes the case of the entire file being
> deleted.

Actually, thinking about this caused me to notice some subtle issues I hadn't noticed before.  Because check-webkit-style only reports errors occurring exactly on added lines, it can miss certain style issues caused by the contributor.  Here is one example I constructed:

> --- a/WebKitTools/Scripts/webkitpy/style/error_handlers.py
> +++ b/WebKitTools/Scripts/webkitpy/style/error_handlers.py
> @@ -51,7 +51,6 @@ Methods:
>  
>  import sys
>  
> -
>  class DefaultStyleErrorHandler(object):
>  
>      """The default style error handler."""

In this case we would want the following error to be reported:

WebKitTools/Scripts/webkitpy/style/error_handlers.py:54:  expected 2 blank lines, found 1  [pep8/E302] [5]

Similarly, this error won't get reported:

> --- a/WebKitTools/Scripts/webkitpy/style/error_handlers.py
> +++ b/WebKitTools/Scripts/webkitpy/style/error_handlers.py
> @@ -98,6 +98,7 @@ class DefaultStyleErrorHandler(object):
>  
>          return self._category_totals[category]
>  
> +
>      def _max_reports(self, category):
>          """Return the maximum number of errors to report."""
>          if not category in self._configuration.max_reports_per_category:

Here we would want this to be reported:

WebKitTools/Scripts/webkitpy/style/error_handlers.py:102:  too many blank lines (2)  [pep8/E303] [5]

So it might be better if the algorithm also reports errors in the line (or two?) immediately after an addition or deletion.
Comment 15 Chris Jerdonek 2010-04-06 07:11:14 PDT
(In reply to comment #14)
> So it might be better if the algorithm also reports errors in the line (or
> two?) immediately after an addition or deletion.

Here is a constructed example of an error caused by a contributor happening two lines after the added line:

> --- a/WebKitTools/Scripts/webkitpy/style/error_handlers.py
> +++ b/WebKitTools/Scripts/webkitpy/style/error_handlers.py
> @@ -50,7 +50,7 @@ Methods:
>  
>  
>  import sys
> -
> +import webkitpy
>  
>  class DefaultStyleErrorHandler(object):

(Sorry for cluttering this report with comments about something not directly related to the original subject.)
Comment 16 Chris Jerdonek 2010-04-06 19:48:09 PDT
(In reply to comment #15)
> (In reply to comment #14)
> > So it might be better if the algorithm also reports errors in the line (or
> > two?) immediately after an addition or deletion.
> ... 
> (Sorry for cluttering this report with comments about something not directly
> related to the original subject.)

Filed a bug for this here:

https://bugs.webkit.org/show_bug.cgi?id=37185
Comment 17 Chris Jerdonek 2010-04-14 19:16:10 PDT
(In reply to comment #11)
> Created an attachment (id=52611) [details]

Thanks for making this better!  Since this patch has languished, I'll take a crack at "reviewing" it in the hopes that a real reviewer can act on it more quickly.  It's good on the whole though I see some potential improvements described below.

diff --git a/WebKitTools/Scripts/webkitpy/style/checker.py b/WebKitTools/Scripts/webkitpy/style/checker.py
index 258a0b3..df21a0f 100644
--- a/WebKitTools/Scripts/webkitpy/style/checker.py
+++ b/WebKitTools/Scripts/webkitpy/style/checker.py
@@ -678,8 +678,8 @@ class StyleChecker(object):
                         mock_process_file)
 
         if not os_path_exists(file_path):
-            _log.warn("Skipping non-existent file: %s" % file_path)
-            return
+            _log.error("File does not exist: %s" % file_path)
+            sys.exit(1)
 
I would consider adding a "Raises:\n  SystemExit" section to check_file()'s docstring since with this change the caller becomes responsible for verifying the file's existence.

I would also consider moving the logic above to an appropriate "except" block inside the read_lines() method.  This way the "if" block will only execute when an exception occurs and not during normal code flow.  Otherwise, it seems that in some cases the above or a similar check will need to execute twice during normal code flow -- both in the calling code when the caller is checking and above at the beginning of the method itself.

@@ -734,4 +734,8 @@ class StyleChecker(object):
             _log.debug('Found %s modified lines in patch for: %s'
                        % (len(line_numbers), file_path))
 
-            check_file(file_path=file_path, line_numbers=line_numbers)
+            # If line_numbers is empty, file_path was deleted.
+            # In this case, we should not call check_file because it
+            # terminates the program for a nonexistent file.
+            if line_numbers:
+                check_file(file_path=file_path, line_numbers=line_numbers)

In the comment, I would clarify that line_numbers False means that the file has no new (or modified) lines.  In particular, a deleted file is a special case rather than the complete case.  For this reason, the "if" block is also an optimization and not there just to cover the case of a deleted file.  Maybe the comment can mention that.

Do we know whether parse_patch() can yield paths corresponding to directories and not just files (e.g. in cases where the patch creates or deletes a directory)?  Perhaps that possibility should also be mentioned in the comments as a case covered by the if block.

diff --git a/WebKitTools/Scripts/webkitpy/style/checker_unittest.py b/WebKitTools/Scripts/webkitpy/style/checker_unittest.py
 
+    _deletion_patch_string = """Index: __init__.py
+===================================================================
+--- __init__.py  (revision 3593)
++++ __init__.py  (working copy)
+@@ -1 +0,0 @@
+-foobar
+"""
+
     def setUp(self):
         StyleCheckerCheckFileBase.setUp(self)
         self._got_file_path = None
@@ -738,19 +745,24 @@ index ef65bee..e3db70e 100644
         self._got_file_path = file_path
         self._got_line_numbers = line_numbers
 
-    def test_check_patch(self):
-        patch_files = parse_patch(self._patch_string)
-        diff = patch_files[self._file_path]
-
+    def assert_check_patch(self, patch_string,
+                           expected_got_file_path, expected_got_line_numbers):
         configuration = self._style_checker_configuration()
 
         style_checker = StyleChecker(configuration)
 
-        style_checker.check_patch(patch_string=self._patch_string,
+        style_checker.check_patch(patch_string=patch_string,
                                   mock_check_file=self._mock_check_file)
 
-        self.assertEquals(self._got_file_path, "__init__.py")
-        self.assertEquals(self._got_line_numbers, set([2]))
+        self.assertEquals(self._got_file_path, expected_got_file_path)
+        self.assertEquals(self._got_line_numbers, expected_got_line_numbers)
+
+    def test_check_patch(self):
+        self.assert_check_patch(self._patch_string, "__init__.py", set([2]))
+
+    def test_check_patch_with_deletion(self):
+        # _mock_check_file should not be called for the deletion patch.
+        self.assert_check_patch(self._deletion_patch_string, None, None)
 
I believe the code above will simplify somewhat after rebasing because of the changes in this patch:

https://bugs.webkit.org/show_bug.cgi?id=37065

I would recommend that a reviewer r- this patch to consider the suggestions above and also so that the patch can be seen after rebasing.  Thanks!
Comment 18 David Levin 2010-04-16 15:14:49 PDT
Comment on attachment 52611 [details]
Added a comment and updated the ChangeLog

Please consider Chris's comments (which by far more thorough/informed than I would have been able to do on this).
Comment 19 Shinichiro Hamaji 2010-04-18 19:29:56 PDT
Created attachment 53649 [details]
Ignore file deletions in a patch 2
Comment 20 Shinichiro Hamaji 2010-04-18 19:37:43 PDT
Thanks Chris for your thorough review! I think I addressed all comments except the following two comments:

> I would also consider moving the logic above to an appropriate "except" block
> inside the read_lines() method.  This way the "if" block will only execute when
> an exception occurs and not during normal code flow.  Otherwise, it seems that
> in some cases the above or a similar check will need to execute twice during
> normal code flow -- both in the calling code when the caller is checking and
> above at the beginning of the method itself.

I think we want to check the existence of the file here. Otherwise, we cannot detect an error for files which are ignored because of dispatcher.should_skip_(with|without)_warning. For example, we may want to terminate the program when a user specifies

% check-webkit-style does_not_exist.png

What do you think?

> Do we know whether parse_patch() can yield paths corresponding to directories
> and not just files (e.g. in cases where the patch creates or deletes a
> directory)?  Perhaps that possibility should also be mentioned in the comments
> as a case covered by the if block.

I'm not sure, but I think (svn|git) diff output nothing for directory creation/deletion?
Comment 21 Chris Jerdonek 2010-04-18 20:31:52 PDT
(In reply to comment #20)
> I think we want to check the existence of the file here. Otherwise, we cannot
> detect an error for files which are ignored because of
> dispatcher.should_skip_(with|without)_warning. For example, we may want to
> terminate the program when a user specifies
> 
> % check-webkit-style does_not_exist.png
> 
> What do you think?

Yes, I also debated this issue when putting together the patch for--

https://bugs.webkit.org/show_bug.cgi?id=37754

which you r+'ed, by the way. :)

My thought process in the end is that check-webkit-style isn't meant to be a file-verification system, so it's not that bad if it doesn't exit when checking "does_not_exist.png".  If the file really did exist, it would be skipped anyways and the result would be the same, so it doesn't seem like there is any harm in the behavior.

A bigger issue is that calling os.path.exists() isn't sufficient to say that the file is okay, so it's a false assurance in any case.  For example, why aren't we also calling "not os.path.isdir()" to verify that the file is a file rather than a directory?  It seems that the only surefire way to check may be to attempt to open the file for reading, but that seems excessive since you'd be doing it twice.  So in the end, I thought it would be better to take the laziest approach and let check-webkit-style attempt to check style as it normally does.
 
(By the way, in my original comment I suggested putting the logic in a try-block in read_lines(), but there is already a try-block around the code calling read_lines which is probably the more appropriate spot.)

In any case, whatever you did there should be fine for now.  We can revisit the issue in the patch to activate the TextFileReader code of bug 37754.
Comment 22 Shinichiro Hamaji 2010-04-18 20:55:18 PDT
> My thought process in the end is that check-webkit-style isn't meant to be a
> file-verification system, so it's not that bad if it doesn't exit when checking
> "does_not_exist.png".  If the file really did exist, it would be skipped
> anyways and the result would be the same, so it doesn't seem like there is any
> harm in the behavior.

Maybe "WebCore/rendering/RenderTheme.cp" (this lacks a trailing "p") would be a better example than "does_not_exist.png". I think this case can be a bit more harmful?
Comment 23 Chris Jerdonek 2010-04-18 21:15:30 PDT
(In reply to comment #22)
> Maybe "WebCore/rendering/RenderTheme.cp" (this lacks a trailing "p") would be a
> better example than "does_not_exist.png". I think this case can be a bit more
> harmful?

Good example.  I think you convinced me for now. :)
Comment 24 Chris Jerdonek 2010-04-18 21:45:10 PDT
(In reply to comment #19)
> Created an attachment (id=53649) [details]

Thanks for addressing my comments and also for covering the "-" case.

This looks good to me.  I would r+ it if I could.

I would just fix this PEP8 nit prior to landing (two spaces after a period).

> +            # modified lines. In this case, we don't check the file
Comment 25 Shinichiro Hamaji 2010-04-19 01:53:30 PDT
Created attachment 53660 [details]
Ignore file deletions in a patch 3
Comment 26 Shinichiro Hamaji 2010-04-19 01:54:51 PDT
> I would just fix this PEP8 nit prior to landing (two spaces after a period).

Thanks. Done in the patch 3.
Comment 27 Shinichiro Hamaji 2010-04-19 22:42:06 PDT
Comment on attachment 53660 [details]
Ignore file deletions in a patch 3

Clearing flags on attachment: 53660

Committed r57870: <http://trac.webkit.org/changeset/57870>
Comment 28 Shinichiro Hamaji 2010-04-19 22:42:16 PDT
All reviewed patches have been landed.  Closing bug.