Bug 38197

Summary: check-webkit-style: Files should be ignored unless they have an associated checker
Product: WebKit Reporter: Chris Jerdonek <cjerdonek>
Component: Tools / TestsAssignee: Chris Jerdonek <cjerdonek>
Status: RESOLVED FIXED    
Severity: Normal CC: cjerdonek, commit-queue, hamaji
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
Proposed patch
none
Proposed patch 2
hamaji: review+, hamaji: commit-queue-
Proposed patch 3 none

Description Chris Jerdonek 2010-04-27 06:47:07 PDT
Currently, the logic of which files to skip is spread across two locations: (1) the _SKIPPED_FILES... global variables and associated ProcessorDispatcher.should_skip...() methods, and (2) whether the FileType associated to the file is non-NONE in the ProcessorDispatcher's dispatch_processor() method.

It would be simpler if we combined this logic in the following way:

Instead of explicitly listing out which files to skip in _SKIPPED_FILES_WITHOUT_WARNING, we should skip files by default if they have file type NONE (i.e. if they are not associated with the C++, Python, or text checker, etc).  In this way, we will only need to maintain the white lists of which files to check.  We will not have to maintain a separate black list: the files to skip will follow logically.  In either case, we will still need to store separately which files to skip "with a warning".

This fix will indirectly take care of issues like the following one:

https://bugs.webkit.org/show_bug.cgi?id=38149
Comment 2 Chris Jerdonek 2010-04-27 08:42:22 PDT
Created attachment 54419 [details]
Proposed patch
Comment 3 Chris Jerdonek 2010-04-27 08:43:53 PDT
Created attachment 54420 [details]
Proposed patch 2
Comment 4 Shinichiro Hamaji 2010-04-27 09:18:55 PDT
Comment on attachment 54420 [details]
Proposed patch 2

WebKitTools/Scripts/webkitpy/style/checker.py:176
 +      'txt',
Can we still check ChangeLogs? First I thought the whitelist approach sounds OK when I've seen the bug description. However, I've found there are many extensions ideally we might want to check. Maybe we should add some more extensions: 'y', 'wm', 'xhtml', 'cgi', 'in', 'rb', 'pl', 'cc', 'gyp', 'gypi', 'flex', 'sh', 'pri', 'pro', and... there can be more. Hmm... Now I'm not sure if the whitelist approach is the right way...
Comment 5 Chris Jerdonek 2010-04-27 10:04:03 PDT
(In reply to comment #4)
> (From update of attachment 54420 [details])
> WebKitTools/Scripts/webkitpy/style/checker.py:176
>  +      'txt',
> Can we still check ChangeLogs?

ChangeLogs should still be getting checked because they have FileType.TEXT.  I can add a unit test for this case to be sure.  I did notice, however, that LayoutTests/ChangeLog is not getting checked.  But I believe that problem was already present.  Perhaps we can deal with that special case later.  I can add it as a FIXME.

> First I thought the whitelist approach sounds OK
> when I've seen the bug description. However, I've found there are many
> extensions ideally we might want to check. Maybe we should add some more
> extensions: 'y', 'wm', 'xhtml', 'cgi', 'in', 'rb', 'pl', 'cc', 'gyp', 'gypi',
> 'flex', 'sh', 'pri', 'pro', and... there can be more. Hmm... Now I'm not sure
> if the whitelist approach is the right way...

Yes, a black list is another approach that would be okay as long as we centralize the logic as we are doing in this report.  The black list approach may have the same problem though because we'd still have to list out every possible binary file extension: gif, jpeg, pyc, png, etc.  Given that, I think the white list approach may be better because then we don't risk accidentally processing binary files (e.g. when passing explicit directory paths).  The only down-side is that, yes, we need to explicitly opt text files in.  That could be a good feature though because at least then we can account for all files we are checking.  I would be okay with adding more text extensions in this patch if you provide more.
Comment 6 Shinichiro Hamaji 2010-04-27 10:22:44 PDT
> Yes, a black list is another approach that would be okay as long as we
> centralize the logic as we are doing in this report.  The black list approach
> may have the same problem though because we'd still have to list out every
> possible binary file extension: gif, jpeg, pyc, png, etc.  Given that, I think
> the white list approach may be better because then we don't risk accidentally
> processing binary files (e.g. when passing explicit directory paths).  The only
> down-side is that, yes, we need to explicitly opt text files in.  That could be
> a good feature though because at least then we can account for all files we are
> checking.  I would be okay with adding more text extensions in this patch if
> you provide more.

I see. Thanks for the nice description. I'm totally convinced. Let's go ahead with the whitelist approach. I'd add 'exp' and 'ac' in addition to my previous comment.
Comment 7 Shinichiro Hamaji 2010-04-27 10:45:30 PDT
Comment on attachment 54420 [details]
Proposed patch 2

The code change looks good. I'm putting r+ assuming you'll add extensions I listed.

WebKitTools/Scripts/webkitpy/style/checker.py:674
 +              raise Exception("File should not be checked: '%s'" % file_path)
This shouldn't happen if this function isn't called from unittest, right? If so, how about using assert statement?

WebKitTools/Scripts/webkitpy/style/checker_unittest.py:764
 +              self._processor.process(lines=['line1', 'line2'],
If you use assert statement above. We can just use assertRaises(AssertionError, ...) ?
Comment 8 Chris Jerdonek 2010-04-27 21:59:16 PDT
Created attachment 54521 [details]
Proposed patch 3

Incorporated Shinichiro's suggestions.  Also added more thorough comments and unit tests, and addressed the issue of LayoutTests/ChangeLog getting skipped.
Comment 9 Shinichiro Hamaji 2010-04-28 01:36:25 PDT
Comment on attachment 54521 [details]
Proposed patch 3

Looks great. Thanks for your update!
Comment 10 WebKit Commit Bot 2010-04-28 04:19:50 PDT
Comment on attachment 54521 [details]
Proposed patch 3

Clearing flags on attachment: 54521

Committed r58401: <http://trac.webkit.org/changeset/58401>
Comment 11 WebKit Commit Bot 2010-04-28 04:19:56 PDT
All reviewed patches have been landed.  Closing bug.