Bug 32538 - [check-webkit-style] Add support for TAB check against text files
Summary: [check-webkit-style] Add support for TAB check against text files
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Nobody
URL:
Keywords:
: 29509 (view as bug list)
Depends on:
Blocks:
 
Reported: 2009-12-14 20:43 PST by Kent Tamura
Modified: 2009-12-16 21:31 PST (History)
6 users (show)

See Also:


Attachments
Proposed patch (5.92 KB, patch)
2009-12-14 20:46 PST, Kent Tamura
levin: review-
Details | Formatted Diff | Diff
Proposed patch (rev.2) (20.02 KB, patch)
2009-12-15 22:53 PST, Kent Tamura
no flags Details | Formatted Diff | Diff
Proposed patch (rev.3) (20.38 KB, patch)
2009-12-16 01:36 PST, Kent Tamura
levin: review+
levin: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Kent Tamura 2009-12-14 20:43:48 PST
I sometimes put TAB characters into ChangeLog or *.js files mistakenly.  It would be good if check-webkit-style detected them.
Comment 1 Kent Tamura 2009-12-14 20:46:31 PST
Created attachment 44837 [details]
Proposed patch

Wow, this is my first python code.
Comment 2 Eric Seidel (no email) 2009-12-14 21:00:27 PST
Tabs are disallowed in all files, unless the files are marked with the svn "allow-tabs" property.  I think this is definitely better than what we have now, but is an incomplete solution. :)
Comment 3 WebKit Review Bot 2009-12-14 21:12:40 PST
style-queue ran check-webkit-style on attachment 44837 [details] without any errors.
Comment 4 Shinichiro Hamaji 2009-12-14 21:23:25 PST
Comment on attachment 44837 [details]
Proposed patch

Though I'm not a reviewer, I'd like to put some comments. First of all, thanks for doing this. Actually, I wanted this feature long time...

As Eric mentioned, we may want to consider allow-tabs property. Maybe we can just use svn proplist for svn repositories, but are there any good ways to see the properties from git repository? If we cannot find a good solution, we may be able start from checking just ChangeLog, where I think most tab mistakes are done.

PEP8 says that we should put 2 blank lines between top level definitions. Though I don't know WebKit is using this style (I bet there're no official rule for python), I think it would be good to keep text_style.py consistent with cpp_style.py.

http://www.python.org/dev/peps/pep-0008/

> +        elif "ChangeLog" in filename or file_extension in ['.html', '.js', '.py', '.txt']:

We may want to add '.pm', '.php', and 'WebKitTools/Scripts/*' ?

> +            line_numbers = set();
> +            for line in diff.lines:
> +                # When deleted line is not set, it means that
> +                # the line is newly added.
> +                if not line[0]:
> +                    line_numbers.add(line[1])
> +
> +            def error_for_patch(filename, line_number, message):
> +                if line_number in line_numbers:
> +                    text_style.error(filename, line_number, message)
> +
> +            text_style.process_file(filename, error=error_for_patch)

We can factor out this code and the similar code for cpp_style? As for the place of initialization of line_numbers, David suggested me that we should delay the initialization so that we don't need to initialize line_numbers if there are no errors. By the way, I think it would be better to set None instead of set() as the initial value of line_numbers.

> +def error_count():
> +    """Returns the global caount of erported errors."""

typo: "global count of reported errors"?

> -- 
> 1.6.3.3

I'm not sure what's this :)
Comment 5 David Levin 2009-12-14 23:00:38 PST
From what I understand TABs are allowed in js files in the test directory. (Although there tends to be a style, the lack of a style (such as allowing TABs) is touted as further testing the js engine.)
Comment 6 Chris Jerdonek 2009-12-15 00:42:00 PST
A couple high-level comments:

(1) The proposed text_style.process_file function has a lot of overlap with cpp_style.process_file.  It might be worth creating a check_webkit_style.py module and moving the cpp_style.process_file to that location.  We could do the branching between file types there.  This will also allow us to keep additional logic out of the check-webkit-style command-line wrapper.  Since the text-file specific logic is minimal, for now you could keep it in check_webkit_style.py. 

(2) In general, I think we should aim to reduce the amount of logic in the check-webkit-style file rather than add more.  We will have better test coverage if we keep check-webkit-style as close to a command-line wrapper as possible.  Note, for example, that the process_patch function which is getting expanded is not currently covered by the tests.
Comment 7 Chris Jerdonek 2009-12-15 10:06:36 PST
(In reply to comment #6)

> It might be worth creating a check_webkit_style.py
> module and moving the cpp_style.process_file to that location.
> ...
> Since the text-file specific logic is minimal, for now you 
> could keep it in check_webkit_style.py.

On second thought, style.py would be a better, simpler name.
Comment 8 David Levin 2009-12-15 11:19:39 PST
Comment on attachment 44837 [details]
Proposed patch

text_style.py should have unit test(s).

r- for the missing unit tests. (Plus there are a number of other comments in this bug that should be considered when doing a second patch.)
Comment 9 Kent Tamura 2009-12-15 22:53:21 PST
Created attachment 44944 [details]
Proposed patch (rev.2)

Thank you for the comments. I followed them, and rewrote the code almost entirely.

I understand we should move more code from cpp_style.py to the common module. But I didn't it for now to avoid behavior change and refactoring at the same time.


> > -- 
> > 1.6.3.3
>
> I'm not sure what's this :)

Please ignore it. It was a footer produced by "git format-patch"
Comment 10 WebKit Review Bot 2009-12-15 22:57:51 PST
style-queue ran check-webkit-style on attachment 44944 [details] without any errors.
Comment 11 Chris Jerdonek 2009-12-15 23:18:44 PST
(In reply to comment #9)
> Created an attachment (id=44944) [details]
> Proposed patch (rev.2)
> 
> Thank you for the comments. I followed them, and rewrote the code almost
> entirely.

Kent, I just added you to the CC for these two reports:

https://bugs.webkit.org/show_bug.cgi?id=32487
https://bugs.webkit.org/show_bug.cgi?id=32592

I should have thought to do that sooner since the first one will affect this patch.
Comment 12 Kent Tamura 2009-12-16 01:36:18 PST
Created attachment 44956 [details]
Proposed patch (rev.3)

* Resolve conflicts with r52189
Comment 13 WebKit Review Bot 2009-12-16 01:41:23 PST
style-queue ran check-webkit-style on attachment 44956 [details] without any errors.
Comment 14 Chris Jerdonek 2009-12-16 09:29:00 PST
(In reply to comment #12)
> Created an attachment (id=44956) [details]
> Proposed patch (rev.3)

Looking a lot better, thanks!

The check-webkit-style doc string should probably be updated:

> """Does WebKit-lint on C/C++ files.

I think the code will be simpler if this just takes a filename:

> +def can_handle(filename, extension):

The calling code won't have to split out the file extension (I count this being done four times in various calling code):

> file_extension = os.path.splitext(filename)[1]

The unit tests will be simpler:

> +        self.assert_(text_style.can_handle('foo.css', '.css'))
> +        self.assert_(text_style.can_handle('foo.html', '.html'))
> +        self.assert_(text_style.can_handle('foo.idl', '.idl'))

Also, the second parameter is redundant data and introduces the possible error case of the given extension not matching the true extension.

> +++ b/WebKitTools/Scripts/modules/style.py
> @@ -0,0 +1,101 @@
> +# Copyright (C) 2009 Google Inc. All rights reserved.
> +#

Does Apple copyright need to be added above and in other new files?  I'm not sure.

> +    It reports erros to the given error function.

errors

> +"""Frontend of some style-checker modules."""

Front end

> +    if cpp_style.can_handle(filename, file_extension) or filename == '-':
> +        cpp_style.process_file(filename)
> +    elif text_style.can_handle(filename, file_extension):
> +        text_style.process_file(filename)

It might be worth adding in the doc string of the *.process_file methods that they still process the file even if passed a filename they can't "handle".  That way future developers don't add parameter checking code at the top of the *.process_file implementations that call can_handle and prematurely exit if they don't.

Some might argue that process_file should prematurely exit.  But for testing purposes, etc, I do think process_file should attempt to process all files as you are currently doing.

> +def process_file_data(filename, lines, error):

> +        if '\t' in line:
> +            error(filename, line_number, 'whitespace/tab', 5, 'Line contains tab character.')

This is nearly identical to the whitespace/tab checking code at the beginning of cpp_style.check_style:

>    if line.find('\t') != -1:
>        error(filename, line_number, 'whitespace/tab', 1,
>              'Tab found; better to use spaces')

It might be good to define a check_whitespace_tab() method in cpp_style and have text_style call that one.

I anticipate that there will be a lot of overlap between the style-checking rules of various file types.  So it would be good if we can begin to establish a pattern of sharing the code for those checks where possible.

> +        # Do not suuprot for filename='-. cpp_style handles it.

Not sure what this is trying to say -- also typo in "support".
Comment 15 Chris Jerdonek 2009-12-16 09:48:31 PST
(In reply to comment #12)
> Created an attachment (id=44956) [details]
> Proposed patch (rev.3)

Last comment:

> +    try:
> +        # Do not suuprot for filename='-. cpp_style handles it.
> +        lines = codecs.open(filename, 'r', 'utf8', 'replace').read().split('\n')
> +    except IOError:
> +        sys.stderr.write("Skipping input '%s': Can't open for reading\n" % filename)
> +        return

This chunk of code is nearly identical to the corresponding chunk of code in cpp_style.process_file (except cpp_style's also has the filename == '-' logic).  For maintainability, it looks like it would be better to refactor so both files use a common "get_lines" function.  Perhaps that new function could accept filename "-", but in text_style.py you could check for "-" before calling get_lines.
Comment 16 David Levin 2009-12-16 11:57:00 PST
Comment on attachment 44956 [details]
Proposed patch (rev.3)

r+ *if* text_style.can_handle is changed to return false for files in the LayoutTests directory.

Also, please consider adding your new unit test files to WebKitTools/Scripts/run-webkit-unittests and fixing the nits below.

Chris made a number of good points that would be nice to address in a future patch.

> diff --git a/WebKitTools/ChangeLog b/WebKitTools/ChangeLog
> +        * Scripts/modules/text_style.py:
> +          Added. This is a liner module for generic text files. It supports

typo: liner


> diff --git a/WebKitTools/Scripts/modules/style.py b/WebKitTools/Scripts/modules/style.py
> +# FIXME: Move more code from cpp_style to here.
> +# check-webkit-style sholud not refer cpp_style directly.

typo: sholud


> +def parse_arguments(args, additional_flags=[], display_help=False):
> +    """Parses the command line arguments.
> +
> +    See cpp_style.parse_arguments() for detail.


s/detail/details/



> diff --git a/WebKitTools/Scripts/modules/text_style.py b/WebKitTools/Scripts/modules/text_style.py
> +"""Does WebKit-lint on text files.
> +
> +This modules shares error count, filter setting, output setting, etc. with cpp_style.

s/modules/module/


> +def process_file_data(filename, lines, error):
> +    """Performs lint check for text on the specified lines.
> +
> +    It reports erros to the given error function.

typo: erros


> +def process_file(filename, error=cpp_style.error):
> +    """Performs lint check for text on a single file."""
> +    try:
> +        # Do not suuprot for filename='-. cpp_style handles it.

typo: suuprot


> +def can_handle(filename, extension):
> +    """Checks if this module supports for the specified file type.

remove "for"

> +    return ("ChangeLog" in filename
> +            or "WebKitTools/Scripts/" in filename
> +            or extension in ('.css', '.html', '.idl', '.js', '.mm', '.php', '.pm', '.py', '.txt'))

Please make this *not handle* files in the LayoutTest directory. TABs, while infrequent, are allowed in js and html files there. (Tabs in the files there ensure that the js and html parsing handle them correctly.)
Comment 17 David Levin 2009-12-16 12:17:23 PST
*** Bug 29509 has been marked as a duplicate of this bug. ***
Comment 18 Kent Tamura 2009-12-16 21:19:39 PST
(In reply to comment #14)
(In reply to comment #15)
(In reply to comment #16)

Thank you for the comments.

* I fixed all of typos.

* As for code sharing with cpp_style.py, I have added FIXME comments for now.

> > +    return ("ChangeLog" in filename
> > +            or "WebKitTools/Scripts/" in filename
> > +            or extension in ('.css', '.html', '.idl', '.js', '.mm', '.php', '.pm', '.py', '.txt'))
> 
> Please make this *not handle* files in the LayoutTest directory. TABs, while
> infrequent, are allowed in js and html files there. (Tabs in the files there
> ensure that the js and html parsing handle them correctly.)

I see. I changed it so that "LayoutTests/*" is excluded.
But I want files in LayoutTests are also checked. I'll investigate whether CSS/HTML/JS parsing tests are put into specific directories or not.

> Also, please consider adding your new unit test files to
> WebKitTools/Scripts/run-webkit-unittests and fixing the nits below.

Added.

I'll commit the change manually.
Comment 19 Kent Tamura 2009-12-16 21:31:47 PST
Landed as r52232
http://trac.webkit.org/changeset/52232