Bug 32538 - [check-webkit-style] Add support for TAB check against text files
: [check-webkit-style] Add support for TAB check against text files
Status: RESOLVED FIXED
: WebKit
Tools / Tests
: 528+ (Nightly build)
: All All
: P2 Normal
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2009-12-14 20:43 PST by
Modified: 2009-12-16 21:31 PST (History)


Attachments
Proposed patch (5.92 KB, patch)
2009-12-14 20:46 PST, Kent Tamura
levin: review-
Review Patch | Details | Formatted Diff | Diff
Proposed patch (rev.2) (20.02 KB, patch)
2009-12-15 22:53 PST, Kent Tamura
no flags Review Patch | Details | Formatted Diff | Diff
Proposed patch (rev.3) (20.38 KB, patch)
2009-12-16 01:36 PST, Kent Tamura
levin: review+
levin: commit‑queue-
Review Patch | Details | Formatted Diff | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2009-12-14 20:43:48 PST
I sometimes put TAB characters into ChangeLog or *.js files mistakenly.  It would be good if check-webkit-style detected them.
------- Comment #1 From 2009-12-14 20:46:31 PST -------
Created an attachment (id=44837) [details]
Proposed patch

Wow, this is my first python code.
------- Comment #2 From 2009-12-14 21:00:27 PST -------
Tabs are disallowed in all files, unless the files are marked with the svn "allow-tabs" property.  I think this is definitely better than what we have now, but is an incomplete solution. :)
------- Comment #3 From 2009-12-14 21:12:40 PST -------
style-queue ran check-webkit-style on attachment 44837 [details] without any errors.
------- Comment #4 From 2009-12-14 21:23:25 PST -------
(From update of attachment 44837 [details])
Though I'm not a reviewer, I'd like to put some comments. First of all, thanks for doing this. Actually, I wanted this feature long time...

As Eric mentioned, we may want to consider allow-tabs property. Maybe we can just use svn proplist for svn repositories, but are there any good ways to see the properties from git repository? If we cannot find a good solution, we may be able start from checking just ChangeLog, where I think most tab mistakes are done.

PEP8 says that we should put 2 blank lines between top level definitions. Though I don't know WebKit is using this style (I bet there're no official rule for python), I think it would be good to keep text_style.py consistent with cpp_style.py.

http://www.python.org/dev/peps/pep-0008/

> +        elif "ChangeLog" in filename or file_extension in ['.html', '.js', '.py', '.txt']:

We may want to add '.pm', '.php', and 'WebKitTools/Scripts/*' ?

> +            line_numbers = set();
> +            for line in diff.lines:
> +                # When deleted line is not set, it means that
> +                # the line is newly added.
> +                if not line[0]:
> +                    line_numbers.add(line[1])
> +
> +            def error_for_patch(filename, line_number, message):
> +                if line_number in line_numbers:
> +                    text_style.error(filename, line_number, message)
> +
> +            text_style.process_file(filename, error=error_for_patch)

We can factor out this code and the similar code for cpp_style? As for the place of initialization of line_numbers, David suggested me that we should delay the initialization so that we don't need to initialize line_numbers if there are no errors. By the way, I think it would be better to set None instead of set() as the initial value of line_numbers.

> +def error_count():
> +    """Returns the global caount of erported errors."""

typo: "global count of reported errors"?

> -- 
> 1.6.3.3

I'm not sure what's this :)
------- Comment #5 From 2009-12-14 23:00:38 PST -------
From what I understand TABs are allowed in js files in the test directory. (Although there tends to be a style, the lack of a style (such as allowing TABs) is touted as further testing the js engine.)
------- Comment #6 From 2009-12-15 00:42:00 PST -------
A couple high-level comments:

(1) The proposed text_style.process_file function has a lot of overlap with cpp_style.process_file.  It might be worth creating a check_webkit_style.py module and moving the cpp_style.process_file to that location.  We could do the branching between file types there.  This will also allow us to keep additional logic out of the check-webkit-style command-line wrapper.  Since the text-file specific logic is minimal, for now you could keep it in check_webkit_style.py. 

(2) In general, I think we should aim to reduce the amount of logic in the check-webkit-style file rather than add more.  We will have better test coverage if we keep check-webkit-style as close to a command-line wrapper as possible.  Note, for example, that the process_patch function which is getting expanded is not currently covered by the tests.
------- Comment #7 From 2009-12-15 10:06:36 PST -------
(In reply to comment #6)

> It might be worth creating a check_webkit_style.py
> module and moving the cpp_style.process_file to that location.
> ...
> Since the text-file specific logic is minimal, for now you 
> could keep it in check_webkit_style.py.

On second thought, style.py would be a better, simpler name.
------- Comment #8 From 2009-12-15 11:19:39 PST -------
(From update of attachment 44837 [details])
text_style.py should have unit test(s).

r- for the missing unit tests. (Plus there are a number of other comments in this bug that should be considered when doing a second patch.)
------- Comment #9 From 2009-12-15 22:53:21 PST -------
Created an attachment (id=44944) [details]
Proposed patch (rev.2)

Thank you for the comments. I followed them, and rewrote the code almost entirely.

I understand we should move more code from cpp_style.py to the common module. But I didn't it for now to avoid behavior change and refactoring at the same time.


> > -- 
> > 1.6.3.3
>
> I'm not sure what's this :)

Please ignore it. It was a footer produced by "git format-patch"
------- Comment #10 From 2009-12-15 22:57:51 PST -------
style-queue ran check-webkit-style on attachment 44944 [details] without any errors.
------- Comment #11 From 2009-12-15 23:18:44 PST -------
(In reply to comment #9)
> Created an attachment (id=44944) [details] [details]
> Proposed patch (rev.2)
> 
> Thank you for the comments. I followed them, and rewrote the code almost
> entirely.

Kent, I just added you to the CC for these two reports:

https://bugs.webkit.org/show_bug.cgi?id=32487
https://bugs.webkit.org/show_bug.cgi?id=32592

I should have thought to do that sooner since the first one will affect this patch.
------- Comment #12 From 2009-12-16 01:36:18 PST -------
Created an attachment (id=44956) [details]
Proposed patch (rev.3)

* Resolve conflicts with r52189
------- Comment #13 From 2009-12-16 01:41:23 PST -------
style-queue ran check-webkit-style on attachment 44956 [details] without any errors.
------- Comment #14 From 2009-12-16 09:29:00 PST -------
(In reply to comment #12)
> Created an attachment (id=44956) [details] [details]
> Proposed patch (rev.3)

Looking a lot better, thanks!

The check-webkit-style doc string should probably be updated:

> """Does WebKit-lint on C/C++ files.

I think the code will be simpler if this just takes a filename:

> +def can_handle(filename, extension):

The calling code won't have to split out the file extension (I count this being done four times in various calling code):

> file_extension = os.path.splitext(filename)[1]

The unit tests will be simpler:

> +        self.assert_(text_style.can_handle('foo.css', '.css'))
> +        self.assert_(text_style.can_handle('foo.html', '.html'))
> +        self.assert_(text_style.can_handle('foo.idl', '.idl'))

Also, the second parameter is redundant data and introduces the possible error case of the given extension not matching the true extension.

> +++ b/WebKitTools/Scripts/modules/style.py
> @@ -0,0 +1,101 @@
> +# Copyright (C) 2009 Google Inc. All rights reserved.
> +#

Does Apple copyright need to be added above and in other new files?  I'm not sure.

> +    It reports erros to the given error function.

errors

> +"""Frontend of some style-checker modules."""

Front end

> +    if cpp_style.can_handle(filename, file_extension) or filename == '-':
> +        cpp_style.process_file(filename)
> +    elif text_style.can_handle(filename, file_extension):
> +        text_style.process_file(filename)

It might be worth adding in the doc string of the *.process_file methods that they still process the file even if passed a filename they can't "handle".  That way future developers don't add parameter checking code at the top of the *.process_file implementations that call can_handle and prematurely exit if they don't.

Some might argue that process_file should prematurely exit.  But for testing purposes, etc, I do think process_file should attempt to process all files as you are currently doing.

> +def process_file_data(filename, lines, error):

> +        if '\t' in line:
> +            error(filename, line_number, 'whitespace/tab', 5, 'Line contains tab character.')

This is nearly identical to the whitespace/tab checking code at the beginning of cpp_style.check_style:

>    if line.find('\t') != -1:
>        error(filename, line_number, 'whitespace/tab', 1,
>              'Tab found; better to use spaces')

It might be good to define a check_whitespace_tab() method in cpp_style and have text_style call that one.

I anticipate that there will be a lot of overlap between the style-checking rules of various file types.  So it would be good if we can begin to establish a pattern of sharing the code for those checks where possible.

> +        # Do not suuprot for filename='-. cpp_style handles it.

Not sure what this is trying to say -- also typo in "support".
------- Comment #15 From 2009-12-16 09:48:31 PST -------
(In reply to comment #12)
> Created an attachment (id=44956) [details] [details]
> Proposed patch (rev.3)

Last comment:

> +    try:
> +        # Do not suuprot for filename='-. cpp_style handles it.
> +        lines = codecs.open(filename, 'r', 'utf8', 'replace').read().split('\n')
> +    except IOError:
> +        sys.stderr.write("Skipping input '%s': Can't open for reading\n" % filename)
> +        return

This chunk of code is nearly identical to the corresponding chunk of code in cpp_style.process_file (except cpp_style's also has the filename == '-' logic).  For maintainability, it looks like it would be better to refactor so both files use a common "get_lines" function.  Perhaps that new function could accept filename "-", but in text_style.py you could check for "-" before calling get_lines.
------- Comment #16 From 2009-12-16 11:57:00 PST -------
(From update of attachment 44956 [details])
r+ *if* text_style.can_handle is changed to return false for files in the LayoutTests directory.

Also, please consider adding your new unit test files to WebKitTools/Scripts/run-webkit-unittests and fixing the nits below.

Chris made a number of good points that would be nice to address in a future patch.

> diff --git a/WebKitTools/ChangeLog b/WebKitTools/ChangeLog
> +        * Scripts/modules/text_style.py:
> +          Added. This is a liner module for generic text files. It supports

typo: liner


> diff --git a/WebKitTools/Scripts/modules/style.py b/WebKitTools/Scripts/modules/style.py
> +# FIXME: Move more code from cpp_style to here.
> +# check-webkit-style sholud not refer cpp_style directly.

typo: sholud


> +def parse_arguments(args, additional_flags=[], display_help=False):
> +    """Parses the command line arguments.
> +
> +    See cpp_style.parse_arguments() for detail.


s/detail/details/



> diff --git a/WebKitTools/Scripts/modules/text_style.py b/WebKitTools/Scripts/modules/text_style.py
> +"""Does WebKit-lint on text files.
> +
> +This modules shares error count, filter setting, output setting, etc. with cpp_style.

s/modules/module/


> +def process_file_data(filename, lines, error):
> +    """Performs lint check for text on the specified lines.
> +
> +    It reports erros to the given error function.

typo: erros


> +def process_file(filename, error=cpp_style.error):
> +    """Performs lint check for text on a single file."""
> +    try:
> +        # Do not suuprot for filename='-. cpp_style handles it.

typo: suuprot


> +def can_handle(filename, extension):
> +    """Checks if this module supports for the specified file type.

remove "for"

> +    return ("ChangeLog" in filename
> +            or "WebKitTools/Scripts/" in filename
> +            or extension in ('.css', '.html', '.idl', '.js', '.mm', '.php', '.pm', '.py', '.txt'))

Please make this *not handle* files in the LayoutTest directory. TABs, while infrequent, are allowed in js and html files there. (Tabs in the files there ensure that the js and html parsing handle them correctly.)
------- Comment #17 From 2009-12-16 12:17:23 PST -------
*** Bug 29509 has been marked as a duplicate of this bug. ***
------- Comment #18 From 2009-12-16 21:19:39 PST -------
(In reply to comment #14)
(In reply to comment #15)
(In reply to comment #16)

Thank you for the comments.

* I fixed all of typos.

* As for code sharing with cpp_style.py, I have added FIXME comments for now.

> > +    return ("ChangeLog" in filename
> > +            or "WebKitTools/Scripts/" in filename
> > +            or extension in ('.css', '.html', '.idl', '.js', '.mm', '.php', '.pm', '.py', '.txt'))
> 
> Please make this *not handle* files in the LayoutTest directory. TABs, while
> infrequent, are allowed in js and html files there. (Tabs in the files there
> ensure that the js and html parsing handle them correctly.)

I see. I changed it so that "LayoutTests/*" is excluded.
But I want files in LayoutTests are also checked. I'll investigate whether CSS/HTML/JS parsing tests are put into specific directories or not.

> Also, please consider adding your new unit test files to
> WebKitTools/Scripts/run-webkit-unittests and fixing the nits below.

Added.

I'll commit the change manually.
------- Comment #19 From 2009-12-16 21:31:47 PST -------
Landed as r52232
http://trac.webkit.org/changeset/52232