Bug 27291 - Support lint for patches
Summary: Support lint for patches
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC All
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-07-15 01:49 PDT by Shinichiro Hamaji
Modified: 2009-07-21 22:47 PDT (History)
1 user (show)

See Also:


Attachments
Patch v1 (19.82 KB, patch)
2009-07-15 01:53 PDT, Shinichiro Hamaji
no flags Details | Formatted Diff | Diff
Patch v1 (8.40 KB, patch)
2009-07-20 23:38 PDT, Shinichiro Hamaji
no flags Details | Formatted Diff | Diff
Patch v3 (7.53 KB, patch)
2009-07-21 01:53 PDT, Shinichiro Hamaji
no flags Details | Formatted Diff | Diff
Patch v4 (7.23 KB, patch)
2009-07-21 03:19 PDT, Shinichiro Hamaji
levin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Shinichiro Hamaji 2009-07-15 01:49:36 PDT
The lint tool added in Bug 25884 doesn't understand patch files.
Comment 1 Shinichiro Hamaji 2009-07-15 01:53:08 PDT
Created attachment 32771 [details]
Patch v1


---
 3 files changed, 414 insertions(+), 3 deletions(-)
Comment 2 David Levin 2009-07-15 07:56:07 PDT
Comment on attachment 32771 [details]
Patch v1



> diff --git a/WebKitTools/Scripts/cpplint.py b/WebKitTools/Scripts/cpplint.py

> +# From http://code.google.com/p/rietveld/source/browse/branches/chromium/codereview/engine.py
This code may not be used in WebKit because it has an incompatible license: "Licensed under the Apache License"

By the way, this is the typical license for Google code. (cpplint.py was relicense due to Albert Wong asking the author so that it could be used in WebKit.)

So there are two options:
1. See if the code can be relicensed (or dual licensed) (which may not be easy to do if people outside of Google contributed) OR
2. Write something equivalent from scratch.

> +# From http://code.google.com/p/rietveld/source/browse/branches/chromium/codereview/patching.py

Same license issue.



*** General comment ***
For the rest of this, it feels like the use case should be someone who has the patch in their scm. The user would be to run the lint tool on the patch that would be created.

With that in mind could this change would do the following: 
1. Getting the current diff from the scm (use "create_patch" in scm.py.  A subsequent change can add support to call create_patch_from_local_commit for git users.)
2. Run cpplint on the files in that diff.
3. Filter the output from cpplint to only include the lines that changed in that diff.

Also, this logic could reside outside of cpplint.py, there could be a run-webkit-lint that did most of this and then called cpplint to do the linting of cpp/h files (with what lines to filter the results to).

I like this approach because you'll get less false lint warnings from running it on the whole file and it feels like it fits the usage model.  (If I want to lint someone else's patch, I could always patch it into my local system and then run the lint tool on it.) 

As you do this work, don't feel like you need a fully completely functional thing to submit a patch.  Just take logical chunks that are complete by themselves but may not provide all the functionality and submit patches as you go along so that the patches stay small which will make the review process faster.

For example one patch, may be just to move cpplint.py (and its unit test) into Scripts/modules since we'll probably call it from "run-webkit-lint" (which would be the harness you'll write in these patches).
Comment 3 Shinichiro Hamaji 2009-07-15 09:00:46 PDT
Thanks for the comments, I'll change the order of your comments.

> *** General comment ***
> For the rest of this, it feels like the use case should be someone who has the
> patch in their scm. The user would be to run the lint tool on the patch that
> would be created.
> 
> With that in mind could this change would do the following: 
> 1. Getting the current diff from the scm (use "create_patch" in scm.py.  A
> subsequent change can add support to call create_patch_from_local_commit for
> git users.)
> 2. Run cpplint on the files in that diff.
> 3. Filter the output from cpplint to only include the lines that changed in
> that diff.
> 
> Also, this logic could reside outside of cpplint.py, there could be a
> run-webkit-lint that did most of this and then called cpplint to do the linting
> of cpp/h files (with what lines to filter the results to).
> 
> I like this approach because you'll get less false lint warnings from running
> it on the whole file and it feels like it fits the usage model.  (If I want to
> lint someone else's patch, I could always patch it into my local system and
> then run the lint tool on it.) 

I see. I considered we may eventually want to integrate the linter into bugzilla so that submitted patch will be automatically linted and reviewees may see style issues even if they forget to apply the linter. Anyway, it's reasonable to implement scm integration first.

> This code may not be used in WebKit because it has an incompatible license:
> "Licensed under the Apache License"
> 
> By the way, this is the typical license for Google code. (cpplint.py was
> relicense due to Albert Wong asking the author so that it could be used in
> WebKit.)
> 
> So there are two options:
> 1. See if the code can be relicensed (or dual licensed) (which may not be easy
> to do if people outside of Google contributed) OR
> 2. Write something equivalent from scratch.

The approach you suggested would be easier than the original way (we don't need to have equivalent of parse_patch_to_chunks, which is the biggest function in my patch). So, it may be not so difficult to re-invent other parts. Do you think it's reasonable to re-implement them? Or, should we ask the authors of rietveld about the license first?

> As you do this work, don't feel like you need a fully completely functional
> thing to submit a patch.  Just take logical chunks that are complete by
> themselves but may not provide all the functionality and submit patches as you
> go along so that the patches stay small which will make the review process
> faster.

OK, I'll split my patch into small pieces. I know big patches are not good, but I thought it would be OK in this case as this patch was mostly copy&paste and unittest.

> For example one patch, may be just to move cpplint.py (and its unit test) into
> Scripts/modules since we'll probably call it from "run-webkit-lint" (which
> would be the harness you'll write in these patches).

Will do.
Comment 4 Shinichiro Hamaji 2009-07-20 23:38:14 PDT
Created attachment 33148 [details]
Patch v1


---
 3 files changed, 159 insertions(+), 11 deletions(-)
Comment 5 Shinichiro Hamaji 2009-07-20 23:57:15 PDT
Comment on attachment 33148 [details]
Patch v1

> Patch v1

Actually, this is v2 :(

Hmm... the usage of current patch is

% run-webkit-lint [options] [commitish]...

with this syntax, we cannot add support of linting files. Maybe the following syntax would be better?

% run-webkit-lint [--git-commit=<commitish>] [options] [file]...

For now, to keep the patch small, I'll re-upload a patch with removing support of linting specified commit.
Comment 6 David Levin 2009-07-21 00:56:54 PDT
Comment on attachment 33148 [details]
Patch v1

I saw that you unmarked this for review but I suspect these comments apply to what you are about to submit.  btw, I agree completely with making the git commit argument --git-commit=<committish> and was about to suggest it simply because that is consistent with prepare-ChangeLog.


> diff --git a/WebKitTools/Scripts/modules/cpplint.py b/WebKitTools/Scripts/modules/cpplint.py

The changes done here look good but I think you missed the call in def main() to process_file(filename, _cpplint_state.verbose_level)
It should be easy enough to set the verbose level and omit the second parameter.


> diff --git a/WebKitTools/Scripts/run-webkit-lint b/WebKitTools/Scripts/run-webkit-lint

> +# Override the usage of the lint tool.
> +cpplint._USAGE = """
> +Syntax: webkit-run-lint [--verbose=#] [--output=vs7] [--filter=-x,+y,...]
> +        [COMMITISH] ...
> +
> +  The style guidelines this tries to follow are those in
> +    http://webkit.org/coding/coding-style.html
> +
> +  Every problem is given a confidence score from 1-5, with 5 meaning we are
> +  certain of the problem, and 1 meaning it could be a legitimate construct.
> +  This will miss some errors, and is not a substitute for a code review.
> +
> +  To prevent specific lines from being linted, add a '// NOLINT' comment to the
> +  end of the line.
> +
> +  If you don't specify COMMITISH, the current change on your repository will
> +  be linted.  Otherwise, the specified commits will be linted based on your
> +  local files.  Note that if you specify old commits which are modified in
> +  newer patches, the line number of the lint results may be wrong.
> +  Linted extensions are .cc, .cpp, and .h.  Other file types will be ignored.
> +
> +  Flags:
> +
> +    output=vs7
> +      By default, the output is formatted to ease emacs parsing.  Visual Studio
> +      compatible output (vs7) may also be used.  Other formats are unsupported.
> +
> +    verbose=#
> +      Specify a number 0-5 to restrict errors to certain verbosity levels.
> +
> +    filter=-x,+y,...
> +      Specify a comma-separated list of category-filters to apply: only
> +      error messages whose category names pass the filters will be printed.
> +      (Category names are printed with the message and look like
> +      "[whitespace/indent]".)  Filters are evaluated left to right.
> +      "-FOO" and "FOO" means "do not print categories that start with FOO".
> +      "+FOO" means "do print categories that start with FOO".
> +
> +      Examples: --filter=-whitespace,+whitespace/braces
> +                --filter=whitespace,runtime/printf,+runtime/printf_format
> +                --filter=-,+build/include_what_you_use
> +
> +      To see a list of all the categories used in cpplint, pass no arg:
> +         --filter=
> +"""

Consider doing this

""" ....""" % {'program_name' sys.argv[0]}
In the description text replace "webkit-run-lint" and "cpplint" with "%{program_name}s"


> +def process_patch(patch_string):
> +    """Does lint on a single patch.
> +
> +    Args:
> +      patch_string: A string of a patch.
> +    """
> +    patch = DiffParser(patch_string.splitlines())
> +    for filename, diff in patch.files.iteritems():
> +        file_extension = os.path.splitext(filename)[1]
> +
> +        if file_extension in ['.cc', '.cpp', '.h']:
> +            line_numbers = set()

Consider creating this on demand, so if there are no lint errors, it just doesn't get created.  Like this:

               line_numbers = None            
> +
> +            def error_for_patch(filename, line_number, category, confidence, message):
> +                """Wrapper function of cpplint.error for patches.
> +
> +                This function outputs errors only if the line number
> +                corresponds to lines which are modified or added.
> +                """
                   if line_numbers is None:
                       line_numbers = set()
                       for line in diff.lines:
                           line_numbers.add(line[1])

> +                if line_number in line_numbers:
> +                    cpplint.error(filename, line_number, category, confidence, message)
> +
> +            cpplint.process_file(filename, error=error_for_patch)
Comment 7 Shinichiro Hamaji 2009-07-21 01:53:11 PDT
Created attachment 33158 [details]
Patch v3


---
 3 files changed, 149 insertions(+), 7 deletions(-)
Comment 8 Shinichiro Hamaji 2009-07-21 02:00:24 PDT
> The changes done here look good but I think you missed the call in def main()
> to process_file(filename, _cpplint_state.verbose_level)
> It should be easy enough to set the verbose level and omit the second
> parameter.

Ah, thanks, fixed.

> Consider doing this
> 
> """ ....""" % {'program_name' sys.argv[0]}
> In the description text replace "webkit-run-lint" and "cpplint" with
> "%{program_name}s"

There is one more change other than the program name. Though cpplint.py requires at least one file in the arguments and this was mentioned in _USAGE of cpplint.py, webkit-run-lint is not. I think we will eventually remove main() and _USAGE from cpplint module when webkit-run-lint is ready. So, can i leave the copy of _USAGE in run-webkit-lint for now?

> Consider creating this on demand, so if there are no lint errors, it just
> doesn't get created.  Like this:
> 
>                line_numbers = None            
> > +
> > +            def error_for_patch(filename, line_number, category, confidence, message):
> > +                """Wrapper function of cpplint.error for patches.
> > +
> > +                This function outputs errors only if the line number
> > +                corresponds to lines which are modified or added.
> > +                """
>                    if line_numbers is None:
>                        line_numbers = set()
>                        for line in diff.lines:
>                            line_numbers.add(line[1])
> 
> > +                if line_number in line_numbers:
> > +                    cpplint.error(filename, line_number, category, confidence, message)
> > +
> > +            cpplint.process_file(filename, error=error_for_patch)

Ah, nice suggestion. However, Python seems not to like this as is. It says:

UnboundLocalError: local variable 'line_numbers' referenced before assignment

I'm not sure what is the best way to fix, but I initialized line_numbers by an empty set. Maybe it's OK as valid diff chunks should have at least one diff line.
Comment 9 David Levin 2009-07-21 02:39:10 PDT
Comment on attachment 33158 [details]
Patch v3

A few things to consider....

> diff --git a/WebKitTools/Scripts/modules/cpplint.py b/WebKitTools/Scripts/modules/cpplint.py

> +# FIXME: Once run-webkit-lint script becomes ready, we don't need to
> +#        support standalone execution for this module.
> +#        So, we should remove _USAGE and main() eventually.

I'm not sure this is necessary (maybe).


>      _cpplint_state.reset_error_count()

Add this line here:

   _set_verbose_level(_cpplint_state.verbose_level)

>      for filename in filenames:
> -        process_file(filename, _cpplint_state.verbose_level)
> +        process_file(filename)


> diff --git a/WebKitTools/Scripts/run-webkit-lint b/WebKitTools/Scripts/run-webkit-lint

> +cpplint._USAGE = """
> +Syntax: webkit-run-lint [--verbose=#] [--output=vs7] [--filter=-x,+y,...]
> +
> +  The style guidelines this tries to follow are those in
> +    http://webkit.org/coding/coding-style.html
> +
> +  Every problem is given a confidence score from 1-5, with 5 meaning we are
> +  certain of the problem, and 1 meaning it could be a legitimate construct.
> +  This will miss some errors, and is not a substitute for a code review.
> +
> +  To prevent specific lines from being linted, add a '// NOLINT' comment to the
> +  end of the line.
> +
> +  Linted extensions are .cc, .cpp, and .h.  Other file types will be ignored.
> +
> +  Flags:
> +
> +    output=vs7
> +      By default, the output is formatted to ease emacs parsing.  Visual Studio
> +      compatible output (vs7) may also be used.  Other formats are unsupported.
> +
> +    verbose=#
> +      Specify a number 0-5 to restrict errors to certain verbosity levels.
> +
> +    filter=-x,+y,...
> +      Specify a comma-separated list of category-filters to apply: only
> +      error messages whose category names pass the filters will be printed.
> +      (Category names are printed with the message and look like
> +      "[whitespace/indent]".)  Filters are evaluated left to right.
> +      "-FOO" and "FOO" means "do not print categories that start with FOO".
> +      "+FOO" means "do print categories that start with FOO".
> +
> +      Examples: --filter=-whitespace,+whitespace/braces
> +                --filter=whitespace,runtime/printf,+runtime/printf_format
> +                --filter=-,+build/include_what_you_use
> +
> +      To see a list of all the categories used in cpplint, pass no arg:
> +         --filter=
> +"""

When I said this:  Consider doing this

""" ....""" % {'program_name' sys.argv[0]}
In the description text replace "webkit-run-lint" and "cpplint" with
"%{program_name}s"

I was referring to this comment (not the one in cpplint.py).  It would fix two things: "webkit-run-lint" is not the program name. "cpplint" is also not the name of the program being run.
sys.argv[0] is the program name.

>
> +        if file_extension in ['.cc', '.cpp', '.h']:
> +            line_numbers = set()
This looks fine.  The problem with my suggestion was the assigment inside of error_for_patch which made the variable have scope local to that block.


> +    if args:
> +        sys.stderr.write('ERROR: We don\'t support files as arguments for now.\n' + cpplint._USAGE)

Consider using "" for the outside quotes and change \' to '.  Like this

        sys.stderr.write("ERROR: We don't support files as arguments for now.\n" + cpplint._USAGE)
Comment 10 Shinichiro Hamaji 2009-07-21 03:19:02 PDT
Created attachment 33167 [details]
Patch v4


---
 3 files changed, 146 insertions(+), 7 deletions(-)
Comment 11 David Levin 2009-07-21 03:28:47 PDT
Comment on attachment 33167 [details]
Patch v4

Ok I see there is no need to call _set_verbose_level(_cpplint_state.verbose_level) because that is just redundant.
Comment 12 David Levin 2009-07-21 03:33:04 PDT
I forgot to note that run-webkit-list should have the executable bit set, but
I'll do this on landing.
Comment 13 David Levin 2009-07-21 03:35:40 PDT
Committed as http://trac.webkit.org/changeset/46166
Comment 14 Shinichiro Hamaji 2009-07-21 22:47:38 PDT
(In reply to comment #11)
> (From update of attachment 33167 [details])
> Ok I see there is no need to call
> _set_verbose_level(_cpplint_state.verbose_level) because that is just
> redundant.

Ah, thanks for finding this. My comment on this was conflicted and I didn't notice the conflict. As for "program_name", I completely misunderstood your comments, sorry.