Bug 33454

Summary: check-webkit-style: Create a class to represent a list of filter rules.
Product: WebKit Reporter: Chris Jerdonek <cjerdonek>
Component: Tools / TestsAssignee: Chris Jerdonek <cjerdonek>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, cjerdonek, eric, hamaji, levin
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
Proposed patch
none
Proposed patch 2
none
Proposed patch 3 none

Description Chris Jerdonek 2010-01-10 16:40:52 PST
This class will encapsulate the logic of checking filter rules against the category names.
Comment 1 Chris Jerdonek 2010-01-10 20:44:38 PST
Created attachment 46252 [details]
Proposed patch
Comment 2 Chris Jerdonek 2010-01-10 20:50:24 PST
Comment on attachment 46252 [details]
Proposed patch

Obsoleting "Proposed Patch 1." I should add unit tests first.
Comment 3 Chris Jerdonek 2010-01-10 21:51:15 PST
Created attachment 46258 [details]
Proposed patch 2
Comment 4 Chris Jerdonek 2010-01-11 00:14:33 PST
Created attachment 46261 [details]
Proposed patch 3

Corrected index paths to be properly rooted.  This is the reason why all of the status bubbles were red.
Comment 5 Shinichiro Hamaji 2010-01-12 23:13:14 PST
Comment on attachment 46261 [details]
Proposed patch 3

Looks good. Thanks for this refactoring. r- for a few nitpicks (yay I'm a reviewer now :).

> +        This method performs argument validation but does not strip
> +        leading or trailing white space.

Why don't we strip the arguments? I guess you want to be stricter than before? If so, we may want to raise a ValueError for the case where rule.find(' ') >= 0 in the integrity check of this function.

> +        Raises:
> +          ValueError: Invalid filter rule if a rule does not start with
> +                      plus ("+") or minus ("-").
> +
> +        """

I think usually we don't put an empty line in the end of docstrings.

> +    def __str__(self):
> +        return ",".join(self._filter_rules)

I think we basically use single-quotes for string literals except for multi-line string literals.

> +    def test_init(self):
> +        """Test __init__ constructor."""
> +        self.assertRaises(ValueError, CategoryFilter, ["no_prefix"])
> +        CategoryFilter([]) # No ValueError: works
> +        CategoryFilter(["+"]) # No ValueError: works
> +        CategoryFilter(["-"]) # No ValueError: works

If we decide to raise a ValueError for whitespaces, we may want to have an assertions for them as well. Also, please consider using single-quotes in this file.
Comment 6 Chris Jerdonek 2010-01-13 00:40:18 PST
(In reply to comment #5)
> (From update of attachment 46261 [details])
> Looks good. Thanks for this refactoring. r- for a few nitpicks (yay I'm a
> reviewer now :).

Congrats, Shinichiro!  That's great news!  And thanks for choosing my patch to be one of your first!  :)

> > +        This method performs argument validation but does not strip
> > +        leading or trailing white space.
> 
> Why don't we strip the arguments?

Good question.  I guess I tend to prefer if "inner-most" functions simply operate on what is given to them rather than do any clean-up.  I'd rather the caller be responsible for choosing what clean-up to do depending on the case at hand.  This could even be by calling a wrapper function that cleans up and then calls the "bare" function.

Some of my reasons are--

(1) If the function always does clean-up, the caller doesn't have the option of trying to call the function without cleaning up.  The function is doing two things instead of one thing.

(2) It can often be an arbitrary choice as to what clean-up to include and what not to include.  Should white space be stripped?  Should strings be converted to lower case?  Etc.  Because of this arbitrariness, the caller is never sure and winds up having to look more to verify what clean-up is done.

(3) There is also the performance argument, which practically speaking doesn't matter here, but illustrates a principle.  If the function cleans up every time, it can be cleaning up even when it doesn't need to.  And if the caller is cleaning up in certain circumstances, things can wind up getting cleaned up twice, which is unnecessary.

Does that seem reasonable to you?

> I guess you want to be stricter than before?
> If so, we may want to raise a ValueError for the case where rule.find(' ') >= 0
> in the integrity check of this function.

I believe the end result is not any stricter than before because the parse function (which is the caller in this case) strips white space before instantiating the CategoryFilter.  This is done in the  _parse_filter_flag function:

> for uncleaned_filter in flag_value.split(','):
>     filter = uncleaned_filter.strip()
>     if not filter:
>         continue
>    filters.append(filter)

And the unit tests now check that the default filter rules do not have white space at the beginning or end.

I chose not to check for spaces to allow for category names with spaces.  If category names should not have white space, we should probably include that in the documentation about what filter rules are valid.  Should we rule out category names with more than one word?

> 
> > +        Raises:
> > +          ValueError: Invalid filter rule if a rule does not start with
> > +                      plus ("+") or minus ("-").
> > +
> > +        """
> 
> I think usually we don't put an empty line in the end of docstrings.

I noticed recently that this is recommended in the Python style guide on docstring conventions:

> The BDFL [3] recommends inserting a blank line between the last paragraph in a multi-line docstring and its closing quotes, placing the closing quotes on a line by themselves. This way, Emacs' fill-paragraph command can be used on it. ( from http://www.python.org/dev/peps/pep-0257/ )

Should we depart from this?  I don't care too much.  After having tried both though, I do think it looks a bit nicer with the space.  It looks less scrunched that way, especially if the subsequent code starts on the line immediately after the """.

> 
> > +    def __str__(self):
> > +        return ",".join(self._filter_rules)
> 
> I think we basically use single-quotes for string literals except for
> multi-line string literals.

Yeah, I was doing that before, and then Eric recently told me that he and Adam were leaning towards double quotes.  I don't really care as long as we're consistent.  If I had to make a choice, I'd probably prefer double because it's more consistent with docstring quotes being double.  The Python style guide only seems to have an opinion about docstring quotes (which should be double).  Should we try to settle on something with Eric, Adam, and David going forward?

> > +    def test_init(self):
> > +        """Test __init__ constructor."""
> > +        self.assertRaises(ValueError, CategoryFilter, ["no_prefix"])
> > +        CategoryFilter([]) # No ValueError: works
> > +        CategoryFilter(["+"]) # No ValueError: works
> > +        CategoryFilter(["-"]) # No ValueError: works
> 
> If we decide to raise a ValueError for whitespaces, we may want to have an
> assertions for them as well. Also, please consider using single-quotes in this
> file.

Thanks again!
Comment 7 Shinichiro Hamaji 2010-01-13 03:56:22 PST
Comment on attachment 46261 [details]
Proposed patch 3

All concerns are resolved. Thanks!

> > Why don't we strip the arguments?
> ... snipped ...
> Does that seem reasonable to you?

I see. Thanks for the detailed description!

> I believe the end result is not any stricter than before because the parse
> function (which is the caller in this case) strips white space before
> instantiating the CategoryFilter.  This is done in the  _parse_filter_flag
> function:

Ah, I see. I missed _parse_filter_flag is stripping the filters. So, I guess I wanted a testcase for _parse_filter_flag about stripping. It may be like

        (files, options) = parse(['--filter= +foo , -bar '])
        self.assertEquals(options.filter_rules,
                          ['-', '+whitespace', '+foo', '-bar'])

But I don't think we should add this in this bug.

> > The BDFL [3] recommends inserting a blank line between the last paragraph in a multi-line docstring and its closing quotes, placing the closing quotes on a line by themselves. This way, Emacs' fill-paragraph command can be used on it. ( from http://www.python.org/dev/peps/pep-0257/ )
> 
> Should we depart from this?  I don't care too much.  After having tried both
> though, I do think it looks a bit nicer with the space.  It looks less
> scrunched that way, especially if the subsequent code starts on the line
> immediately after the """.

Ah, I see. I didn't know this, thanks for the info! I don't have strong preference on this and I guess other people don't care so much either.

> Yeah, I was doing that before, and then Eric recently told me that he and Adam
> were leaning towards double quotes.  I don't really care as long as we're
> consistent.  If I had to make a choice, I'd probably prefer double because it's
> more consistent with docstring quotes being double.  The Python style guide
> only seems to have an opinion about docstring quotes (which should be double). 
> Should we try to settle on something with Eric, Adam, and David going forward?

I have no strong preference here, too. Though having two styles in one file isn't good, we may be able to fix all style inconsistencies issues in another patch later. Let's go ahead.
Comment 8 Shinichiro Hamaji 2010-01-13 07:53:05 PST
Comment on attachment 46261 [details]
Proposed patch 3

Clearing flags on attachment: 46261

Committed r53185: <http://trac.webkit.org/changeset/53185>
Comment 9 Shinichiro Hamaji 2010-01-13 07:53:17 PST
All reviewed patches have been landed.  Closing bug.
Comment 10 Chris Jerdonek 2010-01-13 08:02:13 PST
(In reply to comment #7)
> > I believe the end result is not any stricter than before because the parse
> > function (which is the caller in this case) strips white space before
> > instantiating the CategoryFilter.  This is done in the  _parse_filter_flag
> > function:
> 
> Ah, I see. I missed _parse_filter_flag is stripping the filters. So, I guess I
> wanted a testcase for _parse_filter_flag about stripping. It may be like
> 
>         (files, options) = parse(['--filter= +foo , -bar '])
>         self.assertEquals(options.filter_rules,
>                           ['-', '+whitespace', '+foo', '-bar'])
>
> But I don't think we should add this in this bug.

Good point.  I'll add such a test case in my next patch to check-webkit-style.