Bug 32051 - check-webkit-style should check for camelCase variable names
Summary: check-webkit-style should check for camelCase variable names
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC OS X 10.5
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-12-01 20:31 PST by Eric Seidel (no email)
Modified: 2009-12-04 00:03 PST (History)
4 users (show)

See Also:


Attachments
Incorrect code in WebGLRenderingContext.cpp (2.08 KB, patch)
2009-12-02 08:08 PST, Shinichiro Hamaji
no flags Details | Formatted Diff | Diff
Patch v1 (9.53 KB, patch)
2009-12-02 08:10 PST, Shinichiro Hamaji
no flags Details | Formatted Diff | Diff
Patch v2 (14.18 KB, patch)
2009-12-03 04:39 PST, Shinichiro Hamaji
no flags Details | Formatted Diff | Diff
Patch v3 (16.54 KB, patch)
2009-12-03 09:53 PST, Shinichiro Hamaji
levin: review+
levin: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Seidel (no email) 2009-12-01 20:31:10 PST
check-webkit-style should check for camelCase variable names

WebKit requires camelCase instead of underbar_style variable names.  check-webkit-style should check for this.

This is a common mistake for those coming from Google's C++ style:
https://bugs.webkit.org/show_bug.cgi?id=32048#c9
Comment 1 Shinichiro Hamaji 2009-12-02 08:08:48 PST
Created attachment 44148 [details]
Incorrect code in WebGLRenderingContext.cpp
Comment 2 Shinichiro Hamaji 2009-12-02 08:10:02 PST
Created attachment 44149 [details]
Patch v1
Comment 3 WebKit Review Bot 2009-12-02 08:15:19 PST
style-queue ran check-webkit-style on attachment 44149 [details] without any errors.
Comment 4 Shinichiro Hamaji 2009-12-02 08:18:48 PST
(In reply to comment #1)
> Created an attachment (id=44148) [details]
> Incorrect code in WebGLRenderingContext.cpp

Oops, I mistook the usage of bugzilla-tool. Hmm... I did similar mistakes several times and it's a bit embarrassing. Can I add confirmation to post-* and land-* ? This would show the message and filenames about to post/land, and ask [Y/n] . If this plan sounds to someone, I'll open another bug and try adding the feature.
Comment 5 David Levin 2009-12-02 09:19:13 PST
(In reply to comment #4)
> (In reply to comment #1)
> > Created an attachment (id=44148) [details] [details]
> > Incorrect code in WebGLRenderingContext.cpp
> 
> Oops, I mistook the usage of bugzilla-tool. Hmm... I did similar mistakes
> several times and it's a bit embarrassing. Can I add confirmation to post-* and
> land-* ? This would show the message and filenames about to post/land, and ask
> [Y/n] . If this plan sounds to someone, I'll open another bug and try adding
> the feature.

I did the exact same thing yesterday.
Comment 6 Eric Seidel (no email) 2009-12-02 09:20:39 PST
I encourage you to file a bug about adding confirmation.  I think confirmation on these "dangerous" commands would be a good thing!
Comment 7 Eric Seidel (no email) 2009-12-02 09:27:17 PST
I filed bug 32071 with my thoughts on the subject of confirmation.
Comment 8 David Levin 2009-12-02 11:16:57 PST
Comment on attachment 44149 [details]
Patch v1

> diff --git a/WebKitTools/Scripts/modules/cpp_style.py b/WebKitTools/Scripts/modules/cpp_style.py

> +def check_identifier_name(filename, line_number, line, error):
> +    """Checks if identifier names don't contain an underscore.
How about"
  Checks if identifier names contain any underscores.

> +    # Remove storage sepcifiers, ...

typo: sepcifiers

Instead of saying "what is the code doing?" in the comment. It would be good to answer "why?" (When I read the code, I don't understand why this is necessary.)

> +    line = re.sub(r'(?:unsigned|signed|inline|using|static|const|volatile|auto|register|extern|typedef|else|restrict|struct|class|virtual|return) ', '', line)
> +
> +    # Remove all template parameters by removing matching < and >.

Why?

> +    while True:
> +        line, removed_numbers = re.subn(r'<[\w\s*&:]+>', '', line)

s/removed_numbers/number_of_replacements/

> +        if not removed_numbers:
> +            break
> +
> +    # Remove for statement.
Why?

> +    line = re.sub(r'^\s*for\s*\(', '', line)
> +    # Remove other control statements.
> +    line, control_statement = re.subn(r'^\s*(?:while|if|switch)\s*\(', '', line)

Why?

> +    # Detect variable and functions.
> +    matched = match(r'\s*\w(?:[\w:]|\s*[*&])+\s+([\w:]+)\s*([;(=[])', line)

This regex is hard to read as is. I don't know what it is attempting to do with all of its clauses. One way to fix this is to make parts of the part into separate strings that are put in variables which have names explaining what they do.

> +    if not matched:
> +        return
> +    # If we removed a non-for-control statement, the character after
> +    # the identifier should be '='. With this rule, we can avoid the
> +    # case like "if (val & INT_MAX) {".

What are you trying to avoid?  Flagging INT_MAX?

Why not just avoid flagging TEXT_THAT_LOOKS_LIKE_THIS since this is a defined value


> +    if control_statement and matched.group(2) != '=':
> +        return
> +
> +    identifier = matched.group(1)

It would be nice to use named groups.

> +    # Remove "m_" and "s_" to alow them.

typo: alow

> +    modified_identifier = re.sub(r'(?:^|(?<=::))[ms]_', '', identifier)
> +    if modified_identifier.find('_') >= 0:
> +        error(filename, line_number, 'readability/naming', 4, identifier + " is unallowed naming. Don't use underscores in your identifier names.")

s/unallowed naming/incorrectly named/

I expected to see a loop here that checked each identifier but I don't see it. How does this code check the following?
  myFunction(int variable1, int another_variable)
  int variable1, another_variable;
  int first_variable, secondVariable;
etc.

> diff --git a/WebKitTools/Scripts/modules/cpp_style_unittest.py b/WebKitTools/Scripts/modules/cpp_style_unittest.py

> +        self.assert_lint('short _length;',
> +                         '_length' + name_error_message)

What about a test for length_ as well?

> +        self.assert_lint('const int UNDER_SCORE;',
> +                         'UNDER_SCORE' + name_error_message)

Given how simple the regex matching is. I think that allowing NAMES_LIKE_THIS is fine. People rarely make that mistake and it is a valid name for a #define'd item.


> +        self.assert_lint('static inline const char const & const under_score;',

& is in the wrong place.

> +        self.assert_lint('WTF::Vector<WTF::RefPtr<const RenderObject * const> > under_score;',

The * is in the wrong place.

> +        self.assert_lint('while (foo & under_score) {', '')

Ideally this would produce an error.
Comment 9 David Levin 2009-12-02 14:02:07 PST
Two more comments:
1. It would be great if this only flagged the first instance of a particular identifier (in a given file) to avoid lots of warnings for the same identifier and to avoid giving a warning when someone is fixing some code and just using a variable that was already in the file.

2. Please run this on existing code (if you haven't already) and verify that it doesn't give false alarms.
Comment 10 Shinichiro Hamaji 2009-12-03 04:39:52 PST
Created attachment 44229 [details]
Patch v2
Comment 11 Shinichiro Hamaji 2009-12-03 04:40:29 PST
Thanks for the detailed comments! It helped me a lot.

I think I addressed all small issues such as typo, variable names, and
wrong English. Also, I wrote more comments, sorry for poor comments in
the previous patch.

It was very bad that I didn't write docstring which kind of
identifiers the function I added will check, sorry. The function I
added only checks identifiers in their declaration. If we check the
use of identifiers, it would be very difficult to avoid false alarms
because there would be several identifiers which violates our style
guide in libraries we are using. I also wrote this in the docstring.

I think this decision can be objected. If the checker complains about
the use of wrongly named symbols, we will have more chance to fix
them when we are just using them. However, I think the benefit to
reduce the risk of false alarming for third-party's variables wins the
benefit of more warnings. Also, we may need to add some exception
lists for static_cast or something like this. What do you think?

> Why not just avoid flagging TEXT_THAT_LOOKS_LIKE_THIS since this is
> a defined value

At first, I considered this solution, but I imagined there may be
global variables which has name_like_this in third-party libraries.
So, I'm guessing we need check like this anyway.

> I expected to see a loop here that checked each identifier but I don't see it.
> How does this code check the following?
>   myFunction(int variable1, int another_variable)
>   int variable1, another_variable;
>   int first_variable, secondVariable;
> etc.

Oops, I completely forgot about the latter two cases, and I didn't
implement the first one because this seemed to be a bit more
difficult and this case would be not so common like errors in local
variables. Anyway, I've implemented both cases and added test cases.

> > +        self.assert_lint('while (foo & under_score) {', '')
>
> Ideally this would produce an error.

This test is intended to check a global variables in third-party
library isn't warned. I renamed "under_score" to
"value_in_thirdparty_library" to describe the intention clearly.

> 1. It would be great if this only flagged the first instance of a
> particular identifier (in a given file) to avoid lots of warnings for
> the same identifier and to avoid giving a warning when someone is
> fixing some code and just using a variable that was already in the
> file.

As the function only checks declaration, I think the warnings won't be
so noisy without this.

> 2. Please run this on existing code (if you haven't already) and
> verify that it doesn't give false alarms.

Yes, I'm running the script for WebCore/*/*.{cpp,h} to check this.
This produces 494 warnings now and I don't see any false alarms in the
list.
Comment 12 WebKit Review Bot 2009-12-03 04:44:54 PST
style-queue ran check-webkit-style on attachment 44229 [details] without any errors.
Comment 13 Shinichiro Hamaji 2009-12-03 09:53:38 PST
Created attachment 44252 [details]
Patch v3
Comment 14 WebKit Review Bot 2009-12-03 09:56:00 PST
style-queue ran check-webkit-style on attachment 44252 [details] without any errors.
Comment 15 Shinichiro Hamaji 2009-12-03 09:57:26 PST
David suggested I should use cache mechanism for re.sub and re.subn. I updated
my patch so now it uses the cache.

I re-ran the unittest and there were no failures. I also confirmed that this doesn't change the result for WebCore/*/*.{cpp,h}
Comment 16 David Levin 2009-12-03 12:45:54 PST
Comment on attachment 44252 [details]
Patch v3

Looks good. I've noted a few things that you may want to consider changing when landing but none are critical.

> diff --git a/WebKitTools/Scripts/modules/cpp_style.py b/WebKitTools/Scripts/modules/cpp_style.py
> +def check_identifier_name_in_declaration(filename, line_number, line, error):
> +    """Checks if identifier names contain any underscores.
> +
> +    As identifiers in libraries we are using have a bunch of
> +    underscores, we only warn the declarations of identifiers and

s/warn/warn about/


> +      error: The function to call with any errors found.
> +    """
> +    if match('\s*return', line):

Should there be a \b after return?

> +    # Convert "long long", "long double", and "long long int" to
> +    # simple types, but don't remove simple "long".
> +    line = sub(r'long (?:long )?(?=long|double|int)', '', line)

I'm not sure why this uses the non-group form of parenthesis (and the same comment goes for other places especially now that the code uses named groups). (The code doesn't need the group, but putting in the ?: just introduces more visual noise for me at least.)

> +    line = sub(r'^\s*for\s*\(', '', line)
> +    line, control_statement = subn(r'^\s*(?:while|else if|if|switch)\s*\(', '', line)

I think qt uses foreach as well but it is probably fine to leave it as unsupported for now.
Comment 17 Shinichiro Hamaji 2009-12-03 23:42:32 PST
Committed r51682: <http://trac.webkit.org/changeset/51682>
Comment 18 Shinichiro Hamaji 2009-12-04 00:03:59 PST
Thanks for your review. I landed my patch with the following changes.

> s/warn/warn about/

Fixed.

> Should there be a \b after return?

Ah, nice catch. Fixed, thanks! I also added a test case for this:

        self.assert_lint('return_t under_score;',
                         'under_score' + name_error_message)

> I'm not sure why this uses the non-group form of parenthesis (and the same
> comment goes for other places especially now that the code uses named groups).
> (The code doesn't need the group, but putting in the ?: just introduces more
> visual noise for me at least.)

This is my habit because using (?:) is slightly faster than (). As I agree this is less readable and this is inconsistent with other code in this file, I remove them.

> > +    line = sub(r'^\s*for\s*\(', '', line)
> > +    line, control_statement = subn(r'^\s*(?:while|else if|if|switch)\s*\(', '', line)
> 
> I think qt uses foreach as well but it is probably fine to leave it as
> unsupported for now.

It seems foreach isn't so common. Please let me leave it as is.