Bug 32051

Summary: check-webkit-style should check for camelCase variable names
Product: WebKit Reporter: Eric Seidel (no email) <eric>
Component: Tools / TestsAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, hamaji, levin, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: OS X 10.5   
Attachments:
Description Flags
Incorrect code in WebGLRenderingContext.cpp
none
Patch v1
none
Patch v2
none
Patch v3 levin: review+, levin: commit-queue-

Eric Seidel (no email)
Reported 2009-12-01 20:31:10 PST
check-webkit-style should check for camelCase variable names WebKit requires camelCase instead of underbar_style variable names. check-webkit-style should check for this. This is a common mistake for those coming from Google's C++ style: https://bugs.webkit.org/show_bug.cgi?id=32048#c9
Attachments
Incorrect code in WebGLRenderingContext.cpp (2.08 KB, patch)
2009-12-02 08:08 PST, Shinichiro Hamaji
no flags
Patch v1 (9.53 KB, patch)
2009-12-02 08:10 PST, Shinichiro Hamaji
no flags
Patch v2 (14.18 KB, patch)
2009-12-03 04:39 PST, Shinichiro Hamaji
no flags
Patch v3 (16.54 KB, patch)
2009-12-03 09:53 PST, Shinichiro Hamaji
levin: review+
levin: commit-queue-
Shinichiro Hamaji
Comment 1 2009-12-02 08:08:48 PST
Created attachment 44148 [details] Incorrect code in WebGLRenderingContext.cpp
Shinichiro Hamaji
Comment 2 2009-12-02 08:10:02 PST
Created attachment 44149 [details] Patch v1
WebKit Review Bot
Comment 3 2009-12-02 08:15:19 PST
style-queue ran check-webkit-style on attachment 44149 [details] without any errors.
Shinichiro Hamaji
Comment 4 2009-12-02 08:18:48 PST
(In reply to comment #1) > Created an attachment (id=44148) [details] > Incorrect code in WebGLRenderingContext.cpp Oops, I mistook the usage of bugzilla-tool. Hmm... I did similar mistakes several times and it's a bit embarrassing. Can I add confirmation to post-* and land-* ? This would show the message and filenames about to post/land, and ask [Y/n] . If this plan sounds to someone, I'll open another bug and try adding the feature.
David Levin
Comment 5 2009-12-02 09:19:13 PST
(In reply to comment #4) > (In reply to comment #1) > > Created an attachment (id=44148) [details] [details] > > Incorrect code in WebGLRenderingContext.cpp > > Oops, I mistook the usage of bugzilla-tool. Hmm... I did similar mistakes > several times and it's a bit embarrassing. Can I add confirmation to post-* and > land-* ? This would show the message and filenames about to post/land, and ask > [Y/n] . If this plan sounds to someone, I'll open another bug and try adding > the feature. I did the exact same thing yesterday.
Eric Seidel (no email)
Comment 6 2009-12-02 09:20:39 PST
I encourage you to file a bug about adding confirmation. I think confirmation on these "dangerous" commands would be a good thing!
Eric Seidel (no email)
Comment 7 2009-12-02 09:27:17 PST
I filed bug 32071 with my thoughts on the subject of confirmation.
David Levin
Comment 8 2009-12-02 11:16:57 PST
Comment on attachment 44149 [details] Patch v1 > diff --git a/WebKitTools/Scripts/modules/cpp_style.py b/WebKitTools/Scripts/modules/cpp_style.py > +def check_identifier_name(filename, line_number, line, error): > + """Checks if identifier names don't contain an underscore. How about" Checks if identifier names contain any underscores. > + # Remove storage sepcifiers, ... typo: sepcifiers Instead of saying "what is the code doing?" in the comment. It would be good to answer "why?" (When I read the code, I don't understand why this is necessary.) > + line = re.sub(r'(?:unsigned|signed|inline|using|static|const|volatile|auto|register|extern|typedef|else|restrict|struct|class|virtual|return) ', '', line) > + > + # Remove all template parameters by removing matching < and >. Why? > + while True: > + line, removed_numbers = re.subn(r'<[\w\s*&:]+>', '', line) s/removed_numbers/number_of_replacements/ > + if not removed_numbers: > + break > + > + # Remove for statement. Why? > + line = re.sub(r'^\s*for\s*\(', '', line) > + # Remove other control statements. > + line, control_statement = re.subn(r'^\s*(?:while|if|switch)\s*\(', '', line) Why? > + # Detect variable and functions. > + matched = match(r'\s*\w(?:[\w:]|\s*[*&])+\s+([\w:]+)\s*([;(=[])', line) This regex is hard to read as is. I don't know what it is attempting to do with all of its clauses. One way to fix this is to make parts of the part into separate strings that are put in variables which have names explaining what they do. > + if not matched: > + return > + # If we removed a non-for-control statement, the character after > + # the identifier should be '='. With this rule, we can avoid the > + # case like "if (val & INT_MAX) {". What are you trying to avoid? Flagging INT_MAX? Why not just avoid flagging TEXT_THAT_LOOKS_LIKE_THIS since this is a defined value > + if control_statement and matched.group(2) != '=': > + return > + > + identifier = matched.group(1) It would be nice to use named groups. > + # Remove "m_" and "s_" to alow them. typo: alow > + modified_identifier = re.sub(r'(?:^|(?<=::))[ms]_', '', identifier) > + if modified_identifier.find('_') >= 0: > + error(filename, line_number, 'readability/naming', 4, identifier + " is unallowed naming. Don't use underscores in your identifier names.") s/unallowed naming/incorrectly named/ I expected to see a loop here that checked each identifier but I don't see it. How does this code check the following? myFunction(int variable1, int another_variable) int variable1, another_variable; int first_variable, secondVariable; etc. > diff --git a/WebKitTools/Scripts/modules/cpp_style_unittest.py b/WebKitTools/Scripts/modules/cpp_style_unittest.py > + self.assert_lint('short _length;', > + '_length' + name_error_message) What about a test for length_ as well? > + self.assert_lint('const int UNDER_SCORE;', > + 'UNDER_SCORE' + name_error_message) Given how simple the regex matching is. I think that allowing NAMES_LIKE_THIS is fine. People rarely make that mistake and it is a valid name for a #define'd item. > + self.assert_lint('static inline const char const & const under_score;', & is in the wrong place. > + self.assert_lint('WTF::Vector<WTF::RefPtr<const RenderObject * const> > under_score;', The * is in the wrong place. > + self.assert_lint('while (foo & under_score) {', '') Ideally this would produce an error.
David Levin
Comment 9 2009-12-02 14:02:07 PST
Two more comments: 1. It would be great if this only flagged the first instance of a particular identifier (in a given file) to avoid lots of warnings for the same identifier and to avoid giving a warning when someone is fixing some code and just using a variable that was already in the file. 2. Please run this on existing code (if you haven't already) and verify that it doesn't give false alarms.
Shinichiro Hamaji
Comment 10 2009-12-03 04:39:52 PST
Created attachment 44229 [details] Patch v2
Shinichiro Hamaji
Comment 11 2009-12-03 04:40:29 PST
Thanks for the detailed comments! It helped me a lot. I think I addressed all small issues such as typo, variable names, and wrong English. Also, I wrote more comments, sorry for poor comments in the previous patch. It was very bad that I didn't write docstring which kind of identifiers the function I added will check, sorry. The function I added only checks identifiers in their declaration. If we check the use of identifiers, it would be very difficult to avoid false alarms because there would be several identifiers which violates our style guide in libraries we are using. I also wrote this in the docstring. I think this decision can be objected. If the checker complains about the use of wrongly named symbols, we will have more chance to fix them when we are just using them. However, I think the benefit to reduce the risk of false alarming for third-party's variables wins the benefit of more warnings. Also, we may need to add some exception lists for static_cast or something like this. What do you think? > Why not just avoid flagging TEXT_THAT_LOOKS_LIKE_THIS since this is > a defined value At first, I considered this solution, but I imagined there may be global variables which has name_like_this in third-party libraries. So, I'm guessing we need check like this anyway. > I expected to see a loop here that checked each identifier but I don't see it. > How does this code check the following? > myFunction(int variable1, int another_variable) > int variable1, another_variable; > int first_variable, secondVariable; > etc. Oops, I completely forgot about the latter two cases, and I didn't implement the first one because this seemed to be a bit more difficult and this case would be not so common like errors in local variables. Anyway, I've implemented both cases and added test cases. > > + self.assert_lint('while (foo & under_score) {', '') > > Ideally this would produce an error. This test is intended to check a global variables in third-party library isn't warned. I renamed "under_score" to "value_in_thirdparty_library" to describe the intention clearly. > 1. It would be great if this only flagged the first instance of a > particular identifier (in a given file) to avoid lots of warnings for > the same identifier and to avoid giving a warning when someone is > fixing some code and just using a variable that was already in the > file. As the function only checks declaration, I think the warnings won't be so noisy without this. > 2. Please run this on existing code (if you haven't already) and > verify that it doesn't give false alarms. Yes, I'm running the script for WebCore/*/*.{cpp,h} to check this. This produces 494 warnings now and I don't see any false alarms in the list.
WebKit Review Bot
Comment 12 2009-12-03 04:44:54 PST
style-queue ran check-webkit-style on attachment 44229 [details] without any errors.
Shinichiro Hamaji
Comment 13 2009-12-03 09:53:38 PST
Created attachment 44252 [details] Patch v3
WebKit Review Bot
Comment 14 2009-12-03 09:56:00 PST
style-queue ran check-webkit-style on attachment 44252 [details] without any errors.
Shinichiro Hamaji
Comment 15 2009-12-03 09:57:26 PST
David suggested I should use cache mechanism for re.sub and re.subn. I updated my patch so now it uses the cache. I re-ran the unittest and there were no failures. I also confirmed that this doesn't change the result for WebCore/*/*.{cpp,h}
David Levin
Comment 16 2009-12-03 12:45:54 PST
Comment on attachment 44252 [details] Patch v3 Looks good. I've noted a few things that you may want to consider changing when landing but none are critical. > diff --git a/WebKitTools/Scripts/modules/cpp_style.py b/WebKitTools/Scripts/modules/cpp_style.py > +def check_identifier_name_in_declaration(filename, line_number, line, error): > + """Checks if identifier names contain any underscores. > + > + As identifiers in libraries we are using have a bunch of > + underscores, we only warn the declarations of identifiers and s/warn/warn about/ > + error: The function to call with any errors found. > + """ > + if match('\s*return', line): Should there be a \b after return? > + # Convert "long long", "long double", and "long long int" to > + # simple types, but don't remove simple "long". > + line = sub(r'long (?:long )?(?=long|double|int)', '', line) I'm not sure why this uses the non-group form of parenthesis (and the same comment goes for other places especially now that the code uses named groups). (The code doesn't need the group, but putting in the ?: just introduces more visual noise for me at least.) > + line = sub(r'^\s*for\s*\(', '', line) > + line, control_statement = subn(r'^\s*(?:while|else if|if|switch)\s*\(', '', line) I think qt uses foreach as well but it is probably fine to leave it as unsupported for now.
Shinichiro Hamaji
Comment 17 2009-12-03 23:42:32 PST
Shinichiro Hamaji
Comment 18 2009-12-04 00:03:59 PST
Thanks for your review. I landed my patch with the following changes. > s/warn/warn about/ Fixed. > Should there be a \b after return? Ah, nice catch. Fixed, thanks! I also added a test case for this: self.assert_lint('return_t under_score;', 'under_score' + name_error_message) > I'm not sure why this uses the non-group form of parenthesis (and the same > comment goes for other places especially now that the code uses named groups). > (The code doesn't need the group, but putting in the ?: just introduces more > visual noise for me at least.) This is my habit because using (?:) is slightly faster than (). As I agree this is less readable and this is inconsistent with other code in this file, I remove them. > > + line = sub(r'^\s*for\s*\(', '', line) > > + line, control_statement = subn(r'^\s*(?:while|else if|if|switch)\s*\(', '', line) > > I think qt uses foreach as well but it is probably fine to leave it as > unsupported for now. It seems foreach isn't so common. Please let me leave it as is.
Note You need to log in before you can comment on or make changes to this bug.