214954 – check-webkit-style should enforce acronym capitalization at start/end of an identifier

RESOLVED FIXED 214954

check-webkit-style should enforce acronym capitalization at start/end of an identifier

https://bugs.webkit.org/show_bug.cgi?id=214954

Summary check-webkit-style should enforce acronym capitalization at start/end of an i...

David Kilzer (:ddkilzer)

Reported 2020-07-29 18:27:58 PDT

check-webkit-style should enforce acronym capitalization at start/end of an identifier. Specifically, for "URL". See Bug 214756 Comment #19.

Attachments
Patch v1 (14.28 KB, patch) 2020-07-29 21:09 PDT, David Kilzer (:ddkilzer)	no flags	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

David Kilzer (:ddkilzer)

Comment 1 2020-07-29 18:30:13 PDT

We can now find the incorrectly capitalized identifiers in DumpRenderTree.mm: $ ./Tools/Scripts/check-webkit-style --filter-rules=-,+readability/naming/acronym Tools/DumpRenderTree/mac/DumpRenderTree.mm ERROR: Tools/DumpRenderTree/mac/DumpRenderTree.mm:297: The identifier name "URLString" starts with a acronym that is not all lowercase. [readability/naming/acronym] [5] ERROR: Tools/DumpRenderTree/mac/DumpRenderTree.mm:1648: The identifier name "testPathOrUrl" ends with a acronym that is not all uppercase. [readability/naming/acronym] [5] ERROR: Tools/DumpRenderTree/mac/DumpRenderTree.mm:1650: The identifier name "localPathOrUrl" ends with a acronym that is not all uppercase. [readability/naming/acronym] [5] Total errors found: 3 in 1 files And verify that Source/WTF has no incorrectly named variables (might be false negatives, but no false positives): $ ./Tools/Scripts/check-webkit-style --filter-rules=-,+readability/naming/acronym Source/WTF Total errors found: 0 in 828 files

David Kilzer (:ddkilzer)

Comment 2 2020-07-29 21:09:07 PDT

Created attachment 405543 [details] Patch v1

David Kilzer (:ddkilzer)

Comment 3 2020-07-29 21:52:30 PDT

BTW, I don't plan on running the script on existing code and fixing issues right away since I have other priorities to work on.

Jonathan Bedard

Comment 4 2020-07-30 07:28:47 PDT

Comment on attachment 405543 [details] Patch v1 View in context: https://bugs.webkit.org/attachment.cgi?id=405543&action=review > Tools/Scripts/webkitpy/style/checkers/cpp.py:1820 > + # FIXME: Hard to check middle words without knowing that the word to the left doesn't end with an acronym. We already have a list of acronyms, can't we just do 'starts with' iterating through our list?

David Kilzer (:ddkilzer)

Comment 5 2020-07-30 10:20:22 PDT

Comment on attachment 405543 [details] Patch v1 View in context: https://bugs.webkit.org/attachment.cgi?id=405543&action=review > Tools/Scripts/webkitpy/style/checkers/cpp.py:1805 > + acronyms = '|'.join(['MIME', 'URL']) @Jonathan: This list of acronyms could be replaced by the other list you mentioned. Can you tell me where the list is defined? >> Tools/Scripts/webkitpy/style/checkers/cpp.py:1820 >> + # FIXME: Hard to check middle words without knowing that the word to the left doesn't end with an acronym. > > We already have a list of acronyms, can't we just do 'starts with' iterating through our list? Where is the list of acronyms? I wasn't aware of it until you mentioned it. (See previous comment.) Also, I'm not sure how a list of acronyms would help with this FIXME. The case I'm concerned about is something (made-up) like: canUnfurlFlag canCurlHair Searching the middle of a variable for "url" (in this case) might return a false positive, which is what the comment was about. I was thinking more about this after posting the patch, and I think a better algorithm may be to: - Split camelCase identifier into individual words using the start of a capital letter followed by a lowercase letter (after the initial word which would be all-uppercase or all-lowercase). - Compare each individual word in the list to a list of acronyms, and determine whether its case is wrong whether it's first or not-first in the list. The algorithm wouldn't be perfect, but would probably do a better job (and be easier to understand) than this code: - HTMLDOMDocument => (HTMLDOM, Document) - canUnfurlFlag => (can, Unfurl, Flag) - canCurlHair => (can Curl Hair) - URLString => (URL, String) - localPathOrUrl => (local, Path, Or, Url) - MIMEStringForURL => (MIME, String, For, URL) Now this algorithm assumes each word is spelled correctly, but we're not trying to solve correct spelling of camelCase words, so that's okay.

Jonathan Bedard

Comment 6 2020-07-30 10:27:29 PDT

Comment on attachment 405543 [details] Patch v1 View in context: https://bugs.webkit.org/attachment.cgi?id=405543&action=review >> Tools/Scripts/webkitpy/style/checkers/cpp.py:1805 >> + acronyms = '|'.join(['MIME', 'URL']) > > @Jonathan: This list of acronyms could be replaced by the other list you mentioned. Can you tell me where the list is defined? I was actually referring to this list here, although you explanation of the FIXME and false-positive matches makes it more clear the cases you had in mind. >>> Tools/Scripts/webkitpy/style/checkers/cpp.py:1820 >>> + # FIXME: Hard to check middle words without knowing that the word to the left doesn't end with an acronym. >> >> We already have a list of acronyms, can't we just do 'starts with' iterating through our list? > > Where is the list of acronyms? I wasn't aware of it until you mentioned it. (See previous comment.) > > Also, I'm not sure how a list of acronyms would help with this FIXME. The case I'm concerned about is something (made-up) like: > > canUnfurlFlag > canCurlHair > > Searching the middle of a variable for "url" (in this case) might return a false positive, which is what the comment was about. > > I was thinking more about this after posting the patch, and I think a better algorithm may be to: > > - Split camelCase identifier into individual words using the start of a capital letter followed by a lowercase letter (after the initial word which would be all-uppercase or all-lowercase). > - Compare each individual word in the list to a list of acronyms, and determine whether its case is wrong whether it's first or not-first in the list. > > The algorithm wouldn't be perfect, but would probably do a better job (and be easier to understand) than this code: > > - HTMLDOMDocument => (HTMLDOM, Document) > - canUnfurlFlag => (can, Unfurl, Flag) > - canCurlHair => (can Curl Hair) > - URLString => (URL, String) > - localPathOrUrl => (local, Path, Or, Url) > - MIMEStringForURL => (MIME, String, For, URL) > > Now this algorithm assumes each word is spelled correctly, but we're not trying to solve correct spelling of camelCase words, so that's okay. I was referring to your above list. (seems like DOM and HTML should be in it, though) If we want to do the more thorough camelCase splitting, that really feels like a different patch, since it's quite a bit more complicated then the startswith case I suggested.

David Kilzer (:ddkilzer)

Comment 7 2020-07-30 14:32:59 PDT

Comment on attachment 405543 [details] Patch v1 View in context: https://bugs.webkit.org/attachment.cgi?id=405543&action=review >>> Tools/Scripts/webkitpy/style/checkers/cpp.py:1805 >>> + acronyms = '|'.join(['MIME', 'URL']) >> >> @Jonathan: This list of acronyms could be replaced by the other list you mentioned. Can you tell me where the list is defined? > > I was actually referring to this list here, although you explanation of the FIXME and false-positive matches makes it more clear the cases you had in mind. I should note that I wanted to start with a small list that was easy for others to add to. I don't have time to check for false positives for a ton of acronyms up-front. >>>> Tools/Scripts/webkitpy/style/checkers/cpp.py:1820 >>>> + # FIXME: Hard to check middle words without knowing that the word to the left doesn't end with an acronym. >>> >>> We already have a list of acronyms, can't we just do 'starts with' iterating through our list? >> >> Where is the list of acronyms? I wasn't aware of it until you mentioned it. (See previous comment.) >> >> Also, I'm not sure how a list of acronyms would help with this FIXME. The case I'm concerned about is something (made-up) like: >> >> canUnfurlFlag >> canCurlHair >> >> Searching the middle of a variable for "url" (in this case) might return a false positive, which is what the comment was about. >> >> I was thinking more about this after posting the patch, and I think a better algorithm may be to: >> >> - Split camelCase identifier into individual words using the start of a capital letter followed by a lowercase letter (after the initial word which would be all-uppercase or all-lowercase). >> - Compare each individual word in the list to a list of acronyms, and determine whether its case is wrong whether it's first or not-first in the list. >> >> The algorithm wouldn't be perfect, but would probably do a better job (and be easier to understand) than this code: >> >> - HTMLDOMDocument => (HTMLDOM, Document) >> - canUnfurlFlag => (can, Unfurl, Flag) >> - canCurlHair => (can Curl Hair) >> - URLString => (URL, String) >> - localPathOrUrl => (local, Path, Or, Url) >> - MIMEStringForURL => (MIME, String, For, URL) >> >> Now this algorithm assumes each word is spelled correctly, but we're not trying to solve correct spelling of camelCase words, so that's okay. > > I was referring to your above list. (seems like DOM and HTML should be in it, though) > > If we want to do the more thorough camelCase splitting, that really feels like a different patch, since it's quite a bit more complicated then the startswith case I suggested. Okay, I can do a follow-up patch. This code seems like such a hack, though. :)

EWS

Comment 8 2020-07-30 14:44:38 PDT

Committed r265096: <https://trac.webkit.org/changeset/265096> All reviewed patches have been landed. Closing bug and clearing flags on attachment 405543 [details].

Radar WebKit Bug Importer

Comment 9 2020-07-30 14:46:26 PDT

<rdar://problem/66348710>

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version Other

Hardware Unspecified

OS Unspecified

Product WebKit

Component Tools / Tests

Assignee

David Kilzer (:ddkilzer)

Reported

2020-07-29 18:27 PDT

Modified

2020-07-31 11:26 PDT History

CC List

5 users Show

URL

Keywords InRadar

Depends on

Blocks

215026

Dependencies

tree graph