RESOLVED FIXED Bug 214954
check-webkit-style should enforce acronym capitalization at start/end of an identifier
https://bugs.webkit.org/show_bug.cgi?id=214954
Summary check-webkit-style should enforce acronym capitalization at start/end of an i...
David Kilzer (:ddkilzer)
Reported 2020-07-29 18:27:58 PDT
check-webkit-style should enforce acronym capitalization at start/end of an identifier. Specifically, for "URL". See Bug 214756 Comment #19.
Attachments
Patch v1 (14.28 KB, patch)
2020-07-29 21:09 PDT, David Kilzer (:ddkilzer)
no flags
David Kilzer (:ddkilzer)
Comment 1 2020-07-29 18:30:13 PDT
We can now find the incorrectly capitalized identifiers in DumpRenderTree.mm: $ ./Tools/Scripts/check-webkit-style --filter-rules=-,+readability/naming/acronym Tools/DumpRenderTree/mac/DumpRenderTree.mm ERROR: Tools/DumpRenderTree/mac/DumpRenderTree.mm:297: The identifier name "URLString" starts with a acronym that is not all lowercase. [readability/naming/acronym] [5] ERROR: Tools/DumpRenderTree/mac/DumpRenderTree.mm:1648: The identifier name "testPathOrUrl" ends with a acronym that is not all uppercase. [readability/naming/acronym] [5] ERROR: Tools/DumpRenderTree/mac/DumpRenderTree.mm:1650: The identifier name "localPathOrUrl" ends with a acronym that is not all uppercase. [readability/naming/acronym] [5] Total errors found: 3 in 1 files And verify that Source/WTF has no incorrectly named variables (might be false negatives, but no false positives): $ ./Tools/Scripts/check-webkit-style --filter-rules=-,+readability/naming/acronym Source/WTF Total errors found: 0 in 828 files
David Kilzer (:ddkilzer)
Comment 2 2020-07-29 21:09:07 PDT
Created attachment 405543 [details] Patch v1
David Kilzer (:ddkilzer)
Comment 3 2020-07-29 21:52:30 PDT
BTW, I don't plan on running the script on existing code and fixing issues right away since I have other priorities to work on.
Jonathan Bedard
Comment 4 2020-07-30 07:28:47 PDT
Comment on attachment 405543 [details] Patch v1 View in context: https://bugs.webkit.org/attachment.cgi?id=405543&action=review > Tools/Scripts/webkitpy/style/checkers/cpp.py:1820 > + # FIXME: Hard to check middle words without knowing that the word to the left doesn't end with an acronym. We already have a list of acronyms, can't we just do 'starts with' iterating through our list?
David Kilzer (:ddkilzer)
Comment 5 2020-07-30 10:20:22 PDT
Comment on attachment 405543 [details] Patch v1 View in context: https://bugs.webkit.org/attachment.cgi?id=405543&action=review > Tools/Scripts/webkitpy/style/checkers/cpp.py:1805 > + acronyms = '|'.join(['MIME', 'URL']) @Jonathan: This list of acronyms could be replaced by the other list you mentioned. Can you tell me where the list is defined? >> Tools/Scripts/webkitpy/style/checkers/cpp.py:1820 >> + # FIXME: Hard to check middle words without knowing that the word to the left doesn't end with an acronym. > > We already have a list of acronyms, can't we just do 'starts with' iterating through our list? Where is the list of acronyms? I wasn't aware of it until you mentioned it. (See previous comment.) Also, I'm not sure how a list of acronyms would help with this FIXME. The case I'm concerned about is something (made-up) like: canUnfurlFlag canCurlHair Searching the middle of a variable for "url" (in this case) might return a false positive, which is what the comment was about. I was thinking more about this after posting the patch, and I think a better algorithm may be to: - Split camelCase identifier into individual words using the start of a capital letter followed by a lowercase letter (after the initial word which would be all-uppercase or all-lowercase). - Compare each individual word in the list to a list of acronyms, and determine whether its case is wrong whether it's first or not-first in the list. The algorithm wouldn't be perfect, but would probably do a better job (and be easier to understand) than this code: - HTMLDOMDocument => (HTMLDOM, Document) - canUnfurlFlag => (can, Unfurl, Flag) - canCurlHair => (can Curl Hair) - URLString => (URL, String) - localPathOrUrl => (local, Path, Or, Url) - MIMEStringForURL => (MIME, String, For, URL) Now this algorithm assumes each word is spelled correctly, but we're not trying to solve correct spelling of camelCase words, so that's okay.
Jonathan Bedard
Comment 6 2020-07-30 10:27:29 PDT
Comment on attachment 405543 [details] Patch v1 View in context: https://bugs.webkit.org/attachment.cgi?id=405543&action=review >> Tools/Scripts/webkitpy/style/checkers/cpp.py:1805 >> + acronyms = '|'.join(['MIME', 'URL']) > > @Jonathan: This list of acronyms could be replaced by the other list you mentioned. Can you tell me where the list is defined? I was actually referring to this list here, although you explanation of the FIXME and false-positive matches makes it more clear the cases you had in mind. >>> Tools/Scripts/webkitpy/style/checkers/cpp.py:1820 >>> + # FIXME: Hard to check middle words without knowing that the word to the left doesn't end with an acronym. >> >> We already have a list of acronyms, can't we just do 'starts with' iterating through our list? > > Where is the list of acronyms? I wasn't aware of it until you mentioned it. (See previous comment.) > > Also, I'm not sure how a list of acronyms would help with this FIXME. The case I'm concerned about is something (made-up) like: > > canUnfurlFlag > canCurlHair > > Searching the middle of a variable for "url" (in this case) might return a false positive, which is what the comment was about. > > I was thinking more about this after posting the patch, and I think a better algorithm may be to: > > - Split camelCase identifier into individual words using the start of a capital letter followed by a lowercase letter (after the initial word which would be all-uppercase or all-lowercase). > - Compare each individual word in the list to a list of acronyms, and determine whether its case is wrong whether it's first or not-first in the list. > > The algorithm wouldn't be perfect, but would probably do a better job (and be easier to understand) than this code: > > - HTMLDOMDocument => (HTMLDOM, Document) > - canUnfurlFlag => (can, Unfurl, Flag) > - canCurlHair => (can Curl Hair) > - URLString => (URL, String) > - localPathOrUrl => (local, Path, Or, Url) > - MIMEStringForURL => (MIME, String, For, URL) > > Now this algorithm assumes each word is spelled correctly, but we're not trying to solve correct spelling of camelCase words, so that's okay. I was referring to your above list. (seems like DOM and HTML should be in it, though) If we want to do the more thorough camelCase splitting, that really feels like a different patch, since it's quite a bit more complicated then the startswith case I suggested.
David Kilzer (:ddkilzer)
Comment 7 2020-07-30 14:32:59 PDT
Comment on attachment 405543 [details] Patch v1 View in context: https://bugs.webkit.org/attachment.cgi?id=405543&action=review >>> Tools/Scripts/webkitpy/style/checkers/cpp.py:1805 >>> + acronyms = '|'.join(['MIME', 'URL']) >> >> @Jonathan: This list of acronyms could be replaced by the other list you mentioned. Can you tell me where the list is defined? > > I was actually referring to this list here, although you explanation of the FIXME and false-positive matches makes it more clear the cases you had in mind. I should note that I wanted to start with a small list that was easy for others to add to. I don't have time to check for false positives for a ton of acronyms up-front. >>>> Tools/Scripts/webkitpy/style/checkers/cpp.py:1820 >>>> + # FIXME: Hard to check middle words without knowing that the word to the left doesn't end with an acronym. >>> >>> We already have a list of acronyms, can't we just do 'starts with' iterating through our list? >> >> Where is the list of acronyms? I wasn't aware of it until you mentioned it. (See previous comment.) >> >> Also, I'm not sure how a list of acronyms would help with this FIXME. The case I'm concerned about is something (made-up) like: >> >> canUnfurlFlag >> canCurlHair >> >> Searching the middle of a variable for "url" (in this case) might return a false positive, which is what the comment was about. >> >> I was thinking more about this after posting the patch, and I think a better algorithm may be to: >> >> - Split camelCase identifier into individual words using the start of a capital letter followed by a lowercase letter (after the initial word which would be all-uppercase or all-lowercase). >> - Compare each individual word in the list to a list of acronyms, and determine whether its case is wrong whether it's first or not-first in the list. >> >> The algorithm wouldn't be perfect, but would probably do a better job (and be easier to understand) than this code: >> >> - HTMLDOMDocument => (HTMLDOM, Document) >> - canUnfurlFlag => (can, Unfurl, Flag) >> - canCurlHair => (can Curl Hair) >> - URLString => (URL, String) >> - localPathOrUrl => (local, Path, Or, Url) >> - MIMEStringForURL => (MIME, String, For, URL) >> >> Now this algorithm assumes each word is spelled correctly, but we're not trying to solve correct spelling of camelCase words, so that's okay. > > I was referring to your above list. (seems like DOM and HTML should be in it, though) > > If we want to do the more thorough camelCase splitting, that really feels like a different patch, since it's quite a bit more complicated then the startswith case I suggested. Okay, I can do a follow-up patch. This code seems like such a hack, though. :)
EWS
Comment 8 2020-07-30 14:44:38 PDT
Committed r265096: <https://trac.webkit.org/changeset/265096> All reviewed patches have been landed. Closing bug and clearing flags on attachment 405543 [details].
Radar WebKit Bug Importer
Comment 9 2020-07-30 14:46:26 PDT
Note You need to log in before you can comment on or make changes to this bug.