Bug 217972 - Make report-non-inclusive-language ignore autoinstalled directories and certain file types
Summary: Make report-non-inclusive-language ignore autoinstalled directories and certa...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: Other
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Aakash Jain
URL:
Keywords: InRadar
Depends on:
Blocks: 213092
  Show dependency treegraph
 
Reported: 2020-10-20 10:42 PDT by Aakash Jain
Modified: 2021-05-17 10:01 PDT (History)
6 users (show)

See Also:


Attachments
Patch (2.64 KB, patch)
2020-10-20 10:47 PDT, Aakash Jain
bdakin: review+
Details | Formatted Diff | Diff
Patch (2.65 KB, patch)
2020-10-20 11:31 PDT, Aakash Jain
no flags Details | Formatted Diff | Diff
Patch (1.65 KB, patch)
2020-10-20 13:25 PDT, Aakash Jain
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Aakash Jain 2020-10-20 10:42:09 PDT
Make report-non-inclusive-language ignore following directories and files:

1) 'autoinstalled': The source code of packages inside autoinstalled directories (e.g.: Scripts/libraries/autoinstalled) are not part of our repository and is not controlled by us. So report-non-inclusive-language shouldn't count them. Currently the count is really skewed because of counting such autoinstalled directories.

2) Files ending with .swp: usually these are files created for files opened in other tabs).

3) Files ending with .log: usually these are not part of source code, and are simply logs. For e.g.: twistd.log while running buildbot instance locally.
Comment 1 Aakash Jain 2020-10-20 10:47:41 PDT
Created attachment 411887 [details]
Patch
Comment 2 Aakash Jain 2020-10-20 11:30:02 PDT
It should also ignore files ending in .pyc
Comment 3 Aakash Jain 2020-10-20 11:31:13 PDT
Created attachment 411892 [details]
Patch
Comment 4 Beth Dakin 2020-10-20 11:32:00 PDT
Comment on attachment 411887 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=411887&action=review

> Tools/Scripts/report-non-inclusive-language:47
> +IGNORE_FILES_ENDING_WITH = ('.order', '.xcuserstate', '.swp', '.log')

I saw your comment about adding .pyc here, but R+ for now anyway. This is great!
Comment 5 EWS 2020-10-20 12:13:27 PDT
Committed r268752: <https://trac.webkit.org/changeset/268752>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 411892 [details].
Comment 6 Radar WebKit Bug Importer 2020-10-20 12:14:20 PDT
<rdar://problem/70496351>
Comment 7 Darin Adler 2020-10-20 13:04:38 PDT
Comment on attachment 411892 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=411892&action=review

> Tools/Scripts/report-non-inclusive-language:61
> +    if any(directory in prefix for directory in IGNORE_DIRECTORIES):

This is IGNORE_DIRECTORIES or IGNORE_DIRECTORIES_CONTAINING?

It looks like we are using substring matching for directory names, but we want to know if this is exactly a directory name. Ideally this should not match "notautoinstalled".

I’d like a more correct implementation to be future-looking although for now none of those three strings seems like a problem.

I believe that some others at Apple have been using this script, so it’s good if it works well even outside WebKit, which is why it’s good to be careful about such things.
Comment 8 Aakash Jain 2020-10-20 13:14:43 PDT
(In reply to Darin Adler from comment #7)
> It looks like we are using substring matching for directory names, but we want to know if this is exactly a directory name. Ideally this should not match "notautoinstalled".
Good point. Should we do the exact match for the directories. That seems like the right thing to do. Proposed patch would be something like:

-    if any(directory in prefix for directory in IGNORE_DIRECTORIES):
+    if any(directory==prefix for directory in IGNORE_DIRECTORIES):
Comment 9 Aakash Jain 2020-10-20 13:18:07 PDT
(In reply to Aakash Jain from comment #8)
> (In reply to Darin Adler from comment #7)
> > It looks like we are using substring matching for directory names, but we want to know if this is exactly a directory name. Ideally this should not match "notautoinstalled".
> Good point. Should we do the exact match for the directories. That seems
> like the right thing to do. Proposed patch would be something like:
> 
> -    if any(directory in prefix for directory in IGNORE_DIRECTORIES):
> +    if any(directory==prefix for directory in IGNORE_DIRECTORIES):
Ignore my above comment, just noticed that prefix contains the complete path.
Comment 10 Aakash Jain 2020-10-20 13:24:36 PDT
Reopening to attach minor follow-up patch.
Comment 11 Aakash Jain 2020-10-20 13:25:58 PDT
Created attachment 411908 [details]
Patch
Comment 12 Darin Adler 2020-10-20 14:07:37 PDT
Comment on attachment 411908 [details]
Patch

For now, this is OK.
Comment 13 Darin Adler 2020-10-20 14:08:42 PDT
(In reply to Aakash Jain from comment #8)
> (In reply to Darin Adler from comment #7)
> > It looks like we are using substring matching for directory names, but we want to know if this is exactly a directory name. Ideally this should not match "notautoinstalled".
> Good point. Should we do the exact match for the directories. That seems
> like the right thing to do. Proposed patch would be something like:
> 
> -    if any(directory in prefix for directory in IGNORE_DIRECTORIES):
> +    if any(directory==prefix for directory in IGNORE_DIRECTORIES):

Maybe:

    if any(directory.endsWith("/" + prefix) for directory in IGNORE_DIRECTORIES):

Would that work?
Comment 14 EWS 2020-10-20 15:01:10 PDT
Committed r268763: <https://trac.webkit.org/changeset/268763>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 411908 [details].
Comment 15 Aakash Jain 2020-10-20 15:39:07 PDT
(In reply to Darin Adler from comment #13)
> Maybe:
> 
>     if any(directory.endsWith("/" + prefix) for directory in
> IGNORE_DIRECTORIES):
> 
> Would that work?
That doesn't seem to work. Uploaded patch in https://bugs.webkit.org/show_bug.cgi?id=217993 which seems to work fine.