Bug 231190 - [ews-build.webkit.org] check-patch-relevance can get stuck
Summary: [ews-build.webkit.org] check-patch-relevance can get stuck
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Jonathan Bedard
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2021-10-04 13:04 PDT by Jonathan Bedard
Modified: 2021-10-05 08:23 PDT (History)
5 users (show)

See Also:


Attachments
Patch (6.37 KB, patch)
2021-10-04 13:07 PDT, Jonathan Bedard
no flags Details | Formatted Diff | Diff
Patch (6.37 KB, patch)
2021-10-04 14:45 PDT, Jonathan Bedard
no flags Details | Formatted Diff | Diff
Patch (6.63 KB, patch)
2021-10-04 16:44 PDT, Jonathan Bedard
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jonathan Bedard 2021-10-04 13:04:46 PDT
With large patches, check-patch-relevance can get stuck. Part of this is because of the constant re-computation of regexes, but it's always possible for extremely large patches to jam up EWS. Along with making check-patch-relevance more efficient, we should provide a timeout.
Comment 1 Jonathan Bedard 2021-10-04 13:05:00 PDT
<rdar://problem/77327168>
Comment 2 Jonathan Bedard 2021-10-04 13:07:45 PDT
Created attachment 440094 [details]
Patch
Comment 3 Aakash Jain 2021-10-04 13:52:24 PDT
Comment on attachment 440094 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=440094&action=review

> Tools/CISupport/ews-build/steps.py:467
>          for change in patch.splitlines():

You can consider adding a check here if the line is a filename or not. We actually need to match only the filenames of file being modified in the patch, not the complete patch content. 
e.g.:
if not (change.startswith(b'---') or change.startswith(b'+++')):
    continue
Comment 4 Aakash Jain 2021-10-04 13:52:42 PDT
Also, can you please profile the relevant code portion before and after your change (maybe on your local computer), especially on attachment 440066 [details] and attachment 427346 [details].
Comment 5 Jonathan Bedard 2021-10-04 14:45:03 PDT
Created attachment 440105 [details]
Patch
Comment 6 Jonathan Bedard 2021-10-04 14:47:33 PDT
(In reply to Aakash Jain from comment #4)
> Also, can you please profile the relevant code portion before and after your
> change (maybe on your local computer), especially on attachment 440066 [details]
> [details] and attachment 427346 [details].

Did some profiling.

The kicker is really line size. If we limit line size to 250, the code in the attached patch takes less than a half-second.
Comment 7 Alexey Proskuryakov 2021-10-04 16:27:37 PDT
Comment on attachment 440105 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=440105&action=review

> Tools/CISupport/ews-build/steps.py:399
> +    bindings_regexes = [

I think that the new name doesn't describe the variable well. Either bindings_paths or bindings_path_regexes would be better in my opinion.

Same comment for all the others below.

> Tools/CISupport/ews-build/steps.py:471
> +                if regex.search(change[:250]):

[:250] doesn't appear to be explained, and I don't think that it's even accurate to only look at the first 250 characters (.*jsc.* could be anywhere).
Comment 8 Jonathan Bedard 2021-10-04 16:35:51 PDT
(In reply to Alexey Proskuryakov from comment #7)
> Comment on attachment 440105 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=440105&action=review
> 
> > Tools/CISupport/ews-build/steps.py:399
> > +    bindings_regexes = [
> 
> I think that the new name doesn't describe the variable well. Either
> bindings_paths or bindings_path_regexes would be better in my opinion.
> 
> Same comment for all the others below.

Will change the variable names.

> 
> > Tools/CISupport/ews-build/steps.py:471
> > +                if regex.search(change[:250]):
> 
> [:250] doesn't appear to be explained, and I don't think that it's even
> accurate to only look at the first 250 characters (.*jsc.* could be
> anywhere).

I'll add this to the changelog, but it's the really bit lines that end up being what kills our performance. It's true that we're potentially sacrificing some correctness. But it's also unlikely that a patch will have a line that exceeds 250 characters and contains a legitimate match in truncated portion of the line, a match which does not appear in another line in the patch.
Comment 9 Jonathan Bedard 2021-10-04 16:44:43 PDT
Created attachment 440124 [details]
Patch
Comment 10 EWS 2021-10-04 17:52:30 PDT
Committed r283528 (242495@main): <https://commits.webkit.org/242495@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 440124 [details].
Comment 11 Radar WebKit Bug Importer 2021-10-04 17:53:21 PDT
<rdar://problem/83864363>
Comment 12 Aakash Jain 2021-10-05 08:23:08 PDT
Confirmed that EWS can now process similar patch efficiently. e.g.: https://ews-build.webkit.org/#/builders/1/builds/50144 took only 1 second in check-patch-relevance. Earlier it was taking 30+ minutes in similar patch in https://ews-build.webkit.org/#/builders/1/builds/50001