Bug 226273 - Layout tests should exit early if stress mode detects a failure
Summary: Layout tests should exit early if stress mode detects a failure
Status: ASSIGNED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Aakash Jain
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2021-05-26 08:26 PDT by Aakash Jain
Modified: 2022-01-25 12:46 PST (History)
7 users (show)

See Also:


Attachments
Patch (11.03 KB, patch)
2021-05-26 08:33 PDT, Aakash Jain
jbedard: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Aakash Jain 2021-05-26 08:26:59 PDT
Layout test build should exit early if stress mode detects a failure, i.e.: we do not need to run the complete layout-test suite (few times) if stress mode indicates that the patch being tested introduced test failure. This would reduce the time to declare failure from few hours to few minutes for the patches which breaks layout-tests while adding/modifying those layout-tests.

Stress mode was added on various layout-test queues in https://bugs.webkit.org/show_bug.cgi?id=226097
Comment 1 Aakash Jain 2021-05-26 08:33:32 PDT
Created attachment 429762 [details]
Patch
Comment 2 Aakash Jain 2021-05-26 08:41:37 PDT
Examples:
Patch with failure in stress mode (with this early exit): https://ews-build.webkit-uat.org/#/builders/34/builds/31444 (build completed in less than a minute)

Patch with failure in non-stress mode: https://ews-build.webkit-uat.org/#/builders/34/builds/31455 (build took ~2 hours), https://ews-build.webkit-uat.org/#/builders/34/builds/31462 (build took ~4 hours)
Comment 3 Alexey Proskuryakov 2021-05-26 10:13:29 PDT
Comment on attachment 429762 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=429762&action=review

> Tools/ChangeLog:3
> +        Layout tests should exit early if stress mode detects a failure

The downside to this change is that for patches that fail stress test AND break existing tests, EWS won't tell people anything about the latter. This will increase the number of times they will have to upload the patch for EWS testing.
Comment 4 Aakash Jain 2021-05-26 10:24:40 PDT
That's true. Although they will get the results much faster. So it's a small trade-off.

Also, there would still be one queue where we run complete layout-test suite without the stress mode, which is Windows (since it builds and test in same build, and we haven't modified that queue yet). So, if the patch breaks existing tests on windows as well, then the EWS will tell about those failures.
Comment 5 Alexey Proskuryakov 2021-05-26 17:01:58 PDT
Comment on attachment 429762 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=429762&action=review

Seems OK to make this change, although ideally, this would produce a red bubble and mark the patch cq-, then run the regular tests anyway.

I don't feel entirely qualified to r+ buildbot steps code.

> Tools/CISupport/ews-build/factories.py:138
> +            self.addStep(RunWebKitTestsInStressMode(num_iterations=10, stopBuildOnFailure=True))

Will TriggerCrashLogSubmission and SetBuildSummary still run if this fails?
Comment 6 Aakash Jain 2021-05-27 11:29:32 PDT
(In reply to Alexey Proskuryakov from comment #5)
> Seems OK to make this change, although ideally, this would produce a red bubble and mark the patch cq-, then run the regular tests anyway.
The purpose of this patch is to save build time by exiting early, which wouldn't be achieved in that case.
 
> Will TriggerCrashLogSubmission and SetBuildSummary still run if this fails?
TriggerCrashLogSubmission is not run when layout-tests fails (irrespective of this patch), just filed Bug 226348 for it. 
SetBuildSummary wouldn't be run, but similar action would be done by the SetCommitQueueMinusFlagOnPatch step.
Comment 7 Aakash Jain 2021-05-27 11:33:06 PDT
Another example where this would have saved significant build time (build would have taken 1-2 minutes instead of 3-4 hours, with same output):
https://ews-build.webkit.org/#/builders/56/builds/8700, https://ews-build.webkit.org/#/builders/51/builds/14407
Comment 8 Jonathan Bedard 2021-05-27 11:38:59 PDT
Code looks good, although I'm a bit torn if this is actually the best behavior. I can see this being good in some cases, bad in others, although the efficiency wins probably out-weigh the badness
Comment 9 Ryan Haddad 2021-05-27 12:59:56 PDT
(In reply to Aakash Jain from comment #7)
> Another example where this would have saved significant build time (build
> would have taken 1-2 minutes instead of 3-4 hours, with same output):
> https://ews-build.webkit.org/#/builders/56/builds/8700,
> https://ews-build.webkit.org/#/builders/51/builds/14407
I share the same concerns about whether or not this will cause additional churn for EWS if the developer sees a failure with the new test, fixes it, then sees failures with existing tests on the next run. These are great examples of the potential for time savings though..

Maybe we can announce our intent to enable this on webkit-dev and see if there are enough concerns to warrant holding off?

(In reply to Alexey Proskuryakov from comment #5)
> Comment on attachment 429762 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=429762&action=review
> 
> Seems OK to make this change, although ideally, this would produce a red
> bubble and mark the patch cq-, then run the regular tests anyway.
How difficult would it be to implement this kind of functionality?
Comment 10 Alexey Proskuryakov 2021-05-27 14:23:57 PDT
(In reply to Aakash Jain from comment #6)
> (In reply to Alexey Proskuryakov from comment #5)
> > Seems OK to make this change, although ideally, this would produce a red bubble and mark the patch cq-, then run the regular tests anyway.
> The purpose of this patch is to save build time by exiting early, which
> wouldn't be achieved in that case.

This is a tradeoff between using bot time and providing all necessary data to the engineer. I guess I didn't fully understand the goal, but to me, the ideal behavior would prioritize engineering time, by providing some results quickly, and providing complete results a little slower.

It's similar to how we don't stop all EWSes when one fails - we COULD stop everything if one platform doesn't build, or even if style check fails, but we try to paint a complete picture.
Comment 11 Jonathan Bedard 2021-05-27 14:35:50 PDT
What if we opted out of retires if stress tests failed? Wouldn't that speed things up considerably?
Comment 12 Radar WebKit Bug Importer 2021-06-02 08:27:18 PDT
<rdar://problem/78764585>
Comment 13 Aakash Jain 2022-01-25 12:46:36 PST
(In reply to Ryan Haddad from comment #9)
> How difficult would it be to implement this kind of functionality?
It's not difficult, although that would defeat the purpose of this patch which was to 
save time by exiting early.

> What if we opted out of retries if stress tests failed? Wouldn't that speed things up considerably?
If we go that route, we would still need to do run-layout-tests-without-patch, we might save one run (re-run-layout-tests), but the results might be more flaky.