226273 – Layout tests should exit early if stress mode detects a failure

ASSIGNED 226273

Layout tests should exit early if stress mode detects a failure

https://bugs.webkit.org/show_bug.cgi?id=226273

Summary Layout tests should exit early if stress mode detects a failure

Aakash Jain

Reported 2021-05-26 08:26:59 PDT

Layout test build should exit early if stress mode detects a failure, i.e.: we do not need to run the complete layout-test suite (few times) if stress mode indicates that the patch being tested introduced test failure. This would reduce the time to declare failure from few hours to few minutes for the patches which breaks layout-tests while adding/modifying those layout-tests. Stress mode was added on various layout-test queues in https://bugs.webkit.org/show_bug.cgi?id=226097

Attachments
Patch (11.03 KB, patch) 2021-05-26 08:33 PDT, Aakash Jain	jbedard: review+	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Aakash Jain

Comment 1 2021-05-26 08:33:32 PDT

Created attachment 429762 [details] Patch

Aakash Jain

Comment 2 2021-05-26 08:41:37 PDT

Examples: Patch with failure in stress mode (with this early exit): https://ews-build.webkit-uat.org/#/builders/34/builds/31444 (build completed in less than a minute) Patch with failure in non-stress mode: https://ews-build.webkit-uat.org/#/builders/34/builds/31455 (build took ~2 hours), https://ews-build.webkit-uat.org/#/builders/34/builds/31462 (build took ~4 hours)

Alexey Proskuryakov

Comment 3 2021-05-26 10:13:29 PDT

Comment on attachment 429762 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=429762&action=review > Tools/ChangeLog:3 > + Layout tests should exit early if stress mode detects a failure The downside to this change is that for patches that fail stress test AND break existing tests, EWS won't tell people anything about the latter. This will increase the number of times they will have to upload the patch for EWS testing.

Aakash Jain

Comment 4 2021-05-26 10:24:40 PDT

That's true. Although they will get the results much faster. So it's a small trade-off. Also, there would still be one queue where we run complete layout-test suite without the stress mode, which is Windows (since it builds and test in same build, and we haven't modified that queue yet). So, if the patch breaks existing tests on windows as well, then the EWS will tell about those failures.

Alexey Proskuryakov

Comment 5 2021-05-26 17:01:58 PDT

Comment on attachment 429762 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=429762&action=review Seems OK to make this change, although ideally, this would produce a red bubble and mark the patch cq-, then run the regular tests anyway. I don't feel entirely qualified to r+ buildbot steps code. > Tools/CISupport/ews-build/factories.py:138 > + self.addStep(RunWebKitTestsInStressMode(num_iterations=10, stopBuildOnFailure=True)) Will TriggerCrashLogSubmission and SetBuildSummary still run if this fails?

Aakash Jain

Comment 6 2021-05-27 11:29:32 PDT

(In reply to Alexey Proskuryakov from comment #5) > Seems OK to make this change, although ideally, this would produce a red bubble and mark the patch cq-, then run the regular tests anyway. The purpose of this patch is to save build time by exiting early, which wouldn't be achieved in that case. > Will TriggerCrashLogSubmission and SetBuildSummary still run if this fails? TriggerCrashLogSubmission is not run when layout-tests fails (irrespective of this patch), just filed Bug 226348 for it. SetBuildSummary wouldn't be run, but similar action would be done by the SetCommitQueueMinusFlagOnPatch step.

Aakash Jain

Comment 7 2021-05-27 11:33:06 PDT

Another example where this would have saved significant build time (build would have taken 1-2 minutes instead of 3-4 hours, with same output): https://ews-build.webkit.org/#/builders/56/builds/8700, https://ews-build.webkit.org/#/builders/51/builds/14407

Jonathan Bedard

Comment 8 2021-05-27 11:38:59 PDT

Code looks good, although I'm a bit torn if this is actually the best behavior. I can see this being good in some cases, bad in others, although the efficiency wins probably out-weigh the badness

Ryan Haddad

Comment 9 2021-05-27 12:59:56 PDT

(In reply to Aakash Jain from comment #7) > Another example where this would have saved significant build time (build > would have taken 1-2 minutes instead of 3-4 hours, with same output): > https://ews-build.webkit.org/#/builders/56/builds/8700, > https://ews-build.webkit.org/#/builders/51/builds/14407 I share the same concerns about whether or not this will cause additional churn for EWS if the developer sees a failure with the new test, fixes it, then sees failures with existing tests on the next run. These are great examples of the potential for time savings though.. Maybe we can announce our intent to enable this on webkit-dev and see if there are enough concerns to warrant holding off? (In reply to Alexey Proskuryakov from comment #5) > Comment on attachment 429762 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=429762&action=review > > Seems OK to make this change, although ideally, this would produce a red > bubble and mark the patch cq-, then run the regular tests anyway. How difficult would it be to implement this kind of functionality?

Alexey Proskuryakov

Comment 10 2021-05-27 14:23:57 PDT

(In reply to Aakash Jain from comment #6) > (In reply to Alexey Proskuryakov from comment #5) > > Seems OK to make this change, although ideally, this would produce a red bubble and mark the patch cq-, then run the regular tests anyway. > The purpose of this patch is to save build time by exiting early, which > wouldn't be achieved in that case. This is a tradeoff between using bot time and providing all necessary data to the engineer. I guess I didn't fully understand the goal, but to me, the ideal behavior would prioritize engineering time, by providing some results quickly, and providing complete results a little slower. It's similar to how we don't stop all EWSes when one fails - we COULD stop everything if one platform doesn't build, or even if style check fails, but we try to paint a complete picture.

Jonathan Bedard

Comment 11 2021-05-27 14:35:50 PDT

What if we opted out of retires if stress tests failed? Wouldn't that speed things up considerably?

Radar WebKit Bug Importer

Comment 12 2021-06-02 08:27:18 PDT

<rdar://problem/78764585>

Aakash Jain

Comment 13 2022-01-25 12:46:36 PST

(In reply to Ryan Haddad from comment #9) > How difficult would it be to implement this kind of functionality? It's not difficult, although that would defeat the purpose of this patch which was to save time by exiting early. > What if we opted out of retries if stress tests failed? Wouldn't that speed things up considerably? If we go that route, we would still need to do run-layout-tests-without-patch, we might save one run (re-run-layout-tests), but the results might be more flaky.

Note You need to log in before you can comment on or make changes to this bug.

Status ASSIGNED

Resolution

Priority P2

Severity Normal

Classification Unclassified

Version WebKit Nightly Build

Hardware Unspecified

OS Unspecified

Product WebKit

Component Tools / Tests

Assignee

Aakash Jain

Reported

2021-05-26 08:26 PDT

Modified

2022-01-25 12:46 PST History

CC List

7 users Show

URL

Keywords InRadar

Depends on

Blocks