ASSIGNED 226273
Layout tests should exit early if stress mode detects a failure
https://bugs.webkit.org/show_bug.cgi?id=226273
Summary Layout tests should exit early if stress mode detects a failure
Aakash Jain
Reported 2021-05-26 08:26:59 PDT
Layout test build should exit early if stress mode detects a failure, i.e.: we do not need to run the complete layout-test suite (few times) if stress mode indicates that the patch being tested introduced test failure. This would reduce the time to declare failure from few hours to few minutes for the patches which breaks layout-tests while adding/modifying those layout-tests. Stress mode was added on various layout-test queues in https://bugs.webkit.org/show_bug.cgi?id=226097
Attachments
Patch (11.03 KB, patch)
2021-05-26 08:33 PDT, Aakash Jain
jbedard: review+
Aakash Jain
Comment 1 2021-05-26 08:33:32 PDT
Aakash Jain
Comment 2 2021-05-26 08:41:37 PDT
Examples: Patch with failure in stress mode (with this early exit): https://ews-build.webkit-uat.org/#/builders/34/builds/31444 (build completed in less than a minute) Patch with failure in non-stress mode: https://ews-build.webkit-uat.org/#/builders/34/builds/31455 (build took ~2 hours), https://ews-build.webkit-uat.org/#/builders/34/builds/31462 (build took ~4 hours)
Alexey Proskuryakov
Comment 3 2021-05-26 10:13:29 PDT
Comment on attachment 429762 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=429762&action=review > Tools/ChangeLog:3 > + Layout tests should exit early if stress mode detects a failure The downside to this change is that for patches that fail stress test AND break existing tests, EWS won't tell people anything about the latter. This will increase the number of times they will have to upload the patch for EWS testing.
Aakash Jain
Comment 4 2021-05-26 10:24:40 PDT
That's true. Although they will get the results much faster. So it's a small trade-off. Also, there would still be one queue where we run complete layout-test suite without the stress mode, which is Windows (since it builds and test in same build, and we haven't modified that queue yet). So, if the patch breaks existing tests on windows as well, then the EWS will tell about those failures.
Alexey Proskuryakov
Comment 5 2021-05-26 17:01:58 PDT
Comment on attachment 429762 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=429762&action=review Seems OK to make this change, although ideally, this would produce a red bubble and mark the patch cq-, then run the regular tests anyway. I don't feel entirely qualified to r+ buildbot steps code. > Tools/CISupport/ews-build/factories.py:138 > + self.addStep(RunWebKitTestsInStressMode(num_iterations=10, stopBuildOnFailure=True)) Will TriggerCrashLogSubmission and SetBuildSummary still run if this fails?
Aakash Jain
Comment 6 2021-05-27 11:29:32 PDT
(In reply to Alexey Proskuryakov from comment #5) > Seems OK to make this change, although ideally, this would produce a red bubble and mark the patch cq-, then run the regular tests anyway. The purpose of this patch is to save build time by exiting early, which wouldn't be achieved in that case. > Will TriggerCrashLogSubmission and SetBuildSummary still run if this fails? TriggerCrashLogSubmission is not run when layout-tests fails (irrespective of this patch), just filed Bug 226348 for it. SetBuildSummary wouldn't be run, but similar action would be done by the SetCommitQueueMinusFlagOnPatch step.
Aakash Jain
Comment 7 2021-05-27 11:33:06 PDT
Another example where this would have saved significant build time (build would have taken 1-2 minutes instead of 3-4 hours, with same output): https://ews-build.webkit.org/#/builders/56/builds/8700, https://ews-build.webkit.org/#/builders/51/builds/14407
Jonathan Bedard
Comment 8 2021-05-27 11:38:59 PDT
Code looks good, although I'm a bit torn if this is actually the best behavior. I can see this being good in some cases, bad in others, although the efficiency wins probably out-weigh the badness
Ryan Haddad
Comment 9 2021-05-27 12:59:56 PDT
(In reply to Aakash Jain from comment #7) > Another example where this would have saved significant build time (build > would have taken 1-2 minutes instead of 3-4 hours, with same output): > https://ews-build.webkit.org/#/builders/56/builds/8700, > https://ews-build.webkit.org/#/builders/51/builds/14407 I share the same concerns about whether or not this will cause additional churn for EWS if the developer sees a failure with the new test, fixes it, then sees failures with existing tests on the next run. These are great examples of the potential for time savings though.. Maybe we can announce our intent to enable this on webkit-dev and see if there are enough concerns to warrant holding off? (In reply to Alexey Proskuryakov from comment #5) > Comment on attachment 429762 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=429762&action=review > > Seems OK to make this change, although ideally, this would produce a red > bubble and mark the patch cq-, then run the regular tests anyway. How difficult would it be to implement this kind of functionality?
Alexey Proskuryakov
Comment 10 2021-05-27 14:23:57 PDT
(In reply to Aakash Jain from comment #6) > (In reply to Alexey Proskuryakov from comment #5) > > Seems OK to make this change, although ideally, this would produce a red bubble and mark the patch cq-, then run the regular tests anyway. > The purpose of this patch is to save build time by exiting early, which > wouldn't be achieved in that case. This is a tradeoff between using bot time and providing all necessary data to the engineer. I guess I didn't fully understand the goal, but to me, the ideal behavior would prioritize engineering time, by providing some results quickly, and providing complete results a little slower. It's similar to how we don't stop all EWSes when one fails - we COULD stop everything if one platform doesn't build, or even if style check fails, but we try to paint a complete picture.
Jonathan Bedard
Comment 11 2021-05-27 14:35:50 PDT
What if we opted out of retires if stress tests failed? Wouldn't that speed things up considerably?
Radar WebKit Bug Importer
Comment 12 2021-06-02 08:27:18 PDT
Aakash Jain
Comment 13 2022-01-25 12:46:36 PST
(In reply to Ryan Haddad from comment #9) > How difficult would it be to implement this kind of functionality? It's not difficult, although that would defeat the purpose of this patch which was to save time by exiting early. > What if we opted out of retries if stress tests failed? Wouldn't that speed things up considerably? If we go that route, we would still need to do run-layout-tests-without-patch, we might save one run (re-run-layout-tests), but the results might be more flaky.
Note You need to log in before you can comment on or make changes to this bug.