RESOLVED FIXED 204704
[EWS] Do not retry layout-tests build if the flaky test failures are also present in clean tree run
https://bugs.webkit.org/show_bug.cgi?id=204704
Summary [EWS] Do not retry layout-tests build if the flaky test failures are also pre...
Aakash Jain
Reported 2019-11-29 08:31:34 PST
In EWS, do not retry layout-tests build if the (flaky) test failures are also present in clean tree run. If same failures (whether flaky or consistent) are also found on clean tree, it is very unlikely that the failures were introduced by the patch. e.g.: In https://ews-build.webkit.org/#/builders/24/builds/5456, the first and second run failures were also present on clean tree. However the build was retried, instead it should be marked as passing.
Attachments
Patch (3.50 KB, patch)
2019-11-29 08:37 PST, Aakash Jain
no flags
Aakash Jain
Comment 1 2019-11-29 08:37:24 PST
Alexey Proskuryakov
Comment 2 2019-11-29 22:23:57 PST
Comment on attachment 384510 [details] Patch This is clearly an improvement, it's obvious that a patch with results like these should get a green bubble. I think that we are still too strict with declaring SUCCESS, and delay the verdict unnecessarily. One thing that doesn't make logical sense to me is that we completely ignore flaky tests if they pass on retry within a single test run - but tests that also fail on retry within the run yet pass on a new run are treated as ultra important.
Aakash Jain
Comment 3 2019-11-29 23:44:38 PST
(In reply to Alexey Proskuryakov from comment #2) > Comment on attachment 384510 [details] > > One thing that doesn't make logical sense to me is that we completely ignore > flaky tests if they pass on retry within a single test run - but tests that > also fail on retry within the run yet pass on a new run are treated as ultra > important. Completely agree. I don’t know why we retry so aggressively when a flakiness is encountered between two test runs. However, as you said, we do ignore flaky tests within each single test-run. Maybe we should take the intersection of test failures from first and second test runs, and compare it with clean tree run (unless there are 30+ failures in any run in which case we use special logic). This is what we do on API tests and JSC tests EWSes.
WebKit Commit Bot
Comment 4 2019-11-30 00:15:43 PST
Comment on attachment 384510 [details] Patch Clearing flags on attachment: 384510 Committed r252953: <https://trac.webkit.org/changeset/252953>
WebKit Commit Bot
Comment 5 2019-11-30 00:15:45 PST
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 6 2019-11-30 00:16:16 PST
Aakash Jain
Comment 7 2019-11-30 09:29:43 PST
Deployed on production server.
Alexey Proskuryakov
Comment 8 2019-11-30 15:59:59 PST
I think that taking an intersection would be an improvement in most cases. We can make the logic more subtle and detect patches that introduce substantial flakiness. E.g. a human would probably be concerned about a patch that makes 5 tests flaky when nothing is flaky on a clean run.
Aakash Jain
Comment 9 2019-12-01 08:32:51 PST
(In reply to Alexey Proskuryakov from comment #8) > I think that taking an intersection would be an improvement in most cases. Yeah, For e.g.: in https://ews-build.webkit.org/#/builders/17/builds/7537 there was one pre-existing failure on trunk. In second-run a flaky failure "webaudio/audioparam-setTargetAtTime.html" caused the entire build to be retried. When there is a pre-existing failure, any flaky failure would almost result in build being RETRIED for a good patch, while the build should be marked as SUCCESS (this is a huge waste of resources).
Jonathan Bedard
Comment 10 2019-12-02 08:43:10 PST
(In reply to Alexey Proskuryakov from comment #2) > Comment on attachment 384510 [details] > Patch ... > One thing that doesn't make logical sense to me is that we completely ignore > flaky tests if they pass on retry within a single test run - but tests that > also fail on retry within the run yet pass on a new run are treated as ultra > important. I agree we need a better way to handle this. Lately, every example I’ve seen of a test which passes during the retry in run-webkit-tests has had some sort of stateful dependency in the test which caused it to fail when run with other tests. I actually think that passing tests on the retry in run-webkit-tests should be fatal. With the current state of the tree, that’s usually a pretty uncontroverdial bug.
Aakash Jain
Comment 11 2019-12-02 14:52:33 PST
Going to fix the entire thing in Bug 204769. This bug covered only a specific case of flakiness, while Bug 204769 would be more generic and cover most of the flakiness scenarios. I would need to revert this change as it added explicit if condition for this specific scenario. It might be little unusual, but I am going to revert this commit (r252953) separately and after that create a clean patch for Bug 204769, so that it is easier to review.
Aakash Jain
Comment 12 2019-12-02 15:23:19 PST
Note You need to log in before you can comment on or make changes to this bug.