Summary: | [EWS] Do not retry layout-tests build if the flaky test failures are also present in clean tree run | ||||||
---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Aakash Jain <aakash_jain> | ||||
Component: | Tools / Tests | Assignee: | Aakash Jain <aakash_jain> | ||||
Status: | RESOLVED FIXED | ||||||
Severity: | Normal | CC: | aakash_jain, ap, commit-queue, jbedard, webkit-bug-importer | ||||
Priority: | P2 | Keywords: | InRadar | ||||
Version: | Other | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
See Also: | https://bugs.webkit.org/show_bug.cgi?id=204769 | ||||||
Attachments: |
|
Description
Aakash Jain
2019-11-29 08:31:34 PST
Created attachment 384510 [details]
Patch
Comment on attachment 384510 [details]
Patch
This is clearly an improvement, it's obvious that a patch with results like these should get a green bubble. I think that we are still too strict with declaring SUCCESS, and delay the verdict unnecessarily.
One thing that doesn't make logical sense to me is that we completely ignore flaky tests if they pass on retry within a single test run - but tests that also fail on retry within the run yet pass on a new run are treated as ultra important.
(In reply to Alexey Proskuryakov from comment #2) > Comment on attachment 384510 [details] > > One thing that doesn't make logical sense to me is that we completely ignore > flaky tests if they pass on retry within a single test run - but tests that > also fail on retry within the run yet pass on a new run are treated as ultra > important. Completely agree. I don’t know why we retry so aggressively when a flakiness is encountered between two test runs. However, as you said, we do ignore flaky tests within each single test-run. Maybe we should take the intersection of test failures from first and second test runs, and compare it with clean tree run (unless there are 30+ failures in any run in which case we use special logic). This is what we do on API tests and JSC tests EWSes. Comment on attachment 384510 [details] Patch Clearing flags on attachment: 384510 Committed r252953: <https://trac.webkit.org/changeset/252953> All reviewed patches have been landed. Closing bug. Deployed on production server. I think that taking an intersection would be an improvement in most cases. We can make the logic more subtle and detect patches that introduce substantial flakiness. E.g. a human would probably be concerned about a patch that makes 5 tests flaky when nothing is flaky on a clean run. (In reply to Alexey Proskuryakov from comment #8) > I think that taking an intersection would be an improvement in most cases. Yeah, For e.g.: in https://ews-build.webkit.org/#/builders/17/builds/7537 there was one pre-existing failure on trunk. In second-run a flaky failure "webaudio/audioparam-setTargetAtTime.html" caused the entire build to be retried. When there is a pre-existing failure, any flaky failure would almost result in build being RETRIED for a good patch, while the build should be marked as SUCCESS (this is a huge waste of resources). (In reply to Alexey Proskuryakov from comment #2) > Comment on attachment 384510 [details] > Patch ... > One thing that doesn't make logical sense to me is that we completely ignore > flaky tests if they pass on retry within a single test run - but tests that > also fail on retry within the run yet pass on a new run are treated as ultra > important. I agree we need a better way to handle this. Lately, every example I’ve seen of a test which passes during the retry in run-webkit-tests has had some sort of stateful dependency in the test which caused it to fail when run with other tests. I actually think that passing tests on the retry in run-webkit-tests should be fatal. With the current state of the tree, that’s usually a pretty uncontroverdial bug. Going to fix the entire thing in Bug 204769. This bug covered only a specific case of flakiness, while Bug 204769 would be more generic and cover most of the flakiness scenarios. I would need to revert this change as it added explicit if condition for this specific scenario. It might be little unusual, but I am going to revert this commit (r252953) separately and after that create a clean patch for Bug 204769, so that it is easier to review. (In reply to WebKit Commit Bot from comment #4) > Committed r252953: <https://trac.webkit.org/changeset/252953> Reverted in https://trac.webkit.org/changeset/253014/webkit |