204368 – EWS should use results database instead of doing clean-tree runs

RESOLVED FIXED204368

EWS should use results database instead of doing clean-tree runs

https://bugs.webkit.org/show_bug.cgi?id=204368

Summary EWS should use results database instead of doing clean-tree runs

Aakash Jain

Reported 2019-11-19 12:09:11 PST

EWS bots do clean-tree run when the test fails, in order to confirm if the test-failure is pre-existing or introduced by the patch. A lot of time, test-failure information is already available in results database (through build.webkit.org bots). We should try to get the clean-tree failure info from results database using its API, and avoid doing clean-tree runs. This would save a lot of time on EWS bots.

Attachments
Patch (4.86 KB, patch) 2019-11-20 11:08 PST, Jonathan Bedard	no flags	Details Formatted Diff Diff
Show Obsolete (1) View All Add attachment proposed patch, testcase, etc.

Alexey Proskuryakov

Comment 1 2019-11-19 17:43:25 PST

How can we confirm that this will be sufficient? It feels like EWS-only failures are somewhat common. I think that doing this in addition to retries is better than "instead" as the bug suggests. We can still retry with a clean tree if the failures are unknown.

Jonathan Bedard

Comment 2 2019-11-20 09:45:44 PST

(In reply to Alexey Proskuryakov from comment #1) > How can we confirm that this will be sufficient? It feels like EWS-only > failures are somewhat common. > > I think that doing this in addition to retries is better than "instead" as > the bug suggests. We can still retry with a clean tree if the failures are > unknown. We could do the retry if the failures are unknown, yes. And we might start that way. I think the retry is generally a bad idea, though. It means that in the case where an engineer did introduce a regression, we are going to be taking significantly longer to actually report that regression than in the successful case. Given the longer turn around for layout tests, I think this pretty significantly reduces the usefulness of EWS. We're essentially delaying results in the exact circumstances those results would be most impactful. I suppose we make the bubble orange on retry, but I feel like we might be training folks to ignore orange bubbles given how flakey things are at the moment.

Alexey Proskuryakov

Comment 3 2019-11-20 10:31:45 PST

If we don't retry, then we will have a lot of false positives, training engineers to ignore EWS altogether, not just in orange state.

Jonathan Bedard

Comment 4 2019-11-20 11:08:15 PST Comment hidden (obsolete)

Created attachment 383973 [details] Patch

Jonathan Bedard

Comment 5 2019-11-20 11:14:39 PST

Comment on attachment 383973 [details] Patch This patch should have been uploaded to https://bugs.webkit.org/show_bug.cgi?id=204409.

Jonathan Bedard

Comment 6 2019-11-20 11:25:44 PST

(In reply to Alexey Proskuryakov from comment #3) > If we don't retry, then we will have a lot of false positives, training > engineers to ignore EWS altogether, not just in orange state. I'm not entirely convinced that we need retries to get rid of false positives. There isn't a good reason that clean tree runs in EWS should differ from trunk results at all, and if we're cross-referencing trunk results, we could even expand the set of commits we're looking at to find flakey tests. This approach hinges on us quickly fixing things like https://bugs.webkit.org/show_bug.cgi?id=204409, but I really do think that things like that are bugs, not expected behavior.

Alexey Proskuryakov

Comment 7 2019-11-20 15:23:45 PST

One big reason is that EWS is different hardware. We definitely have quite a few tests that are only flaky or failing on EWS.

Aakash Jain

Comment 8 2019-12-05 08:24:24 PST

Seems like one thing we agree on is that, if the test(s) which fails in EWS is also marked as failing in results database, then we can skip the retry/clean-tree-run. If the test failure is new (not visible in results database), then we can continue with existing retry/clean-tree-run logic. This would still be a significant efficiency improvement, given that most of the times test failures seen in EWS are also seen in build.webkit.org

Alexey Proskuryakov

Comment 9 2019-12-05 08:58:33 PST

Correct. I think that we should use a fairly strict definition of "failing in results database', something like the test having failed within 24 hours, and maybe even require the same OS/flavor.

Radar WebKit Bug Importer

Comment 10 2021-09-10 09:27:43 PDT

<rdar://problem/82975679>

Ryan Haddad

Comment 11 2022-11-09 14:45:47 PST

*** Bug 247700 has been marked as a duplicate of this bug. ***

Aakash Jain

Comment 12 2022-11-24 13:44:14 PST

This was earlier blocked by results database not supporting git-hash or commit-identifier, and then due to not having appropriate api endpoint which ews could talk to. Both these issues were resolved. Implementing this as discussed above.

Aakash Jain

Comment 13 2022-11-24 13:56:56 PST Comment hidden (obsolete)

Pull request: https://github.com/apple/WebKit/pull/223

Aakash Jain

Comment 14 2022-11-24 13:59:31 PST

Pull request: https://github.com/WebKit/WebKit/pull/6792

EWS

Comment 15 2022-11-28 14:44:34 PST

Committed 257090@main (b9824fc290ac): <https://commits.webkit.org/257090@main> Reviewed commits have been landed. Closing PR #6792 and removing active labels.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version Other

Hardware Unspecified

OS Unspecified

Product WebKit

Component Tools / Tests

Assignee

Aakash Jain

Reported

2019-11-19 12:09 PST

Modified

2023-01-30 13:25 PST History

CC List

6 users Show

URL

Keywords InRadar

Duplicates (1)

247700 View as bug list

Depends on

Blocks

248037

Dependencies

tree graph