204769 – Do not retry the EWS build due to flaky failures in layout-test

RESOLVED FIXED 204769

Do not retry the EWS build due to flaky failures in layout-test

https://bugs.webkit.org/show_bug.cgi?id=204769

Summary Do not retry the EWS build due to flaky failures in layout-test

Aakash Jain

Reported 2019-12-02 14:39:36 PST

In EWS, when layout-tests run fails few tests, we re-run the tests. Then we run these tests on clean-tree, and compare the results of first run, second run, and clean tree run. In case there is flakiness (different tests fail in first and second runs), and no consistent failure is introduced by the patch being tested, we retry the entire build. We have this logic for years in old EWS and we copied it in new EWS. See: https://trac.webkit.org/browser/webkit/trunk/Tools/Scripts/webkitpy/tool/bot/patchanalysistask.py#L352 (Defer means retrying the build). ''' 350 # At this point we know that at least one test flaked, but no consistent failures 351 # were introduced. This is a bit of a grey-zone. 352 return False # Defer patch ''' Retrying the entire build, just because of few flaky tests seems excessive and inefficient. This means that if someone commits a very flaky test (which is not rare), then EWS will keep retrying the patch (almost in a infinite loop), and would be almost stuck until the flakiness is fixed. Also, we know that we already have various tests in trunk which are flaky. In other test-suites like API-tests and JSC-tests, we do not retry the build on flaky test failures, we ignore the flaky tests, and only focus on tests which failed consistently with the patch, and passed without the patch. We should do the similar thing for layout-tests as well.

Attachments
Patch (4.66 KB, patch) 2019-12-02 15:36 PST, Aakash Jain	no flags	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Aakash Jain

Comment 1 2019-12-02 14:46:50 PST

For e.g.: In https://ews-build.webkit.org/#/builders/17/builds/7537 first_run_failures: "loader/stateobjects/pushstate-size.html" second_run_failures: "loader/stateobjects/pushstate-size.html", "webaudio/audioparam-setTargetAtTime.html" clean_tree_run_failures: "loader/stateobjects/pushstate-size.html" There was a pre-existing test-failure on trunk, and just because of one flaky test (webaudio/audioparam-setTargetAtTime.html), entire build was retried. This build should have been marked as SUCCESS.

Aakash Jain

Comment 2 2019-12-02 15:36:30 PST

Created attachment 384663 [details] Patch

Jonathan Bedard

Comment 3 2019-12-03 08:05:13 PST

Comment on attachment 384663 [details] Patch Seems like this is what we want more often than not, but I still dislike our flakiness story....

WebKit Commit Bot

Comment 4 2019-12-03 09:56:22 PST

Comment on attachment 384663 [details] Patch Clearing flags on attachment: 384663 Committed r253049: <https://trac.webkit.org/changeset/253049>

WebKit Commit Bot

Comment 5 2019-12-03 09:56:23 PST

All reviewed patches have been landed. Closing bug.

Radar WebKit Bug Importer

Comment 6 2019-12-03 09:57:19 PST

<rdar://problem/57593402>

Aakash Jain

Comment 7 2019-12-03 13:08:33 PST

Deployed on production server.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version Other

Hardware Unspecified

OS Unspecified

Product WebKit

Component Tools / Tests

Assignee

Aakash Jain

Reported

2019-12-02 14:39 PST

Modified

2019-12-03 13:08 PST History

CC List

10 users Show

URL

Keywords InRadar

Depends on

Blocks