Bug 231790

Summary:

[webkitpy] webkit-test-runner doesn't report all results when a test is run several times

Product:

WebKit

Reporter:

Carlos Alberto Lopez Perez <clopez>

Component:

Tools / Tests

Assignee:

Carlos Alberto Lopez Perez <clopez>

Status:

RESOLVED FIXED

Severity:

Normal

CC:

aakash_jain, ap, clopez, dewei_zhu, ews-watchlist, glenn, jbedard, ryanhaddad, webkit-bug-importer

Priority:

Keywords:

InRadar

Version:

WebKit Nightly Build

Hardware:

Unspecified

OS:

Unspecified

See Also:

https://bugs.webkit.org/show_bug.cgi?id=229788
https://bugs.webkit.org/show_bug.cgi?id=231241
https://bugs.webkit.org/show_bug.cgi?id=231999

Attachments:

Description	Flags
Patch	none

Carlos Alberto Lopez Perez

Reported 2021-10-14 18:22:04 PDT

webkit-test-runner is not correctly reporting flaky tests when a test is run more than once (for example, when passing the flag --repeat-each=X or when specifiyng the test twice or more on the command line). This is due to how webkit-test-runner accounts the test result values This is what is happening when passing the flag --repeat-each=X is passed: 1. Run the test X times and if it fails on any of those times, annotate one of the failure types (for example: timeout) 2. Repeat the test one run more (on the retry step) and if it gives any different result than the failure previously annotated then mark the test as " Valuefrom_1 | Valuefrom_retry" And this has several issues - The first issue is that when on the step 1 a test gives different results like: "Pass, Timeout, Fail" webkit-test-runner only picks one of the failures. For example, for a test that gives - Pass - Pass - Timeout - Failure - Pass - Failure - Timeout - Pass It will pick "Timeout" as the value from the first run. Then on the retry step, suppose the test gives "Pass" So webkit-test-runner will mark the test as flaky "[ Timeout Pass ]", which is wrong as it has ignored completely the "Failure" value. - And the second issue happens when you pass the flag "--no-retry-failures". Since there is no retry step then it just picks the last failure value. So on the previous example it will just mark the test as "[ Timeout ]" and it will not report the test as flaky. You can verify the current bug with the following patch that adds a test named fast/random/fails-timeout-pass.html that will "Pass|Fail|Timeout" with equal probability (1/3 of the times) So to test this apply the patch: http://sprunge.us/88Ks7Y and run webkit-test-runner as follows: 1. $ Tools/Scripts/run-webkit-tests --debug-rwt-logging --release --repeat-each=10 --time-out-ms=1000 fast/random/fails-timeout-pass.html You should see the above behaviour (it picks only one failure value from the first try and then the value of the retry step (if different)) 2. Now try to run it passing "--no-retry-failures" $ Tools/Scripts/run-webkit-tests --debug-rwt-logging --release --repeat-each=10 --time-out-ms=1000 --no-retry-failures fast/random/fails-timeout-pass.html And you will see that it just picks only one failure value, ignoring any pass and any other different failure value Note: if you get a python KeyError exception then apply the fix from bug 229788 (if still didn't landed) See for example this test run that just happened here http://sprunge.us/CRNNCn : - On the first 10 retries it gave "pass, pass, fail, pass, fail, timeout, timeout, pass, fail, timeout" and on the retry step it gave "timeout" and WTR marked this as [ Timeout ] (no flaky)

Attachments
Patch (4.90 KB, patch) 2021-10-14 18:43 PDT, Carlos Alberto Lopez Perez	no flags	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Carlos Alberto Lopez Perez

Comment 1 2021-10-14 18:43:16 PDT

Created attachment 441316 [details] Patch

Radar WebKit Bug Importer

Comment 2 2021-10-21 18:23:12 PDT

<rdar://problem/84531445>

Carlos Alberto Lopez Perez

Comment 3 2021-10-24 05:32:56 PDT

ping reviewers?

EWS

Comment 4 2021-10-25 08:17:53 PDT

Committed r284784 (243493@main): <https://commits.webkit.org/243493@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 441316 [details].

Note You need to log in before you can comment on or make changes to this bug.