231241 – [webkitpy] The actual results reported for a flaky tests shouldn't include the expectation

RESOLVED FIXED 231241

[webkitpy] The actual results reported for a flaky tests shouldn't include the expectation

https://bugs.webkit.org/show_bug.cgi?id=231241

Summary [webkitpy] The actual results reported for a flaky tests shouldn't include th...

Carlos Alberto Lopez Perez

Reported 2021-10-05 11:12:45 PDT

This is about the following corner-case -> When a test is marked as flaky and fails the run (doesn't match the expectation) but passes the run on the retry. In this case, at the end of the run we currently report as actual results of the tests the union between the first run and the expectations. So it happens that, for example, when we have a test marked as "[ Crash Timeout Pass ]" and the fail gives the results "Fail" and "Pass" we currently report the actual results of the tests to be "[ Fail Crash Timeout Pass ]" when "Crash" and "Timeout" didn't happened. So this makes difficult to know when a tests stops giving a specific result. Once it is marked as flaky with a set of results there is no way to know if some of the results in the set not longer happen. Here is a test-case to test this: 1. Apply this test patch <http://sprunge.us/MqyGkc> to add a test that fails the 50% of times and passes the other 50% and such test is marked with an expectation of fast/random/fails-half.html [ Crash Timeout Pass ] 2. Run the test like: $ Tools/Scripts/run-webkit-tests --no-build --no-show-results --release --results-directory layout-test-results --debug-rwt-logging fast/random/fails-half.html Repeat the run above until you get the test to fail on the first run and pass on the second. When that happens you will see this <http://sprunge.us/vwM9B7>: Unexpected flakiness: text-only failures (1) fast/random/fails-half.html [ Crash Timeout Pass Failure ] And on the json file at layout-test-results/full_results.json you will see: ADD_RESULTS({"tests":{"fast":{"random":{"fails-half.html":{"report":"FLAKY","expected":"PASS TIMEOUT CRASH","actual":"TEXT PASS TIMEOUT CRASH"}}}} So I think this is misleading, because the _actual_ result of this test was "FAIL/TEXT" and "PASS" but it didn't give TIMEOUT or CRASH

Attachments
Patch (5.83 KB, patch) 2021-10-05 11:42 PDT, Carlos Alberto Lopez Perez	no flags	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Carlos Alberto Lopez Perez

Comment 1 2021-10-05 11:42:31 PDT

Created attachment 440241 [details] Patch Patch fixing the issue. This is the output the above example gives after this patch: http://sprunge.us/ta54tE

Radar WebKit Bug Importer

Comment 2 2021-10-12 11:13:17 PDT

<rdar://problem/84156894>

Jonathan Bedard

Comment 3 2021-10-12 11:49:57 PDT

Comment on attachment 440241 [details] Patch Sorry for the delay looking at this. I was a bit worried about the --repeat case where a test has multiple results. Looks like this performs as expected in that case, though.

EWS

Comment 4 2021-10-14 14:18:52 PDT

Committed r284198 (243012@main): <https://commits.webkit.org/243012@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 440241 [details].

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version WebKit Nightly Build

Hardware Unspecified

OS Unspecified

Product WebKit

Component Tools / Tests

Assignee

Carlos Alberto Lopez Perez

Reported

2021-10-05 11:12 PDT

Modified

2021-10-14 18:42 PDT History

CC List

8 users Show

URL

Keywords InRadar

Depends on

Blocks