Bug 306336

Summary: [webkitpy][run-webkit-tests] Wrong exit code and report when a test is repeated (via --repeat-each=X) and there is a mix of unexpected and expected results
Product: WebKit Reporter: Carlos Alberto Lopez Perez <clopez>
Component: Tools / TestsAssignee: Carlos Alberto Lopez Perez <clopez>
Status: REOPENED    
Severity: Normal CC: bugs-noreply, commit-queue, csaavedra, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=306451
https://bugs.webkit.org/show_bug.cgi?id=306477
Bug Depends on: 306460    
Bug Blocks:    

Carlos Alberto Lopez Perez
Reported 2026-01-27 08:10:24 PST
This has been observed here https://ews-build.webkit.org/#/builders/34/builds/107879 : - On the step `layout-tests-repeat-failures` the bot runs: Tools/Scripts/run-webkit-tests --no-build --no-show-results --no-new-test-results --clobber-old-results --release --wpe --results-directory layout-test-results --debug-rwt-logging --skip-failing-tests --fully-parallel --repeat-each=10 compositing/repaint/composited-document-element.html http/tests/blink/sendbeacon/beacon-cookie.html http/tests/security/contentSecurityPolicy/connect-src-eventsource-blocked.html http/tests/xmlhttprequest/logout.html imported/w3c/web-platform-tests/webrtc/RTCRtpSender-setParameters-keyFrame.html - The result is: 05:40:06.162 521977 Testing completed, Exit status: 1 => Results: 36/50 tests passed (72.0%) => Tests to be fixed (2): 1 crashes (50.0%) => Tests that will only be fixed if they crash (WONTFIX) (0): Unexpected flakiness: text-only failures (2) http/tests/blink/sendbeacon/beacon-cookie.html [ Pass Failure ] http/tests/xmlhttprequest/logout.html [ Pass Failure ] Unexpected flakiness: crashes (1) imported/w3c/web-platform-tests/webrtc/RTCRtpSender-setParameters-keyFrame.html [ Crash Pass ] The exit code (1) is wrong. It should be a zero exit code because all the tests were marked as flaky and not as regressions on the run. This causes an infrastructure error on the EWS logic because run-webkit-tests should not return error (non-zero) unless it also produced a list of failed tests and the EWS explicitly checks for this to guard against a patch that breaks the runner itself.
Attachments
Carlos Alberto Lopez Perez
Comment 1 2026-01-27 08:47:37 PST
EWS
Comment 2 2026-01-28 12:58:11 PST
Committed 306367@main (96d2789262f7): <https://commits.webkit.org/306367@main> Reviewed commits have been landed. Closing PR #57336 and removing active labels.
Radar WebKit Bug Importer
Comment 3 2026-01-28 12:59:15 PST
Carlos Alberto Lopez Perez
Comment 4 2026-01-28 13:23:33 PST
I have discovered that this patch will break the step "run-layout-tests-in-stress-mode" that the EWS uses to find new flakies added. In that step it is expected that it exits with error when there is a flaky test. See https://ews-build.webkit.org/#/builders/169/builds/2488
 So I will revert this, land the anti-gardening at bug 306451 and go back to the drawing board..
WebKit Commit Bot
Comment 5 2026-01-28 13:34:02 PST
Re-opened since this is blocked by bug 306460
Carlos Alberto Lopez Perez
Comment 6 2026-01-28 14:26:41 PST
In the previous patch i assumed "a repeated test should only be considered a regression if _all_ of the results it generated where unexpected. Otherwise, if there is only one PASS or only one expected failure it should be considered flaky instead." but maybe that is wrong and it should be considered a regression if any (instead of all) of the results it generated where unexpected. Anyway, this is a complex topic, I think I'm going to fix first the EWS logic instead to deal with the case run-webkit-tests exists with error and there is only a list of flakies (but not non-flaky errors)
Carlos Alberto Lopez Perez
Comment 7 2026-01-28 14:33:03 PST
(In reply to Carlos Alberto Lopez Perez from comment #6) > Anyway, this is a complex topic, I think I'm going to fix first the EWS > logic instead to deal with the case run-webkit-tests exists with error and > there is only a list of flakies (but not non-flaky errors) See bug 306477
Note You need to log in before you can comment on or make changes to this bug.