<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>306336</bug_id>
          
          <creation_ts>2026-01-27 08:10:24 -0800</creation_ts>
          <short_desc>[webkitpy][run-webkit-tests] Wrong exit code and report when a test is repeated (via --repeat-each=X) and there is a mix of unexpected and expected results</short_desc>
          <delta_ts>2026-01-28 14:33:03 -0800</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>Tools / Tests</component>
          <version>WebKit Nightly Build</version>
          <rep_platform>Unspecified</rep_platform>
          <op_sys>Unspecified</op_sys>
          <bug_status>REOPENED</bug_status>
          <resolution></resolution>
          
          <see_also>https://bugs.webkit.org/show_bug.cgi?id=306451</see_also>
    
    <see_also>https://bugs.webkit.org/show_bug.cgi?id=306477</see_also>
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords>InRadar</keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          <dependson>306460</dependson>
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Carlos Alberto Lopez Perez">clopez</reporter>
          <assigned_to name="Carlos Alberto Lopez Perez">clopez</assigned_to>
          <cc>bugs-noreply</cc>
    
    <cc>commit-queue</cc>
    
    <cc>csaavedra</cc>
    
    <cc>webkit-bug-importer</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>2175430</commentid>
    <comment_count>0</comment_count>
    <who name="Carlos Alberto Lopez Perez">clopez</who>
    <bug_when>2026-01-27 08:10:24 -0800</bug_when>
    <thetext>This has been observed here https://ews-build.webkit.org/#/builders/34/builds/107879 :
 - On the step `layout-tests-repeat-failures` the bot runs: Tools/Scripts/run-webkit-tests --no-build --no-show-results --no-new-test-results --clobber-old-results --release --wpe --results-directory layout-test-results --debug-rwt-logging --skip-failing-tests --fully-parallel --repeat-each=10 compositing/repaint/composited-document-element.html http/tests/blink/sendbeacon/beacon-cookie.html http/tests/security/contentSecurityPolicy/connect-src-eventsource-blocked.html http/tests/xmlhttprequest/logout.html imported/w3c/web-platform-tests/webrtc/RTCRtpSender-setParameters-keyFrame.html 

 - The result is:

05:40:06.162 521977 Testing completed, Exit status: 1
=&gt; Results: 36/50 tests passed (72.0%)

=&gt; Tests to be fixed (2):
      1 crashes                  (50.0%)

=&gt; Tests that will only be fixed if they crash (WONTFIX) (0):


Unexpected flakiness: text-only failures (2)
  http/tests/blink/sendbeacon/beacon-cookie.html [ Pass Failure ]
  http/tests/xmlhttprequest/logout.html [ Pass Failure ]

Unexpected flakiness: crashes (1)
  imported/w3c/web-platform-tests/webrtc/RTCRtpSender-setParameters-keyFrame.html [ Crash Pass ]


The exit code (1) is wrong. It should be a zero exit code because all the tests were marked as flaky and not as regressions on the run.

This causes an infrastructure error on the EWS logic because run-webkit-tests should not return error (non-zero) unless it also produced a list of failed tests and the EWS explicitly checks for this to guard against a patch that breaks the runner itself.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>2175444</commentid>
    <comment_count>1</comment_count>
    <who name="Carlos Alberto Lopez Perez">clopez</who>
    <bug_when>2026-01-27 08:47:37 -0800</bug_when>
    <thetext>Pull request: https://github.com/WebKit/WebKit/pull/57336</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>2175916</commentid>
    <comment_count>2</comment_count>
    <who name="EWS">ews-feeder</who>
    <bug_when>2026-01-28 12:58:11 -0800</bug_when>
    <thetext>Committed 306367@main (96d2789262f7): &lt;https://commits.webkit.org/306367@main&gt;

Reviewed commits have been landed. Closing PR #57336 and removing active labels.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>2175917</commentid>
    <comment_count>3</comment_count>
    <who name="Radar WebKit Bug Importer">webkit-bug-importer</who>
    <bug_when>2026-01-28 12:59:15 -0800</bug_when>
    <thetext>&lt;rdar://problem/169119844&gt;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>2175930</commentid>
    <comment_count>4</comment_count>
    <who name="Carlos Alberto Lopez Perez">clopez</who>
    <bug_when>2026-01-28 13:23:33 -0800</bug_when>
    <thetext>I have discovered that this patch will break the step &quot;run-layout-tests-in-stress-mode&quot; that the EWS uses to find new flakies added.
In that step it is expected that it exits with error when there is a flaky test. See https://ews-build.webkit.org/#/builders/169/builds/2488 
So I will revert this, land the anti-gardening at bug 306451 and go back to the drawing board..</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>2175939</commentid>
    <comment_count>5</comment_count>
    <who name="WebKit Commit Bot">commit-queue</who>
    <bug_when>2026-01-28 13:34:02 -0800</bug_when>
    <thetext>Re-opened since this is blocked by bug 306460</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>2175976</commentid>
    <comment_count>6</comment_count>
    <who name="Carlos Alberto Lopez Perez">clopez</who>
    <bug_when>2026-01-28 14:26:41 -0800</bug_when>
    <thetext>In the previous patch i assumed &quot;a repeated test should only be considered a regression if _all_ of the results it generated where unexpected. Otherwise, if there is only one PASS or only one expected failure it should be considered flaky instead.&quot; but maybe that is wrong and it should be considered a regression if any  (instead of all) of the results it generated where unexpected.

Anyway, this is a complex topic, I think I&apos;m going to fix first the EWS logic instead  to deal with the case run-webkit-tests exists with error and there is only a list of flakies (but not non-flaky errors)</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>2175984</commentid>
    <comment_count>7</comment_count>
    <who name="Carlos Alberto Lopez Perez">clopez</who>
    <bug_when>2026-01-28 14:33:03 -0800</bug_when>
    <thetext>(In reply to Carlos Alberto Lopez Perez from comment #6)
&gt; Anyway, this is a complex topic, I think I&apos;m going to fix first the EWS
&gt; logic instead  to deal with the case run-webkit-tests exists with error and
&gt; there is only a list of flakies (but not non-flaky errors)

See bug 306477</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>