<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>204704</bug_id>
          
          <creation_ts>2019-11-29 08:31:34 -0800</creation_ts>
          <short_desc>[EWS] Do not retry layout-tests build if the flaky test failures are also present in clean tree run</short_desc>
          <delta_ts>2019-12-02 15:24:34 -0800</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>Tools / Tests</component>
          <version>Other</version>
          <rep_platform>Unspecified</rep_platform>
          <op_sys>Unspecified</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          <see_also>https://bugs.webkit.org/show_bug.cgi?id=204769</see_also>
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords>InRadar</keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Aakash Jain">aakash_jain</reporter>
          <assigned_to name="Aakash Jain">aakash_jain</assigned_to>
          <cc>aakash_jain</cc>
    
    <cc>ap</cc>
    
    <cc>commit-queue</cc>
    
    <cc>jbedard</cc>
    
    <cc>webkit-bug-importer</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>1594202</commentid>
    <comment_count>0</comment_count>
    <who name="Aakash Jain">aakash_jain</who>
    <bug_when>2019-11-29 08:31:34 -0800</bug_when>
    <thetext>In EWS, do not retry layout-tests build if the (flaky) test failures are also present in clean tree run. If same failures (whether flaky or consistent) are also found on clean tree, it is very unlikely that the failures were introduced by the patch.

e.g.: In https://ews-build.webkit.org/#/builders/24/builds/5456, the first and second run failures were also present on clean tree. However the build was retried, instead it should be marked as passing.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1594204</commentid>
    <comment_count>1</comment_count>
      <attachid>384510</attachid>
    <who name="Aakash Jain">aakash_jain</who>
    <bug_when>2019-11-29 08:37:24 -0800</bug_when>
    <thetext>Created attachment 384510
Patch</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1594288</commentid>
    <comment_count>2</comment_count>
      <attachid>384510</attachid>
    <who name="Alexey Proskuryakov">ap</who>
    <bug_when>2019-11-29 22:23:57 -0800</bug_when>
    <thetext>Comment on attachment 384510
Patch

This is clearly an improvement, it&apos;s obvious that a patch with results like these should get a green bubble. I think that we are still too strict with declaring SUCCESS, and delay the verdict unnecessarily.

One thing that doesn&apos;t make logical sense to me is that we completely ignore flaky tests if they pass on retry within a single test run - but tests that also fail on retry within the run yet pass on a new run are treated as ultra important.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1594289</commentid>
    <comment_count>3</comment_count>
    <who name="Aakash Jain">aakash_jain</who>
    <bug_when>2019-11-29 23:44:38 -0800</bug_when>
    <thetext>(In reply to Alexey Proskuryakov from comment #2)
&gt; Comment on attachment 384510 [details]
&gt; 
&gt; One thing that doesn&apos;t make logical sense to me is that we completely ignore
&gt; flaky tests if they pass on retry within a single test run - but tests that
&gt; also fail on retry within the run yet pass on a new run are treated as ultra
&gt; important.
Completely agree. I don’t know why we retry so aggressively when a flakiness is encountered between two test runs. However, as you said, we do ignore flaky tests within each single test-run.

Maybe we should take the intersection of test failures from first and second test runs, and compare it with clean tree run (unless there are 30+ failures in any run in which case we use special logic). This is what we do on API tests and JSC tests EWSes.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1594290</commentid>
    <comment_count>4</comment_count>
      <attachid>384510</attachid>
    <who name="WebKit Commit Bot">commit-queue</who>
    <bug_when>2019-11-30 00:15:43 -0800</bug_when>
    <thetext>Comment on attachment 384510
Patch

Clearing flags on attachment: 384510

Committed r252953: &lt;https://trac.webkit.org/changeset/252953&gt;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1594291</commentid>
    <comment_count>5</comment_count>
    <who name="WebKit Commit Bot">commit-queue</who>
    <bug_when>2019-11-30 00:15:45 -0800</bug_when>
    <thetext>All reviewed patches have been landed.  Closing bug.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1594292</commentid>
    <comment_count>6</comment_count>
    <who name="Radar WebKit Bug Importer">webkit-bug-importer</who>
    <bug_when>2019-11-30 00:16:16 -0800</bug_when>
    <thetext>&lt;rdar://problem/57537845&gt;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1594329</commentid>
    <comment_count>7</comment_count>
    <who name="Aakash Jain">aakash_jain</who>
    <bug_when>2019-11-30 09:29:43 -0800</bug_when>
    <thetext>Deployed on production server.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1594356</commentid>
    <comment_count>8</comment_count>
    <who name="Alexey Proskuryakov">ap</who>
    <bug_when>2019-11-30 15:59:59 -0800</bug_when>
    <thetext>I think that taking an intersection would be an improvement in most cases. We can make the logic more subtle and detect patches that introduce substantial flakiness. E.g. a human would probably be concerned about a patch that makes 5 tests flaky when nothing is flaky on a clean run.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1594421</commentid>
    <comment_count>9</comment_count>
    <who name="Aakash Jain">aakash_jain</who>
    <bug_when>2019-12-01 08:32:51 -0800</bug_when>
    <thetext>(In reply to Alexey Proskuryakov from comment #8)
&gt; I think that taking an intersection would be an improvement in most cases.
Yeah, For e.g.: in https://ews-build.webkit.org/#/builders/17/builds/7537 there was one pre-existing failure on trunk. In second-run a flaky failure &quot;webaudio/audioparam-setTargetAtTime.html&quot; caused the entire build to be retried.

When there is a pre-existing failure, any flaky failure would almost result in build being RETRIED for a good patch, while the build should be marked as SUCCESS (this is a huge waste of resources).</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1594603</commentid>
    <comment_count>10</comment_count>
    <who name="Jonathan Bedard">jbedard</who>
    <bug_when>2019-12-02 08:43:10 -0800</bug_when>
    <thetext>(In reply to Alexey Proskuryakov from comment #2)
&gt; Comment on attachment 384510 [details]
&gt; Patch
...
&gt; One thing that doesn&apos;t make logical sense to me is that we completely ignore
&gt; flaky tests if they pass on retry within a single test run - but tests that
&gt; also fail on retry within the run yet pass on a new run are treated as ultra
&gt; important.

I agree we need a better way to handle this. Lately, every example I’ve seen of a test which passes during the retry in run-webkit-tests has had some sort of stateful dependency in the test which caused it to fail when run with other tests. I actually think that passing tests on the retry in run-webkit-tests should be fatal. With the current state of the tree, that’s usually a pretty uncontroverdial bug.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1594807</commentid>
    <comment_count>11</comment_count>
    <who name="Aakash Jain">aakash_jain</who>
    <bug_when>2019-12-02 14:52:33 -0800</bug_when>
    <thetext>Going to fix the entire thing in Bug 204769.

This bug covered only a specific case of flakiness, while Bug 204769 would be more generic and cover most of the flakiness scenarios.

I would need to revert this change as it added explicit if condition for this specific scenario.

It might be little unusual, but I am going to revert this commit (r252953) separately and after that create a clean patch for Bug 204769, so that it is easier to review.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1594823</commentid>
    <comment_count>12</comment_count>
    <who name="Aakash Jain">aakash_jain</who>
    <bug_when>2019-12-02 15:23:19 -0800</bug_when>
    <thetext>(In reply to WebKit Commit Bot from comment #4)
&gt; Committed r252953: &lt;https://trac.webkit.org/changeset/252953&gt;
Reverted in https://trac.webkit.org/changeset/253014/webkit</thetext>
  </long_desc>
      
          <attachment
              isobsolete="0"
              ispatch="1"
              isprivate="0"
          >
            <attachid>384510</attachid>
            <date>2019-11-29 08:37:24 -0800</date>
            <delta_ts>2019-11-30 00:15:43 -0800</delta_ts>
            <desc>Patch</desc>
            <filename>bug-204704-20191129113723.patch</filename>
            <type>text/plain</type>
            <size>3585</size>
            <attacher name="Aakash Jain">aakash_jain</attacher>
            
              <data encoding="base64">SW5kZXg6IFRvb2xzL0NoYW5nZUxvZwo9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09Ci0tLSBUb29scy9DaGFuZ2VMb2cJKHJl
dmlzaW9uIDI1Mjk0MikKKysrIFRvb2xzL0NoYW5nZUxvZwkod29ya2luZyBjb3B5KQpAQCAtMSwz
ICsxLDE2IEBACisyMDE5LTExLTI5ICBBYWthc2ggSmFpbiAgPGFha2FzaF9qYWluQGFwcGxlLmNv
bT4KKworICAgICAgICBbRVdTXSBEbyBub3QgcmV0cnkgbGF5b3V0LXRlc3RzIGJ1aWxkIGlmIHRo
ZSBmbGFreSB0ZXN0IGZhaWx1cmVzIGFyZSBhbHNvIHByZXNlbnQgaW4gY2xlYW4gdHJlZSBydW4K
KyAgICAgICAgaHR0cHM6Ly9idWdzLndlYmtpdC5vcmcvc2hvd19idWcuY2dpP2lkPTIwNDcwNAor
CisgICAgICAgIFJldmlld2VkIGJ5IE5PQk9EWSAoT09QUyEpLgorCisgICAgICAgICogQnVpbGRT
bGF2ZVN1cHBvcnQvZXdzLWJ1aWxkL3N0ZXBzLnB5OgorICAgICAgICAoQW5hbHl6ZUxheW91dFRl
c3RzUmVzdWx0cy5zdGFydCk6CisgICAgICAgICogQnVpbGRTbGF2ZVN1cHBvcnQvZXdzLWJ1aWxk
L3N0ZXBzX3VuaXR0ZXN0LnB5OgorICAgICAgICAoVGVzdEFuYWx5emVMYXlvdXRUZXN0c1Jlc3Vs
dHMudGVzdF9mbGFreV9hbmRfaW5jb25zaXN0ZW50X2ZhaWx1cmVzX3dpdGhfY2xlYW5fdHJlZV9m
YWlsdXJlcyk6IENoYW5nZWQgZXhwZWN0ZWQgb3V0Y29tZSB0byBTVUNDRVNTLgorICAgICAgICAo
VGVzdEFuYWx5emVMYXlvdXRUZXN0c1Jlc3VsdHMudGVzdF9mbGFreV9hbmRfY29uc2lzdGVudF9m
YWlsdXJlc193aXRoX2NsZWFuX3RyZWVfZmFpbHVyZXMpOiBEaXR0by4KKwogMjAxOS0xMS0yOSAg
UGF1bG8gTWF0b3MgIDxwbWF0b3NAaWdhbGlhLmNvbT4KIAogICAgICAgICBNb3ZlIGpzYy1pMzg2
IGZyb20gb2xkIHRvIG5ldyBFV1MKSW5kZXg6IFRvb2xzL0J1aWxkU2xhdmVTdXBwb3J0L2V3cy1i
dWlsZC9zdGVwcy5weQo9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09Ci0tLSBUb29scy9CdWlsZFNsYXZlU3VwcG9ydC9ld3Mt
YnVpbGQvc3RlcHMucHkJKHJldmlzaW9uIDI1Mjk0MSkKKysrIFRvb2xzL0J1aWxkU2xhdmVTdXBw
b3J0L2V3cy1idWlsZC9zdGVwcy5weQkod29ya2luZyBjb3B5KQpAQCAtMTQ1Miw4ICsxNDUyLDEy
IEBAIGNsYXNzIEFuYWx5emVMYXlvdXRUZXN0c1Jlc3VsdHMoYnVpbGRzdGUKICAgICAgICAgICAg
ICAgICBpZiBmYWlsdXJlc19pbnRyb2R1Y2VkX2J5X3BhdGNoOgogICAgICAgICAgICAgICAgICAg
ICByZXR1cm4gc2VsZi5yZXBvcnRfZmFpbHVyZShmYWlsdXJlc19pbnRyb2R1Y2VkX2J5X3BhdGNo
KQogCi0gICAgICAgICAgICAjIEF0IHRoaXMgcG9pbnQgd2Uga25vdyB0aGF0IGF0IGxlYXN0IG9u
ZSB0ZXN0IGZsYWtlZCwgYnV0IG5vIGNvbnNpc3RlbnQgZmFpbHVyZXMKLSAgICAgICAgICAgICMg
d2VyZSBpbnRyb2R1Y2VkLiBUaGlzIGlzIGEgYml0IG9mIGEgZ3JleS16b25lLgorICAgICAgICAg
ICAgbmV3X2ZhaWxpbmdfb3JfZmxha3lfdGVzdHMgPSBmaXJzdF9yZXN1bHRzX2ZhaWxpbmdfdGVz
dHMudW5pb24oc2Vjb25kX3Jlc3VsdHNfZmFpbGluZ190ZXN0cykgLSBjbGVhbl90cmVlX3Jlc3Vs
dHNfZmFpbGluZ190ZXN0cworICAgICAgICAgICAgaWYgbm90IG5ld19mYWlsaW5nX29yX2ZsYWt5
X3Rlc3RzOgorICAgICAgICAgICAgICAgIHJldHVybiBzZWxmLnJlcG9ydF9wcmVfZXhpc3Rpbmdf
ZmFpbHVyZXMoY2xlYW5fdHJlZV9yZXN1bHRzX2ZhaWxpbmdfdGVzdHMpCisKKyAgICAgICAgICAg
ICMgQXQgdGhpcyBwb2ludCB3ZSBrbm93IHRoYXQgYXQgbGVhc3Qgb25lIHRlc3QgZmxha2VkLCBh
bmQgdGhvc2UgZmxha3kgdGVzdHMgcGFzc2VkIG9uIGNsZWFuIHRyZWUsCisgICAgICAgICAgICAj
IGFuZCBubyBjb25zaXN0ZW50IGZhaWx1cmVzIHdlcmUgaW50cm9kdWNlZC4gVGhpcyBpcyBhIGJp
dCBvZiBhIGdyZXktem9uZS4gU28sIHdlIHJldHJ5IHRoZSBidWlsZC4KICAgICAgICAgICAgIHJl
dHVybiBzZWxmLnJldHJ5X2J1aWxkKCkKIAogICAgICAgICBpZiBjbGVhbl90cmVlX3Jlc3VsdHNf
ZGlkX2V4Y2VlZF90ZXN0X2ZhaWx1cmVfbGltaXQ6CkluZGV4OiBUb29scy9CdWlsZFNsYXZlU3Vw
cG9ydC9ld3MtYnVpbGQvc3RlcHNfdW5pdHRlc3QucHkKPT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQotLS0gVG9vbHMvQnVp
bGRTbGF2ZVN1cHBvcnQvZXdzLWJ1aWxkL3N0ZXBzX3VuaXR0ZXN0LnB5CShyZXZpc2lvbiAyNTI5
NDEpCisrKyBUb29scy9CdWlsZFNsYXZlU3VwcG9ydC9ld3MtYnVpbGQvc3RlcHNfdW5pdHRlc3Qu
cHkJKHdvcmtpbmcgY29weSkKQEAgLTE2NDcsNyArMTY0Nyw3IEBAIGNsYXNzIFRlc3RBbmFseXpl
TGF5b3V0VGVzdHNSZXN1bHRzKEJ1aWwKICAgICAgICAgc2VsZi5zZXRQcm9wZXJ0eSgnZmlyc3Rf
cnVuX2ZhaWx1cmVzJywgWyd0ZXN0MScsICd0ZXN0MiddKQogICAgICAgICBzZWxmLnNldFByb3Bl
cnR5KCdzZWNvbmRfcnVuX2ZhaWx1cmVzJywgWyd0ZXN0MyddKQogICAgICAgICBzZWxmLnNldFBy
b3BlcnR5KCdjbGVhbl90cmVlX3J1bl9mYWlsdXJlcycsIFsndGVzdDEnLCAndGVzdDInLCAndGVz
dDMnXSkKLSAgICAgICAgc2VsZi5leHBlY3RPdXRjb21lKHJlc3VsdD1SRVRSWSwgc3RhdGVfc3Ry
aW5nPSdVbmFibGUgdG8gY29uZmlybSBpZiB0ZXN0IGZhaWx1cmVzIGFyZSBpbnRyb2R1Y2VkIGJ5
IHBhdGNoLCByZXRyeWluZyBidWlsZCAocmV0cnkpJykKKyAgICAgICAgc2VsZi5leHBlY3RPdXRj
b21lKHJlc3VsdD1TVUNDRVNTLCBzdGF0ZV9zdHJpbmc9J1Bhc3NlZCBsYXlvdXQgdGVzdHMnKQog
ICAgICAgICByZXR1cm4gc2VsZi5ydW5TdGVwKCkKIAogICAgIGRlZiB0ZXN0X2ZsYWt5X2FuZF9j
b25zaXN0ZW50X2ZhaWx1cmVzX3dpdGhfY2xlYW5fdHJlZV9mYWlsdXJlcyhzZWxmKToKQEAgLTE2
NTUsNyArMTY1NSw3IEBAIGNsYXNzIFRlc3RBbmFseXplTGF5b3V0VGVzdHNSZXN1bHRzKEJ1aWwK
ICAgICAgICAgc2VsZi5zZXRQcm9wZXJ0eSgnZmlyc3RfcnVuX2ZhaWx1cmVzJywgWyd0ZXN0MScs
ICd0ZXN0MiddKQogICAgICAgICBzZWxmLnNldFByb3BlcnR5KCdzZWNvbmRfcnVuX2ZhaWx1cmVz
JywgWyd0ZXN0MSddKQogICAgICAgICBzZWxmLnNldFByb3BlcnR5KCdjbGVhbl90cmVlX3J1bl9m
YWlsdXJlcycsIFsndGVzdDEnLCAndGVzdDInXSkKLSAgICAgICAgc2VsZi5leHBlY3RPdXRjb21l
KHJlc3VsdD1SRVRSWSwgc3RhdGVfc3RyaW5nPSdVbmFibGUgdG8gY29uZmlybSBpZiB0ZXN0IGZh
aWx1cmVzIGFyZSBpbnRyb2R1Y2VkIGJ5IHBhdGNoLCByZXRyeWluZyBidWlsZCAocmV0cnkpJykK
KyAgICAgICAgc2VsZi5leHBlY3RPdXRjb21lKHJlc3VsdD1TVUNDRVNTLCBzdGF0ZV9zdHJpbmc9
J1Bhc3NlZCBsYXlvdXQgdGVzdHMnKQogICAgICAgICByZXR1cm4gc2VsZi5ydW5TdGVwKCkKIAog
ICAgIGRlZiB0ZXN0X2ZpcnN0X3J1bl9leGNlZWRfZmFpbHVyZV9saW1pdChzZWxmKToK
</data>

          </attachment>
      

    </bug>

</bugzilla>