<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>96041</bug_id>
          
          <creation_ts>2012-09-06 16:56:25 -0700</creation_ts>
          <short_desc>Chromium Linux EWS bots and CQ bots are flaky</short_desc>
          <delta_ts>2012-09-17 13:50:24 -0700</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>Tools / Tests</component>
          <version>528+ (Nightly build)</version>
          <rep_platform>Unspecified</rep_platform>
          <op_sys>Unspecified</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>WORKSFORME</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Tony Chang">tony</reporter>
          <assigned_to name="Tony Chang">tony</assigned_to>
          <cc>abarth</cc>
    
    <cc>dpranke</cc>
    
    <cc>jamesr</cc>
    
    <cc>japhet</cc>
    
    <cc>wjmaclean</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>713863</commentid>
    <comment_count>0</comment_count>
    <who name="Tony Chang">tony</who>
    <bug_when>2012-09-06 16:56:25 -0700</bug_when>
    <thetext>The bots keep failing the layout tests and retrying.  This is causing the queue to get really slow.  Filing this bug for tracking and discussion.

Looking at the logs, it looks like the platform/chromium-linux/compositing/gestures are often failing image diffs.  I ssh&apos;ed to the machine and looked at the results.  The actual results for some of those tests are a solid black 800x600 png.

We don&apos;t see this failure on the build.webkit.org or build.chromium.org waterfalls.

These tests should use the software path, right?

I see a few other failures, but it&apos;s not clear to me if the bots would process faster if we marked platform/chromium-linux/compositing/gestures as flaky.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>713864</commentid>
    <comment_count>1</comment_count>
    <who name="Tony Chang">tony</who>
    <bug_when>2012-09-06 16:57:16 -0700</bug_when>
    <thetext>The platform/chromium-linux/compositing/gestures tests were added on Aug 22.  It&apos;s not clear to me if the flakiness started around then or after that.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>713869</commentid>
    <comment_count>2</comment_count>
    <who name="James Robinson">jamesr</who>
    <bug_when>2012-09-06 17:06:00 -0700</bug_when>
    <thetext>Because there&apos;s &quot;compositing&quot; in the path these will use the h/w path (which is backed by osmesa).  These are new tests and I&apos;m not shocked that they are kind of messed up.  Let&apos;s skip them or mark them flaky and let wjmaclean@ work on fixing them.  They aren&apos;t worth holding everything else up.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>713878</commentid>
    <comment_count>3</comment_count>
      <attachid>162623</attachid>
    <who name="Tony Chang">tony</who>
    <bug_when>2012-09-06 17:14:40 -0700</bug_when>
    <thetext>Created attachment 162623
Patch</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>713880</commentid>
    <comment_count>4</comment_count>
      <attachid>162623</attachid>
    <who name="Adam Barth">abarth</who>
    <bug_when>2012-09-06 17:18:24 -0700</bug_when>
    <thetext>Comment on attachment 162623
Patch

ok</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>713881</commentid>
    <comment_count>5</comment_count>
    <who name="Adam Barth">abarth</who>
    <bug_when>2012-09-06 17:18:31 -0700</bug_when>
    <thetext>Thanks for investigating.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>713887</commentid>
    <comment_count>6</comment_count>
    <who name="Tony Chang">tony</who>
    <bug_when>2012-09-06 17:20:51 -0700</bug_when>
    <thetext>Committed r127803: &lt;http://trac.webkit.org/changeset/127803&gt;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>713889</commentid>
    <comment_count>7</comment_count>
      <attachid>162623</attachid>
    <who name="Tony Chang">tony</who>
    <bug_when>2012-09-06 17:21:38 -0700</bug_when>
    <thetext>Comment on attachment 162623
Patch

This is just speculative, so I&apos;m keeping the bug open.  Hopefully the cr-linux queue will clear overnight.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>713959</commentid>
    <comment_count>8</comment_count>
    <who name="Tony Chang">tony</who>
    <bug_when>2012-09-06 18:25:02 -0700</bug_when>
    <thetext>Looking at the CQ now, there are 5 runs that failed.
Fails a bunch of compositing tests: http://webkit-commit-queue.appspot.com/results/13778383

2 http cache tests with missing results: http://webkit-commit-queue.appspot.com/results/13775546 http://webkit-commit-queue.appspot.com/results/13785213 http://webkit-commit-queue.appspot.com/results/13785209 http://webkit-commit-queue.appspot.com/results/13765808

I wonder if the http cache tests is related to https://bugs.webkit.org/show_bug.cgi?id=93195 .  Not sure why they suddenly became flaky.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>714645</commentid>
    <comment_count>9</comment_count>
    <who name="Tony Chang">tony</who>
    <bug_when>2012-09-07 10:05:27 -0700</bug_when>
    <thetext>Looking at the ews bot, 2 http tests seem super flaky:

  http/tests/cache/stopped-revalidation.html = MISSING
  http/tests/cache/subresource-expiration-1.html = MISSING

Here are the diffs:
http://pastebin.com/hLfRDTmp
http://pastebin.com/vk2uz3dh

Looks like neither test is registering dumpAsText() and the second test is getting the output from the first test.

I think we have a bug for tests getting out of sync.  I&apos;m going to mark these 2 tests as flaky while we investigate.

It looks like notifyDone is getting out of sync with the tests.  Maybe we&apos;re not properly clearing the work queue between tests?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>714654</commentid>
    <comment_count>10</comment_count>
    <who name="Tony Chang">tony</who>
    <bug_when>2012-09-07 10:08:46 -0700</bug_when>
    <thetext>http://trac.webkit.org/changeset/127883</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>714712</commentid>
    <comment_count>11</comment_count>
    <who name="James Robinson">jamesr</who>
    <bug_when>2012-09-07 10:39:46 -0700</bug_when>
    <thetext>One of the platform/chromium-linux/compositing/gestures tests involves a navigation - perhaps it&apos;s mucking things up?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>714777</commentid>
    <comment_count>12</comment_count>
    <who name="Tony Chang">tony</who>
    <bug_when>2012-09-07 11:19:09 -0700</bug_when>
    <thetext>Now I&apos;m seeing
  http/tests/cache/subresource-expiration-2.html = MISSING
  http/tests/cache/subresource-failover-to-network.html = MISSING

But I am able to repro with:
  new-run-webkit-tests --no-new-test-results --skip-failing-tests --verbose http

I&apos;ll do some digging . . .</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>714792</commentid>
    <comment_count>13</comment_count>
    <who name="Tony Chang">tony</who>
    <bug_when>2012-09-07 11:28:51 -0700</bug_when>
    <thetext>http://trac.webkit.org/changeset/127897

Turns out that http/tests/cache/cancel-during-revalidation-succeeded.html is causing the 2 following tests to fail. Skipping cancel-during-revalidation-succeeded.html seems to fix the problem on my machine.

Nate, do you think you can take a look?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>714944</commentid>
    <comment_count>14</comment_count>
    <who name="Tony Chang">tony</who>
    <bug_when>2012-09-07 13:59:29 -0700</bug_when>
    <thetext>http://trac.webkit.org/changeset/127916 is a revert of http://trac.webkit.org/changeset/127803, which skipped the compositing/gestures tests.  Other compositing tests were failing the same way, so I put that back.

The cr-linux ews bot seems to be running smoother since skipping the http test, even with the compositing test failures.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>715210</commentid>
    <comment_count>15</comment_count>
    <who name="James Robinson">jamesr</who>
    <bug_when>2012-09-07 18:36:19 -0700</bug_when>
    <thetext>Skipped the directory in http://trac.webkit.org/changeset/127954.  Let&apos;s see if that helps.  James - can you please take a look at this when you get a chance?  If it does turn out to be these tests then I&apos;m pretty sure that indicates a real problem in the code they test that we need to address.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>715989</commentid>
    <comment_count>16</comment_count>
    <who name="W. James MacLean">wjmaclean</who>
    <bug_when>2012-09-10 06:00:23 -0700</bug_when>
    <thetext>(In reply to comment #15)
&gt; Skipped the directory in http://trac.webkit.org/changeset/127954.  Let&apos;s see if that helps.  James - can you please take a look at this when you get a chance?  If it does turn out to be these tests then I&apos;m pretty sure that indicates a real problem in the code they test that we need to address.

Sure, I&apos;ll look and see what&apos;s going on.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>716187</commentid>
    <comment_count>17</comment_count>
    <who name="Adam Barth">abarth</who>
    <bug_when>2012-09-10 09:56:11 -0700</bug_when>
    <thetext>We&apos;re still getting failures in platform/chromium/compositing</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>722142</commentid>
    <comment_count>18</comment_count>
    <who name="Tony Chang">tony</who>
    <bug_when>2012-09-17 12:05:31 -0700</bug_when>
    <thetext>The bots have been running OK for the past week.  Maybe we should file separate bugs for the flaky HTTP test and the compositing tests and close this bug out?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>722145</commentid>
    <comment_count>19</comment_count>
    <who name="Adam Barth">abarth</who>
    <bug_when>2012-09-17 12:08:14 -0700</bug_when>
    <thetext>SGTM</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>722238</commentid>
    <comment_count>20</comment_count>
    <who name="Tony Chang">tony</who>
    <bug_when>2012-09-17 13:50:24 -0700</bug_when>
    <thetext>https://bugs.webkit.org/show_bug.cgi?id=96950
https://bugs.webkit.org/show_bug.cgi?id=96951</thetext>
  </long_desc>
      
          <attachment
              isobsolete="1"
              ispatch="1"
              isprivate="0"
          >
            <attachid>162623</attachid>
            <date>2012-09-06 17:14:40 -0700</date>
            <delta_ts>2012-09-06 17:21:38 -0700</delta_ts>
            <desc>Patch</desc>
            <filename>bug-96041-20120906171420.patch</filename>
            <type>text/plain</type>
            <size>1514</size>
            <attacher name="Tony Chang">tony</attacher>
            
              <data encoding="base64">U3VidmVyc2lvbiBSZXZpc2lvbjogMTI3ODAyCmRpZmYgLS1naXQgYS9MYXlvdXRUZXN0cy9DaGFu
Z2VMb2cgYi9MYXlvdXRUZXN0cy9DaGFuZ2VMb2cKaW5kZXggM2Y4ODhiY2ZkMjY2MTY1MjkwMjRk
NjFiMWMwZGQ1MTVmZTM1MDk1MS4uOTYyNGViOGU2MTM3ZTJmNmQ3Yzg3NTBkOWRiODBmZGEwODc5
ZDE0OSAxMDA2NDQKLS0tIGEvTGF5b3V0VGVzdHMvQ2hhbmdlTG9nCisrKyBiL0xheW91dFRlc3Rz
L0NoYW5nZUxvZwpAQCAtMSwzICsxLDE3IEBACisyMDEyLTA5LTA2ICBUb255IENoYW5nICA8dG9u
eUBjaHJvbWl1bS5vcmc+CisKKyAgICAgICAgQ2hyb21pdW0gTGludXggRVdTIGJvdHMgYW5kIENR
IGJvdHMgYXJlIGZsYWt5CisgICAgICAgIGh0dHBzOi8vYnVncy53ZWJraXQub3JnL3Nob3dfYnVn
LmNnaT9pZD05NjA0MQorCisgICAgICAgIFJldmlld2VkIGJ5IE5PQk9EWSAoT09QUyEpLgorCisg
ICAgICAgIFNraXAgcGxhdGZvcm0vY2hyb21pdW0tbGludXgvY29tcG9zaXRpbmcvZ2VzdHVyZXMg
dGVzdHMgb24gdGhlCisgICAgICAgIENocm9taXVtIExpbnV4IEVXUyBhbmQgQ1EgYm90cy4gQnkg
bWFya2luZyB0aGVtIGFzIGZsYWt5LCB0aGV5IHdvbid0CisgICAgICAgIGJlIHJ1biAodGhlIGJv
dHMgdXNlIC0tc2tpcC1mYWlsaW5nLXRlc3RzIHdoaWNoIGNhdXNlcyB0aGVtIHRvIHNraXAKKyAg
ICAgICAgdGhlc2UgdGVzdHMpLiAgVGhlc2Ugd2lsbCBzdGlsbCBydW4gb24gdGhlIHdhdGVyZmFs
bCBib3RzLgorCisgICAgICAgICogcGxhdGZvcm0vY2hyb21pdW0vVGVzdEV4cGVjdGF0aW9uczoK
KwogMjAxMi0wOS0wNiAgQWRhbSBCYXJ0aCAgPGFiYXJ0aEB3ZWJraXQub3JnPgogCiAgICAgICAg
IE5ldyB0ZXN0cyBpbnRyb2R1Y2VkIGluIHIxMjc3MDQgZmFpbApkaWZmIC0tZ2l0IGEvTGF5b3V0
VGVzdHMvcGxhdGZvcm0vY2hyb21pdW0vVGVzdEV4cGVjdGF0aW9ucyBiL0xheW91dFRlc3RzL3Bs
YXRmb3JtL2Nocm9taXVtL1Rlc3RFeHBlY3RhdGlvbnMKaW5kZXggN2YzNjM1MjA0NjczN2U1ODg0
Mzk0ZWI5MjQ5ZTJlNWJhOWUxZTg5MC4uZDg0MTZiMzA0M2VjYzg0ZGFhOTY3YmM1NWNlMDNiMGJl
NDM1ZmY5NyAxMDA2NDQKLS0tIGEvTGF5b3V0VGVzdHMvcGxhdGZvcm0vY2hyb21pdW0vVGVzdEV4
cGVjdGF0aW9ucworKysgYi9MYXlvdXRUZXN0cy9wbGF0Zm9ybS9jaHJvbWl1bS9UZXN0RXhwZWN0
YXRpb25zCkBAIC0zNTk0LDMgKzM1OTQsNSBAQCBCVUdDUjE0NjQwMSBBTkRST0lEIDogZmFzdC9j
c3Mvc3RpY2t5L3N0aWNreS13cml0aW5nLW1vZGUtdmVydGljYWwtbHIuaHRtbCA9IElNQQogQlVH
V0s5NTc5OSBNQUMgOiB0b3VjaGFkanVzdG1lbnQvaWZyYW1lLWJvdW5kYXJ5Lmh0bWwgPSBURVhU
CiAKIEJVR1dLOTU4MTMgOiBmYXN0L2lubmVySFRNTC9pbm5lckhUTUwtaWZyYW1lLmh0bWwgPSBU
RVhUIFBBU1MKKworQlVHV0s5NjA0MSBMSU5VWCA6IHBsYXRmb3JtL2Nocm9taXVtLWxpbnV4L2Nv
bXBvc2l0aW5nL2dlc3R1cmVzID0gUEFTUyBJTUFHRQo=
</data>
<flag name="review"
          id="173788"
          type_id="1"
          status="+"
          setter="abarth"
    />
          </attachment>
      

    </bug>

</bugzilla>