<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>123385</bug_id>
          
          <creation_ts>2013-10-25 23:29:20 -0700</creation_ts>
          <short_desc>New flakiness dashboard shouldn&apos;t treat tests with right expectations as failing</short_desc>
          <delta_ts>2013-10-27 19:56:29 -0700</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>WebKit Website</component>
          <version>528+ (Nightly build)</version>
          <rep_platform>Unspecified</rep_platform>
          <op_sys>Unspecified</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Ryosuke Niwa">rniwa</reporter>
          <assigned_to name="Ryosuke Niwa">rniwa</assigned_to>
          <cc>ap</cc>
    
    <cc>commit-queue</cc>
    
    <cc>lforschler</cc>
    
    <cc>simon.fraser</cc>
    
    <cc>thorton</cc>
    
    <cc>zan</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>943857</commentid>
    <comment_count>0</comment_count>
    <who name="Ryosuke Niwa">rniwa</who>
    <bug_when>2013-10-25 23:29:20 -0700</bug_when>
    <thetext>Right now, if you select &quot;failing&quot; tests on the builder pane, the new flakiness dashboard lists all failing tests including ones that have the right test expectation.
It should instead only list tests that are failing and don&apos;t have the right expectation that are making bots red.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>943858</commentid>
    <comment_count>1</comment_count>
      <attachid>215240</attachid>
    <who name="Ryosuke Niwa">rniwa</who>
    <bug_when>2013-10-25 23:33:19 -0700</bug_when>
    <thetext>Created attachment 215240
Changes the behavior</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>943931</commentid>
    <comment_count>2</comment_count>
      <attachid>215240</attachid>
    <who name="Alexey Proskuryakov">ap</who>
    <bug_when>2013-10-26 16:52:45 -0700</bug_when>
    <thetext>Comment on attachment 215240
Changes the behavior

I&apos;ve never used this feature on the old dashboard, so it&apos;s not clear to me if either behavior is useful. What are the use cases? If this is a replacement for regular dashboard, then we should consider just removing the duplicate functionality.

r=me</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>943934</commentid>
    <comment_count>3</comment_count>
    <who name="Ryosuke Niwa">rniwa</who>
    <bug_when>2013-10-26 17:21:01 -0700</bug_when>
    <thetext>(In reply to comment #2)
&gt; (From update of attachment 215240 [details])
&gt; I&apos;ve never used this feature on the old dashboard, so it&apos;s not clear to me if either behavior is useful. What are the use cases? If this is a replacement for regular dashboard, then we should consider just removing the duplicate functionality.

This shows the list of failing tests on the bots.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>943937</commentid>
    <comment_count>4</comment_count>
      <attachid>215240</attachid>
    <who name="WebKit Commit Bot">commit-queue</who>
    <bug_when>2013-10-26 17:45:38 -0700</bug_when>
    <thetext>Comment on attachment 215240
Changes the behavior

Clearing flags on attachment: 215240

Committed r158093: &lt;http://trac.webkit.org/changeset/158093&gt;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>943938</commentid>
    <comment_count>5</comment_count>
    <who name="WebKit Commit Bot">commit-queue</who>
    <bug_when>2013-10-26 17:45:40 -0700</bug_when>
    <thetext>All reviewed patches have been landed.  Closing bug.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>943966</commentid>
    <comment_count>6</comment_count>
    <who name="Alexey Proskuryakov">ap</who>
    <bug_when>2013-10-27 10:06:03 -0700</bug_when>
    <thetext>&gt; This shows the list of failing tests on the bots.

I don&apos;t think that this answers my question about use cases. Listing tests that are currently failing is not a job for the dashboard, which is for historic analysis of results.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>943972</commentid>
    <comment_count>7</comment_count>
    <who name="Ryosuke Niwa">rniwa</who>
    <bug_when>2013-10-27 11:04:55 -0700</bug_when>
    <thetext>(In reply to comment #6)
&gt; &gt; This shows the list of failing tests on the bots.
&gt; 
&gt; I don&apos;t think that this answers my question about use cases. Listing tests that are currently failing is not a job for the dashboard, which is for historic analysis of results.

If you&apos;re talking about http://build.webkit.org/dashboard/, I find it impossible to use because it doesn&apos;t have links to builder&apos;s page and it has -webkit-user-select: none along with dozens of other problems.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>943974</commentid>
    <comment_count>8</comment_count>
    <who name="Alexey Proskuryakov">ap</who>
    <bug_when>2013-10-27 11:41:48 -0700</bug_when>
    <thetext>Can you please file bugs for those? That is the tool intended to be used for looking at immediate state of the bots, and adding duplicate functionality to other tools is not the best path forward. We&apos;ll just end up with a set of tools that no one but their creators understand or use.

build.webkit.org/dashboard is also meant to be the primary entry point into the regression test bot system for most people, because checking historic flakiness is an activity that is secondary to checking immediate state. Buildbot waterfall and console certainly have their use, but mostly for people who administer the system, not for WebKit developers in my opinion.

There is a bunch of bugs and enhancement requests filed already, you can find these by searching for &quot;build.webkit.org/dashboard&quot; in Bugzilla titles. 

I encourage you to file bugs in terms of use cases that aren&apos;t addressed well (i.e. not simply &quot;please remove user-select:none&quot;, but &quot;I often need to do XXX when bot watching, and it&apos;s difficult to do now&quot;).</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>943977</commentid>
    <comment_count>9</comment_count>
    <who name="Ryosuke Niwa">rniwa</who>
    <bug_when>2013-10-27 12:03:58 -0700</bug_when>
    <thetext>(In reply to comment #8)
&gt; build.webkit.org/dashboard is also meant to be the primary entry point into the regression test bot system for most people, because checking historic flakiness is an activity that is secondary to checking immediate state. Buildbot waterfall and console certainly have their use, but mostly for people who administer the system, not for WebKit developers in my opinion.

I don&apos;t see a point in doing that given I&apos;m satisfied with what build.webkit.org/waterfall and build.webkit.org/console provides.  Those two pages provides exactly the kind of information I need.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>944009</commentid>
    <comment_count>10</comment_count>
    <who name="Alexey Proskuryakov">ap</who>
    <bug_when>2013-10-27 19:19:43 -0700</bug_when>
    <thetext>&gt; I&apos;m satisfied with what build.webkit.org/waterfall and build.webkit.org/console provides

In this case, can we just get rid of the &quot;failing&quot; display in the new flakiness dashboard?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>944010</commentid>
    <comment_count>11</comment_count>
    <who name="Ryosuke Niwa">rniwa</who>
    <bug_when>2013-10-27 19:24:24 -0700</bug_when>
    <thetext>(In reply to comment #10)
&gt; &gt; I&apos;m satisfied with what build.webkit.org/waterfall and build.webkit.org/console provides
&gt; 
&gt; In this case, can we just get rid of the &quot;failing&quot; display in the new flakiness dashboard?

Why?  The historical results of currently failing tests is exactly what bot watchers need to see to determine which patch caused the failure and whether tests have been flaky or not.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>944011</commentid>
    <comment_count>12</comment_count>
    <who name="Ryosuke Niwa">rniwa</who>
    <bug_when>2013-10-27 19:56:29 -0700</bug_when>
    <thetext>I think I&apos;m disagreeing with the statement that &quot;checking historic flakiness is an activity that is secondary to checking immediate state&quot;.

In my experience, viewing the historical results of a test has been essential in determining the culprit and the correct test expectation to add.

Knowing how many tests are failing on a builder doesn&apos;t get me anywhere as a bot watcher because my primary job as a bot watcher (contacting the patch author, etc…) cannot be carried out until the culprit is determined.

I don&apos;t know what revision number http://build.webkit.org/dashboard/ is showing but automatically determining the culprit has already been tried by TestFailures and garden-o-magic.  They have both miserably failed to carry out the promise.  The task of this sort is best done by humans.</thetext>
  </long_desc>
      
          <attachment
              isobsolete="0"
              ispatch="1"
              isprivate="0"
          >
            <attachid>215240</attachid>
            <date>2013-10-25 23:33:19 -0700</date>
            <delta_ts>2013-10-26 17:45:38 -0700</delta_ts>
            <desc>Changes the behavior</desc>
            <filename>fix123385</filename>
            <type>text/plain</type>
            <size>2293</size>
            <attacher name="Ryosuke Niwa">rniwa</attacher>
            
              <data encoding="base64">SW5kZXg6IFdlYnNpdGVzL3Rlc3QtcmVzdWx0cy9DaGFuZ2VMb2cKPT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQotLS0gV2Vi
c2l0ZXMvdGVzdC1yZXN1bHRzL0NoYW5nZUxvZwkocmV2aXNpb24gMTU4MDgwKQorKysgV2Vic2l0
ZXMvdGVzdC1yZXN1bHRzL0NoYW5nZUxvZwkod29ya2luZyBjb3B5KQpAQCAtMSw1ICsxLDE5IEBA
CiAyMDEzLTEwLTI1ICBSeW9zdWtlIE5pd2EgIDxybml3YUB3ZWJraXQub3JnPgogCisgICAgICAg
IE5ldyBmbGFraW5lc3MgZGFzaGJvYXJkIHNob3VsZG4ndCB0cmVhdCB0ZXN0cyB3aXRoIHJpZ2h0
IGV4cGVjdGF0aW9ucyBhcyBmYWlsaW5nCisgICAgICAgIGh0dHBzOi8vYnVncy53ZWJraXQub3Jn
L3Nob3dfYnVnLmNnaT9pZD0xMjMzODUKKworICAgICAgICBSZXZpZXdlZCBieSBOT0JPRFkgKE9P
UFMhKS4KKworICAgICAgICBXZSBkZWZpbmUgZmFpbGluZyB0ZXN0cyB0byBiZSB0ZXN0cyB3aXRo
IHdyb25nIGV4cGVjdGF0aW9ucyB3aG9zZSBhY3R1YWwgcmVzdWx0cyBhcmUgbm90IFBBU1MKKyAg
ICAgICAgc2luY2UgdGVzdHMgd2l0aCBURVhULCBJTUFHRSwgZXRjLi4uIGZhaWx1cmVzIGRvIG5v
dCB0dXJuIHRoZSBib3RzIHJlZCBhcyBsb25nIGFzIHRoZSBleHBlY3RhdGlvbgorICAgICAgICBv
ZiB0aGUgc2FtZSB0eXBlIGlzIHNwZWNpZmllZCBpbiBUZXN0RXhwZWN0YXRpb24gZmlsZXMuCisK
KyAgICAgICAgKiBwdWJsaWMvaW5jbHVkZS90ZXN0LXJlc3VsdHMucGhwOgorICAgICAgICAoRmFp
bGluZ1Jlc3VsdHNKU09OV3JpdGVyKTogSW5oZXJpdCBmcm9tIFdyb25nRXhwZWN0YXRpb25zUmVz
dWx0c0pTT05Xcml0ZXIuCisKKzIwMTMtMTAtMjUgIFJ5b3N1a2UgTml3YSAgPHJuaXdhQHdlYmtp
dC5vcmc+CisKICAgICAgICAgQnVpbGQgZml4LiBUaGUgcXVlcnkgcmVzdWx0cyB3ZXJlbid0IHNv
cnRlZCBieSB0aGUgbGF0ZXN0IGNvbW1pdCB0aW1lLAogICAgICAgICB5aWVsZGluZyB3cm9uZyBz
ZXQgb2YgdGVzdHMgdG8gYmUgbGlzdGVkIGluIHRoZSBidWlsZGVyIHBhbmUuCiAKSW5kZXg6IFdl
YnNpdGVzL3Rlc3QtcmVzdWx0cy9wdWJsaWMvaW5jbHVkZS90ZXN0LXJlc3VsdHMucGhwCj09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT0KLS0tIFdlYnNpdGVzL3Rlc3QtcmVzdWx0cy9wdWJsaWMvaW5jbHVkZS90ZXN0LXJlc3Vs
dHMucGhwCShyZXZpc2lvbiAxNTgwODApCisrKyBXZWJzaXRlcy90ZXN0LXJlc3VsdHMvcHVibGlj
L2luY2x1ZGUvdGVzdC1yZXN1bHRzLnBocAkod29ya2luZyBjb3B5KQpAQCAtMTM3LDEzICsxMzcs
NiBAQAogICAgIGFic3RyYWN0IHByb3RlY3RlZCBmdW5jdGlvbiBwYXNzX2Zvcl9mYWlsdXJlX3R5
cGUoJiRyZXN1bHRzKTsKIH0KIAotY2xhc3MgRmFpbGluZ1Jlc3VsdHNKU09OV3JpdGVyIGV4dGVu
ZHMgUmVzdWx0c0pTT05Xcml0ZXIgewotICAgIHB1YmxpYyBmdW5jdGlvbiBfX2NvbnN0cnVjdCgk
ZnApIHsgcGFyZW50OjpfX2NvbnN0cnVjdCgkZnApOyB9Ci0gICAgcHJvdGVjdGVkIGZ1bmN0aW9u
IHBhc3NfZm9yX2ZhaWx1cmVfdHlwZSgmJHJlc3VsdHMpIHsKLSAgICAgICAgcmV0dXJuICRyZXN1
bHRzWzBdWydhY3R1YWwnXSA9PSAnUEFTUyc7Ci0gICAgfQotfQotCiBjbGFzcyBGbGFreVJlc3Vs
dHNKU09OV3JpdGVyIGV4dGVuZHMgUmVzdWx0c0pTT05Xcml0ZXIgewogICAgIHB1YmxpYyBmdW5j
dGlvbiBfX2NvbnN0cnVjdCgkZnApIHsgcGFyZW50OjpfX2NvbnN0cnVjdCgkZnApOyB9CiAgICAg
cHJvdGVjdGVkIGZ1bmN0aW9uIHBhc3NfZm9yX2ZhaWx1cmVfdHlwZSgmJHJlc3VsdHMpIHsKQEAg
LTE3NSw2ICsxNjgsMTMgQEAKICAgICB9CiB9CiAKK2NsYXNzIEZhaWxpbmdSZXN1bHRzSlNPTldy
aXRlciBleHRlbmRzIFdyb25nRXhwZWN0YXRpb25zUmVzdWx0c0pTT05Xcml0ZXIgeworICAgIHB1
YmxpYyBmdW5jdGlvbiBfX2NvbnN0cnVjdCgkZnApIHsgcGFyZW50OjpfX2NvbnN0cnVjdCgkZnAp
OyB9CisgICAgcHJvdGVjdGVkIGZ1bmN0aW9uIHBhc3NfZm9yX2ZhaWx1cmVfdHlwZSgmJHJlc3Vs
dHMpIHsKKyAgICAgICAgcmV0dXJuICRyZXN1bHRzWzBdWydhY3R1YWwnXSA9PSAnUEFTUycgfHwg
cGFyZW50OjpwYXNzX2Zvcl9mYWlsdXJlX3R5cGUoJHJlc3VsdHMpOworICAgIH0KK30KKwogY2xh
c3MgUmVzdWx0c0pTT05HZW5lcmF0b3IgewogICAgIHByaXZhdGUgJGRiOwogICAgIHByaXZhdGUg
JGJ1aWxkZXJfaWQ7Cg==
</data>

          </attachment>
      

    </bug>

</bugzilla>