<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>105003</bug_id>
          
          <creation_ts>2012-12-14 02:28:03 -0800</creation_ts>
          <short_desc>Whitelist a subset of tests to be ran on run-perf-tests by default</short_desc>
          <delta_ts>2013-01-24 20:08:06 -0800</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>Tools / Tests</component>
          <version>528+ (Nightly build)</version>
          <rep_platform>Unspecified</rep_platform>
          <op_sys>Unspecified</op_sys>
          <bug_status>NEW</bug_status>
          <resolution></resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          <dependson>97510</dependson>
          <blocked>77037</blocked>
          <everconfirmed>1</everconfirmed>
          <reporter name="Ryosuke Niwa">rniwa</reporter>
          <assigned_to name="Nobody">webkit-unassigned</assigned_to>
          <cc>barraclough</cc>
    
    <cc>eric</cc>
    
    <cc>haraken</cc>
    
    <cc>mjs</cc>
    
    <cc>slewis</cc>
    
    <cc>syoichi</cc>
    
    <cc>zoltan</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>791487</commentid>
    <comment_count>0</comment_count>
    <who name="Ryosuke Niwa">rniwa</who>
    <bug_when>2012-12-14 02:28:03 -0800</bug_when>
    <thetext>Right now, we run all tests in PerformanceTests by default but this isn&apos;t really helpful because some tests generate results with really high variance while others test really specific feature of the browser that it&apos;s not worth running as regression tests.

By whitelisting tests that are known to be good indicative of WebKit performance, we can start using these tests to see if a patch cause a real regression or not, paving our way to eventually add pref. EWS bots.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>791488</commentid>
    <comment_count>1</comment_count>
    <who name="Eric Seidel (no email)">eric</who>
    <bug_when>2012-12-14 02:34:15 -0800</bug_when>
    <thetext>Huzzah!</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>791685</commentid>
    <comment_count>2</comment_count>
    <who name="Zoltan Horvath">zoltan</who>
    <bug_when>2012-12-14 10:12:37 -0800</bug_when>
    <thetext>Sounds reasonable. It would be good to check out the used code coverage after we have the whitelist.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>793450</commentid>
    <comment_count>3</comment_count>
    <who name="Ryosuke Niwa">rniwa</who>
    <bug_when>2012-12-18 01:21:48 -0800</bug_when>
    <thetext>What should be the criteria for a test to be whitelisted?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>793457</commentid>
    <comment_count>4</comment_count>
    <who name="Stephanie Lewis">slewis</who>
    <bug_when>2012-12-18 01:31:47 -0800</bug_when>
    <thetext>When calculating a tests value I usually look at reproducibility, coverage/sensitivity, external interest, and length of time to run/difficulty to set up.  If a test&apos;s results are not consistent then tracking it&apos;s progress creates a burden as opposed to being helpful.  If a test is testing some obscure technologies or doesn&apos;t pick up major regressions in what it does test it may not be valuable.  Tests that are run by the media externally are good to keep an eye on.  Length of time to run may not matter here, but if a test breaks a lot if may also impose a burden.

I think a good first step would be figuring out which tests have less than a 2% difference over a significant number of runs of the same source and go from there.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>794087</commentid>
    <comment_count>5</comment_count>
    <who name="Ryosuke Niwa">rniwa</who>
    <bug_when>2012-12-18 14:56:30 -0800</bug_when>
    <thetext>(In reply to comment #4)
&gt; When calculating a tests value I usually look at reproducibility, coverage/sensitivity, external interest, and length of time to run/difficulty to set up.  If a test&apos;s results are not consistent then tracking it&apos;s progress creates a burden as opposed to being helpful.  If a test is testing some obscure technologies or doesn&apos;t pick up major regressions in what it does test it may not be valuable.  Tests that are run by the media externally are good to keep an eye on.  Length of time to run may not matter here, but if a test breaks a lot if may also impose a burden.

That sounds sensible... except that

&gt; I think a good first step would be figuring out which tests have less than a 2% difference over a significant number of runs of the same source and go from there.

Almost all tests have more than 2% variance :( We should probably fix https://bugs.webkit.org/show_bug.cgi?id=97510 first then.</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>