Bug 149691 - Add shared code for a new a graphics benchmark
Summary: Add shared code for a new a graphics benchmark
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Animations (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Said Abou-Hallawa
URL:
Keywords:
Depends on: 149683
Blocks: 149053
  Show dependency treegraph
 
Reported: 2015-09-30 18:11 PDT by Said Abou-Hallawa
Modified: 2015-10-02 20:10 PDT (History)
3 users (show)

See Also:


Attachments
Patch (27.59 KB, patch)
2015-09-30 18:13 PDT, Said Abou-Hallawa
no flags Details | Formatted Diff | Diff
Patch (33.86 KB, patch)
2015-10-01 17:15 PDT, Said Abou-Hallawa
no flags Details | Formatted Diff | Diff
Patch (22.67 KB, patch)
2015-10-02 18:09 PDT, Said Abou-Hallawa
no flags Details | Formatted Diff | Diff
Patch (22.92 KB, patch)
2015-10-02 19:08 PDT, Said Abou-Hallawa
no flags Details | Formatted Diff | Diff
Patch (22.92 KB, patch)
2015-10-02 19:15 PDT, Said Abou-Hallawa
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Said Abou-Hallawa 2015-09-30 18:11:08 PDT
We need a set of shared objects to be used by the tests and the test runner. This patch is separated from the patch attached to https://bugs.webkit.org/show_bug.cgi?id=149053.
Comment 1 Said Abou-Hallawa 2015-09-30 18:13:06 PDT
Created attachment 262219 [details]
Patch
Comment 2 Ryosuke Niwa 2015-09-30 18:19:00 PDT
Comment on attachment 262219 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=262219&action=review

> PerformanceTests/Animometer/resources/statistics.js:1
> +var Statistics =

This is a copy of Speedometer's statistics.js, right?
Please use "svn cp" instead of copy-paste.

> PerformanceTests/Animometer/resources/statistics.js:95
> +function Experiment()

Please put this in a separate file so that syncing statistics.js won't require copying this code into it.
Comment 3 Said Abou-Hallawa 2015-10-01 17:15:27 PDT
Created attachment 262302 [details]
Patch
Comment 4 Said Abou-Hallawa 2015-10-01 17:28:00 PDT
Comment on attachment 262219 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=262219&action=review

>> PerformanceTests/Animometer/resources/statistics.js:1
>> +var Statistics =
> 
> This is a copy of Speedometer's statistics.js, right?
> Please use "svn cp" instead of copy-paste.

Done.

>> PerformanceTests/Animometer/resources/statistics.js:95
>> +function Experiment()
> 
> Please put this in a separate file so that syncing statistics.js won't require copying this code into it.

Done.
Comment 5 Ryosuke Niwa 2015-10-01 21:55:38 PDT
Comment on attachment 262302 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=262302&action=review

> PerformanceTests/Animometer/resources/statistics.js:84
> -        var degreesOfFreedom = numberOfSamples - 1;
> -        if (degreesOfFreedom > cdfForProbability.length)
> -            throw 'We only support up to ' + deltas.length + ' degrees of freedom';
> +        var degreesOfFreedom = Math.min(numberOfSamples - 1, cdfForProbability.length);

We can't just assume an arbitrary degrees of freedom. 
When the number of samples (and therefore the degrees of freedom) is sufficiently large,
we should be using normal distributions instead of student's t-distribution.

The table you're copying here contains values up until 100 degrees of freedom.
I'd say that's sufficiently large that we can start using normal approximation.
We should use a numerical approximation of normal distributions as outlined in:
https://en.wikipedia.org/wiki/Normal_distribution#Numerical_approximations_for_the_normal_CDF
for example.
Comment 6 Said Abou-Hallawa 2015-10-02 09:09:10 PDT
Comment on attachment 262302 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=262302&action=review

>> PerformanceTests/Animometer/resources/statistics.js:84
>> +        var degreesOfFreedom = Math.min(numberOfSamples - 1, cdfForProbability.length);
> 
> We can't just assume an arbitrary degrees of freedom. 
> When the number of samples (and therefore the degrees of freedom) is sufficiently large,
> we should be using normal distributions instead of student's t-distribution.
> 
> The table you're copying here contains values up until 100 degrees of freedom.
> I'd say that's sufficiently large that we can start using normal approximation.
> We should use a numerical approximation of normal distributions as outlined in:
> https://en.wikipedia.org/wiki/Normal_distribution#Numerical_approximations_for_the_normal_CDF
> for example.

Why not?

The t distribution tables usually begins by listing the individual values for small degrees of freedom. Once the degree of freedom becomes large (>30), the tables list rows; each row covers a range of 10 degree of freedoms. And then they list rows for a range of 20 degrees of freedom and then they stop. The last row is used for all the degree of freedoms which is larger than the last value before it: https://en.wikipedia.org/wiki/Student%27s_t-distribution#Table_of_selected_values.  So my choice to to use the cdfForProbability[cdfForProbability.length - 1] is not arbitrary it is a valid approximation. And this is obvious from the table I copied. The last values for every degree of freedom are equal for 2 or 3 significant figures after the decimal point. That means if we had tables for infinite degree of freedoms, the difference between this approximation and the actual value will be most likely in the fourth significant digit after the decimal point.

You are suggesting using the normal approximation but using the last row of the t distribution table is not bad approximation. Besides I am using toFixed(2) for the calculated value anyway because there is no much space and it is also distracting to show more than two digits after the decimal point.
Comment 7 Ryosuke Niwa 2015-10-02 14:18:59 PDT
(In reply to comment #6)
> Comment on attachment 262302 [details]
> Patch
>
> The t distribution tables usually begins by listing the individual values
> for small degrees of freedom. Once the degree of freedom becomes large
> (>30), the tables list rows; each row covers a range of 10 degree of
> freedoms. And then they list rows for a range of 20 degrees of freedom and
> then they stop. The last row is used for all the degree of freedoms which is
> larger than the last value before it:
> https://en.wikipedia.org/wiki/Student%27s_t-
> distribution#Table_of_selected_values.  So my choice to to use the
> cdfForProbability[cdfForProbability.length - 1] is not arbitrary it is a
> valid approximation.

It isn't because I added that damn table (by computing them on R), and the last value is not appropriate as the approximation for an arbitrary degrees of freedom.  It's for 100 degrees of freedom, and there is 2-3% margins of error compared to when n -> infinity.

It would mean that your benchmark can never detect any regression of a magnitude less than say, 5%, reliably because your confidence interval calculation itself has an inherent error of 2-3%.

> You are suggesting using the normal approximation but using the last row of
> the t distribution table is not bad approximation. Besides I am using
> toFixed(2) for the calculated value anyway because there is no much space
> and it is also distracting to show more than two digits after the decimal
> point.

Is your test result guaranteed to be less than 9?  If not, toFixed(2) doesn't guarantee the number of significant figures at all since the function only forces two decimal points, not two significant figures. There's a big difference between the two.
Comment 8 Said Abou-Hallawa 2015-10-02 18:09:07 PDT
Created attachment 262372 [details]
Patch
Comment 9 Said Abou-Hallawa 2015-10-02 18:10:33 PDT
I talked to Jon Lee and Simon Fraser and we agreed on removing the 95% confidence interval delta measurement altogether. In speedometer it is used to measure the confidence interval of the test score mean. That makes since because the sampled data are totally independent. But in our case, we are sampling the test complexities in the same iteration. Since the sampled test complexities are dependent we can't say this measurement is useful in this case.

We may use the confidence interval measurement in the future if we decide to run the same test multiple times as speedometer. But for now we do not need it for the iteration sampled data.
Comment 10 Said Abou-Hallawa 2015-10-02 18:11:47 PDT
statistics.js was removed because the only function I needed from it was sampleStandardDeviation().
Comment 11 Ryosuke Niwa 2015-10-02 18:14:31 PDT
Makes sense.
Comment 12 Dean Jackson 2015-10-02 18:21:23 PDT
Comment on attachment 262372 [details]
Patch

r= based on previous reviews and rniwa's comments being addressed.
Comment 13 Ryosuke Niwa 2015-10-02 18:23:13 PDT
Comment on attachment 262372 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=262372&action=review

r=me provided the following comments are addressed before the patch is landed.

> PerformanceTests/Animometer/resources/extensions.js:163
> +                th.innerHTML += "<br>[" + message + "]";

For the sake of XSS prevention, I'd suggest manually constructing text nodes instead as in:
th.appendChild(document.createElement('br'));
th.appendChild(document.createTextNode('[' + message +]'));

> PerformanceTests/Animometer/resources/sampler.js:10
> +    sampleStandardDeviation: function(numberOfSamples, sum, squareSum)

It's a bit odd to refer to the version with Bessel's correction as sampleStandardDeviation
since the unqualified "sample standard deviation" usually refers to biased/uncorrected standard deviation.

But I suppose anyone reading the code will see the Bessel's correction so it's not a big deal
but I'd prefer calling this correctedSampleStandardDeviation or unbiasedSampleStandardDeviation instead.

By the way, you removed the comment above this function:
http://trac.webkit.org/browser/trunk/PerformanceTests/resources/statistics.js#L44

Please add it back as it's an important documentation.
Comment 14 Ryosuke Niwa 2015-10-02 18:26:15 PDT
Comment on attachment 262372 [details]
Patch

cq- because my comments have not been addressed.
Comment 15 Said Abou-Hallawa 2015-10-02 19:08:58 PDT
Created attachment 262376 [details]
Patch
Comment 16 Said Abou-Hallawa 2015-10-02 19:09:55 PDT
Comment on attachment 262372 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=262372&action=review

>> PerformanceTests/Animometer/resources/extensions.js:163
>> +                th.innerHTML += "<br>[" + message + "]";
> 
> For the sake of XSS prevention, I'd suggest manually constructing text nodes instead as in:
> th.appendChild(document.createElement('br'));
> th.appendChild(document.createTextNode('[' + message +]'));

Done.

>> PerformanceTests/Animometer/resources/sampler.js:10
>> +    sampleStandardDeviation: function(numberOfSamples, sum, squareSum)
> 
> It's a bit odd to refer to the version with Bessel's correction as sampleStandardDeviation
> since the unqualified "sample standard deviation" usually refers to biased/uncorrected standard deviation.
> 
> But I suppose anyone reading the code will see the Bessel's correction so it's not a big deal
> but I'd prefer calling this correctedSampleStandardDeviation or unbiasedSampleStandardDeviation instead.
> 
> By the way, you removed the comment above this function:
> http://trac.webkit.org/browser/trunk/PerformanceTests/resources/statistics.js#L44
> 
> Please add it back as it's an important documentation.

Function was renamed correctedSampleStandardDeviation() and the comment was added.
Comment 17 Said Abou-Hallawa 2015-10-02 19:10:48 PDT
Comment on attachment 262372 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=262372&action=review

>>> PerformanceTests/Animometer/resources/sampler.js:10
>>> +    sampleStandardDeviation: function(numberOfSamples, sum, squareSum)
>> 
>> It's a bit odd to refer to the version with Bessel's correction as sampleStandardDeviation
>> since the unqualified "sample standard deviation" usually refers to biased/uncorrected standard deviation.
>> 
>> But I suppose anyone reading the code will see the Bessel's correction so it's not a big deal
>> but I'd prefer calling this correctedSampleStandardDeviation or unbiasedSampleStandardDeviation instead.
>> 
>> By the way, you removed the comment above this function:
>> http://trac.webkit.org/browser/trunk/PerformanceTests/resources/statistics.js#L44
>> 
>> Please add it back as it's an important documentation.
> 
> Function was renamed correctedSampleStandardDeviation() and the comment was added.

Correction: Function was renamed unbiasedSampleStandardDeviation()
Comment 18 WebKit Commit Bot 2015-10-02 19:13:15 PDT
Comment on attachment 262376 [details]
Patch

Rejecting attachment 262376 [details] from commit-queue.

Failed to run "['/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch', '--status-host=webkit-queues.webkit.org', '--bot-id=webkit-cq-03', 'validate-changelog', '--check-oops', '--non-interactive', 262376, '--port=mac']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit

ChangeLog entry in PerformanceTests/ChangeLog contains OOPS!.

Full output: http://webkit-queues.webkit.org/results/238447
Comment 19 Said Abou-Hallawa 2015-10-02 19:15:51 PDT
Created attachment 262377 [details]
Patch
Comment 20 WebKit Commit Bot 2015-10-02 20:10:41 PDT
Comment on attachment 262377 [details]
Patch

Clearing flags on attachment: 262377

Committed r190541: <http://trac.webkit.org/changeset/190541>
Comment 21 WebKit Commit Bot 2015-10-02 20:10:46 PDT
All reviewed patches have been landed.  Closing bug.