RESOLVED FIXED 98115
PerfTests have incorrect standard deviation calculation
https://bugs.webkit.org/show_bug.cgi?id=98115
Summary PerfTests have incorrect standard deviation calculation
Philip Rogers
Reported 2012-10-01 22:36:27 PDT
Our performance test numbers calculate standard deviation incorrectly: (wrong) RESULT PageLoad: svg: files: worldcup.svg= 2197.27248895 ms median= 2157.99593925 ms, stdev= 693.92660072 ms, min= 2127.62403488 ms, max= 2865.42391777 ms (right) RESULT PageLoad: svg: files: worldcup.svg= 2177.03914642 ms median= 2146.1250782 ms, stdev= 162.849482453 ms, min= 2105.28707504 ms, max= 2840.33107758 ms
Attachments
Fix PerfTest standard deviation calculation. (4.00 KB, patch)
2012-10-01 22:48 PDT, Philip Rogers
no flags
Update per reviewer comments (13.69 KB, patch)
2012-10-01 23:48 PDT, Philip Rogers
no flags
Rebaselined (13.77 KB, patch)
2012-10-02 00:04 PDT, Philip Rogers
rniwa: review+
Added link to algorithm in the ChangeLog (14.13 KB, patch)
2012-10-02 00:17 PDT, Philip Rogers
no flags
Philip Rogers
Comment 1 2012-10-01 22:48:52 PDT
Created attachment 166608 [details] Fix PerfTest standard deviation calculation.
Ryosuke Niwa
Comment 2 2012-10-01 22:52:45 PDT
Comment on attachment 166608 [details] Fix PerfTest standard deviation calculation. View in context: https://bugs.webkit.org/attachment.cgi?id=166608&action=review Could you also change the algorithm in runner.js? I've always wanted to fix that algorithm. > Tools/Scripts/webkitpy/performance_tests/perftest.py:236 > + 'stdev': math.sqrt(squareSum / (len(sorted_test_times) - 1)), So we're now computing the sample standard deviation as opposed to the sample deviation of the sample: http://en.wikipedia.org/wiki/Standard_deviation#Estimation You should explain that in the change log.
Ryosuke Niwa
Comment 3 2012-10-01 22:54:28 PDT
Comment on attachment 166608 [details] Fix PerfTest standard deviation calculation. View in context: https://bugs.webkit.org/attachment.cgi?id=166608&action=review > Tools/Scripts/webkitpy/performance_tests/perftest.py:227 > delta = time - mean > sweep = i + 1.0 > mean += delta / sweep We can rewrite this as mean += (time - mean) / sweep and eliminate delta now.
Philip Rogers
Comment 4 2012-10-01 23:48:46 PDT
Created attachment 166621 [details] Update per reviewer comments
Philip Rogers
Comment 5 2012-10-01 23:50:01 PDT
(In reply to comment #2) > (From update of attachment 166608 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=166608&action=review > > Could you also change the algorithm in runner.js? I've always wanted to fix that algorithm. Done :) > > > Tools/Scripts/webkitpy/performance_tests/perftest.py:236 > > + 'stdev': math.sqrt(squareSum / (len(sorted_test_times) - 1)), > > So we're now computing the sample standard deviation as opposed to the sample deviation of the sample: > http://en.wikipedia.org/wiki/Standard_deviation#Estimation > > You should explain that in the change log. Done
Philip Rogers
Comment 6 2012-10-01 23:51:52 PDT
(In reply to comment #3) > (From update of attachment 166608 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=166608&action=review > > > Tools/Scripts/webkitpy/performance_tests/perftest.py:227 > > delta = time - mean > > sweep = i + 1.0 > > mean += delta / sweep > > We can rewrite this as > mean += (time - mean) / sweep > and eliminate delta now. I'm not sure it's quite that simple (we need to store the mean before and after the update). This isn't performance-sensitive code and I think the readability (copied directly from Knuth) is useful here for future statistics hackers.
Philip Rogers
Comment 7 2012-10-01 23:53:56 PDT
Comment on attachment 166621 [details] Update per reviewer comments Clearing flags since patch doesn't apply. Let me rebaseline after rniwa's change today and reupload.
Philip Rogers
Comment 8 2012-10-02 00:04:58 PDT
Created attachment 166623 [details] Rebaselined
Ryosuke Niwa
Comment 9 2012-10-02 00:11:09 PDT
Comment on attachment 166623 [details] Rebaselined Please include the URL to the algorithm you cited.
Ryosuke Niwa
Comment 10 2012-10-02 00:12:08 PDT
*** Bug 77401 has been marked as a duplicate of this bug. ***
Philip Rogers
Comment 11 2012-10-02 00:17:26 PDT
Created attachment 166626 [details] Added link to algorithm in the ChangeLog
WebKit Review Bot
Comment 12 2012-10-02 01:10:40 PDT
Comment on attachment 166626 [details] Added link to algorithm in the ChangeLog Clearing flags on attachment: 166626 Committed r130135: <http://trac.webkit.org/changeset/130135>
WebKit Review Bot
Comment 13 2012-10-02 01:10:44 PDT
All reviewed patches have been landed. Closing bug.
Csaba Osztrogonác
Comment 14 2012-10-02 04:00:21 PDT
(In reply to comment #12) > (From update of attachment 166626 [details]) > Clearing flags on attachment: 166626 > > Committed r130135: <http://trac.webkit.org/changeset/130135> FYI: It broke the perf tests again. See the bots for details.
Csaba Osztrogonác
Comment 15 2012-10-02 14:13:07 PDT
Philip Rogers
Comment 16 2012-10-02 14:15:43 PDT
(In reply to comment #15) > Fix landed in http://trac.webkit.org/changeset/130181 Should we also make this change in the python standard deviation method?
Ryosuke Niwa
Comment 17 2012-10-02 14:23:47 PDT
(In reply to comment #16) > (In reply to comment #15) > > Fix landed in http://trac.webkit.org/changeset/130181 > > Should we also make this change in the python standard deviation method? That's a good point. We should do that although PageLoadTest always gets 20 values.
Note You need to log in before you can comment on or make changes to this bug.