Bug 106078

Summary: Statistics used in perftest_unittest.py and perftest_integrationtest.py are bogus
Product: WebKit Reporter: Ryosuke Niwa <rniwa>
Component: Tools / TestsAssignee: Ryosuke Niwa <rniwa>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, dpranke, eric, ggaren, morrita, ossy, pdr, webkit-bug-importer, webkit.review.bot, zoltan
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 97510    
Attachments:
Description Flags
Patch
none
Patch tony: review+

Ryosuke Niwa
Reported 2013-01-04 01:03:08 PST
Right now, run-perf-tests simply reads statistics off of test output text for non-page-loading tests. While this is desirable in the long term where we can have all statistics functions implemented only in JavaScript, it blocks our work to use multiple instances of DRT/WTR to smooth out between-run variance in the bug 97510. Now that Dromaeo and other perf. tests all report results from each iteration, we can reliably compute statistics in python code instead.
Attachments
Patch (26.94 KB, patch)
2013-01-04 01:18 PST, Ryosuke Niwa
no flags
Patch (26.32 KB, patch)
2013-01-04 01:30 PST, Ryosuke Niwa
tony: review+
Radar WebKit Bug Importer
Comment 1 2013-01-04 01:04:05 PST
Ryosuke Niwa
Comment 2 2013-01-04 01:18:44 PST
Ryosuke Niwa
Comment 3 2013-01-04 01:25:30 PST
Let me change the scope of this bug. Instead of refactoring the unit & integration tests and modifying perftest.py, we can concentrate on refactoring unit & integration tests.
Ryosuke Niwa
Comment 4 2013-01-04 01:30:46 PST
Ryosuke Niwa
Comment 5 2013-01-04 10:21:10 PST
Csaba Osztrogonác
Comment 6 2013-01-07 07:32:41 PST
(In reply to comment #5) > Committed r138810: <http://trac.webkit.org/changeset/138810> It broke a webkitpy tests: File "/ramdisk/qt-linux-release/build/Tools/Scripts/webkitpy/performance_tests/perftestsrunner_integrationtest.py", line 306, in test_run_memory_test self.assertEqual(results['Parser/memory-test'], MemoryTestData.results) AssertionError: {u'min': 1080.0, u'max': 1120.0, u'median': 1101.0, u'values': [1080.0, 1120.0, 1095.0, 1101.0, 1104.0], u'stdev': 14.508599999999999, u'avg': 1100.0, u'unit': u'ms'} != {'min': 1080, 'max': 1120, 'median': 1101, 'values': [1080, 1120, 1095, 1101, 1104], 'stdev': 14.508609999999999, 'avg': 1100, 'unit': 'ms'} Could you fix it, please?
Ryosuke Niwa
Comment 7 2013-01-07 10:33:44 PST
(In reply to comment #6) > (In reply to comment #5) > > Committed r138810: <http://trac.webkit.org/changeset/138810> > > It broke a webkitpy tests: > > File "/ramdisk/qt-linux-release/build/Tools/Scripts/webkitpy/performance_tests/perftestsrunner_integrationtest.py", line 306, in test_run_memory_test > self.assertEqual(results['Parser/memory-test'], MemoryTestData.results) > AssertionError: {u'min': 1080.0, u'max': 1120.0, u'median': 1101.0, u'values': [1080.0, 1120.0, 1095.0, 1101.0, 1104.0], u'stdev': 14.508599999999999, u'avg': 1100.0, u'unit': u'ms'} != {'min': 1080, 'max': 1120, 'median': 1101, 'values': [1080, 1120, 1095, 1101, 1104], 'stdev': 14.508609999999999, 'avg': 1100, 'unit': 'ms'} > > Could you fix it, please? Huh, do you have a very old version of python? It appears to me that there's some significant rounding error there.
Csaba Osztrogonác
Comment 8 2013-01-07 10:40:07 PST
I have Python 2.6.6 (Debian Squeeze), but it fails on Qt, GTK and Chromium bots too. But it passes for me with Python 2.7.3 (Ubuntu 12.04). I don't think if the proper fix is updating python on several bots ...
Ryosuke Niwa
Comment 9 2013-01-07 10:55:19 PST
(In reply to comment #8) > I have Python 2.6.6 (Debian Squeeze), but it fails on Qt, GTK and Chromium > bots too. But it passes for me with Python 2.7.3 (Ubuntu 12.04). I mean... it's really bad that standard deviation computation has such a large computation error. It's correct for only 6 decimal points...
Ryosuke Niwa
Comment 10 2013-01-07 10:55:34 PST
(In reply to comment #9) > (In reply to comment #8) > > I have Python 2.6.6 (Debian Squeeze), but it fails on Qt, GTK and Chromium > > bots too. But it passes for me with Python 2.7.3 (Ubuntu 12.04). > > I mean... it's really bad that standard deviation computation has such a large computation error. It's correct for only 6 decimal points... Ugh... I mean 6 significant figures.
Ryosuke Niwa
Comment 11 2013-01-07 11:06:30 PST
It's mind blowing that people think python is great for scientific computation when its numerical accuracy is much worse than that of JavaScript.
Ryosuke Niwa
Comment 12 2013-01-07 11:06:48 PST
Note You need to log in before you can comment on or make changes to this bug.