Right now, run-perf-tests simply reads statistics off of test output text for non-page-loading tests. While this is desirable in the long term where we can have all statistics functions implemented only in JavaScript, it blocks our work to use multiple instances of DRT/WTR to smooth out between-run variance in the bug 97510. Now that Dromaeo and other perf. tests all report results from each iteration, we can reliably compute statistics in python code instead.
<rdar://problem/12955987>
Created attachment 181282 [details] Patch
Let me change the scope of this bug. Instead of refactoring the unit & integration tests and modifying perftest.py, we can concentrate on refactoring unit & integration tests.
Created attachment 181283 [details] Patch
Committed r138810: <http://trac.webkit.org/changeset/138810>
(In reply to comment #5) > Committed r138810: <http://trac.webkit.org/changeset/138810> It broke a webkitpy tests: File "/ramdisk/qt-linux-release/build/Tools/Scripts/webkitpy/performance_tests/perftestsrunner_integrationtest.py", line 306, in test_run_memory_test self.assertEqual(results['Parser/memory-test'], MemoryTestData.results) AssertionError: {u'min': 1080.0, u'max': 1120.0, u'median': 1101.0, u'values': [1080.0, 1120.0, 1095.0, 1101.0, 1104.0], u'stdev': 14.508599999999999, u'avg': 1100.0, u'unit': u'ms'} != {'min': 1080, 'max': 1120, 'median': 1101, 'values': [1080, 1120, 1095, 1101, 1104], 'stdev': 14.508609999999999, 'avg': 1100, 'unit': 'ms'} Could you fix it, please?
(In reply to comment #6) > (In reply to comment #5) > > Committed r138810: <http://trac.webkit.org/changeset/138810> > > It broke a webkitpy tests: > > File "/ramdisk/qt-linux-release/build/Tools/Scripts/webkitpy/performance_tests/perftestsrunner_integrationtest.py", line 306, in test_run_memory_test > self.assertEqual(results['Parser/memory-test'], MemoryTestData.results) > AssertionError: {u'min': 1080.0, u'max': 1120.0, u'median': 1101.0, u'values': [1080.0, 1120.0, 1095.0, 1101.0, 1104.0], u'stdev': 14.508599999999999, u'avg': 1100.0, u'unit': u'ms'} != {'min': 1080, 'max': 1120, 'median': 1101, 'values': [1080, 1120, 1095, 1101, 1104], 'stdev': 14.508609999999999, 'avg': 1100, 'unit': 'ms'} > > Could you fix it, please? Huh, do you have a very old version of python? It appears to me that there's some significant rounding error there.
I have Python 2.6.6 (Debian Squeeze), but it fails on Qt, GTK and Chromium bots too. But it passes for me with Python 2.7.3 (Ubuntu 12.04). I don't think if the proper fix is updating python on several bots ...
(In reply to comment #8) > I have Python 2.6.6 (Debian Squeeze), but it fails on Qt, GTK and Chromium > bots too. But it passes for me with Python 2.7.3 (Ubuntu 12.04). I mean... it's really bad that standard deviation computation has such a large computation error. It's correct for only 6 decimal points...
(In reply to comment #9) > (In reply to comment #8) > > I have Python 2.6.6 (Debian Squeeze), but it fails on Qt, GTK and Chromium > > bots too. But it passes for me with Python 2.7.3 (Ubuntu 12.04). > > I mean... it's really bad that standard deviation computation has such a large computation error. It's correct for only 6 decimal points... Ugh... I mean 6 significant figures.
It's mind blowing that people think python is great for scientific computation when its numerical accuracy is much worse than that of JavaScript.
Attempted a fix in http://trac.webkit.org/changeset/138965.