run-perf-tests should record indivisual value instead of statistics
Created attachment 164811 [details] Patch
I don't have access to perf-o-matic code this week so I'll update it next week.
Comment on attachment 164811 [details] Patch Attachment 164811 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/13954001 New failing tests: http/tests/css/link-css-disabled-value-with-slow-loading-sheet.html
Comment on attachment 164811 [details] Patch The code looks good considering there are some usecases for this value so... what is this for?
(In reply to comment #4) > (From update of attachment 164811 [details]) > The code looks good considering there are some usecases for this value so... what is this for? In the long term, perf-o-matic should be able to store these values and do some statistical analysis on them. e.g. Once we ported Datazilla, it has a built-in student's t-test and other things. For now, we can just report them on results page so that we can use them to analyze test results locally.
(In reply to comment #5) > In the long term, perf-o-matic should be able to store these values and do some statistical analysis on them. e.g. Once we ported Datazilla, it has a built-in student's t-test and other things. For now, we can just report them on results page so that we can use them to analyze test results locally. Got it. Thanks.
Comment on attachment 164811 [details] Patch Rejecting attachment 164811 [details] from commit-queue. Failed to run "['/mnt/git/webkit-commit-queue/Tools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '-..." exit_code: 1 Last 500 characters of output: ueue/Tools/Scripts/webkitpy/tool/commands/stepsequence.py", line 70, in run_and_handle_errors self._run(tool, options, state) File "/mnt/git/webkit-commit-queue/Tools/Scripts/webkitpy/tool/commands/stepsequence.py", line 64, in _run step(tool, options).run(state) File "/mnt/git/webkit-commit-queue/Tools/Scripts/webkitpy/tool/steps/validatereviewer.py", line 50, in run if changelog_entry.has_valid_reviewer(): AttributeError: 'NoneType' object has no attribute 'has_valid_reviewer' Full output: http://queues.webkit.org/results/13916045
Committed r129091: <http://trac.webkit.org/changeset/129091>
Re-opened since this is blocked by 97205
(In reply to comment #9) > Re-opened since this is blocked by 97205 Rolled out by https://trac.webkit.org/changeset/129123, because it broke all perf bots. Apple bot before the patch: http://build.webkit.org/builders/Apple%20Lion%20Release%20%28Perf%29/builds/5748 Apple bot after the patch: http://build.webkit.org/builders/Apple%20Lion%20Release%20%28Perf%29/builds/5749 Qt bot before the patch: http://build.webkit.org/builders/Qt%20Linux%2064-bit%20Release%20%28Perf%29/builds/5002 Qt bot after the patch: http://build.webkit.org/builders/Qt%20Linux%2064-bit%20Release%20%28Perf%29/builds/5003
Could you not roll out these perf-test patches? In practice, nobody is looking at the results of these bots (webkit-perf.appspot.com) and bots don't report whether performance has regressed or not anyway. It causes too much svn commit churn if we were rolling out every single patch like this.
Perf. bots are still in development, and rolling out patches just because tests started to fail will slow down the development process than anything else. It's not helpful.
Sure. But I don't understand why we should run broken perf bots for days. It is wasting the CPU resources ...
Committed r129158: <http://trac.webkit.org/changeset/129158>
(In reply to comment #12) > Perf. bots are still in development, and rolling out patches just because tests started to fail will slow down the development process than anything else. It's not helpful. Having red in http://build.webkit.org/console hurts the perception of the project health, even though current situation is embracing constant redness. If we expect long-running redness, we probably should hide it or make it clear by givin them a separate group IMO. It doesn't seems polite behavior for me to blame people who revert the patch which made bots red.
(In reply to comment #15) > (In reply to comment #12) > > Perf. bots are still in development, and rolling out patches just because tests started to fail will slow down the development process than anything else. It's not helpful. > > If we expect long-running redness, we probably should hide it or make it clear by givin them a separate group IMO. Yeah, it might make sense for us to add a new category for Perf bots and put them there. We might have added prematurely. It has helped us catching some regressions in the past, but the entire framework is very immature compared to layout tests and other test frameworks we have.