RESOLVED FIXED 78303
[PerformanceTests] run-perf-tests should be able to handle Hz-based score.
https://bugs.webkit.org/show_bug.cgi?id=78303
Summary [PerformanceTests] run-perf-tests should be able to handle Hz-based score.
Hajime Morrita
Reported 2012-02-09 17:54:34 PST
Dromaeo reports score (run/s) by default. I should report time taken.
Attachments
Patch (6.53 KB, patch)
2012-03-14 18:56 PDT, Kentaro Hara
no flags
patch for landing (6.47 KB, patch)
2012-03-15 00:16 PDT, Kentaro Hara
no flags
Hajime Morrita
Comment 1 2012-02-09 21:21:16 PST
I'd like to close this as WONTFIX beucause: - We already have data. Changing the dimension will kill these data. - What we area doing is how the official Dromaeo counts. We can understand our score as "Dromaeo score" even though it differs from how other tests is doing. - Since the tests run so fast, we need to give some scale. But determining such scale is a tricky job. Currnet "run/s" is further simpler. Ryosuke, what do you think?
Ryosuke Niwa
Comment 2 2012-02-09 22:59:02 PST
I think we need to update run-perf-tests so that it doesn't it doesn't erroneously add "ms" at the end. It's extremely confusing. The correct unit for these values is "Hz".
Ryosuke Niwa
Comment 3 2012-02-09 23:02:44 PST
By the way, having historical data is okay though. We've already had a bunch of bogus data due to the bot issues so switching the unit type at this point shouldn't be that big of a deal. Also, I'm thinking that we can add collect historical data for tests once tests have stabilized so I don't think losing historical data now should affect our decision here.
Hajime Morrita
Comment 4 2012-02-09 23:11:26 PST
(In reply to comment #2) > I think we need to update run-perf-tests so that it doesn't it doesn't erroneously add "ms" at the end. It's extremely confusing. The correct unit for these values is "Hz". Good point. I changed the bug title. BTW, I personally prefer hz over ms. I guess Ojan has some opinion here. (IIRC I read he talking about that somewhere.)
Adam Barth
Comment 5 2012-02-09 23:19:36 PST
Either way is fine as long as it's clearly labeled on the graphs and the text output whether bigger-is-better or smaller-is-better.
Ryosuke Niwa
Comment 6 2012-02-09 23:23:17 PST
(In reply to comment #5) > Either way is fine as long as it's clearly labeled on the graphs and the text output whether bigger-is-better or smaller-is-better. For labeling, we need upstream changes to perf-o-matic since we currently don't have any notion of units :(
Kentaro Hara
Comment 7 2012-03-14 16:59:40 PDT
Reporting "runs/s" as "ms" is extremely confusing (Actually I've so far given up a few patches because I thought they "regressed" performance). We should fix it somehow. As far as I see the discussion, the possible fixes are as follows: [1] Just change the label from "ms" to "Hz". (Nit: I would prefer "runs/s" to "Hz" since the official Dromaeo is using "runs/s") [2] Convert "runs/s" measured by Dromaeo to "ms", and report "ms". (patch: https://bugs.webkit.org/show_bug.cgi?id=80974) [3] Run Dromaeo with the "runStyle=ms" mode, by which Dromaeo measures "ms" instead of "runs/s". Then report "ms". IMHO, - I do not like [1]. Mixing "larger-is-better" graphs and "smaller-is-better" graphs is really confusing. If we do not care historical data, I guess we can avoid [1]. - Both [2] and [3] sound good to me, but [2] would be better. In case of [3], we need to slightly modify webrunner.js to run the "runStyle=ms" mode with the same scale as the "runStyle=runs/s" mode (patch: https://bugs.webkit.org/attachment.cgi?id=131576&action=review). On the other hand, the disadvantage of [2] is that we cannot output the median and stdev, since we cannot calculate the median and stdev of "ms" from those of "runs/s". WDYT?
Ryosuke Niwa
Comment 8 2012-03-14 17:06:49 PDT
(In reply to comment #7) > Reporting "runs/s" as "ms" is extremely confusing (Actually I've so far given up a few patches because I thought they "regressed" performance). We should fix it somehow. As far as I see the discussion, the possible fixes are as follows: > > [1] Just change the label from "ms" to "Hz". (Nit: I would prefer "runs/s" to "Hz" since the official Dromaeo is using "runs/s") > [2] Convert "runs/s" measured by Dromaeo to "ms", and report "ms". (patch: https://bugs.webkit.org/show_bug.cgi?id=80974) > [3] Run Dromaeo with the "runStyle=ms" mode, by which Dromaeo measures "ms" instead of "runs/s". Then report "ms". > > - I do not like [1]. Mixing "larger-is-better" graphs and "smaller-is-better" graphs is really confusing. If we do not care historical data, I guess we can avoid [1]. > - Both [2] and [3] sound good to me, but [2] would be better. In case of [3], we need to slightly modify webrunner.js to run the "runStyle=ms" mode with the same scale as the "runStyle=runs/s" mode (patch: https://bugs.webkit.org/attachment.cgi?id=131576&action=review). On the other hand, the disadvantage of [2] is that we cannot output the median and stdev, since we cannot calculate the median and stdev of "ms" from those of "runs/s". Let's not do [2] or [3]. It's extremely confusing to invent our own metrics for Dromaeo. Also see https://bugs.webkit.org/show_bug.cgi?id=80776.
Kentaro Hara
Comment 9 2012-03-14 17:10:42 PDT
(In reply to comment #8) > Let's not do [2] or [3]. It's extremely confusing to invent our own metrics for Dromaeo. Also see https://bugs.webkit.org/show_bug.cgi?id=80776. Is it because we want to compare the perf-test results with the results when we run Dromaeo manually on the browser?
Ryosuke Niwa
Comment 10 2012-03-14 17:13:01 PDT
(In reply to comment #9) > Is it because we want to compare the perf-test results with the results when we run Dromaeo manually on the browser? Right.
Kentaro Hara
Comment 11 2012-03-14 17:15:30 PDT
(In reply to comment #10) > (In reply to comment #9) > > Is it because we want to compare the perf-test results with the results when we run Dromaeo manually on the browser? > > Right. OK, then I'll take the approach [1]. (although I still think that [2] or [3] would be less confusing in total, IMO)
Ryosuke Niwa
Comment 12 2012-03-14 17:16:40 PDT
(In reply to comment #11) > OK, then I'll take the approach [1]. (although I still think that [2] or [3] would be less confusing in total, IMO) I think this is just an UI issue on perf-o-matic. We just need to put some information about whether larger is better or smaller is better for each graph (maybe just use different colors?).
Kentaro Hara
Comment 13 2012-03-14 18:56:44 PDT
Ryosuke Niwa
Comment 14 2012-03-14 23:56:02 PDT
Comment on attachment 131973 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=131973&action=review > Tools/Scripts/webkitpy/performance_tests/perftestsrunner.py:294 > results[score.group(1)] = float(score.group(2)) > + if score.group(3): > + unit = score.group(3) Can we used the named groups instead? e.g. "(?Pblah...)" then match.group('blah') > PerformanceTests/ChangeLog:16 > + Running Dromaeo/dom-attr.html (1 of 1) This > PerformanceTests/ChangeLog:19 > + Finished: 30.616529 s And this line are actually output into stderr so you should probably exclude them (you can confirm this by piping stdout).
Ryosuke Niwa
Comment 15 2012-03-14 23:56:47 PDT
I think we need to include the units in JSON file as well so that perf-o-matic can display them. Could you either file a new bug or address that in your patch as well?
Kentaro Hara
Comment 16 2012-03-15 00:16:27 PDT
Created attachment 131996 [details] patch for landing
WebKit Review Bot
Comment 17 2012-03-15 20:13:44 PDT
Comment on attachment 131996 [details] patch for landing Clearing flags on attachment: 131996 Committed r110933: <http://trac.webkit.org/changeset/110933>
Note You need to log in before you can comment on or make changes to this bug.