Dromaeo reports score (run/s) by default. I should report time taken.
I'd like to close this as WONTFIX beucause: - We already have data. Changing the dimension will kill these data. - What we area doing is how the official Dromaeo counts. We can understand our score as "Dromaeo score" even though it differs from how other tests is doing. - Since the tests run so fast, we need to give some scale. But determining such scale is a tricky job. Currnet "run/s" is further simpler. Ryosuke, what do you think?
I think we need to update run-perf-tests so that it doesn't it doesn't erroneously add "ms" at the end. It's extremely confusing. The correct unit for these values is "Hz".
By the way, having historical data is okay though. We've already had a bunch of bogus data due to the bot issues so switching the unit type at this point shouldn't be that big of a deal. Also, I'm thinking that we can add collect historical data for tests once tests have stabilized so I don't think losing historical data now should affect our decision here.
(In reply to comment #2) > I think we need to update run-perf-tests so that it doesn't it doesn't erroneously add "ms" at the end. It's extremely confusing. The correct unit for these values is "Hz". Good point. I changed the bug title. BTW, I personally prefer hz over ms. I guess Ojan has some opinion here. (IIRC I read he talking about that somewhere.)
Either way is fine as long as it's clearly labeled on the graphs and the text output whether bigger-is-better or smaller-is-better.
(In reply to comment #5) > Either way is fine as long as it's clearly labeled on the graphs and the text output whether bigger-is-better or smaller-is-better. For labeling, we need upstream changes to perf-o-matic since we currently don't have any notion of units :(
Reporting "runs/s" as "ms" is extremely confusing (Actually I've so far given up a few patches because I thought they "regressed" performance). We should fix it somehow. As far as I see the discussion, the possible fixes are as follows: [1] Just change the label from "ms" to "Hz". (Nit: I would prefer "runs/s" to "Hz" since the official Dromaeo is using "runs/s") [2] Convert "runs/s" measured by Dromaeo to "ms", and report "ms". (patch: https://bugs.webkit.org/show_bug.cgi?id=80974) [3] Run Dromaeo with the "runStyle=ms" mode, by which Dromaeo measures "ms" instead of "runs/s". Then report "ms". IMHO, - I do not like [1]. Mixing "larger-is-better" graphs and "smaller-is-better" graphs is really confusing. If we do not care historical data, I guess we can avoid [1]. - Both [2] and [3] sound good to me, but [2] would be better. In case of [3], we need to slightly modify webrunner.js to run the "runStyle=ms" mode with the same scale as the "runStyle=runs/s" mode (patch: https://bugs.webkit.org/attachment.cgi?id=131576&action=review). On the other hand, the disadvantage of [2] is that we cannot output the median and stdev, since we cannot calculate the median and stdev of "ms" from those of "runs/s". WDYT?
(In reply to comment #7) > Reporting "runs/s" as "ms" is extremely confusing (Actually I've so far given up a few patches because I thought they "regressed" performance). We should fix it somehow. As far as I see the discussion, the possible fixes are as follows: > > [1] Just change the label from "ms" to "Hz". (Nit: I would prefer "runs/s" to "Hz" since the official Dromaeo is using "runs/s") > [2] Convert "runs/s" measured by Dromaeo to "ms", and report "ms". (patch: https://bugs.webkit.org/show_bug.cgi?id=80974) > [3] Run Dromaeo with the "runStyle=ms" mode, by which Dromaeo measures "ms" instead of "runs/s". Then report "ms". > > - I do not like [1]. Mixing "larger-is-better" graphs and "smaller-is-better" graphs is really confusing. If we do not care historical data, I guess we can avoid [1]. > - Both [2] and [3] sound good to me, but [2] would be better. In case of [3], we need to slightly modify webrunner.js to run the "runStyle=ms" mode with the same scale as the "runStyle=runs/s" mode (patch: https://bugs.webkit.org/attachment.cgi?id=131576&action=review). On the other hand, the disadvantage of [2] is that we cannot output the median and stdev, since we cannot calculate the median and stdev of "ms" from those of "runs/s". Let's not do [2] or [3]. It's extremely confusing to invent our own metrics for Dromaeo. Also see https://bugs.webkit.org/show_bug.cgi?id=80776.
(In reply to comment #8) > Let's not do [2] or [3]. It's extremely confusing to invent our own metrics for Dromaeo. Also see https://bugs.webkit.org/show_bug.cgi?id=80776. Is it because we want to compare the perf-test results with the results when we run Dromaeo manually on the browser?
(In reply to comment #9) > Is it because we want to compare the perf-test results with the results when we run Dromaeo manually on the browser? Right.
(In reply to comment #10) > (In reply to comment #9) > > Is it because we want to compare the perf-test results with the results when we run Dromaeo manually on the browser? > > Right. OK, then I'll take the approach [1]. (although I still think that [2] or [3] would be less confusing in total, IMO)
(In reply to comment #11) > OK, then I'll take the approach [1]. (although I still think that [2] or [3] would be less confusing in total, IMO) I think this is just an UI issue on perf-o-matic. We just need to put some information about whether larger is better or smaller is better for each graph (maybe just use different colors?).
Created attachment 131973 [details] Patch
Comment on attachment 131973 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=131973&action=review > Tools/Scripts/webkitpy/performance_tests/perftestsrunner.py:294 > results[score.group(1)] = float(score.group(2)) > + if score.group(3): > + unit = score.group(3) Can we used the named groups instead? e.g. "(?Pblah...)" then match.group('blah') > PerformanceTests/ChangeLog:16 > + Running Dromaeo/dom-attr.html (1 of 1) This > PerformanceTests/ChangeLog:19 > + Finished: 30.616529 s And this line are actually output into stderr so you should probably exclude them (you can confirm this by piping stdout).
I think we need to include the units in JSON file as well so that perf-o-matic can display them. Could you either file a new bug or address that in your patch as well?
Created attachment 131996 [details] patch for landing
Comment on attachment 131996 [details] patch for landing Clearing flags on attachment: 131996 Committed r110933: <http://trac.webkit.org/changeset/110933>