78303 – [PerformanceTests] run-perf-tests should be able to handle Hz-based score.

RESOLVED FIXED 78303

[PerformanceTests] run-perf-tests should be able to handle Hz-based score.

https://bugs.webkit.org/show_bug.cgi?id=78303

Summary [PerformanceTests] run-perf-tests should be able to handle Hz-based score.

Hajime Morrita

Reported 2012-02-09 17:54:34 PST

Dromaeo reports score (run/s) by default. I should report time taken.

Attachments
Patch (6.53 KB, patch) 2012-03-14 18:56 PDT, Kentaro Hara	no flags	Details Formatted Diff Diff
patch for landing (6.47 KB, patch) 2012-03-15 00:16 PDT, Kentaro Hara	no flags	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Hajime Morrita

Comment 1 2012-02-09 21:21:16 PST

I'd like to close this as WONTFIX beucause: - We already have data. Changing the dimension will kill these data. - What we area doing is how the official Dromaeo counts. We can understand our score as "Dromaeo score" even though it differs from how other tests is doing. - Since the tests run so fast, we need to give some scale. But determining such scale is a tricky job. Currnet "run/s" is further simpler. Ryosuke, what do you think?

Ryosuke Niwa

Comment 2 2012-02-09 22:59:02 PST

I think we need to update run-perf-tests so that it doesn't it doesn't erroneously add "ms" at the end. It's extremely confusing. The correct unit for these values is "Hz".

Ryosuke Niwa

Comment 3 2012-02-09 23:02:44 PST

By the way, having historical data is okay though. We've already had a bunch of bogus data due to the bot issues so switching the unit type at this point shouldn't be that big of a deal. Also, I'm thinking that we can add collect historical data for tests once tests have stabilized so I don't think losing historical data now should affect our decision here.

Hajime Morrita

Comment 4 2012-02-09 23:11:26 PST

(In reply to comment #2) > I think we need to update run-perf-tests so that it doesn't it doesn't erroneously add "ms" at the end. It's extremely confusing. The correct unit for these values is "Hz". Good point. I changed the bug title. BTW, I personally prefer hz over ms. I guess Ojan has some opinion here. (IIRC I read he talking about that somewhere.)

Adam Barth

Comment 5 2012-02-09 23:19:36 PST

Either way is fine as long as it's clearly labeled on the graphs and the text output whether bigger-is-better or smaller-is-better.

Ryosuke Niwa

Comment 6 2012-02-09 23:23:17 PST

(In reply to comment #5) > Either way is fine as long as it's clearly labeled on the graphs and the text output whether bigger-is-better or smaller-is-better. For labeling, we need upstream changes to perf-o-matic since we currently don't have any notion of units :(

Kentaro Hara

Comment 7 2012-03-14 16:59:40 PDT

Reporting "runs/s" as "ms" is extremely confusing (Actually I've so far given up a few patches because I thought they "regressed" performance). We should fix it somehow. As far as I see the discussion, the possible fixes are as follows: [1] Just change the label from "ms" to "Hz". (Nit: I would prefer "runs/s" to "Hz" since the official Dromaeo is using "runs/s") [2] Convert "runs/s" measured by Dromaeo to "ms", and report "ms". (patch: https://bugs.webkit.org/show_bug.cgi?id=80974) [3] Run Dromaeo with the "runStyle=ms" mode, by which Dromaeo measures "ms" instead of "runs/s". Then report "ms". IMHO, - I do not like [1]. Mixing "larger-is-better" graphs and "smaller-is-better" graphs is really confusing. If we do not care historical data, I guess we can avoid [1]. - Both [2] and [3] sound good to me, but [2] would be better. In case of [3], we need to slightly modify webrunner.js to run the "runStyle=ms" mode with the same scale as the "runStyle=runs/s" mode (patch: https://bugs.webkit.org/attachment.cgi?id=131576&action=review). On the other hand, the disadvantage of [2] is that we cannot output the median and stdev, since we cannot calculate the median and stdev of "ms" from those of "runs/s". WDYT?

Ryosuke Niwa

Comment 8 2012-03-14 17:06:49 PDT

(In reply to comment #7) > Reporting "runs/s" as "ms" is extremely confusing (Actually I've so far given up a few patches because I thought they "regressed" performance). We should fix it somehow. As far as I see the discussion, the possible fixes are as follows: > > [1] Just change the label from "ms" to "Hz". (Nit: I would prefer "runs/s" to "Hz" since the official Dromaeo is using "runs/s") > [2] Convert "runs/s" measured by Dromaeo to "ms", and report "ms". (patch: https://bugs.webkit.org/show_bug.cgi?id=80974) > [3] Run Dromaeo with the "runStyle=ms" mode, by which Dromaeo measures "ms" instead of "runs/s". Then report "ms". > > - I do not like [1]. Mixing "larger-is-better" graphs and "smaller-is-better" graphs is really confusing. If we do not care historical data, I guess we can avoid [1]. > - Both [2] and [3] sound good to me, but [2] would be better. In case of [3], we need to slightly modify webrunner.js to run the "runStyle=ms" mode with the same scale as the "runStyle=runs/s" mode (patch: https://bugs.webkit.org/attachment.cgi?id=131576&action=review). On the other hand, the disadvantage of [2] is that we cannot output the median and stdev, since we cannot calculate the median and stdev of "ms" from those of "runs/s". Let's not do [2] or [3]. It's extremely confusing to invent our own metrics for Dromaeo. Also see https://bugs.webkit.org/show_bug.cgi?id=80776.

Kentaro Hara

Comment 9 2012-03-14 17:10:42 PDT

(In reply to comment #8) > Let's not do [2] or [3]. It's extremely confusing to invent our own metrics for Dromaeo. Also see https://bugs.webkit.org/show_bug.cgi?id=80776. Is it because we want to compare the perf-test results with the results when we run Dromaeo manually on the browser?

Ryosuke Niwa

Comment 10 2012-03-14 17:13:01 PDT

(In reply to comment #9) > Is it because we want to compare the perf-test results with the results when we run Dromaeo manually on the browser? Right.

Kentaro Hara

Comment 11 2012-03-14 17:15:30 PDT

(In reply to comment #10) > (In reply to comment #9) > > Is it because we want to compare the perf-test results with the results when we run Dromaeo manually on the browser? > > Right. OK, then I'll take the approach [1]. (although I still think that [2] or [3] would be less confusing in total, IMO)

Ryosuke Niwa

Comment 12 2012-03-14 17:16:40 PDT

(In reply to comment #11) > OK, then I'll take the approach [1]. (although I still think that [2] or [3] would be less confusing in total, IMO) I think this is just an UI issue on perf-o-matic. We just need to put some information about whether larger is better or smaller is better for each graph (maybe just use different colors?).

Kentaro Hara

Comment 13 2012-03-14 18:56:44 PDT

Created attachment 131973 [details] Patch

Ryosuke Niwa

Comment 14 2012-03-14 23:56:02 PDT

Comment on attachment 131973 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=131973&action=review > Tools/Scripts/webkitpy/performance_tests/perftestsrunner.py:294 > results[score.group(1)] = float(score.group(2)) > + if score.group(3): > + unit = score.group(3) Can we used the named groups instead? e.g. "(?Pblah...)" then match.group('blah') > PerformanceTests/ChangeLog:16 > + Running Dromaeo/dom-attr.html (1 of 1) This > PerformanceTests/ChangeLog:19 > + Finished: 30.616529 s And this line are actually output into stderr so you should probably exclude them (you can confirm this by piping stdout).

Ryosuke Niwa

Comment 15 2012-03-14 23:56:47 PDT

I think we need to include the units in JSON file as well so that perf-o-matic can display them. Could you either file a new bug or address that in your patch as well?

Kentaro Hara

Comment 16 2012-03-15 00:16:27 PDT

Created attachment 131996 [details] patch for landing

WebKit Review Bot

Comment 17 2012-03-15 20:13:44 PDT

Comment on attachment 131996 [details] patch for landing Clearing flags on attachment: 131996 Committed r110933: <http://trac.webkit.org/changeset/110933>

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Unspecified

OS Unspecified

Product WebKit

Component Tools / Tests

Assignee

Hajime Morrita

Reported

2012-02-09 17:54 PST

Modified

2012-03-15 20:25 PDT History

CC List

5 users Show

URL

Keywords

Depends on

Blocks

76156

Dependencies

tree graph