Bug 100030

Summary: run-perf-tests should have a --repeat option
Product: WebKit Reporter: Adam Barth <abarth>
Component: Tools / TestsAssignee: Glenn Adams <glenn>
Status: RESOLVED FIXED    
Severity: Normal CC: benjamin, dpranke, eric, glenn, mkwst, ojan, rniwa, webkit.review.bot, zoltan
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 77037    
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch
none
Patch for landing none

Description Adam Barth 2012-10-22 14:22:44 PDT
I often run perf tests 5 times with my patch and 5 times without.  It would be great to be able to pass --repeat 5 rather than running them manually.  Also, I'd like to label the runs with something meaningful to me rather than "r132115" since I use the same base revision both with and without my patch.

While I'm in wishlist land, I'd then like to be able to compare groups of runs in the same way that you can compare individual runs today.
Comment 1 Zoltan Horvath 2012-10-22 15:24:31 PDT
This seems 3 requests for me.

- repeat option
- grouping the repeated runs (How do you think summarize the results? )
- labeling option
Comment 2 Adam Barth 2012-10-22 15:30:42 PDT
Yes, sorry.  The other tings are just "nice to haves" on top of --repeat.
Comment 3 Glenn Adams 2012-10-23 19:13:14 PDT
+1
Comment 4 Mike West 2012-10-25 04:37:10 PDT
Created attachment 170614 [details]
Patch
Comment 5 Mike West 2012-10-25 04:41:25 PDT
Amusingly, this bug was the top hit when I searched for methods of repeating WebKit perf tests. :)

The attached patch does #1 and #3 from Zoltan's list: it adds a '--repeat' option which runs the specified set of tests X times, and it adds a '--runtitle' option which allows you to override each run's column title. If both are specified (e.g. '--runtitle=Clean --repeat=5'), then the column titles are iterated (e.g. 'Clean #1', 'Clean #2', and so on).

I'm not sure how grouping and summarization might work, but it should likely be done in a separate patch anyway.

WDYT?
Comment 6 Zoltan Horvath 2012-10-25 08:54:27 PDT
Comment on attachment 170614 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=170614&action=review

> Tools/Scripts/webkitpy/performance_tests/perftestsrunner.py:313
> +            _log.info('Running %s (%d of %d)%s' % (test.test_name(), expected + unexpected + 1, len(tests), (" [Run %d of %d]" % (self._currentRun, self._totalRuns) if self._totalRuns > 1 else "")))

It's like a bit perl stylish for me. I might write it to more lines using a variable. Not critical.

The change looks good to me.
Comment 7 Dirk Pranke 2012-10-25 10:30:55 PDT
Comment on attachment 170614 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=170614&action=review

Just a couple of minor things ...

> Tools/Scripts/webkitpy/performance_tests/perftestsrunner.py:71
> +        self._totalRuns = int(self._options.repeat)

You can set type="int" in make_option() below and avoid this cast.

> Tools/Scripts/webkitpy/performance_tests/perftestsrunner.py:176
> +                exit_code = self._generate_and_show_results()

I would split this into two routines so that you can merge in the results after each run but only show the results at the end.
Comment 8 Mike West 2012-10-25 10:55:54 PDT
Created attachment 170691 [details]
Patch
Comment 9 Mike West 2012-10-25 10:56:46 PDT
(In reply to comment #7)
> (From update of attachment 170614 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=170614&action=review
> 
> Just a couple of minor things ...
> 
> > Tools/Scripts/webkitpy/performance_tests/perftestsrunner.py:71
> > +        self._totalRuns = int(self._options.repeat)
> 
> You can set type="int" in make_option() below and avoid this cast.
> 
> > Tools/Scripts/webkitpy/performance_tests/perftestsrunner.py:176
> > +                exit_code = self._generate_and_show_results()
> 
> I would split this into two routines so that you can merge in the results after each run but only show the results at the end.

Thanks Dirk, Zoltan. I've addressed both of your comments.

WDYT?
Comment 10 Dirk Pranke 2012-10-25 11:11:41 PDT
Comment on attachment 170691 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=170691&action=review

> Tools/Scripts/webkitpy/performance_tests/perftestsrunner.py:179
> +                if self._currentRun == self._totalRuns:

Nit: I'd pull this out of the while loop and into a separate if block.

> Tools/Scripts/webkitpy/performance_tests/perftestsrunner.py:217
> +        if self._options.show_results:

Seems kinda silly to have if block inside this call, or even to have a separate function for this at all if it's just two lines ...
Comment 11 Mike West 2012-10-25 11:31:27 PDT
Created attachment 170694 [details]
Patch
Comment 12 Ryosuke Niwa 2012-10-25 11:58:35 PDT
Comment on attachment 170694 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=170694&action=review

> Tools/Scripts/webkitpy/performance_tests/perftestsrunner.py:123
> +            optparse.make_option("--runtitle",
> +                help="Specify a name for the run. If provided, this title will replace the WebKit revision in the result page's column header."),

There is already --description.
Comment 13 Mike West 2012-10-25 12:04:30 PDT
(In reply to comment #12)
> (From update of attachment 170694 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=170694&action=review
> 
> > Tools/Scripts/webkitpy/performance_tests/perftestsrunner.py:123
> > +            optparse.make_option("--runtitle",
> > +                help="Specify a name for the run. If provided, this title will replace the WebKit revision in the result page's column header."),
> 
> There is already --description.

'--description' doesn't overwrite the revision in the column header. It appends something additional to it. That's not a feature I personally find useful; replacement is what I'm looking for.

Perhaps we could compromise on a flag that toggled description's behavior between replace or append?
Comment 14 Ryosuke Niwa 2012-10-25 12:07:06 PDT
Comment on attachment 170694 [details]
Patch

Also, theere should be a test for this. Did you even run test-webkitpy?
Comment 15 Ryosuke Niwa 2012-10-25 12:08:50 PDT
(In reply to comment #13)
> (In reply to comment #12)
> > (From update of attachment 170694 [details] [details])
> > View in context: https://bugs.webkit.org/attachment.cgi?id=170694&action=review
> > 
> > > Tools/Scripts/webkitpy/performance_tests/perftestsrunner.py:123
> > > +            optparse.make_option("--runtitle",
> > > +                help="Specify a name for the run. If provided, this title will replace the WebKit revision in the result page's column header."),
> > 
> > There is already --description.
> 
> '--description' doesn't overwrite the revision in the column header. It appends something additional to it. That's not a feature I personally find useful; replacement is what I'm looking for.

The problem is that you could be running multiple runs at multiple revisions with different descriptions. I'm fine with hiding revisions when all revisions are same but always overriding the revision number is not the right thing to do.
Comment 16 Mike West 2012-10-26 02:02:37 PDT
(In reply to comment #14)
> (From update of attachment 170694 [details])
> Also, theere should be a test for this. Did you even run test-webkitpy?

Oh, look. Tests. *facepalm*

Sorry, didn't see that at all. I'll clean this up today.
Comment 17 Mike West 2013-02-07 11:00:48 PST
Unassigning myself; let's be realistic about what I'm actually working on. :/
Comment 18 Glenn Adams 2013-03-06 22:38:50 PST
Created attachment 191921 [details]
Patch
Comment 19 Benjamin Poulain 2013-03-07 01:01:10 PST
After the recent changes of Ryosuke, I am not sure a complete "repeat" is what we want.

Wouldn't it be more interesting to have control over _process_run_count instead?

Adam, what is the reason you use to run tests 5 times? Was it due to the deviation between runs?
Comment 20 Glenn Adams 2013-03-07 06:53:15 PST
(In reply to comment #19)
> After the recent changes of Ryosuke, I am not sure a complete "repeat" is what we want.
> 
> Wouldn't it be more interesting to have control over _process_run_count instead?

these are orthogonal parameters. repeat controls how many times the outer test set is run, while process_run_count controls how many samples are taken when running a given test within the test set.

rniwa and I discussed this on IRC yesterday, and he is fine with the outer repeat. at his suggestion, i plan to expose the process_run_count as a separate option to permit external configuration.

> Adam, what is the reason you use to run tests 5 times? Was it due to the deviation between runs?

i guess you mean 'Glenn', not 'Adam', but if you are asking why i used 5 in the newly added tests, it is simply because i needed to pick a number greater than 1; i could have used 2 instead, or 10... so no significance
Comment 21 Glenn Adams 2013-03-07 06:55:04 PST
(In reply to comment #20)
> i plan to expose the process_run_count as a separate option to permit external configuration.

i should have added: in a different bug/patch
Comment 22 Benjamin Poulain 2013-03-07 10:12:11 PST
> > Adam, what is the reason you use to run tests 5 times? Was it due to the deviation between runs?
> 
> i guess you mean 'Glenn', not 'Adam', but if you are asking why i used 5 in the newly added tests, it is simply because i needed to pick a number greater than 1; i could have used 2 instead, or 10... so no significance

I meant Adam :)
In his comment, he said he runs the tests 5 times.
Comment 23 Glenn Adams 2013-03-07 10:17:59 PST
(In reply to comment #21)
> (In reply to comment #20)
> > i plan to expose the process_run_count as a separate option to permit external configuration.
> 
> i should have added: in a different bug/patch

see https://bugs.webkit.org/show_bug.cgi?id=111726
Comment 24 Glenn Adams 2013-03-07 10:20:22 PST
(In reply to comment #22)
> > > Adam, what is the reason you use to run tests 5 times? Was it due to the deviation between runs?
> > 
> > i guess you mean 'Glenn', not 'Adam', but if you are asking why i used 5 in the newly added tests, it is simply because i needed to pick a number greater than 1; i could have used 2 instead, or 10... so no significance
> 
> I meant Adam :)
> In his comment, he said he runs the tests 5 times.

tnx for clarifying... just wanted to make sure
Comment 25 Ryosuke Niwa 2013-03-07 10:33:00 PST
Comment on attachment 191921 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=191921&action=review

> Tools/ChangeLog:9
> +        Add --repeat option to run-perf-tests, with default value of 1, which runs
> +        test set repeat count times then uploads and shows results as appropriate.

I have a hard time following the phrase "which runs test set repeat count times". Revise?

> Tools/Scripts/webkitpy/performance_tests/perftestsrunner.py:182
> +        runCount = 0

Please don't use camelCase in PEP8 styled code. Use run_count. instead.

> Tools/Scripts/webkitpy/performance_tests/perftestsrunner.py:188
> +            runs = ' [Run %d of %d]' % (runCount, repeat) if repeat > 1 else ''

I would use regular parenthesis instead of square brackets unless there is a compelling reason not to.

> Tools/Scripts/webkitpy/performance_tests/perftestsrunner.py:251
> +            results_page_path = self._host.filesystem.splitext(output_json_path)[0] + '.html'

I don't want to se this code duplicated in two places. Maybe add _results_page_path() ?
Comment 26 Glenn Adams 2013-03-07 11:58:23 PST
Created attachment 192059 [details]
Patch for landing
Comment 27 Glenn Adams 2013-03-07 12:00:52 PST
(In reply to comment #26)
> Created an attachment (id=192059) [details]
> Patch for landing

address rniwa comment #25
Comment 28 Adam Barth 2013-03-07 13:29:57 PST
> Adam, what is the reason you use to run tests 5 times? Was it due to the deviation between runs?

It was so I could compute confidence intervals myself.  I understand that the tool can do that for me, but I'm a paranoid panda.
Comment 29 WebKit Review Bot 2013-03-07 16:40:50 PST
Comment on attachment 192059 [details]
Patch for landing

Clearing flags on attachment: 192059

Committed r145152: <http://trac.webkit.org/changeset/145152>
Comment 30 WebKit Review Bot 2013-03-07 16:40:56 PST
All reviewed patches have been landed.  Closing bug.