Bug 71801 - Benchmark using higher-precision timers
Summary: Benchmark using higher-precision timers
Status: UNCONFIRMED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-11-08 05:10 PST by Andy Wingo
Modified: 2022-02-27 23:23 PST (History)
10 users (show)

See Also:


Attachments
Patch (1.64 KB, patch)
2011-11-08 05:14 PST, Andy Wingo
no flags Details | Formatted Diff | Diff
Patch (2.20 KB, patch)
2011-11-21 01:28 PST, Andy Wingo
no flags Details | Formatted Diff | Diff
Patch (2.19 KB, patch)
2011-11-21 01:44 PST, Andy Wingo
mjs: review-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Andy Wingo 2011-11-08 05:10:06 PST
jsc.cpp's `run' and `parseSyntax' implementations return their elapsed type as an integral number of milliseconds.  This throws away a lot of information, given that we actually measure the elapsed time using platform-specific high-precision timers.

The patch to be attached changes `run' and `parseSyntax' to return the number of milliseconds as a floating-point number.
Comment 1 Andy Wingo 2011-11-08 05:14:08 PST
Created attachment 114045 [details]
Patch
Comment 2 Andy Wingo 2011-11-08 05:14:22 PST
Compared to before, where these routines measured integral milliseconds, this looks to be a huge lose.  However it is more accurate.

TEST                   COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:           *1.087x as slow*  162.2ms +/- 1.0%   176.2ms +/- 1.0%     significant

=============================================================================

  3d:                  *1.091x as slow*   25.3ms +/- 5.2%    27.6ms +/- 5.7%     significant
    cube:              ??                 10.1ms +/- 12.7%    11.8ms +/- 13.6%     not conclusive: might be *1.164x as slow*
    morph:             *1.075x as slow*    7.2ms +/- 4.2%     7.7ms +/- 3.4%     significant
    raytrace:          *1.013x as slow*    8.0ms +/- 0.0%     8.1ms +/- 0.8%     significant

  access:              *1.076x as slow*   21.2ms +/- 2.1%    22.8ms +/- 1.2%     significant
    binary-trees:      *1.112x as slow*    2.0ms +/- 0.0%     2.2ms +/- 1.0%     significant
    fannkuch:          ??                 10.9ms +/- 2.1%    11.2ms +/- 2.5%     not conclusive: might be *1.026x as slow*
    nbody:             *1.115x as slow*    5.3ms +/- 6.5%     5.9ms +/- 1.1%     significant
    nsieve:            *1.162x as slow*    3.0ms +/- 0.0%     3.5ms +/- 2.6%     significant

  bitops:              *1.143x as slow*   14.3ms +/- 2.4%    16.3ms +/- 1.7%     significant
    3bit-bits-in-byte: *1.123x as slow*    2.0ms +/- 0.0%     2.2ms +/- 0.9%     significant
    bits-in-byte:      *1.101x as slow*    5.2ms +/- 5.8%     5.7ms +/- 3.6%     significant
    bitwise-and:       *1.102x as slow*    3.0ms +/- 0.0%     3.3ms +/- 2.4%     significant
    nsieve-bits:       *1.24x as slow*     4.1ms +/- 5.5%     5.1ms +/- 2.8%     significant

  controlflow:         *1.93x as slow*     1.0ms +/- 0.0%     1.9ms +/- 5.2%     significant
    recursive:         *1.93x as slow*     1.0ms +/- 0.0%     1.9ms +/- 5.2%     significant

  crypto:              *1.156x as slow*   10.1ms +/- 2.2%    11.7ms +/- 2.2%     significant
    aes:               *1.088x as slow*    6.0ms +/- 0.0%     6.5ms +/- 0.5%     significant
    md5:               *1.38x as slow*     2.1ms +/- 10.8%     2.9ms +/- 6.9%     significant
    sha1:              *1.131x as slow*    2.0ms +/- 0.0%     2.3ms +/- 1.6%     significant

  date:                *1.060x as slow*   18.1ms +/- 1.2%    19.2ms +/- 0.9%     significant
    format-tofte:      *1.039x as slow*   10.1ms +/- 2.2%    10.5ms +/- 1.2%     significant
    format-xparb:      *1.088x as slow*    8.0ms +/- 0.0%     8.7ms +/- 1.7%     significant

  math:                *1.095x as slow*   16.1ms +/- 1.4%    17.6ms +/- 1.0%     significant
    cordic:            *1.165x as slow*    5.1ms +/- 4.4%     5.9ms +/- 1.1%     significant
    partial-sums:      *1.041x as slow*    8.0ms +/- 0.0%     8.3ms +/- 1.7%     significant
    spectral-norm:     *1.120x as slow*    3.0ms +/- 0.0%     3.4ms +/- 3.1%     significant

  regexp:              *1.066x as slow*   12.3ms +/- 2.8%    13.1ms +/- 1.5%     significant
    dna:               *1.066x as slow*   12.3ms +/- 2.8%    13.1ms +/- 1.5%     significant

  string:              *1.049x as slow*   43.8ms +/- 1.0%    45.9ms +/- 0.7%     significant
    base64:            *1.080x as slow*    4.0ms +/- 0.0%     4.3ms +/- 4.2%     significant
    fasta:             *1.106x as slow*    6.2ms +/- 4.9%     6.9ms +/- 1.4%     significant
    tagcloud:          ??                 11.0ms +/- 0.0%    11.3ms +/- 3.2%     not conclusive: might be *1.028x as slow*
    unpack-code:       *1.024x as slow*   17.6ms +/- 2.1%    18.0ms +/- 0.6%     significant
    validate-input:    *1.086x as slow*    5.0ms +/- 0.0%     5.4ms +/- 2.0%     significant
Comment 3 Andy Wingo 2011-11-08 05:17:11 PST
As you would expect, the effect on v8-v4 is much less:


TEST              COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:      *1.011x as slow*  1074.8ms +/- 0.6%   1087.0ms +/- 0.4%     significant

=============================================================================

  v8:             *1.011x as slow*  1074.8ms +/- 0.6%   1087.0ms +/- 0.4%     significant
    crypto:       ??                 186.0ms +/- 1.3%    186.3ms +/- 0.7%     not conclusive: might be *1.002x as slow*
    deltablue:    *1.012x as slow*   255.7ms +/- 0.7%    258.7ms +/- 0.2%     significant
    earley-boyer: *1.011x as slow*   100.9ms +/- 0.7%    102.0ms +/- 0.6%     significant
    raytrace:     *1.019x as slow*    64.9ms +/- 1.0%     66.2ms +/- 0.3%     significant
    regexp:       ??                 111.4ms +/- 0.8%    113.6ms +/- 2.4%     not conclusive: might be *1.020x as slow*
    richards:     *1.013x as slow*   219.0ms +/- 0.6%    221.9ms +/- 0.6%     significant
    splay:        *1.010x as slow*   136.9ms +/- 0.4%    138.3ms +/- 0.3%     significant
Comment 4 Andy Wingo 2011-11-09 02:01:00 PST
Adding JSC and SunSpider folk to the cc.
Comment 5 Filip Pizlo 2011-11-09 12:32:41 PST
You could also use Tools/Scripts/bencher, which does the same thing.
Comment 6 Filip Pizlo 2011-11-09 12:40:12 PST
(In reply to comment #2)
> Compared to before, where these routines measured integral milliseconds, this looks to be a huge lose.  However it is more accurate.
> 
> TEST                   COMPARISON            FROM                 TO             DETAILS
> 
> =============================================================================
> 
> ** TOTAL **:           *1.087x as slow*  162.2ms +/- 1.0%   176.2ms +/- 1.0%     significant

I'd be fine with this patch except for this giant slow-down.  If you want this functionality, you should probably investigate what happened here.

Tools/Scripts/bencher uses our internal preciseTime() function, which returns an untruncated time.  So if you want more precise results, you could just use that.
Comment 7 Andy Wingo 2011-11-21 01:23:37 PST
The "slowdown" is not a slowdown in actual fact; it simply reflects increased precision when measuring times.  Because the original sunspider harness measures integral milliseconds only, produced by truncating (not rounding) the double-precision time values, it is natural to expect that the higher-precision times indicate longer test runs.
Comment 8 Andy Wingo 2011-11-21 01:28:23 PST
Created attachment 116051 [details]
Patch
Comment 9 Andy Wingo 2011-11-21 01:30:41 PST
Updated the driver to use preciseTime() instead of jsc's run().  Note that this does change the default environment of the test code; `run' uses a fresh global, whereas `load' exposes the current global to the loaded code.  We do want `load' behavior in the data case though, and having them unified makes a little more sense.

I promise I'll switch to bencher soon, just wanted to follow up with this ;-)
Comment 10 Andy Wingo 2011-11-21 01:44:52 PST
Created attachment 116053 [details]
Patch
Comment 11 Andy Wingo 2011-11-21 01:45:26 PST
Comment on attachment 116053 [details]
Patch

Fix log message, set cq?
Comment 12 Yuqiang Xian 2011-11-23 17:17:37 PST
Does it mean that the other JS engines (e.g., V8, Mozilla) also need to provide preciseTime() interface, if we want to run them using the scripts?
Comment 13 Filip Pizlo 2011-11-23 17:32:38 PST
(In reply to comment #12)
> Does it mean that the other JS engines (e.g., V8, Mozilla) also need to provide preciseTime() interface, if we want to run them using the scripts?

The approach taken by bencher is that it only uses preciseTime() if all VMs being tested provide it. 

I tend to think that the run() function and the Date API should be consistent across VMs, and especially with the Date API, it's unclear if we could make that more precise without breaking compliance.
Comment 14 Andy Wingo 2011-11-24 04:34:05 PST
(In reply to comment #12)
> Does it mean that the other JS engines (e.g., V8, Mozilla) also need to provide preciseTime() interface, if we want to run them using the scripts?

This is a moot point, as our sunspider harness currently uses `run', which is not available on V8 at least.  Dunno about SpiderMonkey.
Comment 15 Eric Seidel (no email) 2011-12-19 11:52:02 PST
CCing other JSC experts who might be willing to review this.
Comment 16 Andy Wingo 2012-01-10 07:03:03 PST
Filip, Geoff, Gavin: would someone care to review this please?  I think I adequately addressed the concerns (notably, that it's not an actual slowdown, and we're already using jsc-specific API).

Thanks,

Andy
Comment 17 Andy Wingo 2012-01-18 03:07:51 PST
(In reply to comment #16)
> Filip, Geoff, Gavin: would someone care to review this please?  I think I adequately addressed the concerns (notably, that it's not an actual slowdown, and we're already using jsc-specific API).

Kind ping :-)
Comment 18 Andy Wingo 2012-02-09 08:16:33 PST
(In reply to comment #17)
> (In reply to comment #16)
> > Filip, Geoff, Gavin: would someone care to review this please?  I think I adequately addressed the concerns (notably, that it's not an actual slowdown, and we're already using jsc-specific API).
> 
> Kind ping :-)

Another kind ping :)
Comment 19 Eric Seidel (no email) 2012-02-09 10:08:31 PST
Comment on attachment 116053 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=116053&action=review

> PerformanceTests/SunSpider/resources/sunspider-standalone-driver.js:-54
> -            times[j] = run(testName);

Where did this line go?
Comment 20 Eric Seidel (no email) 2012-02-09 10:09:33 PST
MJS has been the historical keeper of sunspider.  i woudl be happy to approve this if I was convinced there was no behavior change other than using a more precise timer.  Also, doesn't v8-builds use this file as well?
Comment 21 Andy Wingo 2012-02-10 00:56:49 PST
(In reply to comment #19)
> (From update of attachment 116053 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=116053&action=review
> 
> > PerformanceTests/SunSpider/resources/sunspider-standalone-driver.js:-54
> > -            times[j] = run(testName);
> 
> Where did this line go?

It went away.  Instead we use preciseTime() before and after.

(In reply to comment #20)
> MJS has been the historical keeper of sunspider.  i woudl be happy to approve this if I was convinced there was no behavior change other than using a more precise timer.  Also, doesn't v8-builds use this file as well?

V8 version 3.8.6 (candidate) [console: readline]
d8> run
(d8):1: ReferenceError: run is not defined
run
^

So we are already using JSC-specific API here; I don't see how v8 builds could be running this file.

Regards,

Andy
Comment 22 Tony Gentilcore 2012-02-10 02:04:42 PST
> MJS has been the historical keeper of sunspider.
I'd be really interested to hear his thoughts on this as well.

Please forgive the naive question, but I'm wondering, does this affect what runs here: http://www.webkit.org/perf/sunspider-0.9.1/sunspider-0.9.1/driver.html ?

If so, don't we need to keep it cross browser compatible?

If not, are we concerned that the results used within the project are not comparable to the public ones? Should we maintain support for timing the same way as the public one? Should the results on rniwa's perf dashboard match internal or public?

Also worth noting there is talk in the public-web-perf group about adding a high-res monotonically increasing timer to the web platform -- likely as window.performance.now() -- see bug 66684.
Comment 23 Andy Wingo 2012-02-10 03:58:37 PST
(In reply to comment #22)
>  does this affect what runs here: http://www.webkit.org/perf/sunspider-0.9.1/sunspider-0.9.1/driver.html ?

I don't think so, no -- those runs are "hosted", whereas this change is to the "standalone" driver.

> If not, are we concerned that the results used within the project are not comparable to the public ones?

Good question.  My thought is that while hosted and standalone timings should be similar, they have never been the same, for many good reasons.  I wouldn't worry about it -- especially if, as you note, a window.performance.now() lands.  But, Maciej's (and anyone else's) opinions are welcome here.

Cheers,

Andy
Comment 24 Andy Wingo 2012-02-27 03:07:05 PST
Is anyone willing to review this patch?
Comment 25 Ryosuke Niwa 2012-02-27 08:04:13 PST
(In reply to comment #24)
> Is anyone willing to review this patch?

As Eric mentioned, Maciej is probably the only person who can review this patch.
Comment 26 Maciej Stachowiak 2012-02-29 12:20:59 PST
Comment on attachment 116053 [details]
Patch

We sometimes use command-line SunSpider to compare to other JS engines. So making a JSC-specific change that materially alters the timings would be unfortunate. It's great that it is better precision, but it shows a ~9% hit that does not affect the other engines. Also, as written, it has no fallback path.

Therefore, I suggest making the new, more precise timing an optional mode explicitly invoked by a command-line argument to the script. This would allow the option of more precise measurements without losing the cross-engine comparison features of the old mode.
Comment 27 Andy Wingo 2012-02-29 12:47:10 PST
Maciej!  Thank you for commenting.

(In reply to comment #26)
> (From update of attachment 116053 [details])
> We sometimes use command-line SunSpider to compare to other JS engines.

Evidently not very often :)

$ ./sunspider --shell ~/src/v8/d8
Found 26 tests
Running SunSpider once for warmup, then 10 times
Discarded first run.
[resources/sunspider-standalone-driver.js:54: ReferenceError: run is not defined
            times[j] = run(testName);
                       ^
ReferenceError: run is not defined

I mentioned this in comment 14.

Does this change your opinion on the matter?  Should we re-add a portable path?  The code has been non-portable since November 2010, with r72842.

Andy
Comment 28 Filip Pizlo 2012-02-29 13:58:37 PST
(In reply to comment #27)
> Maciej!  Thank you for commenting.
> 
> (In reply to comment #26)
> > (From update of attachment 116053 [details] [details])
> > We sometimes use command-line SunSpider to compare to other JS engines.
> 
> Evidently not very often :)
> 
> $ ./sunspider --shell ~/src/v8/d8
> Found 26 tests
> Running SunSpider once for warmup, then 10 times
> Discarded first run.
> [resources/sunspider-standalone-driver.js:54: ReferenceError: run is not defined
>             times[j] = run(testName);
>                        ^
> ReferenceError: run is not defined
> 
> I mentioned this in comment 14.
> 
> Does this change your opinion on the matter?  Should we re-add a portable path?  The code has been non-portable since November 2010, with r72842.

Another way to look at Maciej's comments is that with this change, it would be very difficult to compare old versions of jsc to new ones.

I do this all the time, so this change would be a showstopper for me.
Comment 29 Andy Wingo 2012-02-29 14:07:11 PST
Hi Filip :)

(In reply to comment #28)
> Another way to look at Maciej's comments is that with this change, it would be very difficult to compare old versions of jsc to new ones.

I certainly do not mean to disrespect the voice of experience :)  However, I wonder if I am communicating clearly: this patch does /not/ change the behavior of `run' in jsc.cpp; it instead changes the sunspider harness to use preciseTime, which is present in very old jsc.
Comment 30 Eric Seidel (no email) 2012-02-29 14:13:04 PST
It looks like Filip added preciseTime:
http://trac.webkit.org/changeset/91817
Comment 31 Ryosuke Niwa 2012-02-29 14:40:59 PST
I support Maicej's point that we shouldn't make this change unless it's optional. And yes, we should be fixing those exceptions so that it runs on all major browsers.
Comment 32 Filip Pizlo 2012-02-29 14:51:31 PST
(In reply to comment #29)
> Hi Filip :)
> 
> (In reply to comment #28)
> > Another way to look at Maciej's comments is that with this change, it would be very difficult to compare old versions of jsc to new ones.
> 
> I certainly do not mean to disrespect the voice of experience :)  However, I wonder if I am communicating clearly: this patch does /not/ change the behavior of `run' in jsc.cpp; it instead changes the sunspider harness to use preciseTime, which is present in very old jsc.

And I like to compare to very old jsc.