Bug 96916 - Make the old JIT not rely on the timeout check register
Summary: Make the old JIT not rely on the timeout check register
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks: 79040
  Show dependency treegraph
 
Reported: 2012-09-17 06:32 PDT by Gabor Ballabas
Modified: 2012-12-05 14:33 PST (History)
12 users (show)

See Also:


Attachments
Patch (12.76 KB, patch)
2012-09-17 06:38 PDT, Gabor Ballabas
no flags Details | Formatted Diff | Diff
Updated patch (12.82 KB, patch)
2012-09-27 09:33 PDT, Gabor Ballabas
no flags Details | Formatted Diff | Diff
Rebased patch (12.80 KB, patch)
2012-11-12 04:32 PST, Gabor Ballabas
ggaren: review-
gaborb: commit-queue?
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Gabor Ballabas 2012-09-17 06:32:18 PDT
First step to solve this DFG-JIT related bug: https://bugs.webkit.org/show_bug.cgi?id=79040
Comment 1 Gabor Ballabas 2012-09-17 06:38:26 PDT
Created attachment 164384 [details]
Patch

MIPS and SH4 versions of the emitTimeoutCheck function has been left as they were (with added
ifdef guards) due to lack of any kind of device or buildbot for testing the changes.
Comment 2 Gergely Kis 2012-09-18 13:02:40 PDT
Add Balázs and myself to the CC list to track for MIPS support.
Comment 3 Gabor Ballabas 2012-09-27 09:33:45 PDT
Created attachment 166020 [details]
Updated patch
Comment 4 Csaba Osztrogonác 2012-10-30 09:36:56 PDT
Ping for review?
Comment 5 Csaba Osztrogonác 2012-11-05 10:08:02 PST
(In reply to comment #3)
> Created an attachment (id=166020) [details]
> Updated patch

Unfortunately it isn't applyable now. Could you rebase the patch to ToT, please?
Comment 6 Gabor Ballabas 2012-11-12 04:32:45 PST
Created attachment 173616 [details]
Rebased patch
Comment 7 Zoltan Herczeg 2012-11-12 04:41:38 PST
Comment on attachment 173616 [details]
Rebased patch

View in context: https://bugs.webkit.org/attachment.cgi?id=173616&action=review

> Source/JavaScriptCore/ChangeLog:9
> +        First step to solve this DFG-JIT related
> +        bug:https://bugs.webkit.org/show_bug.cgi?id=79040

Perhaps a little more explanation would be good.

Otherwise, the patch looks good. Oliver, Filip, anybody any thoughts?
Comment 8 Geoffrey Garen 2012-11-12 10:26:42 PST
How does this relate to bug 79040? What is the plan you're pursuing for restoring timeout checking?

Is this just a subset of the patch in bug 79040? We can't evaluate that patch until you post performance numbers for it.
Comment 9 Zoltan Herczeg 2012-11-12 10:44:04 PST
(In reply to comment #8)
> How does this relate to bug 79040? What is the plan you're pursuing for restoring timeout checking?

We discussed this topic with Filip Pizlo before. We should either support timeout checking everywhere or remove the feature completly (for consistency). We decided to keep the feature that time.

> Is this just a subset of the patch in bug 79040? We can't evaluate that patch until you post performance numbers for it.

Yes, this is the first part. We free the timeout register first so DFG-JIT could use it. Sorry for the missing performance numbers. Gabor, could you copy them here?
Comment 10 Geoffrey Garen 2012-11-12 13:10:01 PST
> Yes, this is the first part.

So your plan is to do timeout checking by polling an in-memory counter at loop boundaries in the interpreter, the baseline JIT, and the DFG?
Comment 11 Zoltan Herczeg 2012-11-12 20:49:24 PST
> So your plan is to do timeout checking by polling an in-memory counter at loop boundaries in the interpreter, the baseline JIT, and the DFG?

Yes. That is a consistent approach. Or perhaps do you have any better idea?
Comment 12 Geoffrey Garen 2012-11-12 20:55:50 PST
> > So your plan is to do timeout checking by polling an in-memory counter at loop boundaries in the interpreter, the baseline JIT, and the DFG?
> 
> Yes. That is a consistent approach. Or perhaps do you have any better idea?

That depends on the performance of polling an in-memory counter. For example, we could use a flag and an alarm to avoid having to read-modify-write the counter. Or we could poll in the interpreter and baseline JIT, where performance matters less, and use some other solution in the DFG.

If there's no performance cost to a read-modify-write counter in all engines, that seems like a fine approach. But I suspect there will be a performance cost. Have you tested this?
Comment 13 Gabor Ballabas 2012-11-13 04:35:31 PST
Sunspider results:

Sunspider:

TEST                   COMPARISON            r134387                 Patch           DETAILS

=============================================================================

** TOTAL **:           *1.003x as slow*  205.8ms +/- 0.1%   206.3ms +/- 0.1%     significant

=============================================================================

  3d:                  *1.005x as slow*   31.2ms +/- 0.3%    31.3ms +/- 0.4%     significant
    cube:              ??                 11.0ms +/- 0.2%    11.1ms +/- 0.7%     not conclusive: might be *1.005x as slow*
    morph:             *1.012x as slow*    9.0ms +/- 0.4%     9.2ms +/- 0.8%     significant
    raytrace:          -                  11.1ms +/- 0.6%    11.1ms +/- 0.6% 

  access:              -                  18.0ms +/- 0.1%    18.0ms +/- 0.0% 
    binary-trees:      -                   2.0ms +/- 0.0%     2.0ms +/- 0.0% 
    fannkuch:          -                   8.0ms +/- 0.2%     8.0ms +/- 0.0% 
    nbody:             -                   4.0ms +/- 0.0%     4.0ms +/- 0.0% 
    nsieve:            -                   4.0ms +/- 0.0%     4.0ms +/- 0.0% 

  bitops:              -                  12.5ms +/- 0.8%    12.4ms +/- 0.8% 
    3bit-bits-in-byte: -                   1.0ms +/- 0.0%     1.0ms +/- 0.0% 
    bits-in-byte:      -                   5.5ms +/- 1.8%     5.4ms +/- 1.8% 
    bitwise-and:       -                   2.0ms +/- 0.0%     2.0ms +/- 0.0% 
    nsieve-bits:       -                   4.0ms +/- 0.0%     4.0ms +/- 0.0% 

  controlflow:         -                   2.0ms +/- 0.0%     2.0ms +/- 0.0% 
    recursive:         -                   2.0ms +/- 0.0%     2.0ms +/- 0.0% 

  crypto:              1.002x as fast     17.0ms +/- 0.0%    17.0ms +/- 0.2%     significant
    aes:               1.004x as fast     10.0ms +/- 0.0%    10.0ms +/- 0.4%     significant
    md5:               -                   4.0ms +/- 0.0%     4.0ms +/- 0.0% 
    sha1:              -                   3.0ms +/- 0.0%     3.0ms +/- 0.0% 

  date:                -                  31.0ms +/- 0.1%    31.0ms +/- 0.1% 
    format-tofte:      -                  16.0ms +/- 0.0%    16.0ms +/- 0.0% 
    format-xparb:      ??                 15.0ms +/- 0.3%    15.0ms +/- 0.2%     not conclusive: might be *1.001x as slow*

  math:                ??                 18.2ms +/- 0.4%    18.3ms +/- 0.5%     not conclusive: might be *1.006x as slow*
    cordic:            -                   3.0ms +/- 0.0%     3.0ms +/- 0.0% 
    partial-sums:      -                  13.0ms +/- 0.2%    13.0ms +/- 0.2% 
    spectral-norm:     ??                  2.1ms +/- 3.3%     2.3ms +/- 3.8%     not conclusive: might be *1.047x as slow*

  regexp:              -                  11.0ms +/- 0.0%    11.0ms +/- 0.0% 
    dna:               -                  11.0ms +/- 0.0%    11.0ms +/- 0.0% 

  string:              *1.005x as slow*   65.0ms +/- 0.1%    65.4ms +/- 0.2%     significant
    base64:            -                   5.0ms +/- 0.0%     5.0ms +/- 0.0% 
    fasta:             -                  10.0ms +/- 0.0%    10.0ms +/- 0.0% 
    tagcloud:          *1.005x as slow*   16.0ms +/- 0.1%    16.1ms +/- 0.4%     significant
    unpack-code:       *1.010x as slow*   26.0ms +/- 0.1%    26.3ms +/- 0.3%     significant
    validate-input:    -                   8.0ms +/- 0.2%     8.0ms +/- 0.0%

V8:

TEST              COMPARISON            r134387           Patch             DETAILS

=============================================================================

** TOTAL **:      -                 762.2ms +/- 0.4%   760.7ms +/- 0.3% 

=============================================================================

  v8:             -                 762.2ms +/- 0.4%   760.7ms +/- 0.3% 
    crypto:       ??                 88.1ms +/- 0.9%    88.4ms +/- 0.7%     not conclusive: might be *1.003x as slow*
    deltablue:    *1.007x as slow*  137.3ms +/- 0.4%   138.3ms +/- 0.5%     significant
    earley-boyer: ??                 99.4ms +/- 0.5%    99.8ms +/- 0.7%     not conclusive: might be *1.004x as slow*
    raytrace:     -                  72.3ms +/- 0.7%    72.0ms +/- 0.7% 
    regexp:       1.015x as fast    109.4ms +/- 0.7%   107.8ms +/- 0.5%     significant
    richards:     -                 114.6ms +/- 0.7%   113.8ms +/- 0.5% 
    splay:        -                 141.1ms +/- 0.7%   140.6ms +/- 0.6%
Comment 14 Gabor Ballabas 2012-11-13 04:36:44 PST
(In reply to comment #13)
The results were produced on x86_64 Linux.
Comment 15 Zoltan Herczeg 2012-11-13 04:54:25 PST
It seems the difference is negligible. Anyway, Geoffrey, if you have a better idea just let us know. The important thing is it should be consistent across JIT and interpreter.
Comment 16 Geoffrey Garen 2012-11-13 10:12:19 PST
The results in bug 79040 show a 1%-2% regression across the board. Whose results are correct?
Comment 17 Zoltan Herczeg 2012-11-13 10:23:29 PST
(In reply to comment #16)
> The results in bug 79040 show a 1%-2% regression across the board. Whose results are correct?

The results here belong to old JIT patch. The results there belong to the DFG JIT patch. Since we are new to DFG JIT, that patch certanly could be improved. Do you have any suggestions how to do that?
Comment 18 Geoffrey Garen 2012-11-13 14:04:10 PST
> Since we are new to DFG JIT, that patch certanly could be improved. Do you have any suggestions how to do that?

I'd start by trying a patch that polls a memory location but doesn't modify the location. If that avoids the regression, we can use an alarm to update the memory location after a certain amount of time, without cost.

For now, I guess I'll mark this patch r-, since it's a step toward a design change that's a performance regression.
Comment 19 Csaba Osztrogonác 2012-11-21 06:31:15 PST
As far as I know Gábor won't work on it in the future. Could you confirm it?
Comment 20 Gabor Ballabas 2012-11-21 06:36:47 PST
(In reply to comment #19)
> As far as I know Gábor won't work on it in the future. Could you confirm it?

No, I won't.