Bug 63605 - DFG JIT does not perform get_by_id self list caching
Summary: DFG JIT does not perform get_by_id self list caching
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-06-28 22:06 PDT by Filip Pizlo
Modified: 2011-06-29 12:46 PDT (History)
2 users (show)

See Also:


Attachments
the patch (12.97 KB, patch)
2011-06-28 22:42 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
the patch (fix style) (12.97 KB, patch)
2011-06-28 22:50 PDT, Filip Pizlo
barraclough: review-
barraclough: commit-queue-
Details | Formatted Diff | Diff
the patch (fix review) (13.08 KB, patch)
2011-06-29 11:16 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Filip Pizlo 2011-06-28 22:06:34 PDT
The main JSC JIT performs polymorphic get_by_id caching by creating a sequence of structure tests and accesses.  The DFG JIT does not do this, but should.
Comment 1 Filip Pizlo 2011-06-28 22:42:05 PDT
Created attachment 99041 [details]
the patch

This patch leads to a speed-up (versus unpatched DFG) on both sunspider and v8.  It does, however, increase the size of StructureStubInfo by two words, due to the need to save data through two state transitions (unset -> get_by_id -> get_by_id self list).  This could be optimized somewhat, but I thought I'd save it for a later patch after we take care of other forms of get/put_by_id optimizations and see what their StructureStubInfo requirements are.


TEST                   COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:           1.023x as fast    188.1ms +/- 0.4%   183.9ms +/- 0.4%     significant

=============================================================================

  3d:                  1.027x as fast     34.2ms +/- 0.8%    33.3ms +/- 0.7%     significant
    cube:              -                   9.0ms +/- 0.6%     9.0ms +/- 0.4% 
    morph:             1.036x as fast      7.5ms +/- 1.9%     7.3ms +/- 1.8%     significant
    raytrace:          1.036x as fast     17.6ms +/- 1.0%    17.0ms +/- 1.2%     significant

  access:              ??                 21.2ms +/- 0.8%    21.3ms +/- 0.8%     not conclusive: might be *1.004x as slow*
    binary-trees:      ??                  2.2ms +/- 5.4%     2.3ms +/- 5.7%     not conclusive: might be *1.027x as slow*
    fannkuch:          ??                 11.0ms +/- 0.9%    11.0ms +/- 0.8%     not conclusive: might be *1.002x as slow*
    nbody:             -                   6.0ms +/- 0.0%     6.0ms +/- 0.0% 
    nsieve:            -                   2.0ms +/- 0.0%     2.0ms +/- 0.0% 

  bitops:              -                  12.0ms +/- 0.0%    12.0ms +/- 0.0% 
    3bit-bits-in-byte: -                   1.0ms +/- 0.0%     1.0ms +/- 0.0% 
    bits-in-byte:      -                   3.0ms +/- 0.0%     3.0ms +/- 0.0% 
    bitwise-and:       -                   3.0ms +/- 0.0%     3.0ms +/- 0.0% 
    nsieve-bits:       -                   5.0ms +/- 0.0%     5.0ms +/- 0.0% 

  controlflow:         -                   1.7ms +/- 7.7%     1.7ms +/- 7.7% 
    recursive:         -                   1.7ms +/- 7.7%     1.7ms +/- 7.7% 

  crypto:              ??                 10.1ms +/- 1.9%    10.2ms +/- 2.1%     not conclusive: might be *1.012x as slow*
    aes:               -                   6.7ms +/- 1.9%     6.6ms +/- 2.1% 
    md5:               -                   2.0ms +/- 0.0%     2.0ms +/- 0.0% 
    sha1:              *1.159x as slow*    1.4ms +/- 10.1%     1.6ms +/- 8.8%     significant

  date:                1.125x as fast     26.1ms +/- 0.6%    23.2ms +/- 0.8%     significant
    format-tofte:      1.199x as fast     17.0ms +/- 0.7%    14.2ms +/- 0.8%     significant
    format-xparb:      -                   9.1ms +/- 1.0%     9.0ms +/- 1.3% 

  math:                -                  16.9ms +/- 0.5%    16.8ms +/- 0.9% 
    cordic:            1.028x as fast      5.9ms +/- 1.3%     5.8ms +/- 2.1%     significant
    partial-sums:      -                   7.0ms +/- 0.0%     7.0ms +/- 0.8% 
    spectral-norm:     ??                  4.0ms +/- 0.0%     4.0ms +/- 1.0%     not conclusive: might be *1.005x as slow*

  regexp:              -                  10.1ms +/- 0.8%    10.0ms +/- 0.4% 
    dna:               -                  10.1ms +/- 0.8%    10.0ms +/- 0.4% 

  string:              -                  55.7ms +/- 0.9%    55.3ms +/- 0.6% 
    base64:            -                   6.0ms +/- 0.9%     6.0ms +/- 0.0% 
    fasta:             -                   7.8ms +/- 1.9%     7.8ms +/- 1.7% 
    tagcloud:          -                  13.6ms +/- 1.5%    13.4ms +/- 1.1% 
    unpack-code:       -                  21.3ms +/- 1.2%    21.2ms +/- 0.9% 
    validate-input:    -                   7.0ms +/- 1.4%     6.9ms +/- 1.2% 



TEST              COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:      1.045x as fast    1236.5ms +/- 1.0%   1183.1ms +/- 0.6%     significant

=============================================================================

  v8:             1.045x as fast    1236.5ms +/- 1.0%   1183.1ms +/- 0.6%     significant
    crypto:       ??                  91.5ms +/- 1.6%     92.3ms +/- 1.0%     not conclusive: might be *1.009x as slow*
    deltablue:    1.20x as fast      325.0ms +/- 1.3%    270.3ms +/- 1.2%     significant
    earley-boyer: *1.020x as slow*   147.1ms +/- 1.5%    150.1ms +/- 1.5%     significant
    raytrace:     ??                  99.8ms +/- 0.8%    100.4ms +/- 1.1%     not conclusive: might be *1.006x as slow*
    regexp:       -                  117.7ms +/- 1.3%    117.4ms +/- 1.9% 
    richards:     -                  248.2ms +/- 1.0%    247.0ms +/- 0.9% 
    splay:        -                  207.2ms +/- 2.5%    205.6ms +/- 1.4%
Comment 2 WebKit Review Bot 2011-06-28 22:45:58 PDT
Attachment 99041 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1

Source/JavaScriptCore/dfg/DFGRepatch.cpp:204:  One line control clauses should not use braces.  [whitespace/braces] [4]
Total errors found: 1 in 7 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 3 Filip Pizlo 2011-06-28 22:50:48 PDT
Created attachment 99042 [details]
the patch (fix style)
Comment 4 Gavin Barraclough 2011-06-28 23:21:00 PDT
Comment on attachment 99042 [details]
the patch (fix style)

View in context: https://bugs.webkit.org/attachment.cgi?id=99042&action=review

Looks great! r- to consider the one comment about memory usage.

> Source/JavaScriptCore/bytecode/StructureStubInfo.h:132
> +        

I think we should probably look at trying to wrap these fields in an #if ENABLE(DFG_JIT).  These are only used if the DFG_JIT is enabled, and it seems unfortunate to increase the size of this object on platforms like ARM.  (We may want to later look at whether we can claw this space back, but if we can make these fields ENABLE(DFG_JIT) I don't think we need to be worried about this for now).

Also, along with accessType & seen (above) the size of this set of fields is 9 bytes, which to align the struct below will probably pad out to 16 bytes.  Probably worth changing accessType & seen to be uint8_t's too, that way all 7 fields should probably pack out to 8 bytes.
Comment 5 Filip Pizlo 2011-06-29 11:16:00 PDT
Created attachment 99115 [details]
the patch (fix review)
Comment 6 WebKit Review Bot 2011-06-29 12:46:41 PDT
Comment on attachment 99115 [details]
the patch (fix review)

Clearing flags on attachment: 99115

Committed r90035: <http://trac.webkit.org/changeset/90035>
Comment 7 WebKit Review Bot 2011-06-29 12:46:46 PDT
All reviewed patches have been landed.  Closing bug.