Bug 68335

Summary: DFG speculation failures should act as additional value profiles
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: JavaScriptCoreAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal    
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Bug Depends on:    
Bug Blocks: 68329    
Attachments:
Description Flags
work in progress
none
work in progress
none
the patch
none
the patch - attempt to make Windows happy oliver: review+

Description Filip Pizlo 2011-09-18 23:15:49 PDT
We want to support reoptimizing code if speculation failures appear to be frequent.  But to do so, we need to track why code is failing speculation, so that we don't make the mistake of making the same incorrect speculation in the future.  Currently, the OSR exit code has no such tracking; it will promptly forget why code failed OSR.

This is particularly bad in the following code:

function foo(a,b) {
    for (var i = 0; i < 100000; ++i) {
        a += b;
    }
    return a;
}

print(foo(1,10000000));

foo() is always called with integer arguments.  But it will fail speculation on the addition "a += b" because of an overflow.  Even after OSR when running in the old JIT code, the value profiler won't catch this because the only value profiling sites in foo() are for the arguments.  In general, code that fails speculation on integer arithmetic will have no way of recovering: the old JIT will correctly reperform the arithmetic using doubles, but will never record that it had done so.

The solution is either to beef up the amount of value profiling that the old JIT performs - for example profiling the results of arithmetic ops - or to record where and why DFG code is failing speculation, and at what rate.
Comment 1 Filip Pizlo 2011-09-18 23:18:56 PDT
Created attachment 107807 [details]
work in progress

This is still a work in progress, and has known problems: https://bugs.webkit.org/show_bug.cgi?id=68335
Comment 2 Filip Pizlo 2011-09-18 23:19:29 PDT
Comment on attachment 107807 [details]
work in progress

Ooops, pasted a patch on the wrong bug!
Comment 3 Filip Pizlo 2011-09-18 23:36:16 PDT
The best way to do this might actually be to have slow case counters in the old JIT.

There are three cases of speculation failure:

1) We speculate on a value that was loaded from memory, was the result of a call, or was passed as an argument.  These are all things that the old JIT already profiles.  If speculation failures occur due to those cases, then all it takes is to let the old JIT warm itself up again, and the rerun the DFG.  The DFG::Propagator should take care to propagate the predictions from those profiles to all uses, even if they go through local or global variable accesses.

2) We speculate that a particular operation behaves a certain way: for example we speculate that an integer arithmetic does not overflow, or that an integer multiplication does not yield zero.  This only occurs in the case of integer speculation.  The old JIT will take the fast path for integer arithmetic except if it fails to behave in exactly the way that the DFG JIT would have speculated.  Thus, counting the number of slow path executions suffices to tell us that such a speculation would be unwise.

3) We speculate that the code is not crazy: for example we speculate that op_convert_this takes a cell as its input, and that this cell is not a string.  We will currently experience pathologies because the DFG has no facility to compile the code any other way, other than to perform the speculation.  This is a separate probem, since the DFG would have been able to realize that it should not speculate non-string-cell on ConvertThis if it just looked at the predictions.  Even if it did not do this, a slow path counter in the old JIT would catch this case!

I will think about this a bit more, but it seems like slow case counters are the way to go.
Comment 4 Filip Pizlo 2011-09-19 02:48:50 PDT
Created attachment 107823 [details]
work in progress
Comment 5 Filip Pizlo 2011-09-19 03:35:15 PDT
Created attachment 107828 [details]
the patch

This is now a win instead of a loss.



Benchmark report for SunSpider, V8, and Kraken.

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc
"CarefulArith" at /Volumes/Data/pizlo/senary/OpenSource/WebKitBuild/Release/jsc

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM
invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting
benchmark execution times with 95% confidence intervals in milliseconds.

                                            TipOfTree              CarefulArith                                  
SunSpider:
   3d-cube                                7.6707+-0.1484    ?     7.6733+-0.1334       ?
   3d-morph                               7.3894+-0.1137          7.3828+-0.1431       
   3d-raytrace                            7.5342+-0.2006    ?     7.6385+-0.1796       ? might be 1.0139x slower
   access-binary-trees                    2.3883+-0.0700          2.3290+-0.0696         might be 1.0255x faster
   access-fannkuch                       11.5046+-0.1685    ?    11.5155+-0.2161       ?
   access-nbody                           3.9088+-0.1078    ?     3.9422+-0.0874       ?
   access-nsieve                          2.6796+-0.0546          2.6248+-0.0471         might be 1.0209x faster
   bitops-3bit-bits-in-byte               1.7100+-0.0330          1.6965+-0.0259       
   bitops-bits-in-byte                    2.8467+-0.0819    ?     2.8854+-0.0682       ? might be 1.0136x slower
   bitops-bitwise-and                     3.6884+-0.0598          3.6508+-0.0799         might be 1.0103x faster
   bitops-nsieve-bits                     5.3210+-0.1023    ?     5.3410+-0.1064       ?
   controlflow-recursive                  2.0924+-0.0730          2.0055+-0.0508         might be 1.0433x faster
   crypto-aes                             7.1066+-0.3143          6.8911+-0.3292         might be 1.0313x faster
   crypto-md5                             2.8268+-0.0868    ?     2.8406+-0.0730       ?
   crypto-sha1                            2.2441+-0.0588    ?     2.2576+-0.0545       ?
   date-format-tofte                     10.2087+-0.1888    ?    10.5143+-0.2890       ? might be 1.0299x slower
   date-format-xparb                      8.5579+-0.1966    ?     8.8230+-0.2043       ? might be 1.0310x slower
   math-cordic                            6.1892+-0.1063          6.1612+-0.0862       
   math-partial-sums                      7.6945+-0.3189          7.3158+-0.1287         might be 1.0518x faster
   math-spectral-norm                     2.6537+-0.0516          2.5877+-0.0562         might be 1.0255x faster
   regexp-dna                            10.8771+-0.1977         10.8093+-0.1862       
   string-base64                          5.7514+-0.1539          5.7447+-0.1340       
   string-fasta                           7.0485+-0.1578          6.9804+-0.1471       
   string-tagcloud                       11.6758+-0.1512    ?    11.7351+-0.2206       ?
   string-unpack-code                    18.5442+-0.2653         18.2784+-0.2714         might be 1.0145x faster
   string-validate-input                  6.4478+-0.1534    ?     6.4600+-0.1284       ?

   <arithmetic>                           6.4062+-0.0309          6.3879+-0.0236       
   <geometric>                            5.3178+-0.0243          5.2916+-0.0192       
   <harmonic>                             4.3772+-0.0325          4.3417+-0.0284       

                                            TipOfTree              CarefulArith                                  
V8:
   crypto                                83.1693+-0.6166    ^    74.3636+-0.6450       ^ definitely 1.1184x faster
   deltablue                            241.0632+-2.0985        240.5850+-2.2448       
   earley-boyer                          96.3423+-0.5503         96.2938+-0.4649       
   raytrace                              68.6738+-0.4872    !    69.7420+-0.4037       ! definitely 1.0156x slower
   regexp                               106.1911+-0.4768        106.1558+-0.4123       
   richards                             217.1715+-0.8134    ?   217.6257+-1.0068       ?
   splay                                 99.4481+-0.5696         99.3026+-0.3885       

   <arithmetic>                         130.2942+-0.3502    ^   129.1527+-0.3620       ^ definitely 1.0088x faster
   <geometric>                          117.2219+-0.2083    ^   115.5814+-0.2118       ^ definitely 1.0142x faster
   <harmonic>                           107.3689+-0.2022    ^   105.3933+-0.2000       ^ definitely 1.0187x faster

                                            TipOfTree              CarefulArith                                  
Kraken:
   ai-astar                             629.2795+-3.0913        627.9005+-2.5983       
   audio-beat-detection                 469.9773+-2.0664    ^   462.7418+-1.4188       ^ definitely 1.0156x faster
   audio-dft                            420.2076+-2.6332    ?   426.9679+-4.2427       ? might be 1.0161x slower
   audio-fft                            361.7542+-1.2142    !   364.1869+-1.0514       ! definitely 1.0067x slower
   audio-oscillator                     311.2696+-0.8523    !   318.0720+-1.6896       ! definitely 1.0219x slower
   imaging-darkroom                     411.9517+-0.7872    !   417.4208+-1.5802       ! definitely 1.0133x slower
   imaging-desaturate                   218.6913+-0.7069    ^   216.9134+-0.6400       ^ definitely 1.0082x faster
   imaging-gaussian-blur                589.3157+-1.7928    ^   586.1567+-0.7324       ^ definitely 1.0054x faster
   json-parse-financial                  48.1661+-0.2949    !    49.6449+-0.2307       ! definitely 1.0307x slower
   json-stringify-tinderbox              68.0875+-0.3727         67.9741+-0.3485       
   stanford-crypto-aes                  142.2783+-0.5305    ?   142.3321+-0.8114       ?
   stanford-crypto-ccm                  110.5747+-0.6112        109.8949+-0.4929       
   stanford-crypto-pbkdf2               390.5863+-1.8163    !   401.1321+-2.9775       ! definitely 1.0270x slower
   stanford-crypto-sha256-iterative     147.4190+-0.6451    !   148.9498+-0.4688       ! definitely 1.0104x slower

   <arithmetic>                         308.5399+-0.3431    !   310.0206+-0.2831       ! definitely 1.0048x slower
   <geometric>                          240.8205+-0.3138    !   242.3024+-0.2427       ! definitely 1.0062x slower
   <harmonic>                           171.4876+-0.4060    !   173.0798+-0.3425       ! definitely 1.0093x slower

                                            TipOfTree              CarefulArith                                  
All benchmarks:
   <arithmetic>                         114.8549+-0.1224    !   115.1158+-0.0940       ! definitely 1.0023x slower
   <geometric>                           26.2457+-0.0715         26.1669+-0.0582       
   <harmonic>                             7.7217+-0.0560          7.6600+-0.0491
Comment 6 Filip Pizlo 2011-09-19 13:26:57 PDT
Created attachment 107909 [details]
the patch - attempt to make Windows happy
Comment 7 Filip Pizlo 2011-09-19 15:27:01 PDT
Landed in r95484.