We want to support reoptimizing code if speculation failures appear to be frequent. But to do so, we need to track why code is failing speculation, so that we don't make the mistake of making the same incorrect speculation in the future. Currently, the OSR exit code has no such tracking; it will promptly forget why code failed OSR. This is particularly bad in the following code: function foo(a,b) { for (var i = 0; i < 100000; ++i) { a += b; } return a; } print(foo(1,10000000)); foo() is always called with integer arguments. But it will fail speculation on the addition "a += b" because of an overflow. Even after OSR when running in the old JIT code, the value profiler won't catch this because the only value profiling sites in foo() are for the arguments. In general, code that fails speculation on integer arithmetic will have no way of recovering: the old JIT will correctly reperform the arithmetic using doubles, but will never record that it had done so. The solution is either to beef up the amount of value profiling that the old JIT performs - for example profiling the results of arithmetic ops - or to record where and why DFG code is failing speculation, and at what rate.
Created attachment 107807 [details] work in progress This is still a work in progress, and has known problems: https://bugs.webkit.org/show_bug.cgi?id=68335
Comment on attachment 107807 [details] work in progress Ooops, pasted a patch on the wrong bug!
The best way to do this might actually be to have slow case counters in the old JIT. There are three cases of speculation failure: 1) We speculate on a value that was loaded from memory, was the result of a call, or was passed as an argument. These are all things that the old JIT already profiles. If speculation failures occur due to those cases, then all it takes is to let the old JIT warm itself up again, and the rerun the DFG. The DFG::Propagator should take care to propagate the predictions from those profiles to all uses, even if they go through local or global variable accesses. 2) We speculate that a particular operation behaves a certain way: for example we speculate that an integer arithmetic does not overflow, or that an integer multiplication does not yield zero. This only occurs in the case of integer speculation. The old JIT will take the fast path for integer arithmetic except if it fails to behave in exactly the way that the DFG JIT would have speculated. Thus, counting the number of slow path executions suffices to tell us that such a speculation would be unwise. 3) We speculate that the code is not crazy: for example we speculate that op_convert_this takes a cell as its input, and that this cell is not a string. We will currently experience pathologies because the DFG has no facility to compile the code any other way, other than to perform the speculation. This is a separate probem, since the DFG would have been able to realize that it should not speculate non-string-cell on ConvertThis if it just looked at the predictions. Even if it did not do this, a slow path counter in the old JIT would catch this case! I will think about this a bit more, but it seems like slow case counters are the way to go.
Created attachment 107823 [details] work in progress
Created attachment 107828 [details] the patch This is now a win instead of a loss. Benchmark report for SunSpider, V8, and Kraken. VMs tested: "TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc "CarefulArith" at /Volumes/Data/pizlo/senary/OpenSource/WebKitBuild/Release/jsc Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree CarefulArith SunSpider: 3d-cube 7.6707+-0.1484 ? 7.6733+-0.1334 ? 3d-morph 7.3894+-0.1137 7.3828+-0.1431 3d-raytrace 7.5342+-0.2006 ? 7.6385+-0.1796 ? might be 1.0139x slower access-binary-trees 2.3883+-0.0700 2.3290+-0.0696 might be 1.0255x faster access-fannkuch 11.5046+-0.1685 ? 11.5155+-0.2161 ? access-nbody 3.9088+-0.1078 ? 3.9422+-0.0874 ? access-nsieve 2.6796+-0.0546 2.6248+-0.0471 might be 1.0209x faster bitops-3bit-bits-in-byte 1.7100+-0.0330 1.6965+-0.0259 bitops-bits-in-byte 2.8467+-0.0819 ? 2.8854+-0.0682 ? might be 1.0136x slower bitops-bitwise-and 3.6884+-0.0598 3.6508+-0.0799 might be 1.0103x faster bitops-nsieve-bits 5.3210+-0.1023 ? 5.3410+-0.1064 ? controlflow-recursive 2.0924+-0.0730 2.0055+-0.0508 might be 1.0433x faster crypto-aes 7.1066+-0.3143 6.8911+-0.3292 might be 1.0313x faster crypto-md5 2.8268+-0.0868 ? 2.8406+-0.0730 ? crypto-sha1 2.2441+-0.0588 ? 2.2576+-0.0545 ? date-format-tofte 10.2087+-0.1888 ? 10.5143+-0.2890 ? might be 1.0299x slower date-format-xparb 8.5579+-0.1966 ? 8.8230+-0.2043 ? might be 1.0310x slower math-cordic 6.1892+-0.1063 6.1612+-0.0862 math-partial-sums 7.6945+-0.3189 7.3158+-0.1287 might be 1.0518x faster math-spectral-norm 2.6537+-0.0516 2.5877+-0.0562 might be 1.0255x faster regexp-dna 10.8771+-0.1977 10.8093+-0.1862 string-base64 5.7514+-0.1539 5.7447+-0.1340 string-fasta 7.0485+-0.1578 6.9804+-0.1471 string-tagcloud 11.6758+-0.1512 ? 11.7351+-0.2206 ? string-unpack-code 18.5442+-0.2653 18.2784+-0.2714 might be 1.0145x faster string-validate-input 6.4478+-0.1534 ? 6.4600+-0.1284 ? <arithmetic> 6.4062+-0.0309 6.3879+-0.0236 <geometric> 5.3178+-0.0243 5.2916+-0.0192 <harmonic> 4.3772+-0.0325 4.3417+-0.0284 TipOfTree CarefulArith V8: crypto 83.1693+-0.6166 ^ 74.3636+-0.6450 ^ definitely 1.1184x faster deltablue 241.0632+-2.0985 240.5850+-2.2448 earley-boyer 96.3423+-0.5503 96.2938+-0.4649 raytrace 68.6738+-0.4872 ! 69.7420+-0.4037 ! definitely 1.0156x slower regexp 106.1911+-0.4768 106.1558+-0.4123 richards 217.1715+-0.8134 ? 217.6257+-1.0068 ? splay 99.4481+-0.5696 99.3026+-0.3885 <arithmetic> 130.2942+-0.3502 ^ 129.1527+-0.3620 ^ definitely 1.0088x faster <geometric> 117.2219+-0.2083 ^ 115.5814+-0.2118 ^ definitely 1.0142x faster <harmonic> 107.3689+-0.2022 ^ 105.3933+-0.2000 ^ definitely 1.0187x faster TipOfTree CarefulArith Kraken: ai-astar 629.2795+-3.0913 627.9005+-2.5983 audio-beat-detection 469.9773+-2.0664 ^ 462.7418+-1.4188 ^ definitely 1.0156x faster audio-dft 420.2076+-2.6332 ? 426.9679+-4.2427 ? might be 1.0161x slower audio-fft 361.7542+-1.2142 ! 364.1869+-1.0514 ! definitely 1.0067x slower audio-oscillator 311.2696+-0.8523 ! 318.0720+-1.6896 ! definitely 1.0219x slower imaging-darkroom 411.9517+-0.7872 ! 417.4208+-1.5802 ! definitely 1.0133x slower imaging-desaturate 218.6913+-0.7069 ^ 216.9134+-0.6400 ^ definitely 1.0082x faster imaging-gaussian-blur 589.3157+-1.7928 ^ 586.1567+-0.7324 ^ definitely 1.0054x faster json-parse-financial 48.1661+-0.2949 ! 49.6449+-0.2307 ! definitely 1.0307x slower json-stringify-tinderbox 68.0875+-0.3727 67.9741+-0.3485 stanford-crypto-aes 142.2783+-0.5305 ? 142.3321+-0.8114 ? stanford-crypto-ccm 110.5747+-0.6112 109.8949+-0.4929 stanford-crypto-pbkdf2 390.5863+-1.8163 ! 401.1321+-2.9775 ! definitely 1.0270x slower stanford-crypto-sha256-iterative 147.4190+-0.6451 ! 148.9498+-0.4688 ! definitely 1.0104x slower <arithmetic> 308.5399+-0.3431 ! 310.0206+-0.2831 ! definitely 1.0048x slower <geometric> 240.8205+-0.3138 ! 242.3024+-0.2427 ! definitely 1.0062x slower <harmonic> 171.4876+-0.4060 ! 173.0798+-0.3425 ! definitely 1.0093x slower TipOfTree CarefulArith All benchmarks: <arithmetic> 114.8549+-0.1224 ! 115.1158+-0.0940 ! definitely 1.0023x slower <geometric> 26.2457+-0.0715 26.1669+-0.0582 <harmonic> 7.7217+-0.0560 7.6600+-0.0491
Created attachment 107909 [details] the patch - attempt to make Windows happy
Landed in r95484.