RESOLVED FIXED 67798
DFG JIT completely undoes speculative compilation even in the case of a partial static speculation failure
https://bugs.webkit.org/show_bug.cgi?id=67798
Summary DFG JIT completely undoes speculative compilation even in the case of a parti...
Filip Pizlo
Reported 2011-09-08 13:25:49 PDT
The DFG JIT may perform a speculation that contravenes static information. For example, it may assume that a value must be integer when the code that produces it always produces a cell, and the fact that it produces a cell is proven statically. In that case, it terminates speculation. Currently this means undoing speculative compilation for the entire code block, and recompiling the entire code block entirely with the non-speculative JIT. What it should probably do instead is just jump out of speculative code at the point where the static information contravenes speculation, to ensure that if this scenario happens partially (i.e. in conditional code, which may be a slow path anyway) then the code block will still benefit from speculation when that condition does not arise.
Attachments
the patch (5.75 KB, patch)
2011-09-08 15:12 PDT, Filip Pizlo
ggaren: review+
the patch - fix review (7.41 KB, patch)
2011-09-09 17:07 PDT, Filip Pizlo
no flags
Filip Pizlo
Comment 1 2011-09-08 13:29:42 PDT
This is a work in progress, and isn't totally stable yet. It's also a regression on v8-crypto under static speculation (which is still the default in ToT). Benchmark report for SunSpider and V8. VMs tested: "TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc "PartialSpecFail" at /Volumes/Data/pizlo/octonary/OpenSource/WebKitBuild/Release/jsc Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree PartialSpecFail SunSpider: 3d-cube 7.8561+-0.1646 7.8155+-0.2022 3d-morph 7.8473+-0.1579 7.6001+-0.1902 might be 1.0325x faster 3d-raytrace 7.6863+-0.1967 7.6497+-0.2789 access-binary-trees 2.2991+-0.0437 ? 2.3556+-0.0766 ? might be 1.0245x slower access-fannkuch 12.0374+-0.2866 11.8132+-0.1643 might be 1.0190x faster access-nbody 4.4257+-0.1024 4.3487+-0.0582 might be 1.0177x faster access-nsieve 2.5066+-0.0831 ? 2.5857+-0.0715 ? might be 1.0316x slower bitops-3bit-bits-in-byte 1.7548+-0.0479 ? 1.7602+-0.0528 ? bitops-bits-in-byte 4.5568+-0.2521 ? 4.6508+-0.2218 ? might be 1.0206x slower bitops-bitwise-and 3.7287+-0.0639 ? 3.7565+-0.0750 ? bitops-nsieve-bits 5.5253+-0.1456 ? 5.6534+-0.1690 ? might be 1.0232x slower controlflow-recursive 2.0740+-0.0461 2.0333+-0.0497 might be 1.0200x faster crypto-aes 6.9185+-0.3649 6.8543+-0.3203 crypto-md5 2.8268+-0.0863 ? 2.8695+-0.1142 ? might be 1.0151x slower crypto-sha1 2.2198+-0.0391 ! 2.3437+-0.0718 ! definitely 1.0558x slower date-format-tofte 10.4411+-0.3354 10.1745+-0.2410 might be 1.0262x faster date-format-xparb 9.1471+-0.2291 ? 9.1878+-0.2066 ? math-cordic 6.4154+-0.1277 6.2871+-0.1184 might be 1.0204x faster math-partial-sums 7.9001+-0.1389 7.8879+-0.1401 math-spectral-norm 2.5591+-0.0447 ^ 2.4600+-0.0321 ^ definitely 1.0403x faster regexp-dna 10.5650+-0.2633 ? 10.5712+-0.1567 ? string-base64 6.0538+-0.1903 ? 6.1351+-0.2088 ? might be 1.0134x slower string-fasta 7.6601+-0.2406 ? 7.6878+-0.1736 ? string-tagcloud 12.2444+-0.3531 12.1599+-0.2803 string-unpack-code 18.9348+-0.3692 ? 19.0148+-0.3475 ? string-validate-input 7.2754+-0.2723 7.2458+-0.2474 <arithmetic> 6.6715+-0.0391 6.6501+-0.0287 <geometric> 5.5415+-0.0294 5.5415+-0.0278 <harmonic> 4.5240+-0.0279 ? 4.5413+-0.0278 ? TipOfTree PartialSpecFail V8: crypto 91.7346+-0.8852 ! 100.1585+-0.5619 ! definitely 1.0918x slower deltablue 270.8549+-2.1356 267.9829+-1.3881 might be 1.0107x faster earley-boyer 95.2392+-0.6167 ! 97.1235+-0.8832 ! definitely 1.0198x slower raytrace 80.0969+-0.8197 79.2897+-0.3222 might be 1.0102x faster regexp 112.4222+-1.1974 111.2879+-0.7193 might be 1.0102x faster richards 246.1516+-1.8007 ^ 240.0536+-0.7336 ^ definitely 1.0254x faster splay 103.9291+-0.3727 ? 104.3902+-1.1212 ? <arithmetic> 142.9183+-0.5466 142.8981+-0.2484 <geometric> 127.4043+-0.3828 ! 128.4276+-0.2773 ! definitely 1.0080x slower <harmonic> 116.3446+-0.3055 ! 117.9230+-0.2941 ! definitely 1.0136x slower TipOfTree PartialSpecFail All benchmarks: <arithmetic> 35.5724+-0.1334 35.5512+-0.0619 <geometric> 10.7755+-0.0479 ? 10.7938+-0.0442 ? <harmonic> 5.6825+-0.0347 ? 5.7048+-0.0345 ?
Filip Pizlo
Comment 2 2011-09-08 14:15:07 PDT
This now appears stable. But, it's a V8 slow-down. Benchmark report for SunSpider, V8, and Kraken. VMs tested: "TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc "PartialSpecFail" at /Volumes/Data/pizlo/octonary/OpenSource/WebKitBuild/Release/jsc Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree PartialSpecFail SunSpider: 3d-cube 7.8790+-0.1847 7.7234+-0.2069 might be 1.0202x faster 3d-morph 7.8556+-0.1429 ^ 7.4219+-0.1409 ^ definitely 1.0584x faster 3d-raytrace 7.5771+-0.2102 7.5468+-0.1384 access-binary-trees 2.2669+-0.0389 ? 2.2928+-0.0555 ? might be 1.0114x slower access-fannkuch 11.9656+-0.2655 ? 11.9976+-0.2837 ? access-nbody 4.3613+-0.1447 4.2937+-0.0866 might be 1.0157x faster access-nsieve 2.4939+-0.0718 ? 2.5792+-0.0601 ? might be 1.0342x slower bitops-3bit-bits-in-byte 1.7326+-0.0618 ? 1.8014+-0.0547 ? might be 1.0397x slower bitops-bits-in-byte 4.6134+-0.2415 ^ 3.3414+-0.1481 ^ definitely 1.3807x faster bitops-bitwise-and 3.7141+-0.0795 3.6939+-0.0653 bitops-nsieve-bits 5.5147+-0.1153 5.4464+-0.1066 might be 1.0125x faster controlflow-recursive 2.0409+-0.0581 ? 2.0611+-0.0426 ? crypto-aes 6.5953+-0.1324 6.5248+-0.1913 might be 1.0108x faster crypto-md5 2.7708+-0.0604 2.7644+-0.0653 crypto-sha1 2.3058+-0.0905 2.2729+-0.0468 might be 1.0145x faster date-format-tofte 10.5226+-0.3112 10.2755+-0.3134 might be 1.0240x faster date-format-xparb 8.8497+-0.3011 ? 9.0041+-0.2844 ? might be 1.0175x slower math-cordic 6.3786+-0.1561 6.3529+-0.1171 math-partial-sums 7.8715+-0.1752 7.7494+-0.1679 might be 1.0157x faster math-spectral-norm 2.5403+-0.0564 ? 2.5837+-0.1298 ? might be 1.0171x slower regexp-dna 10.5271+-0.1745 10.3552+-0.2115 might be 1.0166x faster string-base64 6.1475+-0.2276 6.0770+-0.2383 might be 1.0116x faster string-fasta 7.7278+-0.2546 7.5394+-0.1755 might be 1.0250x faster string-tagcloud 12.1584+-0.4519 ? 12.2708+-0.3747 ? string-unpack-code 18.6953+-0.4765 ? 19.0792+-0.4617 ? might be 1.0205x slower string-validate-input 7.2421+-0.2037 7.0584+-0.1706 might be 1.0260x faster <arithmetic> 6.6288+-0.0408 ^ 6.5426+-0.0404 ^ definitely 1.0132x faster <geometric> 5.5099+-0.0338 ^ 5.4210+-0.0341 ^ definitely 1.0164x faster <harmonic> 4.4992+-0.0305 4.4473+-0.0353 might be 1.0117x faster TipOfTree PartialSpecFail V8: crypto 91.0338+-0.5191 ! 104.1613+-0.8925 ! definitely 1.1442x slower deltablue 269.7535+-2.7809 ^ 265.7885+-0.7118 ^ definitely 1.0149x faster earley-boyer 95.0161+-0.5041 ? 95.5356+-0.5577 ? raytrace 79.1499+-0.5473 ? 79.7137+-0.6612 ? regexp 110.8495+-0.4558 ^ 109.2641+-0.3574 ^ definitely 1.0145x faster richards 240.9133+-0.8254 239.4367+-1.6783 splay 103.0647+-0.6429 ! 104.4789+-0.7212 ! definitely 1.0137x slower <arithmetic> 141.3972+-0.2966 ! 142.6256+-0.4231 ! definitely 1.0087x slower <geometric> 126.1408+-0.1666 ! 128.4242+-0.3937 ! definitely 1.0181x slower <harmonic> 115.2637+-0.1753 ! 118.0846+-0.4050 ! definitely 1.0245x slower TipOfTree PartialSpecFail Kraken: ai-astar 1108.9297+-5.9845 ? 1123.4756+-12.5830 ? might be 1.0131x slower audio-beat-detection 481.1936+-1.4541 ? 486.0185+-4.7695 ? might be 1.0100x slower audio-dft 426.4858+-4.3795 425.2002+-2.3276 audio-fft 373.8409+-2.1670 ? 374.6652+-0.9116 ? audio-oscillator 384.1150+-2.2719 ? 387.0932+-3.3490 ? imaging-darkroom 537.6787+-3.3081 534.4603+-2.0673 imaging-desaturate 623.8627+-8.2803 615.4398+-4.8321 might be 1.0137x faster imaging-gaussian-blur 1738.3217+-5.4653 1729.6538+-4.2880 json-parse-financial 49.1108+-0.5305 ? 49.8053+-0.3037 ? might be 1.0141x slower json-stringify-tinderbox 72.4905+-0.6863 ^ 69.0171+-0.4305 ^ definitely 1.0503x faster stanford-crypto-aes 145.4706+-1.1861 145.0223+-1.1645 stanford-crypto-ccm 115.7358+-0.3964 ^ 113.6545+-0.7748 ^ definitely 1.0183x faster stanford-crypto-pbkdf2 338.3754+-1.7782 ! 341.7524+-1.4837 ! definitely 1.0100x slower stanford-crypto-sha256-iterative 131.3300+-0.4891 ! 134.3038+-1.2092 ! definitely 1.0226x slower <arithmetic> 466.2101+-0.8557 ? 466.3973+-1.1839 ? <geometric> 301.2570+-0.3548 300.8576+-0.4784 <harmonic> 186.9324+-0.3719 ^ 186.0068+-0.4899 ^ definitely 1.0050x faster TipOfTree PartialSpecFail All benchmarks: <arithmetic> 163.5972+-0.2484 ? 163.7883+-0.3213 ? <geometric> 28.9262+-0.1040 28.7323+-0.1022 <harmonic> 7.9467+-0.0525 7.8585+-0.0613 might be 1.0112x faster
Filip Pizlo
Comment 3 2011-09-08 14:39:33 PDT
Looks like this path will work best if it is turned off for static speculation, but turned on for dynamic speculation. Here's the performance with it turned off. Note the noise (38% speed-up on one SunSpider benchmark that gets totally lost in the average). I'm convinced that it is in fact noise and not real. Benchmark report for SunSpider, V8, and Kraken. VMs tested: "TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc "PartialSpecFailOff" at /Volumes/Data/pizlo/octonary/OpenSource/WebKitBuild/Release/jsc Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree PartialSpecFailOff SunSpider: 3d-cube 7.7127+-0.1331 ? 7.8798+-0.2091 ? might be 1.0217x slower 3d-morph 7.5139+-0.1326 ? 7.5244+-0.2035 ? 3d-raytrace 7.3208+-0.1767 ? 7.5204+-0.2760 ? might be 1.0273x slower access-binary-trees 2.3967+-0.0970 2.2549+-0.0498 might be 1.0629x faster access-fannkuch 11.9275+-0.2377 11.8759+-0.1481 access-nbody 4.2453+-0.0602 ? 4.2470+-0.0751 ? access-nsieve 2.5830+-0.0807 2.4702+-0.0412 might be 1.0457x faster bitops-3bit-bits-in-byte 1.7405+-0.0422 ? 1.7923+-0.0560 ? might be 1.0298x slower bitops-bits-in-byte 4.5369+-0.1687 ^ 3.2834+-0.0690 ^ definitely 1.3817x faster bitops-bitwise-and 3.6754+-0.0649 3.6354+-0.0616 might be 1.0110x faster bitops-nsieve-bits 5.3928+-0.1547 ? 5.5196+-0.1109 ? might be 1.0235x slower controlflow-recursive 2.0130+-0.0451 ? 2.0507+-0.0368 ? might be 1.0187x slower crypto-aes 6.6134+-0.2576 6.6038+-0.1874 crypto-md5 2.8496+-0.1166 ? 2.9049+-0.1234 ? might be 1.0194x slower crypto-sha1 2.2437+-0.0694 ? 2.3244+-0.0596 ? might be 1.0360x slower date-format-tofte 10.1987+-0.2700 ? 10.2935+-0.2483 ? date-format-xparb 9.0265+-0.2755 8.9221+-0.3400 might be 1.0117x faster math-cordic 6.2957+-0.0972 ? 6.4081+-0.1749 ? might be 1.0179x slower math-partial-sums 7.7582+-0.1480 ? 7.8717+-0.1562 ? might be 1.0146x slower math-spectral-norm 2.5476+-0.0818 ? 2.5812+-0.1030 ? might be 1.0132x slower regexp-dna 10.6437+-0.2123 10.3263+-0.1333 might be 1.0307x faster string-base64 6.1379+-0.2106 6.0649+-0.1760 might be 1.0120x faster string-fasta 7.4621+-0.1615 ? 7.5237+-0.1623 ? string-tagcloud 12.0855+-0.2801 ? 12.3640+-0.3423 ? might be 1.0230x slower string-unpack-code 19.0547+-0.2942 ? 19.1257+-0.4011 ? string-validate-input 7.2033+-0.2369 7.0648+-0.2110 might be 1.0196x faster <arithmetic> 6.5838+-0.0390 6.5551+-0.0416 <geometric> 5.4789+-0.0256 5.4260+-0.0289 <harmonic> 4.4947+-0.0220 4.4438+-0.0304 might be 1.0114x faster TipOfTree PartialSpecFailOff V8: crypto 90.9512+-0.7022 ^ 86.9512+-0.4698 ^ definitely 1.0460x faster deltablue 264.5347+-0.9710 ? 267.4549+-2.0893 ? might be 1.0110x slower earley-boyer 93.9388+-0.3832 93.3699+-0.2572 raytrace 78.7379+-0.7401 77.6151+-0.4294 might be 1.0145x faster regexp 110.2534+-0.8725 ? 111.8147+-1.1356 ? might be 1.0142x slower richards 237.0448+-1.9937 ? 240.2445+-1.7043 ? might be 1.0135x slower splay 102.7832+-0.4053 ? 103.1920+-1.1310 ? <arithmetic> 139.7491+-0.4324 ? 140.0917+-0.4421 ? <geometric> 125.0396+-0.3620 124.6279+-0.3570 <harmonic> 114.4838+-0.3624 ^ 113.5691+-0.3540 ^ definitely 1.0081x faster TipOfTree PartialSpecFailOff Kraken: ai-astar 1111.5817+-10.0950 1100.4044+-7.0387 might be 1.0102x faster audio-beat-detection 484.9305+-3.8803 479.5919+-2.9518 might be 1.0111x faster audio-dft 423.4744+-4.5484 420.9447+-2.8184 audio-fft 377.1075+-2.9333 374.8421+-2.5561 audio-oscillator 381.5580+-2.0830 380.4500+-2.6626 imaging-darkroom 540.4795+-3.7662 ^ 531.7605+-2.5730 ^ definitely 1.0164x faster imaging-desaturate 616.7875+-7.1272 ? 617.7079+-7.1100 ? imaging-gaussian-blur 1732.4610+-3.6755 ? 1739.1962+-16.1256 ? json-parse-financial 49.6394+-0.4810 49.6081+-0.2538 json-stringify-tinderbox 68.6684+-0.7766 ? 68.9953+-0.7159 ? stanford-crypto-aes 146.0314+-3.3339 144.2934+-1.3057 might be 1.0120x faster stanford-crypto-ccm 112.5396+-0.6479 ? 113.2355+-0.9298 ? stanford-crypto-pbkdf2 339.8969+-3.1305 339.6628+-2.0729 stanford-crypto-sha256-iterative 132.6277+-1.3897 132.0876+-1.3444 <arithmetic> 465.5560+-0.7448 463.7700+-1.9575 <geometric> 299.9983+-0.5828 298.8703+-0.7334 <harmonic> 185.2166+-0.6885 184.9963+-0.6590 TipOfTree PartialSpecFailOff All benchmarks: <arithmetic> 163.1321+-0.2674 162.6352+-0.5690 <geometric> 28.7626+-0.0819 ^ 28.5624+-0.0863 ^ definitely 1.0070x faster <harmonic> 7.9373+-0.0381 7.8488+-0.0527 might be 1.0113x faster
Filip Pizlo
Comment 4 2011-09-08 15:00:16 PDT
Doing this with dynamic optimization enabled appears to reveal a case in v8-crypto where we're speculating incorrectly. My opinion is that we should commit this anyway, since (1) dynamic optimization is turned off by default and (2) we should make v8-crypto speculate correctly all the time instead of relying on the non-speculative path to save us. Benchmark report for SunSpider, V8, and Kraken. VMs tested: "TipOfTreeDyn" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc "PartialSpecFail" at /Volumes/Data/pizlo/octonary/OpenSource/WebKitBuild/Release/jsc Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTreeDyn PartialSpecFail SunSpider: 3d-cube 12.4085+-0.3516 12.2519+-0.2947 might be 1.0128x faster 3d-morph 7.8265+-0.1363 ? 7.9644+-0.1721 ? might be 1.0176x slower 3d-raytrace 8.1677+-0.2721 ? 8.5670+-0.1768 ? might be 1.0489x slower access-binary-trees 2.4106+-0.0321 ? 2.4442+-0.0792 ? might be 1.0139x slower access-fannkuch 12.7431+-0.2813 ? 12.7705+-0.1946 ? access-nbody 4.2390+-0.0476 ! 4.4125+-0.1213 ! definitely 1.0409x slower access-nsieve 2.7531+-0.0640 2.7184+-0.0627 might be 1.0127x faster bitops-3bit-bits-in-byte 1.8846+-0.0544 ? 2.0364+-0.1052 ? might be 1.0805x slower bitops-bits-in-byte 5.2725+-0.2678 5.1915+-0.4005 might be 1.0156x faster bitops-bitwise-and 4.1097+-0.1246 3.9832+-0.1009 might be 1.0318x faster bitops-nsieve-bits 5.8795+-0.1013 ? 6.0344+-0.2063 ? might be 1.0263x slower controlflow-recursive 2.0543+-0.0607 2.0061+-0.0400 might be 1.0240x faster crypto-aes 7.9677+-0.2851 7.9440+-0.3411 crypto-md5 2.9693+-0.0756 ? 3.1021+-0.1297 ? might be 1.0447x slower crypto-sha1 2.4788+-0.0803 ? 2.4853+-0.0725 ? date-format-tofte 10.6320+-0.2625 ? 10.6375+-0.2228 ? date-format-xparb 9.2417+-0.2094 ? 9.4102+-0.2722 ? might be 1.0182x slower math-cordic 6.7555+-0.1049 ? 6.7855+-0.0993 ? math-partial-sums 7.7144+-0.1615 7.5565+-0.1082 might be 1.0209x faster math-spectral-norm 2.6691+-0.0684 2.6070+-0.0618 might be 1.0238x faster regexp-dna 10.4582+-0.2742 ? 10.4726+-0.2079 ? string-base64 6.3856+-0.1344 ? 6.4854+-0.1409 ? might be 1.0156x slower string-fasta 7.2776+-0.1751 ? 7.4598+-0.2319 ? might be 1.0250x slower string-tagcloud 12.5675+-0.5429 ? 12.6614+-0.3102 ? string-unpack-code 19.5392+-0.6266 19.5243+-0.5633 string-validate-input 6.8255+-0.1597 ? 7.2243+-0.2755 ? might be 1.0584x slower <arithmetic> 7.0474+-0.0485 ? 7.1052+-0.0365 ? <geometric> 5.8534+-0.0323 ? 5.9095+-0.0296 ? <harmonic> 4.7800+-0.0298 ? 4.8287+-0.0358 ? might be 1.0102x slower TipOfTreeDyn PartialSpecFail V8: crypto 82.4555+-0.3792 ! 87.4537+-0.6151 ! definitely 1.0606x slower deltablue 263.4640+-2.7763 261.5359+-2.0604 earley-boyer 101.3340+-0.3097 100.8797+-0.3992 raytrace 82.1846+-0.3536 81.5247+-0.8216 regexp 111.7920+-0.7306 110.9027+-0.5673 richards 218.6683+-0.6105 ? 219.1288+-1.2508 ? splay 106.2067+-0.5304 105.4638+-0.5139 <arithmetic> 138.0150+-0.5447 ? 138.1270+-0.3684 ? <geometric> 124.7283+-0.4012 ? 125.1911+-0.2985 ? <harmonic> 114.9504+-0.3416 ! 115.6934+-0.3325 ! definitely 1.0065x slower TipOfTreeDyn PartialSpecFail Kraken: ai-astar 1138.4605+-9.4245 ? 1143.6393+-8.0190 ? audio-beat-detection 514.7744+-2.1217 ? 514.9893+-2.7075 ? audio-dft 470.3802+-3.7706 ? 477.4101+-6.7100 ? might be 1.0149x slower audio-fft 395.4076+-4.8793 ? 401.4950+-2.7998 ? might be 1.0154x slower audio-oscillator 351.5889+-2.0485 348.8424+-1.1604 imaging-darkroom 539.0097+-7.2727 533.6361+-1.2733 might be 1.0101x faster imaging-desaturate 596.2434+-1.6842 ? 597.0473+-2.0741 ? imaging-gaussian-blur 2301.8736+-20.0258 ? 2303.2791+-14.8414 ? json-parse-financial 50.6473+-0.3132 50.1082+-0.3485 might be 1.0108x faster json-stringify-tinderbox 69.7147+-0.5663 ? 70.0110+-0.6125 ? stanford-crypto-aes 162.4962+-0.6509 ! 166.2443+-2.4364 ! definitely 1.0231x slower stanford-crypto-ccm 123.4177+-0.5429 ? 124.6134+-1.4519 ? stanford-crypto-pbkdf2 364.9508+-2.1027 ^ 358.8682+-2.5451 ^ definitely 1.0169x faster stanford-crypto-sha256-iterative 139.6700+-0.4417 138.4910+-1.0300 <arithmetic> 515.6168+-2.0708 ? 516.3339+-1.2586 ? <geometric> 316.7307+-0.7793 ? 317.1747+-0.6087 ? <harmonic> 193.0029+-0.5715 192.9705+-0.6043 TipOfTreeDyn PartialSpecFail All benchmarks: <arithmetic> 178.0419+-0.5977 ? 178.3043+-0.3970 ? <geometric> 30.3089+-0.0897 ! 30.4989+-0.0852 ! definitely 1.0063x slower <harmonic> 8.4339+-0.0515 ? 8.5184+-0.0617 ? might be 1.0100x slower
Filip Pizlo
Comment 5 2011-09-08 15:12:20 PDT
Created attachment 106799 [details] the patch
Geoffrey Garen
Comment 6 2011-09-09 15:36:23 PDT
Comment on attachment 106799 [details] the patch View in context: https://bugs.webkit.org/attachment.cgi?id=106799&action=review r=me > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:1387 > + m_compileIndex = block.begin; > + m_compileOkay = true; > + clearGenerationInfo(); It confused me that a block could sometimes assume that generation info was in an empty state, and sometimes not. Would be nice to clean this up in future, possibly by giving each block its own generation info, or maybe just by calling clearGenerationInfo() unconditionally at the head of SpeculativeJIT::compile, if that's not too expensive. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.h:229 > + // under static speculation, it's more profitable to give up entirely at this Capital 'U', please.
Gavin Barraclough
Comment 7 2011-09-09 15:36:52 PDT
Comment on attachment 106799 [details] the patch View in context: https://bugs.webkit.org/attachment.cgi?id=106799&action=review I think the mechanism implemented in this patch (reintroducing a dynamic bail to non-spec on terminateSpeculation) should be completely orthogonal to the DYNAMIC_OPTIMIZATION - we should be able to configure the two separately? - if so, it may make sense to land this under a separate #ifdef. I'd suggest changing the ENABLE(DYNAMIC_OPTIMIZATION) tests into the code to something like ENABLE(DYNAMIC_TERMINATE_SPECULATIVE_JIT), & then "#define ENABLE_DYNAMIC_TERMINATE_SPECULATIVE_JIT ENABLE_DYNAMIC_OPTIMIZATION" in Platform.h. r+ with at least a fix to DFG_DEBUG_VERBOSE. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.h:223 > +#if DFG_DEBUG_VERBOSE This debug printf should be moved outside of the outer ifdef, such that it is printed for both ENABLE(DYNAMIC_OPTIMIZATION) & !ENABLE(DYNAMIC_OPTIMIZATION).
Gavin Barraclough
Comment 8 2011-09-09 15:41:36 PDT
(In reply to comment #6) > (From update of attachment 106799 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=106799&action=review > > r=me > > > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:1387 > > + m_compileIndex = block.begin; > > + m_compileOkay = true; > > + clearGenerationInfo(); > > It confused me that a block could sometimes assume that generation info was in an empty state, and sometimes not. Would be nice to clean this up in future, possibly by giving each block its own generation info, or maybe just by calling clearGenerationInfo() unconditionally at the head of SpeculativeJIT::compile, if that's not too expensive. One way to ensure that the generation info is already clear at the head of compile(BasicBlock&) may be to call clearGenerationInfo() from terminateSpeculativeExecution(), then we may be able to assert in all cases that the generation info is already clear at the head of blocks.
Filip Pizlo
Comment 9 2011-09-09 17:07:10 PDT
Created attachment 106944 [details] the patch - fix review Will wait for the bots to be happy before I land.
Geoffrey Garen
Comment 10 2011-09-09 17:16:37 PDT
Comment on attachment 106944 [details] the patch - fix review r=me
WebKit Review Bot
Comment 11 2011-09-10 14:22:52 PDT
Comment on attachment 106944 [details] the patch - fix review Clearing flags on attachment: 106944 Committed r94914: <http://trac.webkit.org/changeset/94914>
WebKit Review Bot
Comment 12 2011-09-10 14:22:57 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.