Bug 67798 - DFG JIT completely undoes speculative compilation even in the case of a partial static speculation failure
Summary: DFG JIT completely undoes speculative compilation even in the case of a parti...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-09-08 13:25 PDT by Filip Pizlo
Modified: 2011-09-10 14:22 PDT (History)
3 users (show)

See Also:


Attachments
the patch (5.75 KB, patch)
2011-09-08 15:12 PDT, Filip Pizlo
ggaren: review+
Details | Formatted Diff | Diff
the patch - fix review (7.41 KB, patch)
2011-09-09 17:07 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Filip Pizlo 2011-09-08 13:25:49 PDT
The DFG JIT may perform a speculation that contravenes static information.  For example, it may assume that a value must be integer when the code that produces it always produces a cell, and the fact that it produces a cell is proven statically.  In that case, it terminates speculation.  Currently this means undoing speculative compilation for the entire code block, and recompiling the entire code block entirely with the non-speculative JIT.  What it should probably do instead is just jump out of speculative code at the point where the static information contravenes speculation, to ensure that if this scenario happens partially (i.e. in conditional code, which may be a slow path anyway) then the code block will still benefit from speculation when that condition does not arise.
Comment 1 Filip Pizlo 2011-09-08 13:29:42 PDT
This is a work in progress, and isn't totally stable yet.  It's also a regression on v8-crypto under static speculation (which is still the default in ToT).



Benchmark report for SunSpider and V8.

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc
"PartialSpecFail" at /Volumes/Data/pizlo/octonary/OpenSource/WebKitBuild/Release/jsc

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration
per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level
timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds.

                                    TipOfTree            PartialSpecFail                                 
SunSpider:
   3d-cube                        7.8561+-0.1646          7.8155+-0.2022       
   3d-morph                       7.8473+-0.1579          7.6001+-0.1902         might be 1.0325x faster
   3d-raytrace                    7.6863+-0.1967          7.6497+-0.2789       
   access-binary-trees            2.2991+-0.0437    ?     2.3556+-0.0766       ? might be 1.0245x slower
   access-fannkuch               12.0374+-0.2866         11.8132+-0.1643         might be 1.0190x faster
   access-nbody                   4.4257+-0.1024          4.3487+-0.0582         might be 1.0177x faster
   access-nsieve                  2.5066+-0.0831    ?     2.5857+-0.0715       ? might be 1.0316x slower
   bitops-3bit-bits-in-byte       1.7548+-0.0479    ?     1.7602+-0.0528       ?
   bitops-bits-in-byte            4.5568+-0.2521    ?     4.6508+-0.2218       ? might be 1.0206x slower
   bitops-bitwise-and             3.7287+-0.0639    ?     3.7565+-0.0750       ?
   bitops-nsieve-bits             5.5253+-0.1456    ?     5.6534+-0.1690       ? might be 1.0232x slower
   controlflow-recursive          2.0740+-0.0461          2.0333+-0.0497         might be 1.0200x faster
   crypto-aes                     6.9185+-0.3649          6.8543+-0.3203       
   crypto-md5                     2.8268+-0.0863    ?     2.8695+-0.1142       ? might be 1.0151x slower
   crypto-sha1                    2.2198+-0.0391    !     2.3437+-0.0718       ! definitely 1.0558x slower
   date-format-tofte             10.4411+-0.3354         10.1745+-0.2410         might be 1.0262x faster
   date-format-xparb              9.1471+-0.2291    ?     9.1878+-0.2066       ?
   math-cordic                    6.4154+-0.1277          6.2871+-0.1184         might be 1.0204x faster
   math-partial-sums              7.9001+-0.1389          7.8879+-0.1401       
   math-spectral-norm             2.5591+-0.0447    ^     2.4600+-0.0321       ^ definitely 1.0403x faster
   regexp-dna                    10.5650+-0.2633    ?    10.5712+-0.1567       ?
   string-base64                  6.0538+-0.1903    ?     6.1351+-0.2088       ? might be 1.0134x slower
   string-fasta                   7.6601+-0.2406    ?     7.6878+-0.1736       ?
   string-tagcloud               12.2444+-0.3531         12.1599+-0.2803       
   string-unpack-code            18.9348+-0.3692    ?    19.0148+-0.3475       ?
   string-validate-input          7.2754+-0.2723          7.2458+-0.2474       

   <arithmetic>                   6.6715+-0.0391          6.6501+-0.0287       
   <geometric>                    5.5415+-0.0294          5.5415+-0.0278       
   <harmonic>                     4.5240+-0.0279    ?     4.5413+-0.0278       ?

                                    TipOfTree            PartialSpecFail                                 
V8:
   crypto                        91.7346+-0.8852    !   100.1585+-0.5619       ! definitely 1.0918x slower
   deltablue                    270.8549+-2.1356        267.9829+-1.3881         might be 1.0107x faster
   earley-boyer                  95.2392+-0.6167    !    97.1235+-0.8832       ! definitely 1.0198x slower
   raytrace                      80.0969+-0.8197         79.2897+-0.3222         might be 1.0102x faster
   regexp                       112.4222+-1.1974        111.2879+-0.7193         might be 1.0102x faster
   richards                     246.1516+-1.8007    ^   240.0536+-0.7336       ^ definitely 1.0254x faster
   splay                        103.9291+-0.3727    ?   104.3902+-1.1212       ?

   <arithmetic>                 142.9183+-0.5466        142.8981+-0.2484       
   <geometric>                  127.4043+-0.3828    !   128.4276+-0.2773       ! definitely 1.0080x slower
   <harmonic>                   116.3446+-0.3055    !   117.9230+-0.2941       ! definitely 1.0136x slower

                                    TipOfTree            PartialSpecFail                                 
All benchmarks:
   <arithmetic>                  35.5724+-0.1334         35.5512+-0.0619       
   <geometric>                   10.7755+-0.0479    ?    10.7938+-0.0442       ?
   <harmonic>                     5.6825+-0.0347    ?     5.7048+-0.0345       ?
Comment 2 Filip Pizlo 2011-09-08 14:15:07 PDT
This now appears stable.  But, it's a V8 slow-down.



Benchmark report for SunSpider, V8, and Kraken.

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc
"PartialSpecFail" at /Volumes/Data/pizlo/octonary/OpenSource/WebKitBuild/Release/jsc

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM
invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting
benchmark execution times with 95% confidence intervals in milliseconds.

                                            TipOfTree            PartialSpecFail                                 
SunSpider:
   3d-cube                                7.8790+-0.1847          7.7234+-0.2069         might be 1.0202x faster
   3d-morph                               7.8556+-0.1429    ^     7.4219+-0.1409       ^ definitely 1.0584x faster
   3d-raytrace                            7.5771+-0.2102          7.5468+-0.1384       
   access-binary-trees                    2.2669+-0.0389    ?     2.2928+-0.0555       ? might be 1.0114x slower
   access-fannkuch                       11.9656+-0.2655    ?    11.9976+-0.2837       ?
   access-nbody                           4.3613+-0.1447          4.2937+-0.0866         might be 1.0157x faster
   access-nsieve                          2.4939+-0.0718    ?     2.5792+-0.0601       ? might be 1.0342x slower
   bitops-3bit-bits-in-byte               1.7326+-0.0618    ?     1.8014+-0.0547       ? might be 1.0397x slower
   bitops-bits-in-byte                    4.6134+-0.2415    ^     3.3414+-0.1481       ^ definitely 1.3807x faster
   bitops-bitwise-and                     3.7141+-0.0795          3.6939+-0.0653       
   bitops-nsieve-bits                     5.5147+-0.1153          5.4464+-0.1066         might be 1.0125x faster
   controlflow-recursive                  2.0409+-0.0581    ?     2.0611+-0.0426       ?
   crypto-aes                             6.5953+-0.1324          6.5248+-0.1913         might be 1.0108x faster
   crypto-md5                             2.7708+-0.0604          2.7644+-0.0653       
   crypto-sha1                            2.3058+-0.0905          2.2729+-0.0468         might be 1.0145x faster
   date-format-tofte                     10.5226+-0.3112         10.2755+-0.3134         might be 1.0240x faster
   date-format-xparb                      8.8497+-0.3011    ?     9.0041+-0.2844       ? might be 1.0175x slower
   math-cordic                            6.3786+-0.1561          6.3529+-0.1171       
   math-partial-sums                      7.8715+-0.1752          7.7494+-0.1679         might be 1.0157x faster
   math-spectral-norm                     2.5403+-0.0564    ?     2.5837+-0.1298       ? might be 1.0171x slower
   regexp-dna                            10.5271+-0.1745         10.3552+-0.2115         might be 1.0166x faster
   string-base64                          6.1475+-0.2276          6.0770+-0.2383         might be 1.0116x faster
   string-fasta                           7.7278+-0.2546          7.5394+-0.1755         might be 1.0250x faster
   string-tagcloud                       12.1584+-0.4519    ?    12.2708+-0.3747       ?
   string-unpack-code                    18.6953+-0.4765    ?    19.0792+-0.4617       ? might be 1.0205x slower
   string-validate-input                  7.2421+-0.2037          7.0584+-0.1706         might be 1.0260x faster

   <arithmetic>                           6.6288+-0.0408    ^     6.5426+-0.0404       ^ definitely 1.0132x faster
   <geometric>                            5.5099+-0.0338    ^     5.4210+-0.0341       ^ definitely 1.0164x faster
   <harmonic>                             4.4992+-0.0305          4.4473+-0.0353         might be 1.0117x faster

                                            TipOfTree            PartialSpecFail                                 
V8:
   crypto                                91.0338+-0.5191    !   104.1613+-0.8925       ! definitely 1.1442x slower
   deltablue                            269.7535+-2.7809    ^   265.7885+-0.7118       ^ definitely 1.0149x faster
   earley-boyer                          95.0161+-0.5041    ?    95.5356+-0.5577       ?
   raytrace                              79.1499+-0.5473    ?    79.7137+-0.6612       ?
   regexp                               110.8495+-0.4558    ^   109.2641+-0.3574       ^ definitely 1.0145x faster
   richards                             240.9133+-0.8254        239.4367+-1.6783       
   splay                                103.0647+-0.6429    !   104.4789+-0.7212       ! definitely 1.0137x slower

   <arithmetic>                         141.3972+-0.2966    !   142.6256+-0.4231       ! definitely 1.0087x slower
   <geometric>                          126.1408+-0.1666    !   128.4242+-0.3937       ! definitely 1.0181x slower
   <harmonic>                           115.2637+-0.1753    !   118.0846+-0.4050       ! definitely 1.0245x slower

                                            TipOfTree            PartialSpecFail                                 
Kraken:
   ai-astar                            1108.9297+-5.9845    ?  1123.4756+-12.5830      ? might be 1.0131x slower
   audio-beat-detection                 481.1936+-1.4541    ?   486.0185+-4.7695       ? might be 1.0100x slower
   audio-dft                            426.4858+-4.3795        425.2002+-2.3276       
   audio-fft                            373.8409+-2.1670    ?   374.6652+-0.9116       ?
   audio-oscillator                     384.1150+-2.2719    ?   387.0932+-3.3490       ?
   imaging-darkroom                     537.6787+-3.3081        534.4603+-2.0673       
   imaging-desaturate                   623.8627+-8.2803        615.4398+-4.8321         might be 1.0137x faster
   imaging-gaussian-blur               1738.3217+-5.4653       1729.6538+-4.2880       
   json-parse-financial                  49.1108+-0.5305    ?    49.8053+-0.3037       ? might be 1.0141x slower
   json-stringify-tinderbox              72.4905+-0.6863    ^    69.0171+-0.4305       ^ definitely 1.0503x faster
   stanford-crypto-aes                  145.4706+-1.1861        145.0223+-1.1645       
   stanford-crypto-ccm                  115.7358+-0.3964    ^   113.6545+-0.7748       ^ definitely 1.0183x faster
   stanford-crypto-pbkdf2               338.3754+-1.7782    !   341.7524+-1.4837       ! definitely 1.0100x slower
   stanford-crypto-sha256-iterative     131.3300+-0.4891    !   134.3038+-1.2092       ! definitely 1.0226x slower

   <arithmetic>                         466.2101+-0.8557    ?   466.3973+-1.1839       ?
   <geometric>                          301.2570+-0.3548        300.8576+-0.4784       
   <harmonic>                           186.9324+-0.3719    ^   186.0068+-0.4899       ^ definitely 1.0050x faster

                                            TipOfTree            PartialSpecFail                                 
All benchmarks:
   <arithmetic>                         163.5972+-0.2484    ?   163.7883+-0.3213       ?
   <geometric>                           28.9262+-0.1040         28.7323+-0.1022       
   <harmonic>                             7.9467+-0.0525          7.8585+-0.0613         might be 1.0112x faster
Comment 3 Filip Pizlo 2011-09-08 14:39:33 PDT
Looks like this path will work best if it is turned off for static speculation, but turned on for dynamic speculation.  Here's the performance with it turned off.  Note the noise (38% speed-up on one SunSpider benchmark that gets totally lost in the average).  I'm convinced that it is in fact noise and not real.



Benchmark report for SunSpider, V8, and Kraken.

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc
"PartialSpecFailOff" at /Volumes/Data/pizlo/octonary/OpenSource/WebKitBuild/Release/jsc

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM
invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting
benchmark execution times with 95% confidence intervals in milliseconds.

                                            TipOfTree           PartialSpecFailOff                               
SunSpider:
   3d-cube                                7.7127+-0.1331    ?     7.8798+-0.2091       ? might be 1.0217x slower
   3d-morph                               7.5139+-0.1326    ?     7.5244+-0.2035       ?
   3d-raytrace                            7.3208+-0.1767    ?     7.5204+-0.2760       ? might be 1.0273x slower
   access-binary-trees                    2.3967+-0.0970          2.2549+-0.0498         might be 1.0629x faster
   access-fannkuch                       11.9275+-0.2377         11.8759+-0.1481       
   access-nbody                           4.2453+-0.0602    ?     4.2470+-0.0751       ?
   access-nsieve                          2.5830+-0.0807          2.4702+-0.0412         might be 1.0457x faster
   bitops-3bit-bits-in-byte               1.7405+-0.0422    ?     1.7923+-0.0560       ? might be 1.0298x slower
   bitops-bits-in-byte                    4.5369+-0.1687    ^     3.2834+-0.0690       ^ definitely 1.3817x faster
   bitops-bitwise-and                     3.6754+-0.0649          3.6354+-0.0616         might be 1.0110x faster
   bitops-nsieve-bits                     5.3928+-0.1547    ?     5.5196+-0.1109       ? might be 1.0235x slower
   controlflow-recursive                  2.0130+-0.0451    ?     2.0507+-0.0368       ? might be 1.0187x slower
   crypto-aes                             6.6134+-0.2576          6.6038+-0.1874       
   crypto-md5                             2.8496+-0.1166    ?     2.9049+-0.1234       ? might be 1.0194x slower
   crypto-sha1                            2.2437+-0.0694    ?     2.3244+-0.0596       ? might be 1.0360x slower
   date-format-tofte                     10.1987+-0.2700    ?    10.2935+-0.2483       ?
   date-format-xparb                      9.0265+-0.2755          8.9221+-0.3400         might be 1.0117x faster
   math-cordic                            6.2957+-0.0972    ?     6.4081+-0.1749       ? might be 1.0179x slower
   math-partial-sums                      7.7582+-0.1480    ?     7.8717+-0.1562       ? might be 1.0146x slower
   math-spectral-norm                     2.5476+-0.0818    ?     2.5812+-0.1030       ? might be 1.0132x slower
   regexp-dna                            10.6437+-0.2123         10.3263+-0.1333         might be 1.0307x faster
   string-base64                          6.1379+-0.2106          6.0649+-0.1760         might be 1.0120x faster
   string-fasta                           7.4621+-0.1615    ?     7.5237+-0.1623       ?
   string-tagcloud                       12.0855+-0.2801    ?    12.3640+-0.3423       ? might be 1.0230x slower
   string-unpack-code                    19.0547+-0.2942    ?    19.1257+-0.4011       ?
   string-validate-input                  7.2033+-0.2369          7.0648+-0.2110         might be 1.0196x faster

   <arithmetic>                           6.5838+-0.0390          6.5551+-0.0416       
   <geometric>                            5.4789+-0.0256          5.4260+-0.0289       
   <harmonic>                             4.4947+-0.0220          4.4438+-0.0304         might be 1.0114x faster

                                            TipOfTree           PartialSpecFailOff                               
V8:
   crypto                                90.9512+-0.7022    ^    86.9512+-0.4698       ^ definitely 1.0460x faster
   deltablue                            264.5347+-0.9710    ?   267.4549+-2.0893       ? might be 1.0110x slower
   earley-boyer                          93.9388+-0.3832         93.3699+-0.2572       
   raytrace                              78.7379+-0.7401         77.6151+-0.4294         might be 1.0145x faster
   regexp                               110.2534+-0.8725    ?   111.8147+-1.1356       ? might be 1.0142x slower
   richards                             237.0448+-1.9937    ?   240.2445+-1.7043       ? might be 1.0135x slower
   splay                                102.7832+-0.4053    ?   103.1920+-1.1310       ?

   <arithmetic>                         139.7491+-0.4324    ?   140.0917+-0.4421       ?
   <geometric>                          125.0396+-0.3620        124.6279+-0.3570       
   <harmonic>                           114.4838+-0.3624    ^   113.5691+-0.3540       ^ definitely 1.0081x faster

                                            TipOfTree           PartialSpecFailOff                               
Kraken:
   ai-astar                            1111.5817+-10.0950      1100.4044+-7.0387         might be 1.0102x faster
   audio-beat-detection                 484.9305+-3.8803        479.5919+-2.9518         might be 1.0111x faster
   audio-dft                            423.4744+-4.5484        420.9447+-2.8184       
   audio-fft                            377.1075+-2.9333        374.8421+-2.5561       
   audio-oscillator                     381.5580+-2.0830        380.4500+-2.6626       
   imaging-darkroom                     540.4795+-3.7662    ^   531.7605+-2.5730       ^ definitely 1.0164x faster
   imaging-desaturate                   616.7875+-7.1272    ?   617.7079+-7.1100       ?
   imaging-gaussian-blur               1732.4610+-3.6755    ?  1739.1962+-16.1256      ?
   json-parse-financial                  49.6394+-0.4810         49.6081+-0.2538       
   json-stringify-tinderbox              68.6684+-0.7766    ?    68.9953+-0.7159       ?
   stanford-crypto-aes                  146.0314+-3.3339        144.2934+-1.3057         might be 1.0120x faster
   stanford-crypto-ccm                  112.5396+-0.6479    ?   113.2355+-0.9298       ?
   stanford-crypto-pbkdf2               339.8969+-3.1305        339.6628+-2.0729       
   stanford-crypto-sha256-iterative     132.6277+-1.3897        132.0876+-1.3444       

   <arithmetic>                         465.5560+-0.7448        463.7700+-1.9575       
   <geometric>                          299.9983+-0.5828        298.8703+-0.7334       
   <harmonic>                           185.2166+-0.6885        184.9963+-0.6590       

                                            TipOfTree           PartialSpecFailOff                               
All benchmarks:
   <arithmetic>                         163.1321+-0.2674        162.6352+-0.5690       
   <geometric>                           28.7626+-0.0819    ^    28.5624+-0.0863       ^ definitely 1.0070x faster
   <harmonic>                             7.9373+-0.0381          7.8488+-0.0527         might be 1.0113x faster
Comment 4 Filip Pizlo 2011-09-08 15:00:16 PDT
Doing this with dynamic optimization enabled appears to reveal a case in v8-crypto where we're speculating incorrectly.  My opinion is that we should commit this anyway, since (1) dynamic optimization is turned off by default and (2) we should make v8-crypto speculate correctly all the time instead of relying on the non-speculative path to save us.



Benchmark report for SunSpider, V8, and Kraken.

VMs tested:
"TipOfTreeDyn" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc
"PartialSpecFail" at /Volumes/Data/pizlo/octonary/OpenSource/WebKitBuild/Release/jsc

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM
invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting
benchmark execution times with 95% confidence intervals in milliseconds.

                                           TipOfTreeDyn          PartialSpecFail                                 
SunSpider:
   3d-cube                               12.4085+-0.3516         12.2519+-0.2947         might be 1.0128x faster
   3d-morph                               7.8265+-0.1363    ?     7.9644+-0.1721       ? might be 1.0176x slower
   3d-raytrace                            8.1677+-0.2721    ?     8.5670+-0.1768       ? might be 1.0489x slower
   access-binary-trees                    2.4106+-0.0321    ?     2.4442+-0.0792       ? might be 1.0139x slower
   access-fannkuch                       12.7431+-0.2813    ?    12.7705+-0.1946       ?
   access-nbody                           4.2390+-0.0476    !     4.4125+-0.1213       ! definitely 1.0409x slower
   access-nsieve                          2.7531+-0.0640          2.7184+-0.0627         might be 1.0127x faster
   bitops-3bit-bits-in-byte               1.8846+-0.0544    ?     2.0364+-0.1052       ? might be 1.0805x slower
   bitops-bits-in-byte                    5.2725+-0.2678          5.1915+-0.4005         might be 1.0156x faster
   bitops-bitwise-and                     4.1097+-0.1246          3.9832+-0.1009         might be 1.0318x faster
   bitops-nsieve-bits                     5.8795+-0.1013    ?     6.0344+-0.2063       ? might be 1.0263x slower
   controlflow-recursive                  2.0543+-0.0607          2.0061+-0.0400         might be 1.0240x faster
   crypto-aes                             7.9677+-0.2851          7.9440+-0.3411       
   crypto-md5                             2.9693+-0.0756    ?     3.1021+-0.1297       ? might be 1.0447x slower
   crypto-sha1                            2.4788+-0.0803    ?     2.4853+-0.0725       ?
   date-format-tofte                     10.6320+-0.2625    ?    10.6375+-0.2228       ?
   date-format-xparb                      9.2417+-0.2094    ?     9.4102+-0.2722       ? might be 1.0182x slower
   math-cordic                            6.7555+-0.1049    ?     6.7855+-0.0993       ?
   math-partial-sums                      7.7144+-0.1615          7.5565+-0.1082         might be 1.0209x faster
   math-spectral-norm                     2.6691+-0.0684          2.6070+-0.0618         might be 1.0238x faster
   regexp-dna                            10.4582+-0.2742    ?    10.4726+-0.2079       ?
   string-base64                          6.3856+-0.1344    ?     6.4854+-0.1409       ? might be 1.0156x slower
   string-fasta                           7.2776+-0.1751    ?     7.4598+-0.2319       ? might be 1.0250x slower
   string-tagcloud                       12.5675+-0.5429    ?    12.6614+-0.3102       ?
   string-unpack-code                    19.5392+-0.6266         19.5243+-0.5633       
   string-validate-input                  6.8255+-0.1597    ?     7.2243+-0.2755       ? might be 1.0584x slower

   <arithmetic>                           7.0474+-0.0485    ?     7.1052+-0.0365       ?
   <geometric>                            5.8534+-0.0323    ?     5.9095+-0.0296       ?
   <harmonic>                             4.7800+-0.0298    ?     4.8287+-0.0358       ? might be 1.0102x slower

                                           TipOfTreeDyn          PartialSpecFail                                 
V8:
   crypto                                82.4555+-0.3792    !    87.4537+-0.6151       ! definitely 1.0606x slower
   deltablue                            263.4640+-2.7763        261.5359+-2.0604       
   earley-boyer                         101.3340+-0.3097        100.8797+-0.3992       
   raytrace                              82.1846+-0.3536         81.5247+-0.8216       
   regexp                               111.7920+-0.7306        110.9027+-0.5673       
   richards                             218.6683+-0.6105    ?   219.1288+-1.2508       ?
   splay                                106.2067+-0.5304        105.4638+-0.5139       

   <arithmetic>                         138.0150+-0.5447    ?   138.1270+-0.3684       ?
   <geometric>                          124.7283+-0.4012    ?   125.1911+-0.2985       ?
   <harmonic>                           114.9504+-0.3416    !   115.6934+-0.3325       ! definitely 1.0065x slower

                                           TipOfTreeDyn          PartialSpecFail                                 
Kraken:
   ai-astar                            1138.4605+-9.4245    ?  1143.6393+-8.0190       ?
   audio-beat-detection                 514.7744+-2.1217    ?   514.9893+-2.7075       ?
   audio-dft                            470.3802+-3.7706    ?   477.4101+-6.7100       ? might be 1.0149x slower
   audio-fft                            395.4076+-4.8793    ?   401.4950+-2.7998       ? might be 1.0154x slower
   audio-oscillator                     351.5889+-2.0485        348.8424+-1.1604       
   imaging-darkroom                     539.0097+-7.2727        533.6361+-1.2733         might be 1.0101x faster
   imaging-desaturate                   596.2434+-1.6842    ?   597.0473+-2.0741       ?
   imaging-gaussian-blur               2301.8736+-20.0258   ?  2303.2791+-14.8414      ?
   json-parse-financial                  50.6473+-0.3132         50.1082+-0.3485         might be 1.0108x faster
   json-stringify-tinderbox              69.7147+-0.5663    ?    70.0110+-0.6125       ?
   stanford-crypto-aes                  162.4962+-0.6509    !   166.2443+-2.4364       ! definitely 1.0231x slower
   stanford-crypto-ccm                  123.4177+-0.5429    ?   124.6134+-1.4519       ?
   stanford-crypto-pbkdf2               364.9508+-2.1027    ^   358.8682+-2.5451       ^ definitely 1.0169x faster
   stanford-crypto-sha256-iterative     139.6700+-0.4417        138.4910+-1.0300       

   <arithmetic>                         515.6168+-2.0708    ?   516.3339+-1.2586       ?
   <geometric>                          316.7307+-0.7793    ?   317.1747+-0.6087       ?
   <harmonic>                           193.0029+-0.5715        192.9705+-0.6043       

                                           TipOfTreeDyn          PartialSpecFail                                 
All benchmarks:
   <arithmetic>                         178.0419+-0.5977    ?   178.3043+-0.3970       ?
   <geometric>                           30.3089+-0.0897    !    30.4989+-0.0852       ! definitely 1.0063x slower
   <harmonic>                             8.4339+-0.0515    ?     8.5184+-0.0617       ? might be 1.0100x slower
Comment 5 Filip Pizlo 2011-09-08 15:12:20 PDT
Created attachment 106799 [details]
the patch
Comment 6 Geoffrey Garen 2011-09-09 15:36:23 PDT
Comment on attachment 106799 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=106799&action=review

r=me

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:1387
> +        m_compileIndex = block.begin;
> +        m_compileOkay = true;
> +        clearGenerationInfo();

It confused me that a block could sometimes assume that generation info was in an empty state, and sometimes not. Would be nice to clean this up in future, possibly by giving each block its own generation info, or maybe just by calling clearGenerationInfo() unconditionally at the head of SpeculativeJIT::compile, if that's not too expensive.

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.h:229
> +        // under static speculation, it's more profitable to give up entirely at this

Capital 'U', please.
Comment 7 Gavin Barraclough 2011-09-09 15:36:52 PDT
Comment on attachment 106799 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=106799&action=review

I think the mechanism implemented in this patch (reintroducing a dynamic bail to non-spec on terminateSpeculation) should be completely orthogonal to the DYNAMIC_OPTIMIZATION - we should be able to configure the two separately? - if so, it may make sense to land this under a separate #ifdef.  I'd suggest changing the ENABLE(DYNAMIC_OPTIMIZATION) tests into the code to something like ENABLE(DYNAMIC_TERMINATE_SPECULATIVE_JIT), & then "#define ENABLE_DYNAMIC_TERMINATE_SPECULATIVE_JIT ENABLE_DYNAMIC_OPTIMIZATION" in Platform.h.

r+ with at least a fix to DFG_DEBUG_VERBOSE.

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.h:223
> +#if DFG_DEBUG_VERBOSE

This debug printf should be moved outside of the outer ifdef, such that it is printed for both ENABLE(DYNAMIC_OPTIMIZATION) & !ENABLE(DYNAMIC_OPTIMIZATION).
Comment 8 Gavin Barraclough 2011-09-09 15:41:36 PDT
(In reply to comment #6)
> (From update of attachment 106799 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=106799&action=review
> 
> r=me
> 
> > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:1387
> > +        m_compileIndex = block.begin;
> > +        m_compileOkay = true;
> > +        clearGenerationInfo();
> 
> It confused me that a block could sometimes assume that generation info was in an empty state, and sometimes not. Would be nice to clean this up in future, possibly by giving each block its own generation info, or maybe just by calling clearGenerationInfo() unconditionally at the head of SpeculativeJIT::compile, if that's not too expensive.

One way to ensure that the generation info is already clear at the head of compile(BasicBlock&) may be to call clearGenerationInfo() from terminateSpeculativeExecution(), then we may be able to assert in all cases that the generation info is already clear at the head of blocks.
Comment 9 Filip Pizlo 2011-09-09 17:07:10 PDT
Created attachment 106944 [details]
the patch - fix review

Will wait for the bots to be happy before I land.
Comment 10 Geoffrey Garen 2011-09-09 17:16:37 PDT
Comment on attachment 106944 [details]
the patch - fix review

r=me
Comment 11 WebKit Review Bot 2011-09-10 14:22:52 PDT
Comment on attachment 106944 [details]
the patch - fix review

Clearing flags on attachment: 106944

Committed r94914: <http://trac.webkit.org/changeset/94914>
Comment 12 WebKit Review Bot 2011-09-10 14:22:57 PDT
All reviewed patches have been landed.  Closing bug.