Bug 69560

Summary: DFG::shouldSpeculate methods are too complicated
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: JavaScriptCoreAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal    
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
the patch
none
the patch ggaren: review+

Description Filip Pizlo 2011-10-06 13:49:29 PDT
The DFG::shouldSpeculateXYZ methods have some hold-overs from the static speculation days, as well as some hold-overs from when nodes did not remember their predictions after propagation.  They should now be simplified to just use the node's prediction, instead of also querying if it's a constant, what its generation info is, etc.
Comment 1 Filip Pizlo 2011-10-06 14:49:49 PDT
I don't know how or why but this is a performance progression:


Benchmark report for SunSpider, V8, and Kraken.

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc
"SimpleSpec" at /Volumes/Data/pizlo/quartary/OpenSource/WebKitBuild/Release/jsc

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM
invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting
benchmark execution times with 95% confidence intervals in milliseconds.

                                            TipOfTree               SimpleSpec                                   
SunSpider:
   3d-cube                                7.4846+-0.1938          7.3904+-0.1589         might be 1.0127x faster
   3d-morph                               7.3235+-0.1097    ?     7.3891+-0.1346       ?
   3d-raytrace                            7.6127+-0.1929    ?     7.7139+-0.3390       ? might be 1.0133x slower
   access-binary-trees                    1.7695+-0.0618          1.7312+-0.0529         might be 1.0221x faster
   access-fannkuch                        6.3561+-0.1230    ?     6.4046+-0.1169       ?
   access-nbody                           3.4766+-0.0878          3.4593+-0.0674       
   access-nsieve                          2.6102+-0.0673          2.6090+-0.0636       
   bitops-3bit-bits-in-byte               1.6826+-0.0267    ?     1.6976+-0.0402       ?
   bitops-bits-in-byte                    2.6937+-0.0781    ?     2.7330+-0.0736       ? might be 1.0146x slower
   bitops-bitwise-and                     3.3954+-0.0767          3.3739+-0.1450       
   bitops-nsieve-bits                     5.5211+-0.1235          5.4688+-0.1827       
   controlflow-recursive                  2.0641+-0.0427          2.0219+-0.0362         might be 1.0209x faster
   crypto-aes                             6.6308+-0.1933          6.4991+-0.1958         might be 1.0203x faster
   crypto-md5                             2.8396+-0.0868    ?     2.8502+-0.0976       ?
   crypto-sha1                            2.5092+-0.0877          2.4994+-0.0765       
   date-format-tofte                      9.8208+-0.2289          9.6545+-0.1639         might be 1.0172x faster
   date-format-xparb                      9.0871+-0.2624    ?     9.1752+-0.1927       ?
   math-cordic                            6.6069+-0.1143    ?     6.7638+-0.1043       ? might be 1.0238x slower
   math-partial-sums                      7.5417+-0.1169    ?     7.5588+-0.1163       ?
   math-spectral-norm                     2.8797+-0.0740          2.7982+-0.0634         might be 1.0291x faster
   regexp-dna                            10.7476+-0.1960         10.7336+-0.1182       
   string-base64                          5.5433+-0.1854          5.4857+-0.1473         might be 1.0105x faster
   string-fasta                           6.6210+-0.1331          6.5756+-0.1248       
   string-tagcloud                       11.6307+-0.1731         11.4985+-0.2257         might be 1.0115x faster
   string-unpack-code                    21.0739+-0.2537         21.0662+-0.2105       
   string-validate-input                  6.2776+-0.1459    ?     6.3075+-0.1592       ?

   <arithmetic> *                         6.2231+-0.0252          6.2100+-0.0290       
   <geometric>                            5.1154+-0.0224          5.0997+-0.0362       
   <harmonic>                             4.1997+-0.0283          4.1810+-0.0474       

                                            TipOfTree               SimpleSpec                                   
V8:
   crypto                                72.0093+-0.4413    ?    72.1674+-0.3549       ?
   deltablue                            218.1208+-1.6580        216.5963+-1.1503       
   earley-boyer                          87.0210+-0.3445    ?    87.3982+-0.2968       ?
   raytrace                              59.0426+-0.9198    ^    57.4898+-0.2626       ^ definitely 1.0270x faster
   regexp                               103.3065+-0.4788    ?   103.4281+-0.3658       ?
   richards                             208.0894+-1.3868        206.7795+-0.6596       
   splay                                 90.8671+-0.5588    ^    89.7117+-0.4158       ^ definitely 1.0129x faster

   <arithmetic>                         119.7795+-0.2368    ^   119.0816+-0.2255       ^ definitely 1.0059x faster
   <geometric> *                        106.7105+-0.2330    ^   106.0308+-0.1675       ^ definitely 1.0064x faster
   <harmonic>                            96.6827+-0.3646    ^    95.9378+-0.1593       ^ definitely 1.0078x faster

                                            TipOfTree               SimpleSpec                                   
Kraken:
   ai-astar                             499.2093+-2.2369    ^   492.1926+-2.0969       ^ definitely 1.0143x faster
   audio-beat-detection                 191.4963+-1.1932        190.6489+-1.3148       
   audio-dft                            271.2691+-2.8687        266.2812+-2.3493         might be 1.0187x faster
   audio-fft                            125.6523+-0.2270    ?   126.5136+-0.7651       ?
   audio-oscillator                     244.6052+-1.6766    ?   247.0725+-2.0585       ? might be 1.0101x slower
   imaging-darkroom                     419.3021+-1.0464    ?   425.8465+-5.9273       ? might be 1.0156x slower
   imaging-desaturate                   230.7208+-0.4309    ?   231.6531+-1.7522       ?
   imaging-gaussian-blur                583.7580+-1.5630        581.8522+-1.2108       
   json-parse-financial                  48.6361+-0.2707         48.1809+-0.2972       
   json-stringify-tinderbox              69.4407+-0.4426    ^    68.4374+-0.4581       ^ definitely 1.0147x faster
   stanford-crypto-aes                  129.0581+-1.4862        128.3939+-1.4514       
   stanford-crypto-ccm                   99.9955+-0.5684    ?   100.7834+-1.5850       ?
   stanford-crypto-pbkdf2               190.7082+-1.1493    ^   187.6297+-0.5089       ^ definitely 1.0164x faster
   stanford-crypto-sha256-iterative      75.5406+-0.3227         75.1208+-0.4026       

   <arithmetic> *                       227.0995+-0.4739        226.4719+-0.6494       
   <geometric>                          176.4004+-0.4323        175.8015+-0.5336       
   <harmonic>                           136.4659+-0.3668        135.7894+-0.4376       

                                            TipOfTree               SimpleSpec                                   
All benchmarks:
   <arithmetic>                          88.9287+-0.1346         88.6306+-0.2067       
   <geometric>                           23.0881+-0.0641         23.0034+-0.1058       
   <harmonic>                             7.3831+-0.0486          7.3499+-0.0813       

                                            TipOfTree               SimpleSpec                                   
Geomean of preferred means:
   <scaled-result>                       53.2280+-0.0818    ^    53.0284+-0.1145       ^ definitely 1.0038x faster
Comment 2 Filip Pizlo 2011-10-06 14:54:24 PDT
Created attachment 110031 [details]
the patch
Comment 3 Filip Pizlo 2011-10-06 15:11:53 PDT
Created attachment 110040 [details]
the patch

I realized that I had to make the patch bigger by adding more code.  (I forgot about JITCodeGenerator64.)
Comment 4 Geoffrey Garen 2011-10-06 15:16:54 PDT
Comment on attachment 110040 [details]
the patch

"at" might be a little too concise. I think "node" might be a better name.

shouldSpeculate vs canSpeculate is still a bit mind blowing in the bad way.

r=me
Comment 5 Filip Pizlo 2011-10-06 16:39:04 PDT
Landed in r96871.