Bug 113462

Summary: JIT and DFG should NaN-check loads from Float32 arrays
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: JavaScriptCoreAssignee: Filip Pizlo <fpizlo>
Status: RESOLVED FIXED    
Severity: Normal CC: barraclough, ggaren, mark.lam, mhahnenberg, msaboff, oliver, sam
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
the patch mhahnenberg: review+

Description Filip Pizlo 2013-03-27 18:00:01 PDT
Patch forthcoming.

<rdar://problem/13490804>
Comment 1 Filip Pizlo 2013-03-27 18:05:26 PDT
Created attachment 195449 [details]
the patch
Comment 2 Mark Hahnenberg 2013-03-27 18:08:41 PDT
Comment on attachment 195449 [details]
the patch

r=me
Comment 3 Filip Pizlo 2013-03-27 19:44:17 PDT
This isn't enough of a slow-down for us to care.



Benchmark report for SunSpider, V8Spider, Octane, Kraken, JSBench, JSRegress, and DSP on bigmac (MacPro5,1).

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quartary/OpenSource/WebKitBuild/Release/DumpRenderTree (r147012)
"FixFloat32" at /Volumes/Data/pizlo/secondary/OpenSource/WebKitBuild/Release/DumpRenderTree (r147012)

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get
microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds.

                                                     TipOfTree                 FixFloat32                                    
SunSpider:
   3d-cube                                         8.9210+-0.3206     ?      9.0063+-0.2617        ?
   3d-morph                                        7.4437+-0.0709     ?      7.5217+-0.0612        ? might be 1.0105x slower
   3d-raytrace                                     9.8159+-0.2455     ?      9.8335+-0.2222        ?
   access-binary-trees                             2.2653+-0.3076     ?      2.3541+-0.3046        ? might be 1.0392x slower
   access-fannkuch                                 6.6303+-0.0946     ?      6.6539+-0.1166        ?
   access-nbody                                    3.9617+-0.0590            3.9596+-0.0520        
   access-nsieve                                   4.3191+-0.0481            4.2357+-0.0629          might be 1.0197x faster
   bitops-3bit-bits-in-byte                        1.5426+-0.0261            1.5321+-0.0255        
   bitops-bits-in-byte                             5.6319+-0.0463     ?      5.6772+-0.0759        ?
   bitops-bitwise-and                              2.0914+-0.0497            2.0747+-0.0741        
   bitops-nsieve-bits                              3.3702+-0.0084     ?      3.3979+-0.0290        ?
   controlflow-recursive                           2.4819+-0.0209     ?      2.4888+-0.0203        ?
   crypto-aes                                      7.2849+-0.3080     ?      7.5208+-0.2676        ? might be 1.0324x slower
   crypto-md5                                      3.6109+-0.0855     ?      3.6184+-0.0726        ?
   crypto-sha1                                     2.9073+-0.0563            2.8992+-0.0426        
   date-format-tofte                              14.0093+-0.9144     ?     14.0252+-1.0001        ?
   date-format-xparb                               9.1960+-0.5965     ?      9.2519+-0.6148        ?
   math-cordic                                     3.3265+-0.0324     ?      3.3274+-0.0222        ?
   math-partial-sums                              10.1317+-0.0807           10.1278+-0.0683        
   math-spectral-norm                              2.6751+-0.0221     ?      2.6982+-0.0195        ?
   regexp-dna                                     10.0047+-0.4666     ?     10.0390+-0.4525        ?
   string-base64                                   4.8176+-0.4960     ?      4.8417+-0.4947        ?
   string-fasta                                    9.7266+-0.2202     ?      9.7408+-0.1856        ?
   string-tagcloud                                12.2484+-0.2146           12.1461+-0.2382        
   string-unpack-code                             24.9933+-0.5367           24.9517+-0.5161        
   string-validate-input                           8.2188+-0.2119     ?      8.2470+-0.2599        ?

   <arithmetic> *                                  6.9856+-0.1549     ?      7.0066+-0.1482        ? might be 1.0030x slower
   <geometric>                                     5.5771+-0.1043     ?      5.5980+-0.0988        ? might be 1.0037x slower
   <harmonic>                                      4.4669+-0.0741     ?      4.4832+-0.0696        ? might be 1.0037x slower

                                                     TipOfTree                 FixFloat32                                    
V8Spider:
   crypto                                         76.4436+-0.5328     ?     76.4480+-0.5475        ?
   deltablue                                     106.4238+-0.5406          105.8757+-0.4164        
   earley-boyer                                   71.6931+-0.7290           71.4327+-0.7473        
   raytrace                                       54.1809+-4.0743           52.7361+-2.9790          might be 1.0274x faster
   regexp                                         84.4150+-0.3854           84.2139+-0.3209        
   richards                                      100.3756+-1.2997     ?    100.3977+-1.2587        ?
   splay                                          53.0150+-2.5955           52.5163+-2.9083        

   <arithmetic>                                   78.0782+-0.6094           77.6601+-0.7159          might be 1.0054x faster
   <geometric> *                                  75.5769+-0.8001           75.0897+-0.8883          might be 1.0065x faster
   <harmonic>                                     73.0516+-0.9978           72.4918+-1.0701          might be 1.0077x faster

                                                     TipOfTree                 FixFloat32                                    
Octane and V8v7:
   encrypt                                        0.40957+-0.00100          0.40935+-0.00099       
   decrypt                                        7.41818+-0.00952    ?     7.41867+-0.00698       ?
   deltablue                             x2       0.49180+-0.00504          0.48992+-0.00485       
   earley                                         0.76386+-0.00659    ?     0.77066+-0.01352       ?
   boyer                                         10.80313+-0.01884    ?    10.81146+-0.03513       ?
   raytrace                              x2       3.95038+-0.04845          3.93395+-0.07377       
   regexp                                x2      26.29313+-0.06119         26.25907+-0.06921       
   richards                              x2       0.25832+-0.00245    ?     0.25909+-0.00361       ?
   splay                                 x2       0.55109+-0.01165          0.54750+-0.00825       
   navier-stokes                         x2       9.14513+-0.09462          9.05378+-0.00704         might be 1.0101x faster
   closure                                        0.25930+-0.03320          0.25905+-0.03332       
   jquery                                         3.75395+-0.45104          3.73058+-0.44088       
   gbemu                                 x2     119.36194+-5.41325    ?   127.10538+-7.67356       ? might be 1.0649x slower
   mandreel                              x2     149.69987+-0.53947    !   152.64847+-0.93646       ! definitely 1.0197x slower
   pdfjs                                 x2      93.69051+-0.28408    ?    94.02524+-0.23860       ?
   box2d                                 x2      31.04716+-0.14735         30.91979+-0.18786       

V8v7:
   <arithmetic>                                   6.29840+-0.01509          6.28105+-0.01552         might be 1.0028x faster
   <geometric> *                                  2.06658+-0.00749          2.06183+-0.00754         might be 1.0023x faster
   <harmonic>                                     0.79120+-0.00442          0.79076+-0.00466         might be 1.0005x faster

Octane including V8v7:
   <arithmetic>                                  34.32256+-0.43934    ?    35.14939+-0.58520       ? might be 1.0241x slower
   <geometric> *                                  6.11592+-0.05184    ?     6.14345+-0.05740       ? might be 1.0045x slower
   <harmonic>                                     1.05731+-0.02084          1.05656+-0.02046         might be 1.0007x faster

                                                     TipOfTree                 FixFloat32                                    
Kraken:
   ai-astar                                       438.593+-2.494      ?     438.806+-2.807         ?
   audio-beat-detection                           210.448+-2.306            207.945+-0.693           might be 1.0120x faster
   audio-dft                                      261.910+-1.976            259.024+-1.434           might be 1.0111x faster
   audio-fft                                      122.083+-0.285            121.793+-0.184         
   audio-oscillator                               212.260+-0.487            211.789+-0.287         
   imaging-darkroom                               244.255+-0.990            243.551+-0.956         
   imaging-desaturate                             133.716+-0.121      ?     133.737+-0.159         ?
   imaging-gaussian-blur                          415.293+-0.349      ?     416.187+-0.942         ?
   json-parse-financial                            67.819+-0.139      ?      68.405+-0.881         ?
   json-stringify-tinderbox                        83.960+-0.329             83.731+-0.187         
   stanford-crypto-aes                            101.275+-0.741            101.170+-0.488         
   stanford-crypto-ccm                             97.559+-0.472      ?      97.702+-0.429         ?
   stanford-crypto-pbkdf2                         231.513+-1.800            229.923+-1.515         
   stanford-crypto-sha256-iterative               104.596+-0.317            104.596+-0.442         

   <arithmetic> *                                 194.663+-0.459            194.169+-0.213           might be 1.0025x faster
   <geometric>                                    165.505+-0.292            165.171+-0.184           might be 1.0020x faster
   <harmonic>                                     142.280+-0.163            142.182+-0.284           might be 1.0007x faster

                                                     TipOfTree                 FixFloat32                                    
JSBench:
   amazon                                          7.1667+-0.2473     ?      7.3333+-0.3128        ? might be 1.0233x slower
   facebook                                       33.9167+-1.6359           33.7500+-1.7791        
   google                                         67.5000+-1.7027           67.4167+-1.7233        
   twitter                                         8.5833+-0.3272     ?      8.7500+-0.3949        ? might be 1.0194x slower
   yahoo                                           2.9167+-0.4248     ?      3.1667+-0.3668        ? might be 1.0857x slower

   <arithmetic> *                                 24.0167+-0.7136     ?     24.0833+-0.7319        ? might be 1.0028x slower
   <geometric>                                    13.2026+-0.4882     ?     13.5325+-0.4190        ? might be 1.0250x slower
   <harmonic>                                      7.6571+-0.5848     ?      8.0983+-0.4704        ? might be 1.0576x slower

                                                     TipOfTree                 FixFloat32                                    
JSRegress:
   adapt-to-double-divide                         18.5733+-0.0751           18.5706+-0.0472        
   aliased-arguments-getbyval                      0.8060+-0.0132            0.7997+-0.0061        
   allocate-big-object                             3.5236+-1.1873     ?      3.5291+-1.2089        ?
   arity-mismatch-inlining                         0.6884+-0.0175            0.6701+-0.0054          might be 1.0274x faster
   array-access-polymorphic-structure              7.3217+-1.6776            6.8005+-1.4175          might be 1.0766x faster
   array-with-double-add                           4.7875+-0.0374            4.7685+-0.0161        
   array-with-double-increment                     3.2750+-0.0260     ?      3.2831+-0.0204        ?
   array-with-double-mul-add                       6.6465+-0.1164            6.5011+-0.0618          might be 1.0224x faster
   array-with-double-sum                           6.4356+-0.0303     ?      6.4379+-0.0239        ?
   array-with-int32-add-sub                        8.6604+-0.0328            8.6466+-0.0185        
   array-with-int32-or-double-sum                  6.5002+-0.0324     ?      6.5329+-0.0485        ?
   big-int-mul                                     4.0285+-0.0117     ?      4.0480+-0.0381        ?
   boolean-test                                    3.6082+-0.1003            3.5208+-0.0157          might be 1.0248x faster
   cast-int-to-double                             11.4141+-0.0831           11.3766+-0.0885        
   cell-argument                                  11.8764+-0.0128     ?     11.8774+-0.0146        ?
   cfg-simplify                                    3.2203+-0.0679            3.1921+-0.0433        
   cmpeq-obj-to-obj-other                          9.3500+-0.0974     ?      9.4729+-0.0909        ? might be 1.0131x slower
   constant-test                                   7.0564+-0.0859            6.9609+-0.0729          might be 1.0137x faster
   direct-arguments-getbyval                       0.7419+-0.0136     ?      0.7423+-0.0069        ?
   double-pollution-getbyval                       8.8398+-0.0497            8.8240+-0.0240        
   double-pollution-putbyoffset                    4.7581+-0.5607     ?      4.7868+-0.5496        ?
   empty-string-plus-int                          11.5215+-0.4654           11.4624+-0.4745        
   external-arguments-getbyval                     2.1474+-0.1731     ?      2.2690+-0.1965        ? might be 1.0566x slower
   external-arguments-putbyval                     3.3007+-0.2871     ?      3.3323+-0.3004        ?
   Float32Array-matrix-mult                       12.8316+-0.6926     ?     13.0303+-0.7520        ? might be 1.0155x slower
   fold-double-to-int                             18.2735+-0.1629           18.2722+-0.1860        
   function-dot-apply                              2.6172+-0.0154     ?      2.6400+-0.0548        ?
   function-test                                   4.0872+-0.0483            4.0378+-0.0406          might be 1.0122x faster
   get-by-id-chain-from-try-block                  6.1536+-0.0766            6.1259+-0.0334        
   HashMap-put-get-iterate-keys                   72.3307+-0.8827     ?     72.7020+-1.0130        ?
   HashMap-put-get-iterate                        73.9127+-0.7201     ?     73.9872+-0.8708        ?
   HashMap-string-put-get-iterate                 67.1371+-1.2249           66.8672+-1.0170        
   indexed-properties-in-objects                   3.7103+-0.0435            3.6689+-0.0139          might be 1.0113x faster
   inline-arguments-access                         1.0779+-0.0128     ?      1.0901+-0.0244        ? might be 1.0113x slower
   inline-arguments-local-escape                  21.4651+-0.1051     ?     21.5256+-0.1670        ?
   inline-get-scoped-var                           5.3498+-0.0385            5.3218+-0.0143        
   inlined-put-by-id-transition                   13.9193+-0.1986           13.8737+-0.2217        
   int-or-other-abs-then-get-by-val                7.2827+-0.0311            7.2807+-0.0267        
   int-or-other-abs-zero-then-get-by-val          30.3430+-0.1592           30.3139+-0.1402        
   int-or-other-add-then-get-by-val                8.4768+-0.0513            8.4367+-0.0168        
   int-or-other-add                                8.7178+-0.0409     ?      8.7207+-0.0615        ?
   int-or-other-div-then-get-by-val                6.5903+-0.0262     ?      6.6646+-0.0707        ? might be 1.0113x slower
   int-or-other-max-then-get-by-val                8.1960+-0.2179     ?      8.2950+-0.1983        ? might be 1.0121x slower
   int-or-other-min-then-get-by-val                6.7563+-0.0195     ?      6.7566+-0.0219        ?
   int-or-other-mod-then-get-by-val                6.5729+-0.0258     ?      6.5864+-0.0596        ?
   int-or-other-mul-then-get-by-val                5.8892+-0.0394     ?      5.9025+-0.0306        ?
   int-or-other-neg-then-get-by-val                6.5939+-0.0412            6.5658+-0.0338        
   int-or-other-neg-zero-then-get-by-val          30.2630+-0.0678           30.2171+-0.0773        
   int-or-other-sub-then-get-by-val                8.4588+-0.0528     ?      8.4760+-0.0718        ?
   int-or-other-sub                                6.7448+-0.0164     ?      6.7734+-0.0405        ?
   int-overflow-local                             10.6757+-0.0425           10.6362+-0.0555        
   Int16Array-bubble-sort                         67.5525+-2.8459     ?     74.8295+-21.5064       ? might be 1.1077x slower
   Int16Array-load-int-mul                         1.5779+-0.0232            1.5706+-0.0108        
   Int8Array-load                                  4.6478+-0.1401            4.5711+-0.0868          might be 1.0168x faster
   integer-divide                                 12.8176+-0.2071           12.6705+-0.0309          might be 1.0116x faster
   integer-modulo                                  1.8640+-0.0187            1.8572+-0.0589        
   make-indexed-storage                            3.8738+-0.5523     ?      3.8846+-0.5789        ?
   method-on-number                               19.4652+-0.3545     ?     19.6622+-0.4129        ? might be 1.0101x slower
   nested-function-parsing-random                323.4666+-11.9693         322.0079+-10.6137       
   nested-function-parsing                        48.0927+-3.0055     ?     48.5746+-3.1521        ? might be 1.0100x slower
   new-array-buffer-dead                           3.1104+-0.1121            3.0707+-0.1108          might be 1.0129x faster
   new-array-buffer-push                          12.7455+-2.0634     ?     12.8664+-2.0594        ?
   new-array-dead                                 23.4288+-0.0723     ?     23.4614+-0.0564        ?
   new-array-push                                 10.2926+-1.6250           10.2286+-1.6129        
   number-test                                     3.4521+-0.0380            3.4439+-0.0269        
   object-closure-call                             7.1848+-0.2035     ?      7.1992+-0.1862        ?
   object-test                                     3.9351+-0.0338     ?      3.9507+-0.0311        ?
   poly-stricteq                                  76.4296+-0.8444           76.2550+-0.8107        
   polymorphic-structure                          16.6998+-0.1822           16.5929+-0.0388        
   polyvariant-monomorphic-get-by-id              10.3368+-0.0431     ?     10.3518+-0.0487        ?
   rare-osr-exit-on-local                         17.0050+-0.0481     ?     17.0210+-0.0404        ?
   register-pressure-from-osr                     26.0911+-0.0405     ?     26.3013+-0.2249        ?
   simple-activation-demo                         28.7682+-0.2249           28.7620+-0.2242        
   slow-array-profile-convergence                  3.9795+-0.2066            3.9088+-0.2280          might be 1.0181x faster
   slow-convergence                                3.1895+-0.0688            3.1772+-0.0240        
   sparse-conditional                              1.1164+-0.0111     ?      1.1217+-0.0109        ?
   splice-to-remove                               41.3113+-0.1283     !     41.7906+-0.3009        ! definitely 1.0116x slower
   string-concat-object                            3.8786+-1.1705     ?      4.2467+-1.2774        ? might be 1.0949x slower
   string-concat-pair-object                       4.2080+-1.2601            4.1743+-1.2758        
   string-concat-pair-simple                      17.1067+-0.4984     ?     17.3749+-0.5046        ? might be 1.0157x slower
   string-concat-simple                           16.9203+-0.4657     ?     17.0031+-0.4794        ?
   string-cons-repeat                             12.3395+-0.7783     ?     12.4141+-0.8664        ?
   string-cons-tower                              30.8612+-18.1041          30.8386+-18.0645       
   string-hash                                     2.1724+-0.0114     ?      2.2091+-0.0319        ? might be 1.0169x slower
   string-repeat-arith                            37.2928+-0.3484           37.0894+-0.3120        
   string-sub                                     73.2475+-0.9414     ?     73.3145+-0.5919        ?
   string-test                                     3.4426+-0.0295     ?      3.4439+-0.0256        ?
   structure-hoist-over-transitions                3.4060+-0.5736            3.3916+-0.5480        
   tear-off-arguments-simple                       1.5005+-0.0098            1.4968+-0.0084        
   tear-off-arguments                              2.7613+-0.0147     ?      2.7640+-0.0209        ?
   temporal-structure                             17.3238+-0.0547           17.3025+-0.0360        
   to-int32-boolean                               25.3500+-0.0169     ?     25.4299+-0.1043        ?
   undefined-test                                  3.6424+-0.0261     ?      3.6611+-0.0318        ?

   <arithmetic>                                   17.5419+-0.3934     ?     17.6178+-0.5476        ? might be 1.0043x slower
   <geometric> *                                   8.1383+-0.1617     ?      8.1457+-0.1917        ? might be 1.0009x slower
   <harmonic>                                      4.5405+-0.0646            4.5371+-0.0763          might be 1.0007x faster

                                                     TipOfTree                 FixFloat32                                    
DSP:
   filtrr-posterize-tint                          44.8839+-0.9748           44.8740+-0.9559        
   filtrr-tint-contrast-sat-bright                64.6584+-2.3953           63.2487+-1.6434          might be 1.0223x faster
   filtrr-tint-sat-adj-contr-mult                 74.7810+-1.9508           74.4355+-2.0009        
   filtrr-blur-overlay-sat-contr                 193.9332+-5.8263          185.8268+-5.1224          might be 1.0436x faster
   filtrr-sat-blur-mult-sharpen-contr            233.8335+-4.6154          233.4521+-4.9055        
   filtrr-sepia-bias                              32.3349+-1.6908     ?     32.5578+-1.7999        ?
   route9-vp8                            x5     1038.2862+-25.6108    ?   1050.8866+-16.9175       ? might be 1.0121x slower
   starfield                             x5     1176.6958+-6.0084         1165.5170+-6.7290        
   bellard-jslinux                       x5     2764.9167+-10.5502    ?   2769.8333+-13.7874       ?
   zynaps-quake3                         x5     1166.8512+-30.1726        1155.6982+-33.0065       
   zynaps-mandelbrot                     x5     1001.4150+-5.9211         1000.8416+-5.4550        

   <arithmetic>                                 1173.7177+-7.1366         1172.5251+-6.3965          might be 1.0010x faster
   <geometric> *                                 769.8201+-4.5473          767.5457+-5.0842          might be 1.0030x faster
   <harmonic>                                    277.2825+-6.7005          276.1702+-6.6491          might be 1.0040x faster

                                                     TipOfTree                 FixFloat32                                    
All benchmarks:
   <arithmetic>                                  210.3097+-1.1193          210.2237+-1.0757          might be 1.0004x faster
   <geometric>                                    20.2304+-0.2329     ?     20.2558+-0.2689        ? might be 1.0013x slower
   <harmonic>                                      3.8932+-0.0389     ?      3.8949+-0.0359        ? might be 1.0004x slower

                                                     TipOfTree                 FixFloat32                                    
Geomean of preferred means:
   <scaled-result>                                36.9715+-0.3314           36.9668+-0.3676          might be 1.0001x faster
Comment 4 Filip Pizlo 2013-03-27 19:45:24 PDT
Landed in http://trac.webkit.org/changeset/147047