Bug 155096

Summary: [JSC] Improve how DFG zero Floating Point registers
Product: WebKit Reporter: Benjamin Poulain <benjamin>
Component: New BugsAssignee: Benjamin Poulain <benjamin>
Status: RESOLVED FIXED    
Severity: Normal CC: annulen, commit-queue, fpizlo, keith_miller, mark.lam, msaboff, ossy, saam
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 155128    
Bug Blocks:    
Attachments:
Description Flags
Patch
none
Patch
none
Patch none

Description Benjamin Poulain 2016-03-06 15:02:33 PST
[JSC] Improve how DFG zero Floating Point registers
Comment 1 Benjamin Poulain 2016-03-06 15:03:59 PST
Created attachment 273148 [details]
Patch
Comment 2 Benjamin Poulain 2016-03-06 17:50:56 PST
Created attachment 273155 [details]
Patch
Comment 3 Benjamin Poulain 2016-03-06 18:18:35 PST
Created attachment 273157 [details]
Patch
Comment 4 Benjamin Poulain 2016-03-06 19:50:05 PST
Haswell:

                                                  Conf#1                    Conf#2                                      
SunSpider:
   3d-cube                                    4.8003+-0.0662     ?      5.1148+-0.3986        ? might be 1.0655x slower
   3d-morph                                   5.3795+-0.1627     ?      5.4474+-0.1276        ? might be 1.0126x slower
   3d-raytrace                                5.7165+-0.0494            5.7046+-0.1161        
   access-binary-trees                        2.2263+-0.0478     ?      2.2829+-0.0773        ? might be 1.0254x slower
   access-fannkuch                            6.1489+-0.1663     ?      6.2672+-0.2304        ? might be 1.0193x slower
   access-nbody                               2.6852+-0.0494     ?      2.8182+-0.2642        ? might be 1.0495x slower
   access-nsieve                              3.0815+-0.0878     ?      3.1353+-0.1370        ? might be 1.0175x slower
   bitops-3bit-bits-in-byte                   1.2178+-0.0370            1.1804+-0.0171          might be 1.0316x faster
   bitops-bits-in-byte                        3.3653+-0.0795     ?      3.3761+-0.0735        ?
   bitops-bitwise-and                         2.0963+-0.0427            2.0389+-0.0246          might be 1.0281x faster
   bitops-nsieve-bits                         3.0371+-0.0385     ?      3.0596+-0.0482        ?
   controlflow-recursive                      2.3646+-0.0115     ?      2.4289+-0.1070        ? might be 1.0272x slower
   crypto-aes                                 4.1038+-0.0506            4.1019+-0.0728        
   crypto-md5                                 2.6436+-0.0490     ?      2.6775+-0.1084        ? might be 1.0128x slower
   crypto-sha1                                2.3461+-0.0470     ?      2.3538+-0.0296        ?
   date-format-tofte                          6.8753+-0.1775     ?      6.9790+-0.1501        ? might be 1.0151x slower
   date-format-xparb                          5.1832+-0.1537            4.9995+-0.1369          might be 1.0367x faster
   math-cordic                                2.9384+-0.0571            2.8653+-0.0485          might be 1.0255x faster
   math-partial-sums                          5.0867+-0.2597            4.9308+-0.1106          might be 1.0316x faster
   math-spectral-norm                         2.0729+-0.0468            2.0329+-0.0318          might be 1.0197x faster
   regexp-dna                                 6.1860+-0.1495            6.1163+-0.1232          might be 1.0114x faster
   string-base64                              4.5831+-0.1906            4.4120+-0.0379          might be 1.0388x faster
   string-fasta                               6.0281+-0.1170            5.9026+-0.0574          might be 1.0213x faster
   string-tagcloud                            8.0834+-0.0289     !      8.2203+-0.1070        ! definitely 1.0169x slower
   string-unpack-code                        19.1920+-0.6869     ?     19.3735+-0.7269        ?
   string-validate-input                      4.3247+-0.1022            4.2804+-0.0574          might be 1.0103x faster

   <arithmetic>                               4.6833+-0.0423     ?      4.6962+-0.0385        ? might be 1.0027x slower

                                                  Conf#1                    Conf#2                                      
Octane:
   encrypt                                   0.16444+-0.00329          0.16126+-0.00327         might be 1.0198x faster
   decrypt                                   2.87520+-0.04421          2.85984+-0.03903       
   deltablue                        x2       0.14123+-0.00478    ?     0.14499+-0.00953       ? might be 1.0266x slower
   earley                                    0.28890+-0.00441          0.28594+-0.00168         might be 1.0103x faster
   boyer                                     5.01821+-0.13913          4.95518+-0.10778         might be 1.0127x faster
   navier-stokes                    x2       4.96636+-0.02585    ?     4.97091+-0.07059       ?
   raytrace                         x2       0.91223+-0.01860          0.89659+-0.00995         might be 1.0174x faster
   richards                         x2       0.08255+-0.00167    ?     0.08287+-0.00162       ?
   splay                            x2       0.35288+-0.00482          0.35120+-0.00347       
   regexp                           x2      20.04105+-0.48806    ?    20.60547+-0.21352       ? might be 1.0282x slower
   pdfjs                            x2      39.34641+-0.71709         39.09899+-0.55979       
   mandreel                         x2      43.48000+-0.80517    ?    43.64120+-1.10664       ?
   gbemu                            x2      24.97394+-0.33305    ?    25.21407+-0.52919       ?
   closure                                   0.57005+-0.00261    ?     0.58472+-0.01394       ? might be 1.0257x slower
   jquery                                    7.69550+-0.18560          7.59527+-0.12343         might be 1.0132x faster
   box2d                            x2       9.61655+-0.23430          9.43202+-0.14491         might be 1.0196x faster
   zlib                             x2     384.02105+-10.25770   ?   385.91153+-6.11887       ?
   typescript                       x2     668.99503+-7.72064        662.06196+-6.21916         might be 1.0105x faster

   <geometric>                               5.27345+-0.02490          5.27318+-0.02285         might be 1.0001x faster

                                                  Conf#1                    Conf#2                                      
Kraken:
   ai-astar                                   98.116+-2.810             97.417+-2.953         
   audio-beat-detection                       47.081+-0.598      ?      47.511+-0.229         ?
   audio-dft                                  98.034+-1.363      ?      98.619+-1.711         ?
   audio-fft                                  35.933+-0.425             35.731+-0.056         
   audio-oscillator                           49.728+-1.085             49.556+-1.127         
   imaging-darkroom                           62.209+-1.718             60.683+-0.750           might be 1.0251x faster
   imaging-desaturate                         46.016+-1.614             45.634+-1.152         
   imaging-gaussian-blur                      68.638+-1.509             65.777+-3.316           might be 1.0435x faster
   json-parse-financial                       37.640+-1.249             36.958+-0.737           might be 1.0185x faster
   json-stringify-tinderbox                   24.246+-0.601      ?      24.555+-0.815         ? might be 1.0128x slower
   stanford-crypto-aes                        39.997+-0.500      ?      40.044+-0.430         ?
   stanford-crypto-ccm                        37.063+-1.209             36.608+-0.643           might be 1.0124x faster
   stanford-crypto-pbkdf2                    103.551+-1.999            102.533+-1.503         
   stanford-crypto-sha256-iterative           39.235+-0.509      ?      39.895+-0.885         ? might be 1.0168x slower

   <arithmetic>                               56.249+-0.362             55.823+-0.337           might be 1.0076x faster

                                                  Conf#1                    Conf#2                                      
AsmBench:
   bigfib.cpp                               442.2368+-5.9685     ?    443.4775+-8.8077        ?
   cray.c                                   368.2074+-4.8122     ?    370.4352+-6.7999        ?
   dry.c                                    450.4179+-33.2786         438.0829+-20.7072         might be 1.0282x faster
   FloatMM.c                                722.5146+-12.2548    ?    722.6382+-11.6831       ?
   gcc-loops.cpp                           3698.1718+-42.1452    ?   3709.5201+-44.4462       ?
   n-body.c                                 822.4819+-14.0351    ?    830.5771+-16.6330       ?
   Quicksort.c                              402.4949+-8.2802     ?    403.0941+-9.3269        ?
   stepanov_container.cpp                  3332.6366+-62.7019    ?   3339.1666+-59.4226       ?
   Towers.c                                 273.3234+-5.8376          267.9595+-0.2123          might be 1.0200x faster

   <geometric>                              729.5824+-7.0101          728.0901+-6.4356          might be 1.0020x faster

                                                  Conf#1                    Conf#2                                      
Geomean of preferred means:
   <scaled-result>                           31.7270+-0.0908           31.6724+-0.1127          might be 1.0017x faster
Comment 5 Geoffrey Garen 2016-03-07 09:32:27 PST
Comment on attachment 273157 [details]
Patch

r=me
Comment 6 WebKit Commit Bot 2016-03-07 10:25:45 PST
Comment on attachment 273157 [details]
Patch

Clearing flags on attachment: 273157

Committed r197687: <http://trac.webkit.org/changeset/197687>
Comment 7 WebKit Commit Bot 2016-03-07 10:25:49 PST
All reviewed patches have been landed.  Closing bug.
Comment 8 Konstantin Tokarev 2016-03-12 11:44:18 PST
Do you know any particular test case that should fail if moveZeroToDouble does not work properly?
Comment 9 Benjamin Poulain 2016-03-12 12:29:26 PST
(In reply to comment #8)
> Do you know any particular test case that should fail if moveZeroToDouble
> does not work properly?

The "math-XXX" stress tests of JSC should cover that.