Bug 154131 - [JSC] SqrtFloat and CeilFloat also suffer from partial register stalls
Summary: [JSC] SqrtFloat and CeilFloat also suffer from partial register stalls
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Benjamin Poulain
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-02-11 14:09 PST by Benjamin Poulain
Modified: 2016-02-11 15:01 PST (History)
5 users (show)

See Also:


Attachments
Patch (1.43 KB, patch)
2016-02-11 14:10 PST, Benjamin Poulain
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Benjamin Poulain 2016-02-11 14:09:33 PST
[JSC] SqrtFloat and CeilFloat also suffer from partial register stalls
Comment 1 Benjamin Poulain 2016-02-11 14:10:02 PST
Created attachment 271084 [details]
Patch
Comment 2 Benjamin Poulain 2016-02-11 14:43:05 PST
Looks like it actually helped ASM: 

                                                  Conf#1                    Conf#2                                      
Octane:
   encrypt                                   0.26580+-0.00535          0.26261+-0.00181         might be 1.0121x faster
   decrypt                                   4.88253+-0.03087          4.87441+-0.03265       
   deltablue                        x2       0.22707+-0.02065    ?     0.23526+-0.01640       ? might be 1.0360x slower
   earley                                    0.50553+-0.00370          0.50526+-0.00190       
   boyer                                     8.29859+-0.03093          8.29258+-0.14059       
   navier-stokes                    x2       6.32582+-0.01449    ?     6.32637+-0.01641       ?
   raytrace                         x2       1.51112+-0.02407          1.51109+-0.01574       
   richards                         x2       0.13826+-0.00187    ?     0.14065+-0.00125       ? might be 1.0173x slower
   splay                            x2       0.53466+-0.01274          0.53390+-0.00423       
   regexp                           x2      38.63601+-0.54019         38.61920+-0.35350       
   pdfjs                            x2      60.58862+-0.72967    ?    60.71452+-0.21419       ?
   mandreel                         x2      70.06093+-0.31377    ?    70.42351+-1.25984       ?
   gbemu                            x2      45.88083+-0.18418    ?    46.13054+-0.10402       ?
   closure                                   0.93782+-0.00608          0.93723+-0.00767       
   jquery                                   11.81939+-0.05536         11.77315+-0.10492       
   box2d                            x2      16.68482+-0.14566         16.68110+-0.14917       
   zlib                             x2     581.25326+-2.14803        578.19639+-3.10252       
   typescript                       x2    1116.63348+-8.15026    ?  1127.10419+-16.04267      ?

   <geometric>                               8.58508+-0.05871    ?     8.61839+-0.05741       ? might be 1.0039x slower

                                                  Conf#1                    Conf#2                                      
Kraken:
   ai-astar                                  149.728+-1.075      ^     147.221+-0.765         ^ definitely 1.0170x faster
   audio-beat-detection                       74.380+-0.512      ?      74.519+-0.516         ?
   audio-dft                                 128.216+-0.377            127.508+-1.973         
   audio-fft                                  60.330+-0.390      ?      60.382+-1.155         ?
   audio-oscillator                           85.936+-0.186      ?      86.273+-1.178         ?
   imaging-darkroom                           96.946+-0.359      ?      96.993+-0.440         ?
   imaging-desaturate                         84.382+-0.666             84.331+-0.158         
   imaging-gaussian-blur                     125.591+-0.756            125.500+-0.345         
   json-parse-financial                       67.640+-0.446             67.623+-0.546         
   json-stringify-tinderbox                   40.490+-0.455      ?      40.679+-0.380         ?
   stanford-crypto-aes                        62.713+-1.357      ?      63.494+-0.359         ? might be 1.0125x slower
   stanford-crypto-ccm                        59.512+-2.354      ?      61.230+-5.146         ? might be 1.0289x slower
   stanford-crypto-pbkdf2                    152.271+-0.684      ?     152.522+-2.581         ?
   stanford-crypto-sha256-iterative           59.230+-0.361      ?      59.251+-0.775         ?

   <arithmetic>                               89.098+-0.145      ?      89.109+-0.502         ? might be 1.0001x slower

                                                  Conf#1                    Conf#2                                      
AsmBench:
   bigfib.cpp                               655.1461+-20.9275    ?    665.8336+-4.7698        ? might be 1.0163x slower
   cray.c                                   608.0889+-9.1078          606.2350+-6.4679        
   dry.c                                    812.7330+-462.8484        745.3911+-318.6448        might be 1.0903x faster
   FloatMM.c                                925.2051+-64.5055         924.6276+-65.0364       
   gcc-loops.cpp                           6285.7328+-27.1328        6280.3089+-24.3046       
   n-body.c                                1615.9001+-30.6388        1604.6730+-13.7856       
   Quicksort.c                              599.0604+-4.2295          598.6930+-1.7205        
   stepanov_container.cpp                  4484.4255+-74.5501        4472.9373+-49.4676       
   Towers.c                                 384.6421+-0.7442     ?    384.9072+-0.6414        ?

   <geometric>                             1130.8891+-54.3618        1122.3520+-45.4478         might be 1.0076x faster

                                                  Conf#1                    Conf#2                                      
Geomean of preferred means:
   <scaled-result>                           95.2746+-1.5452           95.1637+-1.4671          might be 1.0012x faster
Comment 3 WebKit Commit Bot 2016-02-11 15:01:40 PST
Comment on attachment 271084 [details]
Patch

Clearing flags on attachment: 271084

Committed r196444: <http://trac.webkit.org/changeset/196444>
Comment 4 WebKit Commit Bot 2016-02-11 15:01:43 PST
All reviewed patches have been landed.  Closing bug.