Bug 155055

Summary: [JSC] Improve codegen of Compare and Test
Product: WebKit Reporter: Benjamin Poulain <benjamin>
Component: New BugsAssignee: Benjamin Poulain <benjamin>
Status: RESOLVED FIXED    
Severity: Normal CC: commit-queue, fpizlo, keith_miller, mark.lam, msaboff, saam
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
none
Patch none

Description Benjamin Poulain 2016-03-04 17:08:46 PST
[JSC] Improve codegen of Compare and Test
Comment 1 Benjamin Poulain 2016-03-05 00:20:40 PST
Created attachment 273070 [details]
Patch
Comment 2 WebKit Commit Bot 2016-03-05 00:23:19 PST
Attachment 273070 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/b3/testb3.cpp:12245:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12246:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12247:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12248:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12249:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12250:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12251:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12252:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12253:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12254:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12255:  More than one command on the same line  [whitespace/newline] [4]
Total errors found: 11 in 12 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 3 Benjamin Poulain 2016-03-05 00:53:08 PST
Created attachment 273071 [details]
Patch
Comment 4 WebKit Commit Bot 2016-03-05 00:54:34 PST
Attachment 273071 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/b3/testb3.cpp:12245:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12246:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12247:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12248:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12249:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12250:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12251:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12252:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12253:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12254:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12255:  More than one command on the same line  [whitespace/newline] [4]
Total errors found: 11 in 14 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 5 Benjamin Poulain 2016-03-05 01:02:40 PST
x86:


                                                  Conf#1                    Conf#2                                      
SunSpider:
   3d-cube                                    4.9518+-0.4155            4.7860+-0.0528          might be 1.0346x faster
   3d-morph                                   5.3521+-0.0735     ?      5.3876+-0.1941        ?
   3d-raytrace                                6.0125+-0.4709            5.6209+-0.0285          might be 1.0697x faster
   access-binary-trees                        2.1999+-0.0816            2.1963+-0.0189        
   access-fannkuch                            6.0559+-0.0868            6.0529+-0.0145        
   access-nbody                               2.7230+-0.1068     ?      2.7435+-0.1645        ?
   access-nsieve                              3.0064+-0.0269     ?      3.0465+-0.0399        ? might be 1.0133x slower
   bitops-3bit-bits-in-byte                   1.1810+-0.0133     ?      1.2173+-0.0569        ? might be 1.0307x slower
   bitops-bits-in-byte                        3.3144+-0.0167     ?      3.3685+-0.0506        ? might be 1.0163x slower
   bitops-bitwise-and                         2.0472+-0.0338     ?      2.0872+-0.0519        ? might be 1.0195x slower
   bitops-nsieve-bits                         3.1018+-0.0758            3.0348+-0.0252          might be 1.0221x faster
   controlflow-recursive                      2.4260+-0.0873            2.3800+-0.0226          might be 1.0193x faster
   crypto-aes                                 4.0973+-0.1232     ?      4.2310+-0.3702        ? might be 1.0326x slower
   crypto-md5                                 2.6005+-0.0378     ?      2.6380+-0.0624        ? might be 1.0144x slower
   crypto-sha1                                2.3734+-0.0480            2.3460+-0.0227          might be 1.0117x faster
   date-format-tofte                          6.9090+-0.0555     ?      6.9279+-0.2651        ?
   date-format-xparb                          4.8919+-0.1997     ?      4.9892+-0.1943        ? might be 1.0199x slower
   math-cordic                                2.9521+-0.0302     ?      2.9824+-0.0627        ? might be 1.0103x slower
   math-partial-sums                          4.9571+-0.1167            4.8466+-0.0273          might be 1.0228x faster
   math-spectral-norm                         2.0627+-0.0833            2.0464+-0.0209        
   regexp-dna                                 6.1785+-0.2961     ?      6.2439+-0.3236        ? might be 1.0106x slower
   string-base64                              4.5036+-0.0657            4.4685+-0.0554        
   string-fasta                               6.0087+-0.1265            5.9536+-0.0913        
   string-tagcloud                            8.1602+-0.0584            8.1128+-0.0405        
   string-unpack-code                        19.0600+-0.4252           18.9983+-0.4531        
   string-validate-input                      4.2969+-0.0817     ?      4.3487+-0.1307        ? might be 1.0120x slower

   <arithmetic>                               4.6701+-0.0344            4.6559+-0.0273          might be 1.0031x faster

                                                  Conf#1                    Conf#2                                      
Octane:
   encrypt                                   0.15836+-0.00175    ?     0.16074+-0.00112       ? might be 1.0150x slower
   decrypt                                   2.84958+-0.01036    !     2.89505+-0.02112       ! definitely 1.0160x slower
   deltablue                        x2       0.14076+-0.00447          0.13712+-0.00129         might be 1.0266x faster
   earley                                    0.28378+-0.00133    ?     0.28543+-0.00253       ?
   boyer                                     4.72956+-0.09435    ?     4.79880+-0.05750       ? might be 1.0146x slower
   navier-stokes                    x2       4.93910+-0.00843    ?     4.96180+-0.01523       ?
   raytrace                         x2       0.89213+-0.00267    ?     0.89387+-0.00362       ?
   richards                         x2       0.08272+-0.00077          0.08262+-0.00085       
   splay                            x2       0.34917+-0.00153          0.34916+-0.00151       
   regexp                           x2      22.58500+-0.15579    ^    21.39319+-0.14520       ^ definitely 1.0557x faster
   pdfjs                            x2      38.89915+-0.37815    ?    39.22654+-0.37765       ?
   mandreel                         x2      42.70744+-0.20596         42.61766+-0.14258       
   gbemu                            x2      24.87460+-0.11441    !    25.23538+-0.23868       ! definitely 1.0145x slower
   closure                                   0.56781+-0.00150          0.56473+-0.00253       
   jquery                                    7.42776+-0.04502          7.42551+-0.04437       
   box2d                            x2       9.38607+-0.03564    ?     9.40558+-0.03076       ?
   zlib                             x2     386.17556+-1.95739    ?   390.07341+-3.43982       ? might be 1.0101x slower
   typescript                       x2     662.63595+-7.64406    ?   666.15288+-5.71330       ?

   <geometric>                               5.25325+-0.01343          5.24862+-0.00740         might be 1.0009x faster

                                                  Conf#1                    Conf#2                                      
Kraken:
   ai-astar                                   97.873+-2.138             94.820+-1.510           might be 1.0322x faster
   audio-beat-detection                       47.509+-0.421      ?      47.670+-0.128         ?
   audio-dft                                  96.878+-0.849      ?      97.349+-0.945         ?
   audio-fft                                  36.337+-0.691             35.799+-0.084           might be 1.0150x faster
   audio-oscillator                           49.379+-1.095             48.263+-0.108           might be 1.0231x faster
   imaging-darkroom                           60.687+-0.942      ?      61.631+-1.600         ? might be 1.0156x slower
   imaging-desaturate                         44.290+-0.121      ?      44.381+-0.155         ?
   imaging-gaussian-blur                      67.730+-1.533      ?      68.986+-1.741         ? might be 1.0185x slower
   json-parse-financial                       37.516+-0.108      ?      37.803+-0.574         ?
   json-stringify-tinderbox                   25.154+-0.299      ^      23.566+-1.125         ^ definitely 1.0674x faster
   stanford-crypto-aes                        39.878+-0.456      ?      40.252+-0.138         ?
   stanford-crypto-ccm                        36.839+-0.960      ?      37.160+-0.867         ?
   stanford-crypto-pbkdf2                    101.213+-0.570            100.777+-0.435         
   stanford-crypto-sha256-iterative           39.004+-0.301             38.719+-0.186         

   <arithmetic>                               55.735+-0.190             55.513+-0.247           might be 1.0040x faster

                                                  Conf#1                    Conf#2                                      
AsmBench:
   bigfib.cpp                               435.2185+-5.8724          432.6396+-2.7153        
   cray.c                                   366.6495+-2.3552          364.4775+-0.9322        
   dry.c                                    466.5560+-39.1096         454.9115+-29.1868         might be 1.0256x faster
   FloatMM.c                                711.5276+-4.7815     ?    718.1357+-2.2435        ?
   gcc-loops.cpp                           3669.7504+-7.4500     ?   3690.3642+-14.9276       ?
   n-body.c                                 813.5631+-3.7418          812.8051+-6.1921        
   Quicksort.c                              396.7163+-3.1319          395.0087+-1.4178        
   stepanov_container.cpp                  3336.1117+-67.9640        3297.1206+-14.4521         might be 1.0118x faster
   Towers.c                                 268.9032+-1.2291     ?    269.7010+-2.1627        ?

   <geometric>                              725.6193+-7.1426          723.0271+-4.6005          might be 1.0036x faster

                                                  Conf#1                    Conf#2                                      
Geomean of preferred means:
   <scaled-result>                           31.5596+-0.1203           31.4694+-0.0740          might be 1.0029x faster
Comment 6 Benjamin Poulain 2016-03-05 01:04:11 PST
And here is ARM64. We save a few registers but visibly register pressure was not a huge problem :) 

                                                  Conf#1                    Conf#2                                      
SunSpider:
   3d-cube                                   10.3243+-0.0257           10.2635+-0.0694        
   3d-morph                                   8.4085+-0.1092            8.3218+-0.1203          might be 1.0104x faster
   3d-raytrace                                9.5002+-1.1013     ?      9.5916+-0.7864        ?
   access-binary-trees                        4.2527+-0.2049     ?      4.3150+-0.1448        ? might be 1.0146x slower
   access-fannkuch                           11.4689+-1.0740     ?     12.1995+-1.0038        ? might be 1.0637x slower
   access-nbody                               4.7256+-0.0270     ?      4.7337+-0.0437        ?
   access-nsieve                              3.3900+-0.0972            3.3627+-0.2015        
   bitops-3bit-bits-in-byte                   1.7097+-0.0753     ?      1.7242+-0.0867        ?
   bitops-bits-in-byte                        4.1575+-0.0512     ?      4.1585+-0.0345        ?
   bitops-bitwise-and                         3.2239+-0.1042     ?      3.2352+-0.0187        ?
   bitops-nsieve-bits                         5.8837+-0.0666     ?      5.9077+-0.0390        ?
   controlflow-recursive                      3.3940+-0.0998     ?      3.6182+-0.7075        ? might be 1.0661x slower
   crypto-aes                                 6.4380+-0.2254            6.4365+-0.1517        
   crypto-md5                                 3.9030+-0.0983     ?      3.9640+-0.1467        ? might be 1.0156x slower
   crypto-sha1                                3.7327+-0.1299     ?      3.8057+-0.1504        ? might be 1.0196x slower
   date-format-tofte                         11.3157+-0.0916     ?     11.5099+-0.2129        ? might be 1.0172x slower
   date-format-xparb                          7.3159+-0.0638     ?      7.3850+-0.0515        ?
   math-cordic                                5.4598+-0.0644            5.4094+-0.2284        
   math-partial-sums                         12.0386+-1.2024           11.6752+-0.1541          might be 1.0311x faster
   math-spectral-norm                         3.4418+-0.5164            3.3469+-0.6529          might be 1.0284x faster
   regexp-dna                                 9.7556+-0.1859            9.7205+-0.1759        
   string-base64                              6.4913+-0.0072            6.4834+-0.0277        
   string-fasta                               9.7759+-0.1713     ?      9.8637+-0.5916        ?
   string-tagcloud                           10.8035+-0.0866           10.7655+-0.0469        
   string-unpack-code                        23.1396+-0.0555     ?     23.1617+-0.2357        ?
   string-validate-input                      6.6161+-0.2344            6.4968+-0.2255          might be 1.0184x faster

   <arithmetic>                               7.3333+-0.0453     ?      7.3637+-0.0463        ? might be 1.0041x slower

                                                  Conf#1                    Conf#2                                      
Octane:
   encrypt                                   0.19236+-0.00365          0.18600+-0.00579         might be 1.0342x faster
   decrypt                                   3.66137+-0.00885    ^     3.60235+-0.00692       ^ definitely 1.0164x faster
   deltablue                        x2       0.17275+-0.01057          0.16915+-0.00913         might be 1.0213x faster
   earley                                    0.44862+-0.05620          0.43461+-0.04565         might be 1.0322x faster
   boyer                                    10.05291+-2.77533          9.16307+-2.24909         might be 1.0971x faster
   navier-stokes                    x2       7.16546+-0.01219    ?     7.17074+-0.01713       ?
   raytrace                         x2       1.28420+-0.01694          1.27991+-0.03197       
   richards                         x2       0.11193+-0.00659    ?     0.11460+-0.00094       ? might be 1.0238x slower
   splay                            x2       0.75281+-0.04144          0.75026+-0.04435       
   regexp                           x2      32.72377+-0.78530         32.45107+-0.52304       
   pdfjs                            x2      57.14911+-0.64156    ?    59.04761+-4.98161       ? might be 1.0332x slower
   mandreel                         x2      68.73824+-0.85912         68.45163+-0.87099       
   gbemu                            x2      44.04685+-0.40260    ?    47.41480+-14.53067      ? might be 1.0765x slower
   closure                                   0.64761+-0.02159          0.64240+-0.00639       
   jquery                                    9.53881+-0.03091    ?     9.55679+-0.08651       ?
   box2d                            x2      16.09129+-0.19371         15.93278+-0.26619       
   zlib                             x2     684.53776+-7.48863        678.71967+-20.41057      
   typescript                       x2    1137.11212+-6.00707       1136.12775+-17.69377      

   <geometric>                               8.09176+-0.10935          8.07538+-0.16551         might be 1.0020x faster

                                                  Conf#1                    Conf#2                                      
Kraken:
   ai-astar                                  156.594+-7.098            153.817+-1.086           might be 1.0181x faster
   audio-beat-detection                       58.141+-0.614      ?      58.395+-0.649         ?
   audio-dft                                 139.736+-10.273           134.563+-1.019           might be 1.0384x faster
   audio-fft                                  42.161+-0.707             41.685+-0.057           might be 1.0114x faster
   audio-oscillator                           52.818+-0.481      ?      53.806+-1.285         ? might be 1.0187x slower
   imaging-darkroom                           68.952+-0.042      ?      69.433+-0.934         ?
   imaging-desaturate                         77.510+-0.146      ^      76.866+-0.195         ^ definitely 1.0084x faster
   imaging-gaussian-blur                      96.864+-4.834      ?      98.024+-0.162         ? might be 1.0120x slower
   json-parse-financial                       47.373+-1.127             46.895+-0.065           might be 1.0102x faster
   json-stringify-tinderbox                   27.189+-0.556      ?      27.233+-0.505         ?
   stanford-crypto-aes                        58.289+-0.491             58.042+-2.172         
   stanford-crypto-ccm                        45.964+-1.015      ?      46.307+-2.623         ?
   stanford-crypto-pbkdf2                    141.766+-0.804            140.435+-1.247         
   stanford-crypto-sha256-iterative           51.085+-0.072      !      51.303+-0.131         ! definitely 1.0043x slower

   <arithmetic>                               76.031+-0.989             75.486+-0.254           might be 1.0072x faster

                                                  Conf#1                    Conf#2                                      
AsmBench:
   bigfib.cpp                               688.7177+-41.6818         646.4770+-80.6129         might be 1.0653x faster
   cray.c                                   554.7190+-3.2135          551.2500+-1.4206        
   dry.c                                    465.2328+-13.9236    ?    468.8533+-12.5562       ?
   FloatMM.c                                794.2280+-19.3555    ?    804.7562+-22.0298       ? might be 1.0133x slower
   gcc-loops.cpp                           4660.3900+-8.9334     ^   4625.8010+-7.9105        ^ definitely 1.0075x faster
   n-body.c                                1756.4335+-2.9581     ?   1756.5724+-3.2973        ?
   Quicksort.c                              561.9343+-4.7473          556.9090+-9.0781        
   stepanov_container.cpp                  5519.6120+-10.1877        5504.3962+-18.4030       
   Towers.c                                 277.8829+-3.8248          276.6393+-4.2085        

   <geometric>                              999.4611+-1.0159          991.2871+-10.3108         might be 1.0082x faster

                                                  Conf#1                    Conf#2                                      
Geomean of preferred means:
   <scaled-result>                           46.0809+-0.2310           45.9278+-0.3323          might be 1.0033x faster
Comment 7 Benjamin Poulain 2016-03-05 01:10:31 PST
Created attachment 273074 [details]
Patch
Comment 8 Benjamin Poulain 2016-03-05 01:11:07 PST
*** Bug 148536 has been marked as a duplicate of this bug. ***
Comment 9 WebKit Commit Bot 2016-03-05 01:12:29 PST
Attachment 273074 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/b3/testb3.cpp:12245:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12246:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12247:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12248:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12249:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12250:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12251:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12252:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12253:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12254:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/JavaScriptCore/b3/testb3.cpp:12255:  More than one command on the same line  [whitespace/newline] [4]
Total errors found: 11 in 14 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 10 Filip Pizlo 2016-03-05 13:51:49 PST
Comment on attachment 273074 [details]
Patch

Nice!
Comment 11 WebKit Commit Bot 2016-03-06 18:40:10 PST
Comment on attachment 273074 [details]
Patch

Clearing flags on attachment: 273074

Committed r197652: <http://trac.webkit.org/changeset/197652>
Comment 12 WebKit Commit Bot 2016-03-06 18:40:14 PST
All reviewed patches have been landed.  Closing bug.