Bug 163234 - B3->Air lowering should be able to emit complex leas on x86
Summary: B3->Air lowering should be able to emit complex leas on x86
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: All All
: P2 Normal
Assignee: Filip Pizlo
URL:
Keywords:
Depends on:
Blocks: 163171
  Show dependency treegraph
 
Reported: 2016-10-10 12:47 PDT by Filip Pizlo
Modified: 2016-10-10 20:47 PDT (History)
5 users (show)

See Also:


Attachments
this might work (15.82 KB, patch)
2016-10-10 12:48 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
the patch (25.17 KB, patch)
2016-10-10 14:38 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
the patch (25.27 KB, patch)
2016-10-10 14:42 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
the patch (25.56 KB, patch)
2016-10-10 15:16 PDT, Filip Pizlo
saam: review+
Details | Formatted Diff | Diff
patch for landing (29.78 KB, patch)
2016-10-10 19:02 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Filip Pizlo 2016-10-10 12:47:53 PDT
Patch forthcoming.
Comment 1 Filip Pizlo 2016-10-10 12:48:36 PDT
Created attachment 291145 [details]
this might work

I haven't tried it yet.
Comment 2 Filip Pizlo 2016-10-10 14:38:25 PDT
Created attachment 291166 [details]
the patch
Comment 3 WebKit Commit Bot 2016-10-10 14:41:25 PDT
Attachment 291166 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/b3/testb3.cpp:13541:  Boolean expressions that span multiple lines should have their operators on the left side of the line instead of the right side.  [whitespace/operators] [4]
Total errors found: 1 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 4 Filip Pizlo 2016-10-10 14:42:17 PDT
Created attachment 291167 [details]
the patch
Comment 5 WebKit Commit Bot 2016-10-10 14:44:12 PDT
Attachment 291167 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/b3/testb3.cpp:13541:  Boolean expressions that span multiple lines should have their operators on the left side of the line instead of the right side.  [whitespace/operators] [4]
Total errors found: 1 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 6 Filip Pizlo 2016-10-10 15:16:35 PDT
Created attachment 291171 [details]
the patch

Fixed some UB.
Comment 7 WebKit Commit Bot 2016-10-10 15:18:57 PDT
Attachment 291171 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/b3/testb3.cpp:13541:  Boolean expressions that span multiple lines should have their operators on the left side of the line instead of the right side.  [whitespace/operators] [4]
Total errors found: 1 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 8 Saam Barati 2016-10-10 15:46:10 PDT
Comment on attachment 291171 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=291171&action=review

r=me

> Source/JavaScriptCore/b3/B3LowerToAir.cpp:1912
> +            // Use 64-bit math to perform the shift so that <<32 does the right thing.

Can you add a test for this?

> Source/JavaScriptCore/b3/testb3.cpp:13477
> +    checkUsesInstruction(*code, "lea (%rdi,%rsi,4), %rax");

Can you also add a test for when the multiplier is 8?
Comment 9 Filip Pizlo 2016-10-10 15:53:22 PDT
Perf looks OK.


Benchmark report for SunSpider, LongSpider, Octane, Kraken, and AsmBench on murderface (MacBookPro11,5).

VMs tested:
"TipOfTree" at /Volumes/Data/secondary/OpenSource/WebKitBuild/Release/jsc (r207004)
"Things" at /Volumes/Data/quinary/OpenSource/WebKitBuild/Release/jsc (r207004)

Collected 6 samples per benchmark/VM, with 6 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to
get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds.

                                                TipOfTree                   Things                                      
SunSpider:
   3d-cube                                    4.8081+-0.2238     ?      4.8262+-0.2977        ?
   3d-morph                                   4.6882+-0.0978     ?      4.7601+-0.2533        ? might be 1.0153x slower
   3d-raytrace                                4.7714+-0.0548            4.7458+-0.0474        
   access-binary-trees                        1.9676+-0.0293            1.9565+-0.0566        
   access-fannkuch                            4.9313+-0.3643            4.7004+-0.0593          might be 1.0491x faster
   access-nbody                               2.3885+-0.0917     ?      2.4575+-0.1884        ? might be 1.0289x slower
   access-nsieve                              3.1593+-0.1057     ?      3.2587+-0.2102        ? might be 1.0314x slower
   bitops-3bit-bits-in-byte                   1.1943+-0.2344            1.1272+-0.0670          might be 1.0595x faster
   bitops-bits-in-byte                        2.7194+-0.1502            2.7065+-0.0672        
   bitops-bitwise-and                         1.9384+-0.0574     ?      2.0792+-0.3127        ? might be 1.0727x slower
   bitops-nsieve-bits                         3.0229+-0.0852     ?      3.0520+-0.1305        ?
   controlflow-recursive                      2.2252+-0.0363     ?      2.2671+-0.1093        ? might be 1.0189x slower
   crypto-aes                                 4.4934+-0.4792            4.4124+-0.3744          might be 1.0184x faster
   crypto-md5                                 2.6387+-0.0298     ?      2.6655+-0.0540        ? might be 1.0102x slower
   crypto-sha1                                2.6890+-0.0374            2.6641+-0.0297        
   date-format-tofte                          6.6390+-0.1583            6.5411+-0.1523          might be 1.0150x faster
   date-format-xparb                          4.3767+-0.0671     ?      4.5150+-0.3943        ? might be 1.0316x slower
   math-cordic                                2.6641+-0.0350            2.6508+-0.0256        
   math-partial-sums                          3.8015+-0.0254     ?      3.9691+-0.2808        ? might be 1.0441x slower
   math-spectral-norm                         2.0075+-0.0824     ?      2.0110+-0.0383        ?
   regexp-dna                                 6.0497+-0.2065     ?      6.1086+-0.2067        ?
   string-base64                              4.5950+-0.3812     ?      4.5957+-0.3264        ?
   string-fasta                               5.3043+-0.0763            5.2775+-0.0581        
   string-tagcloud                            8.3661+-0.4766     ?      8.3822+-0.5256        ?
   string-unpack-code                        18.1754+-1.3383     ?     18.2402+-1.0321        ?
   string-validate-input                      4.1697+-0.0361     ?      4.1760+-0.1303        ?

   <arithmetic>                               4.3763+-0.0727     ?      4.3902+-0.0511        ? might be 1.0032x slower

                                                TipOfTree                   Things                                      
LongSpider:
   3d-cube                                  791.8907+-11.6104         789.3553+-17.6858       
   3d-morph                                 565.1932+-3.7603          562.6833+-2.6864        
   3d-raytrace                              456.1164+-5.2722          453.4562+-4.8183        
   access-binary-trees                      774.3717+-7.9252     ?    779.0691+-1.0611        ?
   access-fannkuch                          233.1144+-11.8863         229.0910+-4.4792          might be 1.0176x faster
   access-nbody                             503.3042+-3.8408          497.9259+-6.5190          might be 1.0108x faster
   access-nsieve                            284.1951+-10.1249         278.1833+-4.1534          might be 1.0216x faster
   bitops-3bit-bits-in-byte                  31.8182+-1.2447     ?     33.1229+-2.1330        ? might be 1.0410x slower
   bitops-bits-in-byte                       81.7782+-1.9792     ?     81.8352+-1.2246        ?
   bitops-nsieve-bits                       367.6414+-4.2287     ?    371.5944+-7.1039        ? might be 1.0108x slower
   controlflow-recursive                    429.9538+-6.1434          428.6799+-4.3455        
   crypto-aes                               531.7749+-1.9364     ?    535.2872+-2.0557        ?
   crypto-md5                               453.6676+-7.6862     ?    458.8516+-2.8962        ? might be 1.0114x slower
   crypto-sha1                              588.7456+-6.7503     !    606.1940+-10.6159       ! definitely 1.0296x slower
   date-format-tofte                        332.4270+-6.3292          330.9482+-2.4590        
   date-format-xparb                        590.6508+-2.7649     ?    618.2487+-66.2687       ? might be 1.0467x slower
   hash-map                                 143.4471+-5.5083          138.0060+-4.2588          might be 1.0394x faster
   math-cordic                              421.8797+-14.5175    ?    429.3605+-13.2129       ? might be 1.0177x slower
   math-partial-sums                        282.8370+-3.0053          282.4237+-3.3276        
   math-spectral-norm                       515.3109+-4.1812          514.1577+-2.6478        
   string-base64                            479.2913+-5.1687          477.2460+-5.5475        
   string-fasta                             331.6859+-6.6387          324.6578+-3.3975          might be 1.0216x faster
   string-tagcloud                          163.1533+-2.9448          161.6974+-2.8722        

   <geometric>                              337.1272+-2.4717     ?    337.3787+-1.7645        ? might be 1.0007x slower

                                                TipOfTree                   Things                                      
Octane:
   encrypt                                   0.15235+-0.00413          0.14988+-0.00144         might be 1.0165x faster
   decrypt                                   2.71068+-0.02175    ^     2.59835+-0.03295       ^ definitely 1.0432x faster
   deltablue                        x2       0.11573+-0.00146    ?     0.11728+-0.00203       ? might be 1.0134x slower
   earley                                    0.23706+-0.00174          0.23662+-0.00292       
   boyer                                     4.19900+-0.14777          4.19524+-0.12439       
   navier-stokes                    x2       4.60230+-0.02294    ?     4.61005+-0.01454       ?
   raytrace                         x2       0.65030+-0.00117    ?     0.65509+-0.00530       ?
   richards                         x2       0.07858+-0.00116    ?     0.07873+-0.00165       ?
   splay                            x2       0.31164+-0.00299          0.30971+-0.00130       
   regexp                           x2      16.77981+-0.36453         16.67011+-0.42146       
   pdfjs                            x2      38.85408+-0.21052    ?    39.25619+-0.39671       ? might be 1.0103x slower
   mandreel                         x2      39.55364+-0.15019    ?    39.76014+-0.21737       ?
   gbemu                            x2      28.56973+-0.20802    ?    28.67272+-0.50951       ?
   closure                                   0.47103+-0.00270          0.47065+-0.00359       
   jquery                                    6.48439+-0.02125    ^     6.44658+-0.01626       ^ definitely 1.0059x faster
   box2d                            x2       8.63412+-0.08616    ?     8.65745+-0.06238       ?
   zlib                             x2     342.75494+-1.76345        341.00248+-2.91578       
   typescript                       x2     600.84261+-15.17109   ?   615.36666+-10.28872      ? might be 1.0242x slower

   <geometric>                               4.70961+-0.01883    ?     4.71523+-0.01932       ? might be 1.0012x slower

                                                TipOfTree                   Things                                      
Kraken:
   ai-astar                                   91.181+-4.377             88.676+-1.336           might be 1.0282x faster
   audio-beat-detection                       35.557+-0.457             35.508+-0.675         
   audio-dft                                  95.656+-3.304             93.495+-0.763           might be 1.0231x faster
   audio-fft                                  27.399+-0.173      ?      27.479+-0.363         ?
   audio-oscillator                           43.152+-0.492             42.877+-0.339         
   imaging-darkroom                           57.639+-1.814             55.640+-0.235           might be 1.0359x faster
   imaging-desaturate                         41.546+-0.498             41.069+-0.368           might be 1.0116x faster
   imaging-gaussian-blur                      60.018+-1.997      ?      60.225+-4.629         ?
   json-parse-financial                       31.531+-0.486      ?      32.750+-1.794         ? might be 1.0387x slower
   json-stringify-tinderbox                   21.355+-0.855             21.347+-0.918         
   stanford-crypto-aes                        34.679+-1.461             33.782+-0.171           might be 1.0265x faster
   stanford-crypto-ccm                        31.952+-1.692      ?      33.252+-1.862         ? might be 1.0407x slower
   stanford-crypto-pbkdf2                     89.356+-0.941             88.743+-0.664         
   stanford-crypto-sha256-iterative           29.510+-0.917      ?      29.746+-1.058         ?

   <arithmetic>                               49.324+-0.493             48.899+-0.579           might be 1.0087x faster

                                                TipOfTree                   Things                                      
AsmBench:
   bigfib.cpp                               405.6467+-3.5323     ?    409.2070+-5.8495        ?
   cray.c                                   357.7287+-4.7686          356.4018+-2.3540        
   dry.c                                    397.8157+-11.8652         396.8417+-6.9671        
   FloatMM.c                                657.4486+-20.8368         630.6041+-35.1956         might be 1.0426x faster
   gcc-loops.cpp                           3381.3095+-14.3842    ^   3341.0881+-17.5972       ^ definitely 1.0120x faster
   n-body.c                                 744.0028+-3.7649     ?    748.3889+-3.9711        ?
   Quicksort.c                              379.1238+-6.0500     ^    362.0960+-3.7614        ^ definitely 1.0470x faster
   stepanov_container.cpp                  3182.4114+-65.9123        3147.4892+-43.5650         might be 1.0111x faster
   Towers.c                                 242.2733+-5.1053     ?    246.2947+-9.4203        ? might be 1.0166x slower

   <geometric>                              671.8334+-6.8226          665.3898+-4.1114          might be 1.0097x faster

                                                TipOfTree                   Things                                      
Geomean of preferred means:
   <scaled-result>                           47.0355+-0.1865           46.9124+-0.1819          might be 1.0026x faster
Comment 10 Filip Pizlo 2016-10-10 15:53:55 PDT
(In reply to comment #8)
> Comment on attachment 291171 [details]
> the patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=291171&action=review
> 
> r=me
> 
> > Source/JavaScriptCore/b3/B3LowerToAir.cpp:1912
> > +            // Use 64-bit math to perform the shift so that <<32 does the right thing.
> 
> Can you add a test for this?
> 
> > Source/JavaScriptCore/b3/testb3.cpp:13477
> > +    checkUsesInstruction(*code, "lea (%rdi,%rsi,4), %rax");
> 
> Can you also add a test for when the multiplier is 8?

Yup, I can add those tests.
Comment 11 Filip Pizlo 2016-10-10 19:02:39 PDT
Created attachment 291207 [details]
patch for landing
Comment 12 WebKit Commit Bot 2016-10-10 19:08:45 PDT
Attachment 291207 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/b3/testb3.cpp:13658:  Boolean expressions that span multiple lines should have their operators on the left side of the line instead of the right side.  [whitespace/operators] [4]
Total errors found: 1 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 13 Filip Pizlo 2016-10-10 20:47:59 PDT
Landed in https://trac.webkit.org/changeset/207039