Bug 163371

Summary: DFG and FTL should be able to use DirectCall ICs when they proved the callee or its executable
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: JavaScriptCoreAssignee: Filip Pizlo <fpizlo>
Status: RESOLVED FIXED    
Severity: Normal CC: buildbot, commit-queue, keith_miller, mark.lam, msaboff, ossy, rniwa, saam
Priority: P2    
Version: WebKit Nightly Build   
Hardware: All   
OS: All   
Attachments:
Description Flags
it's a start
none
a bit more
none
getting closer to done
none
adding some FTL action
none
ftl is done
none
it builds!
buildbot: commit-queue-
Archive of layout-test-results from ews104 for mac-yosemite-wk2
none
Archive of layout-test-results from ews117 for mac-yosemite
none
Archive of layout-test-results from ews101 for mac-yosemite
none
Archive of layout-test-results from ews124 for ios-simulator-wk2
none
it ran a program!
none
it runs bigger programs
none
trying to fix direct tail calls
none
tail calls work again
none
the patch
none
the patch
none
fix more things
none
less wrong
none
the patch
none
the patch
none
the patch
ggaren: review+
patch for landing none

Description Filip Pizlo 2016-10-12 17:21:28 PDT
We should definitely use these whenever we prove what the callee is.  We may also want to use them when we have a solid prediction of who the callee might be.
Comment 1 Filip Pizlo 2016-10-12 17:22:37 PDT
Created attachment 291429 [details]
it's a start

I started doodling this while I was waiting for GC tests to run.
Comment 2 Filip Pizlo 2016-10-12 19:29:52 PDT
Created attachment 291442 [details]
a bit more

Threading the notion of direct calls through DFG IR.
Comment 3 Filip Pizlo 2016-10-13 10:09:44 PDT
Created attachment 291488 [details]
getting closer to done

I think that most of the DFG 64-bit backend for DirectCall is done.

I still have lots more work to do: 32-bit DFG and FTL.
Comment 4 Filip Pizlo 2016-10-13 10:53:53 PDT
Created attachment 291494 [details]
adding some FTL action

not done yet
Comment 5 Filip Pizlo 2016-10-13 11:43:50 PDT
Created attachment 291499 [details]
ftl is done
Comment 6 Filip Pizlo 2016-10-13 13:29:59 PDT
Created attachment 291512 [details]
it builds!
Comment 7 WebKit Commit Bot 2016-10-13 13:32:22 PDT
Attachment 291512 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/bytecode/CallLinkInfo.h:201:  The parameter name "vm" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/jit/JITOperations.cpp:966:  One line control clauses should not use braces.  [whitespace/braces] [4]
Total errors found: 2 in 30 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 8 Build Bot 2016-10-13 14:32:00 PDT
Comment on attachment 291512 [details]
it builds!

Attachment 291512 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/2279625

Number of test failures exceeded the failure limit.
Comment 9 Build Bot 2016-10-13 14:32:03 PDT
Created attachment 291517 [details]
Archive of layout-test-results from ews104 for mac-yosemite-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews104  Port: mac-yosemite-wk2  Platform: Mac OS X 10.10.5
Comment 10 Build Bot 2016-10-13 14:36:24 PDT
Comment on attachment 291512 [details]
it builds!

Attachment 291512 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/2279583

Number of test failures exceeded the failure limit.
Comment 11 Build Bot 2016-10-13 14:36:27 PDT
Created attachment 291518 [details]
Archive of layout-test-results from ews117 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews117  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 12 Build Bot 2016-10-13 14:36:59 PDT
Comment on attachment 291512 [details]
it builds!

Attachment 291512 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/2279648

Number of test failures exceeded the failure limit.
Comment 13 Build Bot 2016-10-13 14:37:02 PDT
Created attachment 291519 [details]
Archive of layout-test-results from ews101 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews101  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 14 Build Bot 2016-10-13 14:37:33 PDT
Comment on attachment 291512 [details]
it builds!

Attachment 291512 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: http://webkit-queues.webkit.org/results/2279617

Number of test failures exceeded the failure limit.
Comment 15 Build Bot 2016-10-13 14:37:37 PDT
Created attachment 291520 [details]
Archive of layout-test-results from ews124 for ios-simulator-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews124  Port: ios-simulator-wk2  Platform: Mac OS X 10.11.6
Comment 16 Filip Pizlo 2016-10-13 14:45:23 PDT
Created attachment 291521 [details]
it ran a program!

Seems like we can DirectCall in DFG64 and FTL.

I haven't tried DirectConstruct.

I haven't tried DirectTailCall.

I haven't tried DirectTailCallInlinedCaller.

I haven't implemented DFG32.

I haven't run any real tests.
Comment 17 Filip Pizlo 2016-10-13 16:26:05 PDT
Created attachment 291533 [details]
it runs bigger programs
Comment 18 Filip Pizlo 2016-10-14 14:22:33 PDT
Man, I totally messed up the tail call support.  I need to make sure that the slow path call happens before we destroy our call frame.
Comment 19 WebKit Commit Bot 2016-10-14 14:22:44 PDT
Attachment 291533 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/bytecode/CallLinkInfo.h:201:  The parameter name "vm" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/jit/JITOperations.cpp:962:  One line control clauses should not use braces.  [whitespace/braces] [4]
Total errors found: 2 in 38 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 20 Filip Pizlo 2016-10-14 14:23:51 PDT
Created attachment 291667 [details]
trying to fix direct tail calls
Comment 21 Filip Pizlo 2016-10-14 16:30:02 PDT
Created attachment 291679 [details]
tail calls work again
Comment 22 Filip Pizlo 2016-10-15 13:04:16 PDT
Created attachment 291721 [details]
the patch
Comment 23 WebKit Commit Bot 2016-10-15 13:06:09 PDT
Attachment 291721 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/bytecode/CallLinkInfo.h:202:  The parameter name "vm" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/jit/JITOperations.cpp:962:  One line control clauses should not use braces.  [whitespace/braces] [4]
Total errors found: 2 in 53 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 24 Saam Barati 2016-10-15 13:23:39 PDT
Have you run benchmarks on this patch?

Also, I'll try to review this later today or tmrw.
Comment 25 Filip Pizlo 2016-10-15 14:01:42 PDT
Created attachment 291726 [details]
the patch

Fixed some bugs.
Comment 26 WebKit Commit Bot 2016-10-15 14:04:11 PDT
Attachment 291726 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/bytecode/CallLinkInfo.h:202:  The parameter name "vm" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/jit/JITOperations.cpp:962:  One line control clauses should not use braces.  [whitespace/braces] [4]
Total errors found: 2 in 53 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 27 Filip Pizlo 2016-10-15 14:04:21 PDT
(In reply to comment #24)
> Have you run benchmarks on this patch?
> 
> Also, I'll try to review this later today or tmrw.

Benchmark report for SunSpider, LongSpider, Octane, Kraken, and AsmBench on murderface (MacBookPro11,5).

VMs tested:
"TipOfTree" at /Volumes/Data/secondary/OpenSource/WebKitBuild/Release/jsc (r207376)
"Things" at /Volumes/Data/primary/OpenSource/WebKitBuild/Release/jsc (r207376)

Collected 6 samples per benchmark/VM, with 6 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to
get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds.

                                                TipOfTree                   Things                                      
SunSpider:
   3d-cube                                    4.7665+-0.2137     ?      4.8226+-0.2051        ? might be 1.0118x slower
   3d-morph                                   4.7362+-0.0965            4.6974+-0.0578        
   3d-raytrace                                4.7505+-0.1114     ?      4.8104+-0.1217        ? might be 1.0126x slower
   access-binary-trees                        1.9671+-0.0389            1.9592+-0.0352        
   access-fannkuch                            4.8027+-0.2510     ?      4.8110+-0.2442        ?
   access-nbody                               2.4098+-0.1008     ?      2.4310+-0.1062        ?
   access-nsieve                              3.1898+-0.1587     ?      3.1928+-0.0835        ?
   bitops-3bit-bits-in-byte                   1.1159+-0.0911            1.0765+-0.0155          might be 1.0366x faster
   bitops-bits-in-byte                        2.6938+-0.0356     ?      2.7190+-0.0398        ?
   bitops-bitwise-and                         1.9195+-0.0162     ?      1.9628+-0.0677        ? might be 1.0226x slower
   bitops-nsieve-bits                         3.0149+-0.0748     ?      3.0219+-0.0854        ?
   controlflow-recursive                      2.2395+-0.0225            2.2050+-0.0503          might be 1.0156x faster
   crypto-aes                                 4.3206+-0.2407            4.2364+-0.0543          might be 1.0199x faster
   crypto-md5                                 2.6148+-0.0447     ?      2.6326+-0.0596        ?
   crypto-sha1                                2.6365+-0.0324     ?      2.6891+-0.1161        ? might be 1.0200x slower
   date-format-tofte                          6.7162+-0.2029     ?      6.8406+-0.2826        ? might be 1.0185x slower
   date-format-xparb                          4.3555+-0.0766     ?      4.3696+-0.0503        ?
   math-cordic                                2.7938+-0.2009            2.6858+-0.1013          might be 1.0402x faster
   math-partial-sums                          3.8956+-0.1450            3.8562+-0.1020          might be 1.0102x faster
   math-spectral-norm                         2.0539+-0.1205            1.9498+-0.0324          might be 1.0534x faster
   regexp-dna                                 6.1190+-0.1865     ?      6.2348+-0.1837        ? might be 1.0189x slower
   string-base64                              4.4453+-0.0600     ?      4.6553+-0.3771        ? might be 1.0473x slower
   string-fasta                               5.5435+-0.4184            5.2988+-0.0605          might be 1.0462x faster
   string-tagcloud                            8.1785+-0.1726            8.1142+-0.0475        
   string-unpack-code                        17.9479+-0.4932     ?     18.4537+-0.7685        ? might be 1.0282x slower
   string-validate-input                      4.1795+-0.0632     ?      4.2424+-0.2175        ? might be 1.0150x slower

   <arithmetic>                               4.3618+-0.0349     ?      4.3834+-0.0508        ? might be 1.0050x slower

                                                TipOfTree                   Things                                      
LongSpider:
   3d-cube                                  779.8097+-13.3833         775.2314+-10.7554       
   3d-morph                                 564.5124+-5.9556     ?    565.3111+-4.3749        ?
   3d-raytrace                              451.2368+-5.8123          446.6384+-3.8572          might be 1.0103x faster
   access-binary-trees                      783.4135+-27.1677    ?    787.7517+-10.5074       ?
   access-fannkuch                          233.3810+-12.9643         229.9929+-7.4942          might be 1.0147x faster
   access-nbody                             501.0771+-3.7342          497.8612+-3.1479        
   access-nsieve                            274.6501+-2.4070     ?    280.0665+-10.0914       ? might be 1.0197x slower
   bitops-3bit-bits-in-byte                  31.9132+-1.9116     ?     32.3127+-1.5160        ? might be 1.0125x slower
   bitops-bits-in-byte                       80.9983+-1.0565     ?     83.7311+-3.5949        ? might be 1.0337x slower
   bitops-nsieve-bits                       367.1679+-4.3950     ?    368.4792+-7.2455        ?
   controlflow-recursive                    429.9335+-6.7108     ^    400.8186+-2.2474        ^ definitely 1.0726x faster
   crypto-aes                               534.5749+-6.4564     ?    535.1849+-7.8424        ?
   crypto-md5                               451.3222+-7.1409     ?    461.1316+-12.0189       ? might be 1.0217x slower
   crypto-sha1                              589.8838+-7.2732     ?    600.6090+-8.0827        ? might be 1.0182x slower
   date-format-tofte                        332.4468+-10.2948    ?    332.9252+-11.8860       ?
   date-format-xparb                        597.1398+-2.9384          593.3193+-1.1824        
   hash-map                                 136.5276+-4.1086          136.1173+-1.5773        
   math-cordic                              406.7956+-6.2529     ?    407.8101+-16.8894       ?
   math-partial-sums                        281.4867+-1.7377     ?    283.6348+-4.2371        ?
   math-spectral-norm                       515.6010+-1.0586          515.4228+-2.9267        
   string-base64                            462.9693+-3.4169     ?    472.7293+-16.2820       ? might be 1.0211x slower
   string-fasta                             324.9185+-5.3776     ?    327.2005+-5.6165        ?
   string-tagcloud                          162.9485+-4.0327          162.5833+-5.0717        

   <geometric>                              334.2752+-1.8701     ?    334.7928+-1.0391        ? might be 1.0015x slower

                                                TipOfTree                   Things                                      
Octane:
   encrypt                                   0.14896+-0.00103          0.14831+-0.00124       
   decrypt                                   2.55593+-0.03504          2.54921+-0.02650       
   deltablue                        x2       0.11758+-0.00116          0.11591+-0.00219         might be 1.0144x faster
   earley                                    0.24084+-0.00455          0.23685+-0.00118         might be 1.0169x faster
   boyer                                     4.17223+-0.07967    ^     3.93125+-0.02972       ^ definitely 1.0613x faster
   navier-stokes                    x2       4.63655+-0.04569          4.60293+-0.01362       
   raytrace                         x2       0.66010+-0.00678          0.65622+-0.01506       
   richards                         x2       0.07792+-0.00131          0.07762+-0.00138       
   splay                            x2       0.31310+-0.00289          0.31077+-0.00702       
   regexp                           x2      16.72938+-0.29279         16.15823+-0.61537         might be 1.0353x faster
   pdfjs                            x2      39.22051+-0.43549         38.83390+-0.61884       
   mandreel                         x2      40.14901+-0.25783         39.69607+-0.38816         might be 1.0114x faster
   gbemu                            x2      28.68935+-0.26424    ?    29.02162+-1.95403       ? might be 1.0116x slower
   closure                                   0.46879+-0.00151    ?     0.47252+-0.00466       ?
   jquery                                    6.44494+-0.03909          6.44257+-0.06341       
   box2d                            x2       8.62186+-0.04397          8.61886+-0.06069       
   zlib                             x2     328.81759+-14.03874   ?   335.57731+-1.11317       ? might be 1.0206x slower
   typescript                       x2     613.26416+-6.78977        601.30823+-10.00381        might be 1.0199x faster

   <geometric>                               4.70828+-0.01610          4.67003+-0.02873         might be 1.0082x faster

                                                TipOfTree                   Things                                      
Kraken:
   ai-astar                                   89.328+-2.402             88.863+-3.441         
   audio-beat-detection                       35.867+-1.495             35.221+-0.172           might be 1.0183x faster
   audio-dft                                  95.739+-3.742      ?      96.159+-2.510         ?
   audio-fft                                  27.355+-0.050             27.332+-0.043         
   audio-oscillator                           42.892+-0.550             42.888+-0.171         
   imaging-darkroom                           55.780+-0.497      ?      56.784+-2.388         ? might be 1.0180x slower
   imaging-desaturate                         41.202+-0.422             40.915+-0.314         
   imaging-gaussian-blur                      59.246+-2.627      ?      61.496+-3.173         ? might be 1.0380x slower
   json-parse-financial                       32.115+-1.068      ?      32.161+-2.091         ?
   json-stringify-tinderbox                   21.175+-1.298      ?      21.596+-1.067         ? might be 1.0199x slower
   stanford-crypto-aes                        34.270+-0.255             34.016+-0.495         
   stanford-crypto-ccm                        33.029+-1.619      ?      34.549+-2.298         ? might be 1.0460x slower
   stanford-crypto-pbkdf2                     89.326+-2.087             87.844+-0.935           might be 1.0169x faster
   stanford-crypto-sha256-iterative           29.481+-0.336      ?      29.636+-0.675         ?

   <arithmetic>                               49.058+-0.190      ?      49.247+-0.726         ? might be 1.0039x slower

                                                TipOfTree                   Things                                      
AsmBench:
   bigfib.cpp                               409.4874+-4.3776          408.6794+-1.6398        
   cray.c                                   359.6500+-5.3419          355.8020+-4.1749          might be 1.0108x faster
   dry.c                                    393.6168+-5.4804          390.7567+-4.2644        
   FloatMM.c                                625.1077+-27.2172         614.7714+-6.7022          might be 1.0168x faster
   gcc-loops.cpp                           3375.9010+-51.6729        3355.4707+-31.2944       
   n-body.c                                 746.5075+-3.5171     ?    748.6332+-2.6703        ?
   Quicksort.c                              372.0623+-7.6587     ?    378.1494+-4.7250        ? might be 1.0164x slower
   stepanov_container.cpp                  3134.2475+-25.2866        3126.8337+-13.1033       
   Towers.c                                 233.5002+-4.6557     ^    222.8184+-3.5266        ^ definitely 1.0479x faster

   <geometric>                              663.2461+-3.2493     ^    657.9628+-1.6912        ^ definitely 1.0080x faster

                                                TipOfTree                   Things                                      
Geomean of preferred means:
   <scaled-result>                           46.7519+-0.1354           46.6968+-0.1801          might be 1.0012x faster
Comment 28 Filip Pizlo 2016-10-15 14:11:52 PDT
Created attachment 291727 [details]
fix more things

I don't think it's ready for review yet.  There's still more to be done.
Comment 29 WebKit Commit Bot 2016-10-15 14:21:41 PDT
Attachment 291727 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/bytecode/CallLinkInfo.h:202:  The parameter name "vm" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/jit/JITOperations.cpp:962:  One line control clauses should not use braces.  [whitespace/braces] [4]
Total errors found: 2 in 53 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 30 Filip Pizlo 2016-10-16 10:40:20 PDT
Created attachment 291751 [details]
less wrong

This thing still has failures in run-javascriptcore-tests.
Comment 31 Filip Pizlo 2016-10-16 13:32:20 PDT
Looks like I'm hitting a register allocation bug.  It's not an easy one.
Comment 32 Filip Pizlo 2016-10-16 14:58:52 PDT
Created attachment 291778 [details]
the patch

I think it works!
Comment 33 WebKit Commit Bot 2016-10-16 15:02:18 PDT
Attachment 291778 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/bytecode/CallLinkInfo.h:202:  The parameter name "vm" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/jit/JITOperations.cpp:962:  One line control clauses should not use braces.  [whitespace/braces] [4]
Total errors found: 2 in 54 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 34 Filip Pizlo 2016-10-16 16:45:26 PDT
Created attachment 291784 [details]
the patch

Passes 32-bit debug tests!
Comment 35 WebKit Commit Bot 2016-10-16 16:46:58 PDT
Attachment 291784 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/bytecode/CallLinkInfo.h:202:  The parameter name "vm" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/jit/JITOperations.cpp:962:  One line control clauses should not use braces.  [whitespace/braces] [4]
Total errors found: 2 in 54 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 36 Filip Pizlo 2016-10-16 16:58:16 PDT
Created attachment 291785 [details]
the patch
Comment 37 WebKit Commit Bot 2016-10-16 17:00:23 PDT
Attachment 291785 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/bytecode/CallLinkInfo.h:202:  The parameter name "vm" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/jit/JITOperations.cpp:962:  One line control clauses should not use braces.  [whitespace/braces] [4]
Total errors found: 2 in 54 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 38 Geoffrey Garen 2016-10-17 14:34:10 PDT
Comment on attachment 291785 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=291785&action=review

r=me

> Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:93
>          IndexType firstNonRegIndex = m_lastPrecoloredRegisterIndex + 1;
>          for (IndexType i = firstNonRegIndex; i < m_degrees.size(); ++i) {
>              unsigned degree = m_degrees[i];
> -            if (!degree)
> -                continue;
> -
>              if (degree >= m_regsInPriorityOrder.size())
>                  addToSpill(i);
>              else if (!m_moveList[i].isEmpty())

Revert. This was a different patch.
Comment 39 Filip Pizlo 2016-10-17 14:48:23 PDT
(In reply to comment #38)
> Comment on attachment 291785 [details]
> the patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=291785&action=review
> 
> r=me
> 
> > Source/JavaScriptCore/b3/air/AirIteratedRegisterCoalescing.cpp:93
> >          IndexType firstNonRegIndex = m_lastPrecoloredRegisterIndex + 1;
> >          for (IndexType i = firstNonRegIndex; i < m_degrees.size(); ++i) {
> >              unsigned degree = m_degrees[i];
> > -            if (!degree)
> > -                continue;
> > -
> >              if (degree >= m_regsInPriorityOrder.size())
> >                  addToSpill(i);
> >              else if (!m_moveList[i].isEmpty())
> 
> Revert. This was a different patch.

Thanks, will do.
Comment 40 Saam Barati 2016-10-17 15:52:05 PDT
Comment on attachment 291785 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=291785&action=review

Patch also LGTM, but I have a few nits

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:868
> +                    Graph::parameterSlotsForArgCount(numAllocatedArgs)
> +                    <= m_jit.graph().m_parameterSlots);

Nit: I'd put this compare on one line.

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:1021
> +        JITCompiler::Call call = isTail ? m_jit.nearTailCall() : m_jit.nearCall();

isTail can't be true here.

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:1025
> +        if (!isTail && isX86())

isTail can't be false here.

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:1034
> +        if (isTail)

ditto

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:986
> +        JITCompiler::Call call = isTail ? m_jit.nearTailCall() : m_jit.nearCall();

ditto

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:990
> +        if (!isTail && isX86())

ditto

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:999
> +        if (isTail)

ditto

> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5590
> +            numAllocatedArgs = std::max(
> +                numAllocatedArgs,
> +                std::min(
> +                    static_cast<unsigned>(functionExecutable->parameterCount()) + 1,
> +                    Options::maximumDirectCallStackSize()));

This is obviously an edge case, but I don't think this code is ideal if you have something like this:
numAllocatedArgs = 10
functionExecutable->parameterCount() = 250
Options::maximumDirectCallStackSize() = 200

This will cause us to create a frame of size for 200 args, but we will still have to do fixup in the caller. It seems like we should always defer to the caller's arity fixup if they're asking for more than maximumDirectCallStackSize of arguments?

> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5606
> +            auto addArgument = [&] (LValue value, VirtualRegister reg, int offset) {

Do lambdas allow default argument values?

> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5615
> +                addArgument(lowJSValue(m_graph.varArgChild(node, 1 + i)), virtualRegisterForArgument(i), 0);

Should this be virtualRegisterForArgument(i + 1)? Or is |this| one of the varargs of this node?

> Source/JavaScriptCore/jit/JITOperations.cpp:962
> +    if (executable->isHostFunction()) {
> +        codePtr = executable->entrypointFor(kind, MustCheckArity);
> +    } else {

Style: no braces here.

> Source/JavaScriptCore/runtime/Options.h:303
> +    v(unsigned, maximumDirectCallStackSize, 200, Normal, nullptr) \

Nit: I'd call this "maximumDirectCallArgumentCount" since that's how you use it.
Comment 41 Saam Barati 2016-10-17 15:57:05 PDT
Comment on attachment 291785 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=291785&action=review

> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5647
> +                GPRReg calleeGPR = params[!isTail].gpr();

One more nit: Can this just be: isTail ? 0 : 1.
I think what you have now requires more thinking than needed when reading it.
Comment 42 Filip Pizlo 2016-10-17 17:18:31 PDT
Thanks for the detailed review.  Responding to the changes that I made:

(In reply to comment #40)
> Comment on attachment 291785 [details]
> the patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=291785&action=review
> 
> Patch also LGTM, but I have a few nits
> 
> > Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:868
> > +                    Graph::parameterSlotsForArgCount(numAllocatedArgs)
> > +                    <= m_jit.graph().m_parameterSlots);
> 
> Nit: I'd put this compare on one line.
> 
> > Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:1021
> > +        JITCompiler::Call call = isTail ? m_jit.nearTailCall() : m_jit.nearCall();
> 
> isTail can't be true here.

Fixed.

> 
> > Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:1025
> > +        if (!isTail && isX86())
> 
> isTail can't be false here.

Fixed.

> 
> > Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:1034
> > +        if (isTail)
> 
> ditto

Fixed.

> 
> > Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:986
> > +        JITCompiler::Call call = isTail ? m_jit.nearTailCall() : m_jit.nearCall();
> 
> ditto

Fixed.

> 
> > Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:990
> > +        if (!isTail && isX86())
> 
> ditto

Fixed.

> 
> > Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:999
> > +        if (isTail)
> 
> ditto

Fixed.

> 
> > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5590
> > +            numAllocatedArgs = std::max(
> > +                numAllocatedArgs,
> > +                std::min(
> > +                    static_cast<unsigned>(functionExecutable->parameterCount()) + 1,
> > +                    Options::maximumDirectCallStackSize()));
> 
> This is obviously an edge case, but I don't think this code is ideal if you
> have something like this:
> numAllocatedArgs = 10
> functionExecutable->parameterCount() = 250
> Options::maximumDirectCallStackSize() = 200
> 
> This will cause us to create a frame of size for 200 args, but we will still
> have to do fixup in the caller. It seems like we should always defer to the
> caller's arity fixup if they're asking for more than
> maximumDirectCallStackSize of arguments?

Good catch, this should be:

unsigned numDesiredAllocatedArgs = static_cast<unsigned>(functionExecutable->parameterCount()) + 1;
if (numDesiredAllocatedArgs <= Options::maximumDirectCallStackSize())
    numAllocatedArgs = std::max(numAllocatedArgs, numDesiredAllocatedArgs);

> 
> > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5606
> > +            auto addArgument = [&] (LValue value, VirtualRegister reg, int offset) {
> 
> Do lambdas allow default argument values?

I don't know.

> 
> > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5615
> > +                addArgument(lowJSValue(m_graph.varArgChild(node, 1 + i)), virtualRegisterForArgument(i), 0);
> 
> Should this be virtualRegisterForArgument(i + 1)? Or is |this| one of the
> varargs of this node?

This is one of the varargs.

> 
> > Source/JavaScriptCore/jit/JITOperations.cpp:962
> > +    if (executable->isHostFunction()) {
> > +        codePtr = executable->entrypointFor(kind, MustCheckArity);
> > +    } else {
> 
> Style: no braces here.

Fixed.

> 
> > Source/JavaScriptCore/runtime/Options.h:303
> > +    v(unsigned, maximumDirectCallStackSize, 200, Normal, nullptr) \
> 
> Nit: I'd call this "maximumDirectCallArgumentCount" since that's how you use
> it.
Comment 43 Filip Pizlo 2016-10-17 17:31:14 PDT
Created attachment 291903 [details]
patch for landing
Comment 44 WebKit Commit Bot 2016-10-17 17:34:24 PDT
Attachment 291903 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/bytecode/CallLinkInfo.h:202:  The parameter name "vm" adds no information, so it should be removed.  [readability/parameter_name] [5]
Total errors found: 1 in 54 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 45 Saam Barati 2016-10-17 17:35:10 PDT
Comment on attachment 291785 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=291785&action=review

>>> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:5590
>>> +                    Options::maximumDirectCallStackSize()));
>> 
>> This is obviously an edge case, but I don't think this code is ideal if you have something like this:
>> numAllocatedArgs = 10
>> functionExecutable->parameterCount() = 250
>> Options::maximumDirectCallStackSize() = 200
>> 
>> This will cause us to create a frame of size for 200 args, but we will still have to do fixup in the caller. It seems like we should always defer to the caller's arity fixup if they're asking for more than maximumDirectCallStackSize of arguments?
> 
> Good catch, this should be:
> 
> unsigned numDesiredAllocatedArgs = static_cast<unsigned>(functionExecutable->parameterCount()) + 1;
> if (numDesiredAllocatedArgs <= Options::maximumDirectCallStackSize())
>     numAllocatedArgs = std::max(numAllocatedArgs, numDesiredAllocatedArgs);

Oops I obviously meant callee not caller in my comment, but it sounds like you knew what I meant :)
Comment 46 Filip Pizlo 2016-10-18 11:33:10 PDT
Landed in https://trac.webkit.org/changeset/207475