Bug 77070

Summary: All DFG helpers that may call out to arbitrary JS code must know where they were called from due to inlining and call stack walking
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: JavaScriptCoreAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: dglazkov, rakuco, webkit.review.bot
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
work in progress
webkit-ews: commit-queue-
the patch
webkit-ews: commit-queue-
the patch
ggaren: review+, webkit.review.bot: commit-queue-
the patch
none
the patch webkit.review.bot: commit-queue-

Description Filip Pizlo 2012-01-25 20:55:13 PST
Consider code like:

function a(stuff) { return a.arguments; }
function b(stuff) { return a(stuff); }

If b() is called frequently, a() will be inlined into b().  The access to a.arguments will turn into a GetById of sorts.  But the arguments access will fail because the call to the DFG helper that is supposed to do the GetById will not set where in b() we are right now, so the stack walking code for finding the arguments will not know that a() is on the stack.
Comment 1 Filip Pizlo 2012-01-25 20:55:26 PST
<rdar://problem/10750834>
Comment 2 Filip Pizlo 2012-01-25 21:11:04 PST
Created attachment 124060 [details]
work in progress

This patch is actually complete but I want to take some time to write some good layout tests for all of the wonderful corner cases I found.
Comment 3 WebKit Review Bot 2012-01-25 21:13:13 PST
Attachment 124060 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/CMakeLists.txt', u'S..." exit_code: 1

Source/JavaScriptCore/interpreter/AbstractPC.h:46:  The parameter name "globalData" adds no information, so it should be removed.  [readability/parameter_name] [5]
Source/JavaScriptCore/interpreter/AbstractPC.h:46:  The parameter name "exec" adds no information, so it should be removed.  [readability/parameter_name] [5]
Total errors found: 2 in 22 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 4 Filip Pizlo 2012-01-25 21:22:15 PST
Performance results showing its neutrality.



[pizlo@nitroflex bencher] ./bencher TipOfTree:/Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc StoreCodeOrigin:/Volumes/Data/pizlo/senary/OpenSource/WebKitBuild/Release/jsc --remote oldmac --local
Copying TipOfTree into /Volumes/Data/pizlo/bencher/temp/benchdata...
Copying StoreCodeOrigin into /Volumes/Data/pizlo/bencher/temp/benchdata...
All VMs are in place.
Packaging benchmarking directory for remote hosts...
Sending benchmark payload to oldmac...
Running on oldmac...
376/376                                                              
Generating benchmark report at TipOfTree_StoreCodeOrigin_SunSpiderV8Kraken_oldmac_20120125_2115_benchReport.txt

Benchmark report for SunSpider, V8, and Kraken on oldmac (MacPro4,1).

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc (r105935)
"StoreCodeOrigin" at /Volumes/Data/pizlo/senary/OpenSource/WebKitBuild/Release/jsc (r105935)

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime()
function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in
milliseconds.

                                            TipOfTree            StoreCodeOrigin                                 
SunSpider:
   3d-cube                                6.9488+-0.0788    ^     6.5647+-0.0677       ^ definitely 1.0585x faster
   3d-morph                              11.0735+-0.1713         10.9503+-0.0943         might be 1.0113x faster
   3d-raytrace                            9.4707+-0.2103    ?     9.5085+-0.2293       ?
   access-binary-trees                    1.9763+-0.0141    ?     1.9801+-0.0157       ?
   access-fannkuch                        9.1079+-0.0666    ^     8.9301+-0.0333       ^ definitely 1.0199x faster
   access-nbody                           4.8524+-0.0529          4.8357+-0.0170       
   access-nsieve                          4.1552+-0.0332    ?     4.1578+-0.0370       ?
   bitops-3bit-bits-in-byte               1.5453+-0.0144          1.5415+-0.0077       
   bitops-bits-in-byte                    6.3152+-0.0251    ?     6.3397+-0.0170       ?
   bitops-bitwise-and                     3.9769+-0.0033    !     3.9874+-0.0047       ! definitely 1.0026x slower
   bitops-nsieve-bits                     6.8607+-0.0422          6.8369+-0.0314       
   controlflow-recursive                  2.8286+-0.0135    ?     2.8361+-0.0167       ?
   crypto-aes                             8.7081+-0.0975    !     8.9696+-0.1178       ! definitely 1.0300x slower
   crypto-md5                             3.0635+-0.0280          3.0564+-0.0278       
   crypto-sha1                            2.7651+-0.0235    ?     2.7742+-0.0198       ?
   date-format-tofte                     13.0011+-0.1342    ?    13.0626+-0.1332       ?
   date-format-xparb                     12.4576+-0.1808    !    13.1638+-0.1068       ! definitely 1.0567x slower
   math-cordic                            8.8122+-0.0594    ?     8.9559+-0.1724       ? might be 1.0163x slower
   math-partial-sums                     12.5997+-0.0408    ?    12.6615+-0.0385       ?
   math-spectral-norm                     3.2204+-0.0166          3.2013+-0.0081       
   regexp-dna                            10.6201+-0.1430         10.5679+-0.0659       
   string-base64                          5.4396+-0.0488    ?     5.4779+-0.0623       ?
   string-fasta                           8.5682+-0.0231    ?     8.5860+-0.0245       ?
   string-tagcloud                       15.3282+-0.0672    ^    15.1681+-0.0751       ^ definitely 1.0106x faster
   string-unpack-code                    25.4699+-0.0554    ?    25.6491+-0.1269       ?
   string-validate-input                  7.1639+-0.0709    ?     7.2312+-0.0747       ?

   <arithmetic> *                         7.9357+-0.0348    ?     7.9613+-0.0410       ? might be 1.0032x slower
   <geometric>                            6.4624+-0.0308    ?     6.4710+-0.0326       ? might be 1.0013x slower
   <harmonic>                             5.1369+-0.0256    ?     5.1370+-0.0244       ? might be 1.0000x slower

                                            TipOfTree            StoreCodeOrigin                                 
V8:
   crypto                                95.2602+-1.2474         95.0325+-0.1779       
   deltablue                            192.8642+-1.4185        190.9143+-2.0429         might be 1.0102x faster
   earley-boyer                         116.3381+-3.2398    ?   116.6156+-3.0490       ?
   raytrace                              62.4554+-0.5250         62.1242+-0.2297       
   regexp                               117.0618+-0.3430    !   118.1363+-0.3603       ! definitely 1.0092x slower
   richards                             167.7616+-0.9350        167.4195+-1.0053       
   splay                                 85.9614+-0.5781    ?    87.3899+-1.5143       ? might be 1.0166x slower

   <arithmetic>                         119.6718+-0.7339        119.6617+-0.7530         might be 1.0001x faster
   <geometric> *                        112.2914+-0.6963    ?   112.4210+-0.6841       ? might be 1.0012x slower
   <harmonic>                           105.3508+-0.6495    ?   105.5294+-0.6142       ? might be 1.0017x slower

                                            TipOfTree            StoreCodeOrigin                                 
Kraken:
   ai-astar                             899.3676+-0.8194        898.5860+-0.7611       
   audio-beat-detection                 235.7903+-0.9517    ^   234.1390+-0.3143       ^ definitely 1.0071x faster
   audio-dft                            344.0872+-0.9791    ?   344.6886+-1.0430       ?
   audio-fft                            146.4794+-0.1007    !   146.8324+-0.0532       ! definitely 1.0024x slower
   audio-oscillator                     372.3952+-3.7514    ?   374.5188+-2.5319       ?
   imaging-darkroom                     363.1993+-8.9926        362.4276+-7.3561       
   imaging-desaturate                   279.0770+-0.2465    !   287.8392+-1.5893       ! definitely 1.0314x slower
   imaging-gaussian-blur                630.1294+-0.3312    ?   632.8067+-5.4983       ?
   json-parse-financial                  79.3698+-0.1884    ?    79.7423+-0.2391       ?
   json-stringify-tinderbox              98.3397+-0.5552         97.5735+-0.3820       
   stanford-crypto-aes                  126.9323+-0.2539    ?   127.0340+-0.3985       ?
   stanford-crypto-ccm                  124.8409+-1.2446    ?   125.1225+-1.0071       ?
   stanford-crypto-pbkdf2               239.2119+-0.5470    ?   240.5413+-0.8478       ?
   stanford-crypto-sha256-iterative     110.3179+-0.5499    ?   110.8394+-0.2766       ?

   <arithmetic> *                       289.2527+-0.8314    ?   290.1922+-0.9324       ? might be 1.0032x slower
   <geometric>                          224.5557+-0.5116    ?   225.2853+-0.5708       ? might be 1.0032x slower
   <harmonic>                           180.4026+-0.2647    ?   180.8393+-0.3321       ? might be 1.0024x slower

                                            TipOfTree            StoreCodeOrigin                                 
All benchmarks:
   <arithmetic>                         108.3738+-0.3551    ?   108.6663+-0.3822       ? might be 1.0027x slower
   <geometric>                           28.4489+-0.1088    ?    28.5022+-0.1202       ? might be 1.0019x slower
   <harmonic>                             9.0290+-0.0441    ?     9.0297+-0.0424       ? might be 1.0001x slower

                                            TipOfTree            StoreCodeOrigin                                 
Geomean of preferred means:
   <scaled-result>                       63.6405+-0.2505    ?    63.8024+-0.2815       ? might be 1.0025x slower

Running locally...
376/376                                                              
Generating benchmark report at TipOfTree_StoreCodeOrigin_SunSpiderV8Kraken_nitroflex_20120125_2117_benchReport.txt

Benchmark report for SunSpider, V8, and Kraken on nitroflex (MacBookPro8,2).

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc (r105935)
"StoreCodeOrigin" at /Volumes/Data/pizlo/senary/OpenSource/WebKitBuild/Release/jsc (r105935)

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime()
function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in
milliseconds.

                                            TipOfTree            StoreCodeOrigin                                 
SunSpider:
   3d-cube                                5.1113+-0.0803    ?     5.2325+-0.1350       ? might be 1.0237x slower
   3d-morph                               8.7623+-0.1635          8.7519+-0.1378       
   3d-raytrace                            7.6618+-0.1789          7.5046+-0.2901         might be 1.0209x faster
   access-binary-trees                    1.6148+-0.0852          1.5181+-0.0373         might be 1.0637x faster
   access-fannkuch                        6.0782+-0.1009    !     6.4547+-0.0920       ! definitely 1.0619x slower
   access-nbody                           3.3197+-0.0734    ?     3.3604+-0.0893       ? might be 1.0123x slower
   access-nsieve                          2.9323+-0.0837          2.9261+-0.0668       
   bitops-3bit-bits-in-byte               1.3879+-0.0309    ?     1.4020+-0.0172       ? might be 1.0102x slower
   bitops-bits-in-byte                    2.3546+-0.0636    ?     2.3855+-0.0408       ? might be 1.0131x slower
   bitops-bitwise-and                     3.4221+-0.0675    !     3.6211+-0.0861       ! definitely 1.0582x slower
   bitops-nsieve-bits                     5.5194+-0.1408    ?     5.6532+-0.0953       ? might be 1.0243x slower
   controlflow-recursive                  2.0868+-0.0316          2.0301+-0.0416         might be 1.0279x faster
   crypto-aes                             7.0519+-0.2251    ?     7.1199+-0.1561       ?
   crypto-md5                             2.5047+-0.0883          2.4201+-0.0597         might be 1.0350x faster
   crypto-sha1                            2.2331+-0.0515          2.2006+-0.0487         might be 1.0148x faster
   date-format-tofte                     10.0555+-0.1892          9.8771+-0.2245         might be 1.0181x faster
   date-format-xparb                      9.5833+-0.3081          9.2216+-0.2473         might be 1.0392x faster
   math-cordic                            6.4352+-0.0924    ?     6.5132+-0.1601       ? might be 1.0121x slower
   math-partial-sums                      7.6175+-0.1906    ?     7.6280+-0.1116       ?
   math-spectral-norm                     2.3980+-0.0415          2.3813+-0.0379       
   regexp-dna                             7.8736+-0.1167          7.6310+-0.1414         might be 1.0318x faster
   string-base64                          4.5333+-0.1879          4.4438+-0.1158         might be 1.0201x faster
   string-fasta                           6.6275+-0.1638    ?     6.6987+-0.0972       ? might be 1.0107x slower
   string-tagcloud                       11.0601+-0.2346    ?    11.2794+-0.2285       ? might be 1.0198x slower
   string-unpack-code                    18.9842+-0.2672    ?    19.0081+-0.2993       ?
   string-validate-input                  5.7583+-0.1332          5.7057+-0.0937       

   <arithmetic> *                         5.8834+-0.0304    ?     5.8834+-0.0400       ? might be 1.0000x slower
   <geometric>                            4.8051+-0.0255          4.8001+-0.0229         might be 1.0010x faster
   <harmonic>                             3.8885+-0.0321          3.8688+-0.0166         might be 1.0051x faster

                                            TipOfTree            StoreCodeOrigin                                 
V8:
   crypto                                71.9521+-0.3930    ^    70.3502+-0.5188       ^ definitely 1.0228x faster
   deltablue                            142.0428+-1.3598    ?   142.0884+-1.2110       ?
   earley-boyer                          79.2531+-2.4090         77.8141+-2.0773         might be 1.0185x faster
   raytrace                              47.5646+-0.4028    ?    47.6769+-0.6173       ?
   regexp                                84.5265+-0.5830    ?    85.1192+-0.4100       ?
   richards                             121.1500+-0.7152    ^   118.5520+-0.9332       ^ definitely 1.0219x faster
   splay                                 57.3965+-0.3860    ?    58.0787+-0.6424       ? might be 1.0119x slower

   <arithmetic>                          86.2694+-0.3647         85.6685+-0.5225         might be 1.0070x faster
   <geometric> *                         80.8677+-0.3846         80.3972+-0.5209         might be 1.0059x faster
   <harmonic>                            75.9891+-0.3857         75.6687+-0.5110         might be 1.0042x faster

                                            TipOfTree            StoreCodeOrigin                                 
Kraken:
   ai-astar                             484.3776+-3.6504        479.0608+-2.8705         might be 1.0111x faster
   audio-beat-detection                 173.4070+-1.0985        172.9341+-0.6769       
   audio-dft                            296.0445+-3.8752        295.3934+-2.0775       
   audio-fft                            107.4654+-0.3741        106.4235+-0.8071       
   audio-oscillator                     271.0532+-2.9626        269.9187+-1.2035       
   imaging-darkroom                     273.6805+-5.9648        272.0281+-5.6411       
   imaging-desaturate                   210.8776+-0.5189    ?   211.2534+-1.1194       ?
   imaging-gaussian-blur                490.1310+-1.5792    ?   492.9216+-7.8859       ?
   json-parse-financial                  51.6447+-0.1727    !    53.4904+-0.1926       ! definitely 1.0357x slower
   json-stringify-tinderbox              74.3663+-0.5488    ^    73.0956+-0.4359       ^ definitely 1.0174x faster
   stanford-crypto-aes                   90.5814+-0.3143    ?    90.6512+-0.9003       ?
   stanford-crypto-ccm                   93.0167+-0.8390    !    95.6212+-1.7626       ! definitely 1.0280x slower
   stanford-crypto-pbkdf2               165.7952+-0.9580    !   168.1087+-1.2082       ! definitely 1.0140x slower
   stanford-crypto-sha256-iterative      80.3811+-0.5386    ?    80.5351+-0.3193       ?

   <arithmetic> *                       204.4873+-0.4905        204.3883+-0.8716         might be 1.0005x faster
   <geometric>                          162.3331+-0.2468    ?   162.7192+-0.4095       ? might be 1.0024x slower
   <harmonic>                           129.9590+-0.1475    !   130.7860+-0.2014       ! definitely 1.0064x slower

                                            TipOfTree            StoreCodeOrigin                                 
All benchmarks:
   <arithmetic>                          77.0144+-0.1883         76.8954+-0.3031         might be 1.0015x faster
   <geometric>                           20.8768+-0.0781         20.8614+-0.0734         might be 1.0007x faster
   <harmonic>                             6.8252+-0.0551          6.7919+-0.0286         might be 1.0049x faster

                                            TipOfTree            StoreCodeOrigin                                 
Geomean of preferred means:
   <scaled-result>                       45.9924+-0.1557         45.8952+-0.2010         might be 1.0021x faster

[pizlo@nitroflex bencher]
Comment 5 Early Warning System Bot 2012-01-25 21:26:01 PST
Comment on attachment 124060 [details]
work in progress

Attachment 124060 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/11350197
Comment 6 Filip Pizlo 2012-01-25 22:38:14 PST
Created attachment 124066 [details]
the patch

- hopefully fixed build issues, but not sure, because I don't see them on my box.

- added loads of tests.

- found one additional, related, bug that the testing revealed: inlining code was not correctly setting aside all 6 call frame slots because of an off-by-one bug.
Comment 7 Early Warning System Bot 2012-01-25 23:00:35 PST
Comment on attachment 124066 [details]
the patch

Attachment 124066 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/11163210
Comment 8 Gyuyoung Kim 2012-01-25 23:05:18 PST
Comment on attachment 124066 [details]
the patch

Attachment 124066 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/11314486
Comment 9 WebKit Review Bot 2012-01-25 23:13:51 PST
Comment on attachment 124066 [details]
the patch

Attachment 124066 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/11191222

New failing tests:
fast/js/dfg-inline-arguments-use-from-all-the-places-broken.html
Comment 10 Filip Pizlo 2012-01-25 23:30:13 PST
Created attachment 124067 [details]
the patch

What do you know, if your patch is missing a file, then all bots will be red.
Comment 11 WebKit Review Bot 2012-01-25 23:57:32 PST
Comment on attachment 124067 [details]
the patch

Attachment 124067 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/11350245

New failing tests:
fast/js/dfg-inline-arguments-use-from-all-the-places-broken.html
Comment 12 Geoffrey Garen 2012-01-26 15:02:27 PST
Comment on attachment 124067 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=124067&action=review

Need some Windows JavaScriptCore.def love:

3>JavaScriptCore.exp : error LNK2001: unresolved external symbol "public: class JSC::JSValue __thiscall JSC::Interpreter::retrieveCaller(class JSC::ExecState *,class JSC::JSFunction *)const " (?retrieveCaller@Interpreter@JSC@@QBE?AVJSValue@2@PAVExecState@2@PAVJSFunction@2@@Z)

Need to double-check that we're actually passing this test:

  fast/js/dfg-inline-arguments-use-from-all-the-places-broken.html -> unexpected text diff mismatch

> Source/JavaScriptCore/jit/JITStubs.h:287
> +    inline bool returnAddressIsInCallTrampoline(ReturnAddressPtr returnAddress)

Small quibble: I don't think it's right to name this the "call trampoline", since it's used for entry to global code as well. I would call it "returnAddressIsInCtiTrampoline", to match the C function name.
Comment 13 Filip Pizlo 2012-01-26 15:17:51 PST
(In reply to comment #12)
> (From update of attachment 124067 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=124067&action=review
> 
> Need some Windows JavaScriptCore.def love:
> 
> 3>JavaScriptCore.exp : error LNK2001: unresolved external symbol "public: class JSC::JSValue __thiscall JSC::Interpreter::retrieveCaller(class JSC::ExecState *,class JSC::JSFunction *)const " (?retrieveCaller@Interpreter@JSC@@QBE?AVJSValue@2@PAVExecState@2@PAVJSFunction@2@@Z)

Ah, that's the error!  Thanks, I was trying to grok the Windows error output and couldn't figure out out. ;-)

> 
> Need to double-check that we're actually passing this test:
> 
>   fast/js/dfg-inline-arguments-use-from-all-the-places-broken.html -> unexpected text diff mismatch

Found it.  I stupidly changed the description() text after generating the expected file.

> 
> > Source/JavaScriptCore/jit/JITStubs.h:287
> > +    inline bool returnAddressIsInCallTrampoline(ReturnAddressPtr returnAddress)
> 
> Small quibble: I don't think it's right to name this the "call trampoline", since it's used for entry to global code as well. I would call it "returnAddressIsInCtiTrampoline", to match the C function name.

Yup, that's better.  Will change.
Comment 14 Filip Pizlo 2012-01-26 15:22:46 PST
Created attachment 124188 [details]
the patch

Made changes suggested by Geoff.  Putting up for EWS.  Will land if I can get it to go green.
Comment 15 Filip Pizlo 2012-01-26 15:32:26 PST
Created attachment 124194 [details]
the patch

Second attempt, rebased patch.
Comment 16 WebKit Review Bot 2012-01-26 16:08:11 PST
Comment on attachment 124194 [details]
the patch

Attachment 124194 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/11357103

New failing tests:
fast/js/dfg-inline-arguments-use-from-all-the-places-broken.html
Comment 17 Filip Pizlo 2012-01-26 16:59:31 PST
(In reply to comment #16)
> (From update of attachment 124194 [details])
> Attachment 124194 [details] did not pass chromium-ews (chromium-xvfb):
> Output: http://queues.webkit.org/results/11357103
> 
> New failing tests:
> fast/js/dfg-inline-arguments-use-from-all-the-places-broken.html

This test passes on Mac.
Comment 18 Filip Pizlo 2012-01-26 17:15:54 PST
Landed in http://trac.webkit.org/changeset/106067