Bug 144458 - JIT call inline caches should cache calls to objects with getCallData/getConstructData traps
Summary: JIT call inline caches should cache calls to objects with getCallData/getCons...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Yusuke Suzuki
URL:
Keywords: InRadar
Depends on:
Blocks: 178064
  Show dependency treegraph
 
Reported: 2015-04-30 10:42 PDT by Filip Pizlo
Modified: 2017-11-15 13:11 PST (History)
9 users (show)

See Also:


Attachments
Patch (151.02 KB, patch)
2017-10-19 09:55 PDT, Yusuke Suzuki
no flags Details | Formatted Diff | Diff
Patch (153.94 KB, patch)
2017-10-19 10:31 PDT, Yusuke Suzuki
no flags Details | Formatted Diff | Diff
Patch (174.06 KB, patch)
2017-10-20 02:12 PDT, Yusuke Suzuki
no flags Details | Formatted Diff | Diff
Patch (176.82 KB, patch)
2017-10-20 06:11 PDT, Yusuke Suzuki
no flags Details | Formatted Diff | Diff
Patch (176.90 KB, patch)
2017-10-20 06:29 PDT, Yusuke Suzuki
no flags Details | Formatted Diff | Diff
Patch (176.68 KB, patch)
2017-10-22 08:04 PDT, Yusuke Suzuki
sbarati: review+
Details | Formatted Diff | Diff
Patch (180.57 KB, patch)
2017-11-05 22:31 PST, Yusuke Suzuki
commit-queue: commit-queue-
Details | Formatted Diff | Diff
Archive of layout-test-results from ews113 for mac-elcapitan (513.04 KB, application/zip)
2017-11-05 23:24 PST, Build Bot
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Filip Pizlo 2015-04-30 10:42:49 PDT
This would be a speed-up for InternalFunction calls, since currently we take the slow path every time.
Comment 1 Yusuke Suzuki 2017-10-19 09:05:38 PDT
It turns out that we should cache InternalFunction too to optimize builtin constructors!
Comment 2 Yusuke Suzuki 2017-10-19 09:55:42 PDT
Created attachment 324243 [details]
Patch
Comment 3 Yusuke Suzuki 2017-10-19 10:05:52 PDT
I've added JSC part. WebCore and other part is not changed right now.
But I think it is good for early review for the direction.

My change is simple: InternalFunction has NativeFunction for call and construct. Our call linking mechanism now recognizes InternalFunction too. And cache it.
Comment 4 Yusuke Suzuki 2017-10-19 10:31:42 PDT
Created attachment 324248 [details]
Patch
Comment 5 Yusuke Suzuki 2017-10-19 11:19:58 PDT
Performance in Octane / Kraken is neutral since they do not use these internal functions *so* frequently.

                                                 baseline                  patched            
Octane:
   encrypt                                   0.22105+-0.00624          0.21790+-0.00242         might be 1.0145x faster
   decrypt                                   3.31623+-0.17153          3.23261+-0.05183         might be 1.0259x faster
   deltablue                        x2       0.18744+-0.00351          0.18728+-0.00358       
   earley                                    0.39254+-0.00363    ?     0.39794+-0.00688       ? might be 1.0138x slower
   boyer                                     7.15169+-0.05381          7.10940+-0.08521       
   navier-stokes                    x2       5.19215+-0.05461    ?     5.21819+-0.06407       ?
   raytrace                         x2       1.10587+-0.01281    ?     1.10979+-0.01747       ?
   richards                         x2       0.11294+-0.00050    ?     0.11534+-0.00307       ? might be 1.0212x slower
   splay                            x2       0.42048+-0.00285    ?     0.42049+-0.00402       ?
   regexp                           x2      25.52627+-0.38065    ?    25.71624+-0.50927       ?
   pdfjs                            x2      55.52976+-1.68890         55.03439+-1.19541       
   mandreel                         x2      59.38075+-0.38568    ?    60.54147+-2.59337       ? might be 1.0195x slower
   gbemu                            x2      55.71041+-2.58057    ?    56.87792+-3.20713       ? might be 1.0210x slower
   closure                                   0.76924+-0.01966          0.76212+-0.01816       
   jquery                                    9.65424+-0.29193    ?     9.70115+-0.34738       ?
   box2d                            x2      13.92739+-0.41589         13.63308+-0.43132         might be 1.0216x faster
   zlib                             x2     411.39828+-4.62837        409.83302+-5.16500       
   typescript                       x2     896.05240+-26.12225       884.70540+-28.31109        might be 1.0128x faster

   <geometric>                               6.99406+-0.03541    ?     6.99673+-0.05489       ? might be 1.0004x slower

                                                 baseline                  patched                                      
Kraken:
   ai-astar                                  155.656+-5.401      ?     159.362+-5.413         ? might be 1.0238x slower
   audio-beat-detection                       97.312+-4.552             95.973+-5.641           might be 1.0140x faster
   audio-dft                                 102.382+-2.521      ?     104.500+-2.568         ? might be 1.0207x slower
   audio-fft                                  81.170+-3.159      ?      81.608+-4.068         ?
   audio-oscillator                           69.664+-2.229             68.811+-2.180           might be 1.0124x faster
   imaging-darkroom                          118.960+-3.542      ?     120.291+-3.025         ? might be 1.0112x slower
   imaging-desaturate                         72.570+-2.137      ?      73.991+-1.949         ? might be 1.0196x slower
   imaging-gaussian-blur                     112.376+-3.677            109.583+-3.527           might be 1.0255x faster
   json-parse-financial                       63.624+-3.165      ?      63.981+-1.585         ?
   json-stringify-tinderbox                   40.194+-1.369             39.880+-1.300         
   stanford-crypto-aes                        64.451+-2.590             63.676+-1.883           might be 1.0122x faster
   stanford-crypto-ccm                        55.514+-2.347      ?      57.434+-2.512         ? might be 1.0346x slower
   stanford-crypto-pbkdf2                     96.267+-3.208             95.327+-3.317         
   stanford-crypto-sha256-iterative           32.090+-1.078      ?      33.964+-1.491         ? might be 1.0584x slower

   <arithmetic>                               83.016+-0.923      ?      83.456+-0.941         ? might be 1.0053x slower
Comment 6 Yusuke Suzuki 2017-10-20 02:12:19 PDT
Created attachment 324379 [details]
Patch
Comment 7 Build Bot 2017-10-20 03:00:17 PDT
Comment on attachment 324379 [details]
Patch

Attachment 324379 [details] did not pass jsc-ews (mac):
Output: http://webkit-queues.webkit.org/results/4930856

New failing tests:
stress/tail-call-host-call-throw.js.ftl-no-cjit-small-pool
stress/tail-call-host-call-throw.js.default
Comment 8 Yusuke Suzuki 2017-10-20 06:11:19 PDT
Created attachment 324389 [details]
Patch
Comment 9 Yusuke Suzuki 2017-10-20 06:29:16 PDT
Created attachment 324390 [details]
Patch
Comment 10 Yusuke Suzuki 2017-10-22 08:04:20 PDT
Created attachment 324527 [details]
Patch

Just rebaselining
Comment 11 Yusuke Suzuki 2017-10-23 09:27:42 PDT
Personally, *in the future*, Iā€™m considering making JSDOMConstructor InternalFunction later to accelerate custom elements, like,

class Derived extends HTMLElement {
}

case. And using InternalFunction structure cache for the resulted object.
Comment 12 Yusuke Suzuki 2017-11-03 12:27:56 PDT
Ping?
Comment 13 Saam Barati 2017-11-05 21:12:47 PST
Comment on attachment 324527 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=324527&action=review

r=me

> Source/JavaScriptCore/ChangeLog:8
> +        Previously only JSFunction is handled by CallLinkInfo's caching mechanism. This means that any

"that any" => "that"

> Source/JavaScriptCore/ChangeLog:12
> +        2. CallLinkInfo tells nothing in the higher tier JITs.

"tells nothing" => "tells us nothing"?

> Source/JavaScriptCore/ChangeLog:19
> +        for InternalFunction. Previously we do not record any information to CallLinkInfo. Except for the

"we do not" => "we did not"

> Source/JavaScriptCore/ChangeLog:22
> +        nodes for these InternalFunctions since CallLinkInfo tells nothing. This patch teaches CallLinkInfo
> +        handling InternalFunction too.

This last sentence is unneeded.

> Source/JavaScriptCore/ChangeLog:24
> +        Attached microbenchmarks show performance improvement.

nice

> Source/JavaScriptCore/bytecode/CallLinkInfo.cpp:268
> +            handleSpecificCallee(static_cast<JSFunction*>(lastSeenCallee()));

Nit: Even though we're checking the type(), I think jsCast is the more canonical cast and it also gives us even extra debug assertions.

> Source/JavaScriptCore/jit/JITOperations.cpp:944
> +            ASSERT(!!codePtr);

nit: Might as well make it RELEASE_ASSERT since this check isn't performance sensitive.

> Source/JavaScriptCore/jit/Repatch.cpp:956
> +            // FIXME: We could add a fast path for InternalFunction with closure call.

Please make a bug and link to it here or remove FIXME

> Source/JavaScriptCore/runtime/InternalFunction.cpp:95
> +    if (function->m_functionForConstruct == callHostFunctionAsConstructor)

Why use this as a flag instead of nullptr?
Comment 14 Saam Barati 2017-11-05 21:14:40 PST
Comment on attachment 324527 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=324527&action=review

> Source/JavaScriptCore/runtime/InternalFunction.cpp:-48
> -    ASSERT(methodTable(vm)->getCallData != InternalFunction::info()->methodTable.getCallData);

Do we want to assert the opposite now? That this isn't overridden? Maybe we also want to assert type() == InternalFunctionType?
Comment 15 Yusuke Suzuki 2017-11-05 22:28:22 PST
Comment on attachment 324527 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=324527&action=review

Thank you!!

>> Source/JavaScriptCore/ChangeLog:8
>> +        Previously only JSFunction is handled by CallLinkInfo's caching mechanism. This means that any
> 
> "that any" => "that"

Fixed.

>> Source/JavaScriptCore/ChangeLog:12
>> +        2. CallLinkInfo tells nothing in the higher tier JITs.
> 
> "tells nothing" => "tells us nothing"?

Fixed.

>> Source/JavaScriptCore/ChangeLog:19
>> +        for InternalFunction. Previously we do not record any information to CallLinkInfo. Except for the
> 
> "we do not" => "we did not"

Fixed.

>> Source/JavaScriptCore/ChangeLog:22
>> +        handling InternalFunction too.
> 
> This last sentence is unneeded.

Dropped.

>> Source/JavaScriptCore/bytecode/CallLinkInfo.cpp:268
>> +            handleSpecificCallee(static_cast<JSFunction*>(lastSeenCallee()));
> 
> Nit: Even though we're checking the type(), I think jsCast is the more canonical cast and it also gives us even extra debug assertions.

Changed it to `jsCast`.

>> Source/JavaScriptCore/jit/JITOperations.cpp:944
>> +            ASSERT(!!codePtr);
> 
> nit: Might as well make it RELEASE_ASSERT since this check isn't performance sensitive.

OK, fixed.

>> Source/JavaScriptCore/jit/Repatch.cpp:956
>> +            // FIXME: We could add a fast path for InternalFunction with closure call.
> 
> Please make a bug and link to it here or remove FIXME

Filed and noted the URL.

>> Source/JavaScriptCore/runtime/InternalFunction.cpp:-48
>> -    ASSERT(methodTable(vm)->getCallData != InternalFunction::info()->methodTable.getCallData);
> 
> Do we want to assert the opposite now? That this isn't overridden? Maybe we also want to assert type() == InternalFunctionType?

Sounds nice. Fixed.

>> Source/JavaScriptCore/runtime/InternalFunction.cpp:95
>> +    if (function->m_functionForConstruct == callHostFunctionAsConstructor)
> 
> Why use this as a flag instead of nullptr?

This is because we do not want to check nullptr when calling InternalFunction from JIT's stub.
Calling non-constructible InternalFunction as constructor should be super rare. Such a check should not be done when calling.
Comment 16 Yusuke Suzuki 2017-11-05 22:31:41 PST
Created attachment 326102 [details]
Patch

Patch for landing
Comment 17 WebKit Commit Bot 2017-11-05 22:33:32 PST
Comment on attachment 326102 [details]
Patch

Rejecting attachment 326102 [details] from commit-queue.

Failed to run "['/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch', '--status-host=webkit-queues.webkit.org', '--bot-id=webkit-cq-03', 'validate-changelog', '--check-oops', '--non-interactive', 326102, '--port=mac']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit

ChangeLog entry in JSTests/ChangeLog contains OOPS!.

Full output: http://webkit-queues.webkit.org/results/5119174
Comment 18 Build Bot 2017-11-05 23:24:25 PST
Comment on attachment 326102 [details]
Patch

Attachment 326102 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/5119345

Number of test failures exceeded the failure limit.
Comment 19 Build Bot 2017-11-05 23:24:26 PST
Created attachment 326104 [details]
Archive of layout-test-results from ews113 for mac-elcapitan

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews113  Port: mac-elcapitan  Platform: Mac OS X 10.11.6
Comment 20 Yusuke Suzuki 2017-11-06 06:40:27 PST
Committed r224487: <https://trac.webkit.org/changeset/224487>
Comment 21 Radar WebKit Bug Importer 2017-11-15 13:11:12 PST
<rdar://problem/35568940>