158083 – LLInt should support other types of prototype GetById caching.

RESOLVED WONTFIX 158083

LLInt should support other types of prototype GetById caching.

https://bugs.webkit.org/show_bug.cgi?id=158083

Summary LLInt should support other types of prototype GetById caching.

Keith Miller

Reported 2016-05-25 13:22:39 PDT

Currently, the LLInt supports prototype load caching. It might also be profitable to cache prototype misses, accessors, or customs.

Attachments
Patch (13.11 KB, patch) 2016-06-03 02:57 PDT, Caio Lima	no flags	Details Formatted Diff Diff
Archive of layout-test-results from ews103 for mac-yosemite (980.15 KB, application/zip) 2016-06-03 03:58 PDT, Build Bot	no flags	Details
Archive of layout-test-results from ews106 for mac-yosemite-wk2 (1.28 MB, application/zip) 2016-06-03 04:01 PDT, Build Bot	no flags	Details
Archive of layout-test-results from ews114 for mac-yosemite (1.72 MB, application/zip) 2016-06-03 04:04 PDT, Build Bot	no flags	Details
Archive of layout-test-results from ews122 for ios-simulator-wk2 (670.75 KB, application/zip) 2016-06-03 04:14 PDT, Build Bot	no flags	Details
Patch (13.73 KB, patch) 2016-06-03 09:45 PDT, Caio Lima	no flags	Details Formatted Diff Diff
Patch (17.15 KB, patch) 2016-06-05 01:17 PDT, Caio Lima	no flags	Details Formatted Diff Diff
Patch (14.49 KB, patch) 2016-06-05 13:21 PDT, Caio Lima	no flags	Details Formatted Diff Diff
Base's Benchmark output (43.03 KB, text/plain) 2016-06-05 19:56 PDT, Caio Lima	no flags	Details
Cached Accessor's benchmark output (43.03 KB, text/plain) 2016-06-05 19:57 PDT, Caio Lima	no flags	Details
Benchmark (77.44 KB, text/plain) 2016-06-06 01:47 PDT, Caio Lima	no flags	Details
Patch (23.42 KB, patch) 2016-06-13 22:52 PDT, Caio Lima	no flags	Details Formatted Diff Diff
Patch (21.31 KB, patch) 2016-06-13 23:08 PDT, Caio Lima	no flags	Details Formatted Diff Diff
Archive of layout-test-results from ews100 for mac-yosemite (502.35 KB, application/zip) 2016-06-13 23:55 PDT, Build Bot	no flags	Details
Archive of layout-test-results from ews107 for mac-yosemite-wk2 (414.01 KB, application/zip) 2016-06-13 23:57 PDT, Build Bot	no flags	Details
Archive of layout-test-results from ews115 for mac-yosemite (490.82 KB, application/zip) 2016-06-13 23:57 PDT, Build Bot	no flags	Details
Archive of layout-test-results from ews124 for ios-simulator-wk2 (509.37 KB, application/zip) 2016-06-13 23:58 PDT, Build Bot	no flags	Details
Patch (20.63 KB, patch) 2016-06-15 20:30 PDT, Caio Lima	no flags	Details Formatted Diff Diff
Archive of layout-test-results from ews101 for mac-yosemite (948.69 KB, application/zip) 2016-06-15 21:28 PDT, Build Bot	no flags	Details
Archive of layout-test-results from ews107 for mac-yosemite-wk2 (811.62 KB, application/zip) 2016-06-15 21:32 PDT, Build Bot	no flags	Details
Archive of layout-test-results from ews124 for ios-simulator-wk2 (657.22 KB, application/zip) 2016-06-15 21:36 PDT, Build Bot	no flags	Details
Archive of layout-test-results from ews117 for mac-yosemite (1.43 MB, application/zip) 2016-06-15 23:48 PDT, Build Bot	no flags	Details
Patch (20.61 KB, patch) 2016-06-16 23:27 PDT, Caio Lima	no flags	Details Formatted Diff Diff
speedometer run comparison with and without patch applied (6.04 MB, application/octet-stream) 2016-06-16 23:32 PDT, Caio Lima	no flags	Details
Archive of layout-test-results from ews103 for mac-yosemite (947.48 KB, application/zip) 2016-06-17 00:26 PDT, Build Bot	no flags	Details
Archive of layout-test-results from ews106 for mac-yosemite-wk2 (839.46 KB, application/zip) 2016-06-17 00:29 PDT, Build Bot	no flags	Details
Archive of layout-test-results from ews125 for ios-simulator-wk2 (681.04 KB, application/zip) 2016-06-17 00:36 PDT, Build Bot	no flags	Details
Archive of layout-test-results from ews116 for mac-yosemite (1.43 MB, application/zip) 2016-06-17 00:38 PDT, Build Bot	no flags	Details
Patch (23.40 KB, patch) 2016-06-17 01:06 PDT, Caio Lima	no flags	Details Formatted Diff Diff
Patch (23.48 KB, patch) 2016-06-17 01:34 PDT, Caio Lima	no flags	Details Formatted Diff Diff
Patch (20.98 KB, patch) 2016-06-18 01:55 PDT, Caio Lima	no flags	Details Formatted Diff Diff
Patch (20.95 KB, patch) 2016-06-28 23:59 PDT, Caio Lima	no flags	Details Formatted Diff Diff
Archive of layout-test-results from ews126 for ios-simulator-wk2 (1.01 MB, application/zip) 2016-06-29 01:02 PDT, Build Bot	no flags	Details
Patch (21.91 KB, patch) 2016-06-30 01:15 PDT, Caio Lima	no flags	Details Formatted Diff Diff
Patch (21.96 KB, patch) 2016-07-11 01:31 PDT, Caio Lima	no flags	Details Formatted Diff Diff
Patch (22.71 KB, patch) 2016-08-02 00:21 PDT, Caio Lima	no flags	Details Formatted Diff Diff
Patch (22.62 KB, patch) 2016-08-02 00:34 PDT, Caio Lima	no flags	Details Formatted Diff Diff
Patch (22.92 KB, patch) 2016-08-06 22:27 PDT, Caio Lima	no flags	Details Formatted Diff Diff
Patch (23.03 KB, patch) 2016-10-08 22:47 PDT, Caio Lima	beidson: review-	Details Formatted Diff Diff
Show Obsolete (37) View All Add attachment proposed patch, testcase, etc.

Caio Lima

Comment 1 2016-06-02 13:12:23 PDT

Hey Keith, I am interested in work in this bug if it is possible Cold you clarify some points to me? 1. What are prototype misses? 2. When you think in cache accessors, you mean getters and setters? 3. Also, for customs It could be a big help if you post some JavaScript code as example. I am trying to debug the case for accessors and I noticed that the getter is emitted as get_by_id op and I suspect it is already being cached (I didn't check it).

Radar WebKit Bug Importer

Comment 2 2016-06-02 13:12:58 PDT

<rdar://problem/26607441>

Keith Miller

Comment 3 2016-06-02 15:57:19 PDT

(In reply to comment #1) > Hey Keith, I am interested in work in this bug if it is possible > > Cold you clarify some points to me? > > 1. What are prototype misses? > > 2. When you think in cache accessors, you mean getters and setters? > > 3. Also, for customs > > It could be a big help if you post some JavaScript code as example. > > I am trying to debug the case for accessors and I noticed that the getter is > emitted as get_by_id op and I suspect it is already being cached (I didn't > check it). Hi Caio, It would be great if you fixed this feature and I would be happy to help. Feel free to message me on irc, my nick is keith_miller, if you have any other questions. 1) prototype miss was the wrong terminology. I should have said that we should cache unset properties. Meaning, `let x = foo.bar` where `"bar" in foo === false`. I actually already handled this case in http://trac.webkit.org/changeset/201456. 2) Yep, that's right. In JavaScriptCore we use the word Accessor to mean a property with a getter and/or a setter. For example, if you did the following: foo = {}; Object.defineOwnProperty(Object.prototype, "bar", { get: () => { return 1; } }); print(foo.bar); The bytecodes will look something like: [ 9] resolve_scope loc10, loc3, print(@id0), <GlobalProperty>, 1, 0x113de7900 [ 16] get_from_scope loc6, loc10, print(@id0), 2048<ThrowIfNotFound|GlobalProperty|NotInitialization>, 122 predicting None [ 24] resolve_scope loc11, loc3, foo(@id1), <GlobalProperty>, 1, 0x113de7900 [ 31] get_from_scope loc12, loc11, foo(@id1), 2048<ThrowIfNotFound|GlobalProperty|NotInitialization>, 191 predicting None [ 39] get_by_id loc9, loc12, bar(@id2) predicting None [ 48] call loc5, loc6, 2, 16 status(Could Take Slow Path) Original; predicting None That get_by_id will attempt to lookup the property on foo by calling foo's getOwnPropertySlot function, which will set the passed PropertySlot& to Object.prototype's "bar" GetterSetter object. In this case, the PropertySlot will say that the value is an GetterSetter and we won't cache it. 3) Customs are mostly used as a performance optimization for DOM Accessors. Basically, instead of allocating a JSFunction object for all accessor properties in the DOM we use a function pointer instead. You can see the typedef, GetValueFunc, in PropertySlot.h. Customs are also used in a small number of other places for special properties that are values but change might change, although that is quite rare. I would recommend for a first pass that you implement a new opcode, I guess called op_get_by_id_proto_accessor, that just calls to a slow path. In the slow path it can load the GetterSetter object and call the getter returning the result. If you are feeling very ambitious, you could inline the code for the call in the interpreter assembly. Let me know if you plan on trying to inline the call since you will need to know the JavaScriptCore calling convention to implement it.

Caio Lima

Comment 4 2016-06-03 02:57:17 PDT

Created attachment 280434 [details] Patch

Caio Lima

Comment 5 2016-06-03 03:02:13 PDT

(In reply to comment #3) > (In reply to comment #1) > > Hey Keith, I am interested in work in this bug if it is possible > > > > Cold you clarify some points to me? > > > > 1. What are prototype misses? > > > > 2. When you think in cache accessors, you mean getters and setters? > > > > 3. Also, for customs > > > > It could be a big help if you post some JavaScript code as example. > > > > I am trying to debug the case for accessors and I noticed that the getter is > > emitted as get_by_id op and I suspect it is already being cached (I didn't > > check it). > > Hi Caio, > > It would be great if you fixed this feature and I would be happy to help. > Feel free to message me on irc, my nick is keith_miller, if you have any > other questions. > > 1) prototype miss was the wrong terminology. I should have said that we > should cache unset properties. Meaning, `let x = foo.bar` where `"bar" in > foo === false`. I actually already handled this case in > http://trac.webkit.org/changeset/201456. > > 2) Yep, that's right. In JavaScriptCore we use the word Accessor to mean a > property with a getter and/or a setter. > > For example, if you did the following: > > foo = {}; > Object.defineOwnProperty(Object.prototype, "bar", { get: () => { return 1; } > }); > print(foo.bar); > > The bytecodes will look something like: > > [ 9] resolve_scope loc10, loc3, print(@id0), <GlobalProperty>, 1, > 0x113de7900 > [ 16] get_from_scope loc6, loc10, print(@id0), > 2048<ThrowIfNotFound|GlobalProperty|NotInitialization>, 122 predicting > None > [ 24] resolve_scope loc11, loc3, foo(@id1), <GlobalProperty>, 1, > 0x113de7900 > [ 31] get_from_scope loc12, loc11, foo(@id1), > 2048<ThrowIfNotFound|GlobalProperty|NotInitialization>, 191 predicting > None > [ 39] get_by_id loc9, loc12, bar(@id2) predicting None > [ 48] call loc5, loc6, 2, 16 status(Could Take Slow Path) > Original; predicting None > > That get_by_id will attempt to lookup the property on foo by calling foo's > getOwnPropertySlot function, which will set the passed PropertySlot& to > Object.prototype's "bar" GetterSetter object. In this case, the PropertySlot > will say that the value is an GetterSetter and we won't cache it. > > 3) Customs are mostly used as a performance optimization for DOM Accessors. > Basically, instead of allocating a JSFunction object for all accessor > properties in the DOM we use a function pointer instead. You can see the > typedef, GetValueFunc, in PropertySlot.h. Customs are also used in a small > number of other places for special properties that are values but change > might change, although that is quite rare. > > I would recommend for a first pass that you implement a new opcode, I guess > called op_get_by_id_proto_accessor, that just calls to a slow path. In the > slow path it can load the GetterSetter object and call the getter returning > the result. If you are feeling very ambitious, you could inline the code for > the call in the interpreter assembly. Let me know if you plan on trying to > inline the call since you will need to know the JavaScriptCore calling > convention to implement it. Thanks for the clarification Keith. I am sending a prototype of what you described, but I definitely want to learn how can I inline a function call =).

Build Bot

Comment 6 2016-06-03 03:58:17 PDT

Comment on attachment 280434 [details] Patch Attachment 280434 [details] did not pass mac-ews (mac): Output: http://webkit-queues.webkit.org/results/1427950 New failing tests: webgl/1.0.2/conformance/typedarrays/typed-arrays-in-workers.html js/regress/string-prototype-search-observable-side-effects.html webgl/1.0.2/conformance/more/conformance/quickCheckAPI-B4.html fast/files/workers/worker-read-file-async.html storage/indexeddb/structured-clone-private.html fast/canvas/webgl/arraybuffer-transfer-of-control.html js/regress/string-prototype-split-observable-side-effects.html webgl/1.0.2/conformance/state/gl-get-calls.html js/regress/regexp-prototype-split-observable-side-effects.html fast/files/read-file-async.html fast/files/workers/worker-read-file-sync.html storage/indexeddb/structured-clone.html fast/canvas/webgl/array-message-passing.html js/regress/regexp-prototype-search-observable-side-effects.html js/pic/cached-named-property-getter.html fast/canvas/webgl/typed-arrays-in-workers.html fast/canvas/webgl/gl-get-calls.html

Build Bot

Comment 7 2016-06-03 03:58:22 PDT

Created attachment 280435 [details] Archive of layout-test-results from ews103 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews103 Port: mac-yosemite Platform: Mac OS X 10.10.5

Build Bot

Comment 8 2016-06-03 04:01:15 PDT

Comment on attachment 280434 [details] Patch Attachment 280434 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.webkit.org/results/1427959 New failing tests: webgl/1.0.2/conformance/typedarrays/typed-arrays-in-workers.html js/regress/string-prototype-search-observable-side-effects.html webgl/1.0.2/conformance/more/conformance/quickCheckAPI-B4.html fast/canvas/webgl/arraybuffer-transfer-of-control.html js/regress/string-prototype-split-observable-side-effects.html webgl/1.0.2/conformance/state/gl-get-calls.html js/pic/cached-named-property-getter.html js/regress/regexp-prototype-split-observable-side-effects.html fast/canvas/webgl/array-message-passing.html js/regress/regexp-prototype-search-observable-side-effects.html fast/canvas/webgl/typed-arrays-in-workers.html fast/canvas/webgl/gl-get-calls.html

Build Bot

Comment 9 2016-06-03 04:01:19 PDT

Created attachment 280436 [details] Archive of layout-test-results from ews106 for mac-yosemite-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews106 Port: mac-yosemite-wk2 Platform: Mac OS X 10.10.5

Build Bot

Comment 10 2016-06-03 04:04:01 PDT

Comment on attachment 280434 [details] Patch Attachment 280434 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/1427956 Number of test failures exceeded the failure limit.

Build Bot

Comment 11 2016-06-03 04:04:05 PDT

Created attachment 280437 [details] Archive of layout-test-results from ews114 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews114 Port: mac-yosemite Platform: Mac OS X 10.10.5

Build Bot

Comment 12 2016-06-03 04:14:51 PDT

Comment on attachment 280434 [details] Patch Attachment 280434 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: http://webkit-queues.webkit.org/results/1427981 New failing tests: js/regress/regexp-prototype-split-observable-side-effects.html js/regress/string-prototype-search-observable-side-effects.html js/regress/string-prototype-split-observable-side-effects.html js/pic/cached-named-property-getter.html js/arraybuffer-wrappers.html js/regress/regexp-prototype-search-observable-side-effects.html

Build Bot

Comment 13 2016-06-03 04:14:56 PDT

Created attachment 280438 [details] Archive of layout-test-results from ews122 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews122 Port: ios-simulator-wk2 Platform: Mac OS X 10.11.4

Caio Lima

Comment 14 2016-06-03 09:45:58 PDT

Created attachment 280446 [details] Patch

Geoffrey Garen

Comment 15 2016-06-03 10:53:46 PDT

Lost of crashes on the mac-debug bot: fast/loader/stateobjects/state-attribute-history-getter.html [ Crash ] fast/regex/toString.html [ Crash ] fast/workers/stress-js-execution.html [ Crash ] ietestcenter/Javascript/11.1.5-0-1.html [ Crash ] ietestcenter/Javascript/11.1.5-0-2.html [ Crash ] ietestcenter/Javascript/15.10.7.1-1.html [ Crash ] ietestcenter/Javascript/15.10.7.2-1.html [ Crash ] inspector/css/stylesheet-events-multiple-documents.html [ Crash ] inspector/debugger/searchInContent-linebreaks.html [ Crash ] inspector/debugger/setBreakpoint-actions.html [ Crash ] inspector/page/searchInResources.html [ Crash ] inspector/runtime/getProperties.html [ Crash ] js/Object-create.html [ Crash ] js/array-defineOwnProperty.html [ Crash ] js/basic-computed-property-name.html [ Crash ] js/caller-property.html [ Crash ] js/dom/Object-defineProperty.html [ Crash ] js/dom/call-base-resolution.html [ Crash ] js/dom/exception-sequencing-binops2.html [ Crash ] js/kde/StringObject.html [ Crash ] js/kde/prototype_proto.html [ Crash ] js/pic/cached-getter-setter.html [ Crash ] js/pic/cached-named-property-getter.html [ Crash ] js/regress/getter-no-activation.html [ Crash ] js/regress/getter.html [ Crash ] sputnik/Conformance/15_Native_Objects/15.5_String/15.5.4/15.5.4.9_String.prototype.localeCompare/S15.5.4.9_3.html [ Crash ] sputnik/Conformance/15_Native_Objects/15.5_String/15.5.4/15.5.4.9_String.prototype.localeCompare/S15.5.4.9_A1_T1.html [ Crash ] sputnik/Conformance/15_Native_Objects/15.5_String/15.5.4/15.5.4.9_String.prototype.localeCompare/S15.5.4.9_A1_T2.html [ Crash ] streams/brand-checks.html [ Crash ] You can run these tests locally using run-webkit-tests --debug --dump-render-tree.

Keith Miller

Comment 16 2016-06-03 11:01:04 PDT

> > You can run these tests locally using run-webkit-tests --debug > --dump-render-tree. I would recommend running "run-javascriptcore-tests --debug [--no-build if you already have a debug build of jsc]" first since that doesn't require building all of WebCore and runs most of these tests.

Caio Lima

Comment 17 2016-06-03 11:07:45 PDT

(In reply to comment #15) > Lost of crashes on the mac-debug bot: > > fast/loader/stateobjects/state-attribute-history-getter.html [ Crash ] > fast/regex/toString.html [ Crash ] > fast/workers/stress-js-execution.html [ Crash ] > ietestcenter/Javascript/11.1.5-0-1.html [ Crash ] > ietestcenter/Javascript/11.1.5-0-2.html [ Crash ] > ietestcenter/Javascript/15.10.7.1-1.html [ Crash ] > ietestcenter/Javascript/15.10.7.2-1.html [ Crash ] > inspector/css/stylesheet-events-multiple-documents.html [ Crash ] > inspector/debugger/searchInContent-linebreaks.html [ Crash ] > inspector/debugger/setBreakpoint-actions.html [ Crash ] > inspector/page/searchInResources.html [ Crash ] > inspector/runtime/getProperties.html [ Crash ] > js/Object-create.html [ Crash ] > js/array-defineOwnProperty.html [ Crash ] > js/basic-computed-property-name.html [ Crash ] > js/caller-property.html [ Crash ] > js/dom/Object-defineProperty.html [ Crash ] > js/dom/call-base-resolution.html [ Crash ] > js/dom/exception-sequencing-binops2.html [ Crash ] > js/kde/StringObject.html [ Crash ] > js/kde/prototype_proto.html [ Crash ] > js/pic/cached-getter-setter.html [ Crash ] > js/pic/cached-named-property-getter.html [ Crash ] > js/regress/getter-no-activation.html [ Crash ] > js/regress/getter.html [ Crash ] > > sputnik/Conformance/15_Native_Objects/15.5_String/15.5.4/15.5.4.9_String. > prototype.localeCompare/S15.5.4.9_3.html [ Crash ] > > sputnik/Conformance/15_Native_Objects/15.5_String/15.5.4/15.5.4.9_String. > prototype.localeCompare/S15.5.4.9_A1_T1.html [ Crash ] > > sputnik/Conformance/15_Native_Objects/15.5_String/15.5.4/15.5.4.9_String. > prototype.localeCompare/S15.5.4.9_A1_T2.html [ Crash ] > streams/brand-checks.html [ Crash ] > > > You can run these tests locally using run-webkit-tests --debug > --dump-render-tree. (In reply to comment #14) > Created attachment 280446 [details] Actually, I updated this patch today morning to avoid the crashes, because my implementation was incorrect. These tests are from my first patch, I am going try to run them locally tonight, but is there one way to retrigger them here? BTW, this patch is RFC, not the final Version, because I want to inline the call instead of using the slowPath.

Keith Miller

Comment 18 2016-06-03 11:11:40 PDT

Comment on attachment 280446 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=280446&action=review > Source/JavaScriptCore/llint/LLIntSlowPaths.cpp:703 > + JSObject* slotBase = (JSObject *) pc[6].u.pointer; Style point. This should be: JSObject* slotBase = JSCast<JSObject*>(pc[6].u.jsCell.get()); JSCast has some builtin error checking on debug builds.

Keith Miller

Comment 19 2016-06-03 11:14:31 PDT

> Actually, I updated this patch today morning to avoid the crashes, because > my implementation was incorrect. These tests are from my first patch, I am > going try to run them locally tonight, but is there one way to retrigger > them here? > > BTW, this patch is RFC, not the final Version, because I want to inline the > call instead of using the slowPath. It looks like those crashes were happening on the second patch. You can see by clicking on the mac-debug bubble and looking at the results log. Unfortunately, the only way I know to rerun the tests on the bots is to upload the patch again :/.

Caio Lima

Comment 20 2016-06-03 11:24:19 PDT

(In reply to comment #19) > > Actually, I updated this patch today morning to avoid the crashes, because > > my implementation was incorrect. These tests are from my first patch, I am > > going try to run them locally tonight, but is there one way to retrigger > > them here? > > > > BTW, this patch is RFC, not the final Version, because I want to inline the > > call instead of using the slowPath. > > > It looks like those crashes were happening on the second patch. You can see > by clicking on the mac-debug bubble and looking at the results log. > Unfortunately, the only way I know to rerun the tests on the bots is to > upload the patch again :/. No problem. I am going to fix them tonight, because I have no access to my computer right now.

Keith Miller

Comment 21 2016-06-03 18:50:42 PDT

Per your request before on inlining the getter call, I think the simplest way to implement it is to follow how we do it in PolymorphicAccess/operationGetByIdOptimize. PolymorphicAccess has the inline cache repatching code used in the JITs. The assembler in PolymorphicAccess more or less models X86 so it shouldn't be too hard to follow and translate to LLInt assembly. I'll also see if I can find some better information about the calling convention too. Let me know if you have any other questions.

Keith Miller

Comment 22 2016-06-03 19:05:31 PDT

(In reply to comment #21) > Per your request before on inlining the getter call, I think the simplest > way to implement it is to follow how we do it in > PolymorphicAccess/operationGetByIdOptimize. PolymorphicAccess has the inline > cache repatching code used in the JITs. The assembler in PolymorphicAccess > more or less models X86 so it shouldn't be too hard to follow and translate > to LLInt assembly. I'll also see if I can find some better information about > the calling convention too. Let me know if you have any other questions. Also, it might be helpful to look at JSStack.h, in particular, the CallFrameHeaderEntry enum since that represents the expected layout of a JS frame at the end of the function prologue.

Caio Lima

Comment 23 2016-06-04 18:54:51 PDT

(In reply to comment #22) > (In reply to comment #21) > > Per your request before on inlining the getter call, I think the simplest > > way to implement it is to follow how we do it in > > PolymorphicAccess/operationGetByIdOptimize. PolymorphicAccess has the inline > > cache repatching code used in the JITs. The assembler in PolymorphicAccess > > more or less models X86 so it shouldn't be too hard to follow and translate > > to LLInt assembly. I'll also see if I can find some better information about > > the calling convention too. Let me know if you have any other questions. > > Also, it might be helpful to look at JSStack.h, in particular, the > CallFrameHeaderEntry enum since that represents the expected layout of a JS > frame at the end of the function prologue. Thank you for pointing where I can look. I think I am almost done because I understand and know hot to construct the JS Frame convention for 64bits architecture, however I am missing some very important information. 1) how to perform the call to the Getter function? I searched and found a makeJavascriptCall macro, however I am not sure if it is necessary use it. Also I would like to understand the offlineassembly code "call". 2) Another point is, How can I get the getter JSFunction, since the offset cached point to a "GetterSetter"? Is it possible use "loadp GetterSetter::m_getter[t3], t4", for example? I noticed that the code in PolymorphicAccess uses GetterSetter::offsetGetter(). I suppose that this value should be placed in calle member of JS Frame Entry. I didn't find information about the OfflineAssembly, could you point me some?

Caio Lima

Comment 24 2016-06-05 01:17:46 PDT

Created attachment 280545 [details] Patch

Caio Lima

Comment 25 2016-06-05 01:21:29 PDT

In this patch I fixed the regressions and tried to inline the op_code. I am sending my experimental code commented. However, I am facing problems to get the Getter Function and store it in the Calle member of the JS Stack entry. Could you help me on this?

Keith Miller

Comment 26 2016-06-05 12:05:07 PDT

(In reply to comment #25) > In this patch I fixed the regressions and tried to inline the op_code. I am > sending my experimental code commented. However, I am facing problems to get > the Getter Function and store it in the Calle member of the JS Stack entry. > Could you help me on this? Thinking about this further I realized that in order to inline the call we would need to have CallLinkInfo for the getter Callee. That would tell us what address to call when the function we are calling gets tiered up. Unfortunately, as far as I know, we don't have any kind of CallLinkInfo for the LLInt. Without, CallLinkInfo I think you would need to call into C++ to find the call location anyway. Let me look into this further but I think it might not be worthwhile to inline normal getters. We should still be able to inline custom getters however.

Caio Lima

Comment 27 2016-06-05 13:21:22 PDT

Created attachment 280560 [details] Patch

Caio Lima

Comment 28 2016-06-05 13:27:43 PDT

Hi Guys. I cleaned the code and it is my Patch Version for caching the prototype accessor. I would like to know if It is a good idea send the implementation cache of prototype custom functions in this bug. Also, could you point me to a code that uses prototype customs?

Saam Barati

Comment 29 2016-06-05 16:31:16 PDT

Comment on attachment 280560 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=280560&action=review Have you run any benchmarks? > Source/JavaScriptCore/llint/LLIntSlowPaths.cpp:713 > + LLINT_OP(1) = result; > + > + pc[OPCODE_LENGTH(op_get_by_id) - 1].u.profile->m_buckets[0] = JSValue::encode(result); > + LLINT_END(); I think LLINT_RETURN_PROFILED does this for you

Saam Barati

Comment 30 2016-06-05 16:35:09 PDT

View in context: https://bugs.webkit.org/attachment.cgi?id=280560&action=review Have you run any benchmarks? > Source/JavaScriptCore/llint/LLIntSlowPaths.cpp:713 > + LLINT_OP(1) = result; > + > + pc[OPCODE_LENGTH(op_get_by_id) - 1].u.profile->m_buckets[0] = JSValue::encode(result); > + LLINT_END(); I think LLINT_RETURN_PROFILED does this for you

Caio Lima

Comment 31 2016-06-05 19:56:18 PDT

Created attachment 280572 [details] Base's Benchmark output

Caio Lima

Comment 32 2016-06-05 19:57:01 PDT

Created attachment 280573 [details] Cached Accessor's benchmark output

Caio Lima

Comment 33 2016-06-05 21:38:39 PDT

Checking the benchmark I noticed that the patch does not improve the results. The cases that I paid attention were: #JSC with the patch get-by-id-proto-or-self 11.6561+-2.5266 #JSC without the patch get-by-id-proto-or-self 13.2753+-2.3457 -------------------------------------------------------------- #JSC with the patch get-by-id-self-or-proto 13.0930+-1.7589 #JSC without the patch get-by-id-self-or-proto 12.1011+-0.7154 -------------------------------------------------------------- #JSC with the patch getter-prototype 5.8231+-0.6807 #JSC without the patch getter-prototype 5.5745+-0.2236 -------------------------------------------------------------- #JSC with the patch prototype-access-with-mutating-prototype 6.2859+-2.0017 #JSC without the patch prototype-access-with-mutating-prototype 5.3830+-0.4957 -------------------------------------------------------------- It is important notice the errors presented by the JSC run with the patch are always higher than the run without patch. IMHO it is because of the prototype cache maintenance and setup overhead. Also, It is important consider that this is implementing just the slow_path to access the prototype accessor functions.

Saam Barati

Comment 34 2016-06-05 23:08:54 PDT

(In reply to comment #32) > Created attachment 280573 [details] > Cached Accessor's benchmark output You can compare two VMs directly using the `run-jsc-benchmarks` script. This will give you nice output showing the differences so you don't have to manually compare the results. Like so: `run-jsc-benchmarks baseline:<pathToBaslineBuild>/jsc changes:<pathToChanges>/jsc` You can run individual benchmarks like (this runs only Kraken): `run-jsc-benchmarks --kraken baseline:<pathToBaslineBuild>/jsc changes:<pathToChanges>/jsc` You can make run-jsc-benchmarks run more iterations which is helpful for noisy benchmarks. The default is 4 runs. This will make kraken run for 10 runs: `run-jsc-benchmarks --kraken --outer 10 baseline:<pathToBaslineBuild>/jsc changes:<pathToChanges>/jsc` You can also provide a regex to specify sub tests inside a benchmark suite. This will run just ai-astar and audio-fft from kraken. `run-jsc-benchmarks --kraken --benchmarks "ai-astar|audio-fft" baseline:<pathToBaslineBuild>/jsc changes:<pathToChanges>/jsc` Also, you should get yourself an Octane and Kraken. They can be found on github I think. We don't bundle those benchmarks inside WebKit itself, but run-jsc-benchmarks will look for the following the following JSON file: `~/.run-jsc-benchmarks` Which specifies paths to where Octane/Kraken can be found. My `~/.run-jsc-benchmarks` looks like: ``` { "OctanePath": "/Volumes/Data/jsc-benchmarks/octane", "KrakenPath": "/Volumes/Data/jsc-benchmarks/kraken/tests/kraken-1.1" } ```

Caio Lima

Comment 35 2016-06-05 23:14:05 PDT

(In reply to comment #34) > (In reply to comment #32) > > Created attachment 280573 [details] > > Cached Accessor's benchmark output > > You can compare two VMs directly using the `run-jsc-benchmarks` script. This > will give you nice output showing the differences so you don't have to > manually compare the results. > Like so: > `run-jsc-benchmarks baseline:<pathToBaslineBuild>/jsc > changes:<pathToChanges>/jsc` > > You can run individual benchmarks like (this runs only Kraken): > `run-jsc-benchmarks --kraken baseline:<pathToBaslineBuild>/jsc > changes:<pathToChanges>/jsc` > > You can make run-jsc-benchmarks run more iterations which is helpful for > noisy benchmarks. The default is 4 runs. This will make kraken run for 10 > runs: > `run-jsc-benchmarks --kraken --outer 10 baseline:<pathToBaslineBuild>/jsc > changes:<pathToChanges>/jsc` > > You can also provide a regex to specify sub tests inside a benchmark suite. > This will run just ai-astar and audio-fft from kraken. > `run-jsc-benchmarks --kraken --benchmarks "ai-astar|audio-fft" > baseline:<pathToBaslineBuild>/jsc changes:<pathToChanges>/jsc` > > > Also, you should get yourself an Octane and Kraken. They can be found on > github I think. We don't bundle those benchmarks inside WebKit itself, but > run-jsc-benchmarks > will look for the following the following JSON file: > `~/.run-jsc-benchmarks` > Which specifies paths to where Octane/Kraken can be found. > > My `~/.run-jsc-benchmarks` looks like: > ``` > { > "OctanePath": "/Volumes/Data/jsc-benchmarks/octane", > "KrakenPath": "/Volumes/Data/jsc-benchmarks/kraken/tests/kraken-1.1" > } > ``` It is a nice information. BTW, How can I manage the DYLD_FRAMEWORK_PATH between 2 different builds?

Saam Barati

Comment 36 2016-06-05 23:45:02 PDT

(In reply to comment #35) > (In reply to comment #34) > > (In reply to comment #32) > > > Created attachment 280573 [details] > > > Cached Accessor's benchmark output > > > > You can compare two VMs directly using the `run-jsc-benchmarks` script. This > > will give you nice output showing the differences so you don't have to > > manually compare the results. > > Like so: > > `run-jsc-benchmarks baseline:<pathToBaslineBuild>/jsc > > changes:<pathToChanges>/jsc` > > > > You can run individual benchmarks like (this runs only Kraken): > > `run-jsc-benchmarks --kraken baseline:<pathToBaslineBuild>/jsc > > changes:<pathToChanges>/jsc` > > > > You can make run-jsc-benchmarks run more iterations which is helpful for > > noisy benchmarks. The default is 4 runs. This will make kraken run for 10 > > runs: > > `run-jsc-benchmarks --kraken --outer 10 baseline:<pathToBaslineBuild>/jsc > > changes:<pathToChanges>/jsc` > > > > You can also provide a regex to specify sub tests inside a benchmark suite. > > This will run just ai-astar and audio-fft from kraken. > > `run-jsc-benchmarks --kraken --benchmarks "ai-astar|audio-fft" > > baseline:<pathToBaslineBuild>/jsc changes:<pathToChanges>/jsc` > > > > > > Also, you should get yourself an Octane and Kraken. They can be found on > > github I think. We don't bundle those benchmarks inside WebKit itself, but > > run-jsc-benchmarks > > will look for the following the following JSON file: > > `~/.run-jsc-benchmarks` > > Which specifies paths to where Octane/Kraken can be found. > > > > My `~/.run-jsc-benchmarks` looks like: > > ``` > > { > > "OctanePath": "/Volumes/Data/jsc-benchmarks/octane", > > "KrakenPath": "/Volumes/Data/jsc-benchmarks/kraken/tests/kraken-1.1" > > } > > ``` > > It is a nice information. BTW, How can I manage the DYLD_FRAMEWORK_PATH > between 2 different builds? run-jsc-benchmarks should do that for you

Caio Lima

Comment 37 2016-06-06 01:47:05 PDT

Created attachment 280590 [details] Benchmark

Keith Miller

Comment 38 2016-06-08 08:57:33 PDT

Comment on attachment 280560 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=280560&action=review Looks good. I think it's almost ready to go. Saam and I have a few comments before the patch is ready to land however. > Source/JavaScriptCore/llint/LLIntSlowPaths.cpp:659 > + if ((slot.isValue() || slot.isAccessor()) && slot.slotBase() == baseValue) { Why clear the cache if we have a self accessor? There is no cost to leaving the old value cache in place.

Caio Lima

Comment 39 2016-06-08 09:42:31 PDT

(In reply to comment #38) > Comment on attachment 280560 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=280560&action=review > > Looks good. I think it's almost ready to go. Saam and I have a few comments > before the patch is ready to land however. > > > Source/JavaScriptCore/llint/LLIntSlowPaths.cpp:659 > > + if ((slot.isValue() || slot.isAccessor()) && slot.slotBase() == baseValue) { > > Why clear the cache if we have a self accessor? There is no cost to leaving > the old value cache in place. I see your point. Actually, the main reason I implemented it was because I though it was the case where handling a cache miss and clearing the old cache. However, if I am not wrong, It is the case just to self accessors, right? In our implementation, we just cache the prototype once (when pc[7].u.operand reaches 0) and after that, all cache misses are handled by the _llint_get_by_id_proto_acessor opcode slowPath case. The advantage to clear the cache is to avoid all these misses, however, the original _llint_get_by_id_proto_acessor also check for misses and there is no real improvement clearing the cache, right? I am going to change it. BTW, I have already an first implementation of CustomAcessors. What I need to finish is inline the CustomAcessor call to Assembly code. I think it is a good idea send them in the same patch, since the files touched are the same. As you pointed me instructions to try inline the Getter, I already have an idea what I need to do to complete this implementation.

Keith Miller

Comment 40 2016-06-08 09:49:16 PDT

(In reply to comment #39) > (In reply to comment #38) > > Comment on attachment 280560 [details] > > Patch > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=280560&action=review > > > > Looks good. I think it's almost ready to go. Saam and I have a few comments > > before the patch is ready to land however. > > > > > Source/JavaScriptCore/llint/LLIntSlowPaths.cpp:659 > > > + if ((slot.isValue() || slot.isAccessor()) && slot.slotBase() == baseValue) { > > > > Why clear the cache if we have a self accessor? There is no cost to leaving > > the old value cache in place. > > I see your point. Actually, the main reason I implemented it was because I > though it was the case where handling a cache miss and clearing the old > cache. However, if I am not wrong, It is the case just to self accessors, > right? > > In our implementation, we just cache the prototype once (when > pc[7].u.operand reaches 0) and after that, all cache misses are handled by > the _llint_get_by_id_proto_acessor opcode slowPath case. The advantage to > clear the cache is to avoid all these misses, however, the original > _llint_get_by_id_proto_acessor also check for misses and there is no real > improvement clearing the cache, right? > That's right. Unless we have something else to put in the cache or the cache becomes invalid there is no benefit to clearing it. > I am going to change it. > > BTW, I have already an first implementation of CustomAcessors. What I need > to finish is inline the CustomAcessor call to Assembly code. I think it is a > good idea send them in the same patch, since the files touched are the same. > As you pointed me instructions to try inline the Getter, I already have an > idea what I need to do to complete this implementation. Cool, sounds like a plan. Let me know if you have any other questions.

Caio Lima

Comment 41 2016-06-10 13:32:08 PDT

Hi Guys, I am trying to inline the customs getter function call but I am facing a weird error. When I set the cache Up, I put into pc[5] the pointer of the Custom Function "GetValueFunc". When I try to load it into a register in LowLevelInterpreter64.asm, its content is not the same. I am doing the following operations: loadisFromInstruction(5, t1) PrepareForCCall //load the arguments cCall4(t0) //retrun retrieving and dispatch I debugged the value present into t1 (in my architecture is the rsi) and it is not corresponding to the pointer setted in setup. To test if call was working, I manually set rsi with the value of the function address and the function was called, however some arguments were incorrect the pointers. I tried load the function pointer using: loadpFromInstruction(5, t1) And had no success too. Is anything I am doing wrong to get the pc[] elements as pointer?

Saam Barati

Comment 42 2016-06-10 22:20:57 PDT

(In reply to comment #41) > Hi Guys, > > I am trying to inline the customs getter function call but I am facing a > weird error. > > When I set the cache Up, I put into pc[5] the pointer of the Custom Function > "GetValueFunc". > > When I try to load it into a register in LowLevelInterpreter64.asm, its > content is not the same. I am doing the following operations: > > loadisFromInstruction(5, t1) > PrepareForCCall > //load the arguments > cCall4(t0) > //retrun retrieving and dispatch > > I debugged the value present into t1 (in my architecture is the rsi) and it > is not corresponding to the pointer setted in setup. To test if call was > working, I manually set rsi with the value of the function address and the > function was called, however some arguments were incorrect the pointers. > > I tried load the function pointer using: > loadpFromInstruction(5, t1) > > And had no success too. > > Is anything I am doing wrong to get the pc[] elements as pointer? Can you post the patch that you think should work? What architecture are you on?

Caio Lima

Comment 43 2016-06-10 22:41:19 PDT

(In reply to comment #42) > (In reply to comment #41) > > Hi Guys, > > > > I am trying to inline the customs getter function call but I am facing a > > weird error. > > > > When I set the cache Up, I put into pc[5] the pointer of the Custom Function > > "GetValueFunc". > > > > When I try to load it into a register in LowLevelInterpreter64.asm, its > > content is not the same. I am doing the following operations: > > > > loadisFromInstruction(5, t1) > > PrepareForCCall > > //load the arguments > > cCall4(t0) > > //retrun retrieving and dispatch > > > > I debugged the value present into t1 (in my architecture is the rsi) and it > > is not corresponding to the pointer setted in setup. To test if call was > > working, I manually set rsi with the value of the function address and the > > function was called, however some arguments were incorrect the pointers. > > > > I tried load the function pointer using: > > loadpFromInstruction(5, t1) > > > > And had no success too. > > > > Is anything I am doing wrong to get the pc[] elements as pointer? > Can you post the patch that you think should work? > What architecture are you on? Hi Saam. I just discovered that some pseudo-register are mapped to the same physical register (e.g t1 = a0 in my architecture). I fixed the register usage and it is working now in my architecture. Now the problem I am facing is that the "typedef EncodedJSValue (*GetValueFunc)(ExecState*, EncodedJSValue thisValue, PropertyName, JSObject* slotBase)" requires a PropertyName and I don't know how can I pass it in the inlined assembly. My current assembly is this one: _llint_op_get_by_id_proto_custom: traceExecution() loadisFromInstruction(2, t0) loadConstantOrVariableCell(t0, t3, .opGetByIdProtoCustomSlow) loadi JSCell::m_structureID[t3], t1 loadisFromInstruction(4, t2) bineq t2, t1, .opGetByIdProtoCustomSlow loadpFromInstruction(6, a1) loadpFromInstruction(5, t0) push t2 push PC push PB prepareStateForCCall() move cfr, a0 move 0, a2 # I am testing with CustomGetter function and this parameter is ignored move a1, a3 cCall4(t0) restoreStateAfterCCall() pop PB pop PC pop t2 storeq r0, [cfr, t2, 8] valueProfile(r0, 8, t1) dispatch(9) .opGetByIdProtoCustomSlow: callSlowPath(_llint_slow_path_get_by_id) dispatch(9)

Saam Barati

Comment 44 2016-06-11 01:43:01 PDT

A PropertyName is identical to a UniquedStringImpl* so you can just pass that. A PropertyName is a struct with a single UniquedStringImpl* field. I believe get_by_id has an index in it's instruction corresponding to its UniquedStringImpl*. You may need to write some code that loads this off the CodeBlock based on some index. I'm not sure if we have code like that already in the LLInt but it should be easy to write. You should also make sure you don't have any other unintended register aliasing for all platforms that the LLInt supports.

Caio Lima

Comment 45 2016-06-11 13:37:41 PDT

(In reply to comment #44) > A PropertyName is identical to a UniquedStringImpl* so you can just pass > that. > A PropertyName is a struct with a single UniquedStringImpl* field. > > I believe get_by_id has an index in it's instruction corresponding to its > UniquedStringImpl*. You may need to write some code that loads this off > the CodeBlock based on some index. I'm not sure if we have code like that > already > in the LLInt but it should be easy to write. Actually, I don't think so. Checking the CodeBlock code I found the method used to get the identifier based in the identifier index. const Identifier& CodeBlock::identifier(int index) const { size_t unlinkedIdentifiers = m_unlinkedCode->numberOfIdentifiers(); if (static_cast<unsigned>(index) < unlinkedIdentifiers) return m_unlinkedCode->identifier(index); } The m_unlikedCode stores the identifiers into a Vector<Identifier> m_identifiers. The problem is to revive the identifier contained in the Vector buffer. My current implementation is like this macro loadIdentifier(index, dest) loadp CodeBlock[cfr], t2 loadp CodeBlock::m_unlinkedCode[t2], t1 #t1 is pointing to UnlinkedCodeBlock addp UnlinkedCodeBlock::m_identifiers, t1 #Offset to point to m_identifiers loadis index, t2 mulp sizeof Identifier, t2 addp t2, t1 # Offset to point to m_identifiers[index] addp Identifier::m_string, t1 # m_identifiers[index].m_string loadp t1, dest end Any tip?

Caio Lima

Comment 46 2016-06-11 14:38:34 PDT

(In reply to comment #45) > The m_unlikedCode stores the identifiers into a Vector<Identifier> > m_identifiers. The problem is to revive the identifier contained in the > Vector buffer. My current implementation is like this > > macro loadIdentifier(index, dest) > loadp CodeBlock[cfr], t2 > loadp CodeBlock::m_unlinkedCode[t2], t1 #t1 is pointing to > UnlinkedCodeBlock > addp UnlinkedCodeBlock::m_identifiers, t1 #Offset to point to > m_identifiers > loadis index, t2 > mulp sizeof Identifier, t2 > addp t2, t1 # Offset to point to m_identifiers[index] > addp Identifier::m_string, t1 # m_identifiers[index].m_string > loadp t1, dest > end > > Any tip? Actually, this code working now: macro loadIdentifier(index, dest) loadp CodeBlock[cfr], t2 loadp CodeBlock::m_unlinkedCode[t2], t1 loadp UnlinkedCodeBlock::m_identifiers[t1], t2 move t2, t1 loadis index, t2 mulp sizeof Identifier, t2 addp t2, t1 loadp Identifier::m_string[t1], dest end I would like to know if there is a way to test the changes in all architectures.

Saam Barati

Comment 47 2016-06-11 14:40:25 PDT

(In reply to comment #45) > (In reply to comment #44) > > A PropertyName is identical to a UniquedStringImpl* so you can just pass > > that. > > A PropertyName is a struct with a single UniquedStringImpl* field. > > > > I believe get_by_id has an index in it's instruction corresponding to its > > UniquedStringImpl*. You may need to write some code that loads this off > > the CodeBlock based on some index. I'm not sure if we have code like that > > already > > in the LLInt but it should be easy to write. > > Actually, I don't think so. Checking the CodeBlock code I found the method > used to get the identifier based in the identifier index. > > const Identifier& CodeBlock::identifier(int index) const > { > size_t unlinkedIdentifiers = m_unlinkedCode->numberOfIdentifiers(); > if (static_cast<unsigned>(index) < unlinkedIdentifiers) > return m_unlinkedCode->identifier(index); > > } > > The m_unlikedCode stores the identifiers into a Vector<Identifier> > m_identifiers. The problem is to revive the identifier contained in the > Vector buffer. My current implementation is like this > > macro loadIdentifier(index, dest) > loadp CodeBlock[cfr], t2 > loadp CodeBlock::m_unlinkedCode[t2], t1 #t1 is pointing to > UnlinkedCodeBlock > addp UnlinkedCodeBlock::m_identifiers, t1 #Offset to point to > m_identifiers > loadis index, t2 > mulp sizeof Identifier, t2 > addp t2, t1 # Offset to point to m_identifiers[index] > addp Identifier::m_string, t1 # m_identifiers[index].m_string > loadp t1, dest > end > > Any tip? This is how we solve such problems in the LLInt and JITs in general. We often end up inlining particular functions. For the JIT, this is usually easier because it can compile constants more easily. So, your proposed solution would definitely work, but it's obviously not ideal. But I should have suggested something simpler yesterday. When you convert your get_by_id into your specialized version, you can stash the UniquedStringImpl* into the instruction stream itself if you have an unused instruction offset. Do you have such a free location?

Saam Barati

Comment 48 2016-06-11 14:42:03 PDT

(In reply to comment #46) > (In reply to comment #45) > > The m_unlikedCode stores the identifiers into a Vector<Identifier> > > m_identifiers. The problem is to revive the identifier contained in the > > Vector buffer. My current implementation is like this > > > > macro loadIdentifier(index, dest) > > loadp CodeBlock[cfr], t2 > > loadp CodeBlock::m_unlinkedCode[t2], t1 #t1 is pointing to > > UnlinkedCodeBlock > > addp UnlinkedCodeBlock::m_identifiers, t1 #Offset to point to > > m_identifiers > > loadis index, t2 > > mulp sizeof Identifier, t2 > > addp t2, t1 # Offset to point to m_identifiers[index] > > addp Identifier::m_string, t1 # m_identifiers[index].m_string > > loadp t1, dest > > end > > > > Any tip? > > Actually, this code working now: > > macro loadIdentifier(index, dest) > loadp CodeBlock[cfr], t2 > loadp CodeBlock::m_unlinkedCode[t2], t1 > loadp UnlinkedCodeBlock::m_identifiers[t1], t2 > move t2, t1 > loadis index, t2 > mulp sizeof Identifier, t2 > addp t2, t1 > loadp Identifier::m_string[t1], dest > end > > I would like to know if there is a way to test the changes in all > architectures. Which architecture are you running locally?

Keith Miller

Comment 49 2016-06-11 14:48:14 PDT

(In reply to comment #47) > But I should have suggested something simpler yesterday. > When you convert your get_by_id into your specialized version, you > can stash the UniquedStringImpl* into the instruction stream itself if > you have an unused instruction offset. Do you have such a free location? This is a good idea. We have an invariant that changing a Accessor on an object requires transitioning the structure of that object. This means that you can use, for example, pc[4] to hold the function pointer and pc[5] to point to the UniquedStringImpl* of the property. (In reply to comment #46) > I would like to know if there is a way to test the changes in all > architectures. It can be pretty tricky. I would recommend having the bot do it since, IIRC, we test all the architectures we care about.

Caio Lima

Comment 50 2016-06-11 15:17:47 PDT

(In reply to comment #49) > (In reply to comment #47) > > But I should have suggested something simpler yesterday. > > When you convert your get_by_id into your specialized version, you > > can stash the UniquedStringImpl* into the instruction stream itself if > > you have an unused instruction offset. Do you have such a free location? > > This is a good idea. We have an invariant that changing a Accessor on an > object requires transitioning the structure of that object. This means that > you can use, for example, pc[4] to hold the function pointer and pc[5] to > point to the UniquedStringImpl* of the property. Actually, my first thinking was in that way. However, pc[4] is used to check if the inline is still valid. _llint_op_get_by_id_proto_custom: traceExecution() loadisFromInstruction(2, t0) loadConstantOrVariableCell(t0, t3, .opGetByIdProtoCustomSlow) loadi JSCell::m_structureID[t3], t1 loadisFromInstruction(4, t2) bineq t2, t1, .opGetByIdProtoCustomSlow This way, I think it is not possible use pc[4] for that. Could you confirm it? If it is possible, I think it is the best way to implement it. Another strategy i had in mind was create a C struct to store customGetter pointer and UniquedStringImpl and "pc[5].u.cstruct" point to it. However, it is not efficient, since it is going to double the size of pc[].u. A way to improve it should store a pointer to this struct, however It is not possible to track this pointer and free its memory. Other solution I was thinking was create an property into slot.slotBase() to store the identifier, however it is going to create a waste of space to other objects. So, the most viable solution I thought was the last one I commented above. > (In reply to comment #46) > > I would like to know if there is a way to test the changes in all > > architectures. > > It can be pretty tricky. I would recommend having the bot do it since, IIRC, > we test all the architectures we care about. No problem. I asked because you could have a way to do it without submitting a patch. BTW my current architecture is x86_64 running Mac OS X.

Saam Barati

Comment 51 2016-06-11 15:27:02 PDT

Saam Barati

Comment 52 2016-06-11 15:29:57 PDT

You won't be able to test ARM on your machine, but you can also test x86 32-bit by building JavaScript core like: Tools/Scripts/build-jsc --32-bit --release (or --debug) This will give you some more coverage also because x86-32 doesn't have argument registers.

Caio Lima

Comment 53 2016-06-13 22:52:03 PDT

Created attachment 281234 [details] Patch

Caio Lima

Comment 54 2016-06-13 23:05:16 PDT

Hi Guys. It is not the final patch (It is not ready yet and it is breaking). I need some help from you. I am running my implementation of CustomAccessor prototype inline, however I think I am missing some important information. I created a slowPath version to make debug easier and evaluate the performance gain without inlining it to assembly. The problem that I am facing is because some customGetters are failing because the slot.slotBase() is not the correct parameter to them. In PropertySlot.cpp I found this code: JSValue PropertySlot::customGetter(ExecState* exec, PropertyName propertyName) const { // FIXME: Remove this differences in custom values and custom accessors. // https://bugs.webkit.org/show_bug.cgi?id=158014 JSValue thisValue = m_attributes & CustomAccessor ? m_thisValue : JSValue(slotBase()); return JSValue::decode(m_data.custom.getValue(exec, JSValue::encode(thisValue), propertyName)); } The biggest problem I am facing is because m_thisValue.getObject() is not a valid pointer when called by the inlined version and I am not able to store EncodedJSValue to pc[6]. I spent yesterday night and today night trying to figure out what can I do. Saam, about store the UniquedStringImpl into a pc slot, it is not possible because all slot are in use.

Caio Lima

Comment 55 2016-06-13 23:08:11 PDT

Created attachment 281235 [details] Patch

Build Bot

Comment 56 2016-06-13 23:55:42 PDT

Comment on attachment 281235 [details] Patch Attachment 281235 [details] did not pass mac-ews (mac): Output: http://webkit-queues.webkit.org/results/1498748 Number of test failures exceeded the failure limit.

Build Bot

Comment 57 2016-06-13 23:55:48 PDT

Created attachment 281237 [details] Archive of layout-test-results from ews100 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews100 Port: mac-yosemite Platform: Mac OS X 10.10.5

Build Bot

Comment 58 2016-06-13 23:57:05 PDT

Comment on attachment 281235 [details] Patch Attachment 281235 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.webkit.org/results/1498751 Number of test failures exceeded the failure limit.

Build Bot

Comment 59 2016-06-13 23:57:09 PDT

Created attachment 281238 [details] Archive of layout-test-results from ews107 for mac-yosemite-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews107 Port: mac-yosemite-wk2 Platform: Mac OS X 10.10.5

Build Bot

Comment 60 2016-06-13 23:57:29 PDT

Comment on attachment 281235 [details] Patch Attachment 281235 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/1498731 Number of test failures exceeded the failure limit.

Build Bot

Comment 61 2016-06-13 23:57:33 PDT

Created attachment 281239 [details] Archive of layout-test-results from ews115 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews115 Port: mac-yosemite Platform: Mac OS X 10.10.5

Build Bot

Comment 62 2016-06-13 23:58:22 PDT

Comment on attachment 281235 [details] Patch Attachment 281235 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: http://webkit-queues.webkit.org/results/1498736 Number of test failures exceeded the failure limit.

Build Bot

Comment 63 2016-06-13 23:58:24 PDT

Created attachment 281240 [details] Archive of layout-test-results from ews124 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews124 Port: ios-simulator-wk2 Platform: Mac OS X 10.11.4

Caio Lima

Comment 64 2016-06-14 11:12:18 PDT

Hi Guys. Finally I finished a version that is passing in the webkit-test for x86_64 on Mac. Now I need to fix the 32bit version and send to review. However, I am testing the performance to check the improvement of this implementation. As a Benchmark for the Custom inlined I am using the following tool: http://browserbench.org/Speedometer/ I already have some results but I didn't Analyze them. I am running this tool with the --release version of the baseline version and with the path applied. In the overall score, both versions are close and I couldn't notice difference in the results. I am going to improve the analysis to check some behaviors of this benchmark with the prototype cache implementation (e.g.cache miss rate). I keep you informed. Probably today I am going to send my final version to review today.

Caio Lima

Comment 65 2016-06-15 20:30:45 PDT

Created attachment 281432 [details] Patch

Build Bot

Comment 66 2016-06-15 21:28:16 PDT

Comment on attachment 281432 [details] Patch Attachment 281432 [details] did not pass mac-ews (mac): Output: http://webkit-queues.webkit.org/results/1510609 New failing tests: http/tests/dom/set-document-location-host-to-accepted-values.html http/tests/dom/set-document-location-hostname-to-accepted-values.html

Build Bot

Comment 67 2016-06-15 21:28:20 PDT

Created attachment 281436 [details] Archive of layout-test-results from ews101 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews101 Port: mac-yosemite Platform: Mac OS X 10.10.5

Build Bot

Comment 68 2016-06-15 21:32:25 PDT

Comment on attachment 281432 [details] Patch Attachment 281432 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.webkit.org/results/1510612 New failing tests: http/tests/dom/set-document-location-host-to-accepted-values.html http/tests/dom/set-document-location-hostname-to-accepted-values.html

Build Bot

Comment 69 2016-06-15 21:32:29 PDT

Created attachment 281437 [details] Archive of layout-test-results from ews107 for mac-yosemite-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews107 Port: mac-yosemite-wk2 Platform: Mac OS X 10.10.5

Build Bot

Comment 70 2016-06-15 21:36:18 PDT

Comment on attachment 281432 [details] Patch Attachment 281432 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: http://webkit-queues.webkit.org/results/1510615 New failing tests: http/tests/dom/set-document-location-host-to-accepted-values.html http/tests/dom/set-document-location-hostname-to-accepted-values.html

Build Bot

Comment 71 2016-06-15 21:36:23 PDT

Created attachment 281438 [details] Archive of layout-test-results from ews124 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews124 Port: ios-simulator-wk2 Platform: Mac OS X 10.11.4

Build Bot

Comment 72 2016-06-15 23:48:44 PDT

Comment on attachment 281432 [details] Patch Attachment 281432 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/1511098 New failing tests: http/tests/dom/set-document-location-host-to-accepted-values.html http/tests/dom/set-document-location-hostname-to-accepted-values.html

Build Bot

Comment 73 2016-06-15 23:48:49 PDT

Created attachment 281446 [details] Archive of layout-test-results from ews117 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews117 Port: mac-yosemite Platform: Mac OS X 10.10.5

Caio Lima

Comment 74 2016-06-16 23:27:26 PDT

Created attachment 281539 [details] Patch

Caio Lima

Comment 75 2016-06-16 23:32:07 PDT

Created attachment 281540 [details] speedometer run comparison with and without patch applied It is the result of speedometer benchmark comparing baseline and the patch applied.

Caio Lima

Comment 76 2016-06-16 23:44:42 PDT

(In reply to comment #75) > Created attachment 281540 [details] > speedometer run comparison with and without patch applied > > It is the result of speedometer benchmark comparing baseline and the patch > applied. I noticed that there is no real gain in this benchmark in terms of performance for the custom accessor caching. So I decided to check the cache miss ratio of it to verify the behavior of this benchmark. The summary results I've got are: ---------------------------------------------------------------------------- Current CacheMiss Ratio:145467/319000 ---------------------------------------------------------------------------- This Ration is 0.45600940438871473. I am not convinced that this benchmark is a good choice to test the performance gain of this patch. What I am doing right now is logging my tonight usage to collect the CacheMiss Ration using facebook, youtube and other real web applications. I am going to send you the results soon.

Build Bot

Comment 77 2016-06-17 00:26:46 PDT

Comment on attachment 281539 [details] Patch Attachment 281539 [details] did not pass mac-ews (mac): Output: http://webkit-queues.webkit.org/results/1516671 New failing tests: http/tests/dom/set-document-location-host-to-accepted-values.html http/tests/dom/set-document-location-hostname-to-accepted-values.html

Build Bot

Comment 78 2016-06-17 00:26:51 PDT

Created attachment 281542 [details] Archive of layout-test-results from ews103 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews103 Port: mac-yosemite Platform: Mac OS X 10.10.5

Build Bot

Comment 79 2016-06-17 00:28:58 PDT

Comment on attachment 281539 [details] Patch Attachment 281539 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.webkit.org/results/1516670 New failing tests: http/tests/dom/set-document-location-host-to-accepted-values.html http/tests/dom/set-document-location-hostname-to-accepted-values.html

Build Bot

Comment 80 2016-06-17 00:29:02 PDT

Created attachment 281543 [details] Archive of layout-test-results from ews106 for mac-yosemite-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews106 Port: mac-yosemite-wk2 Platform: Mac OS X 10.10.5

Build Bot

Comment 81 2016-06-17 00:36:10 PDT

Comment on attachment 281539 [details] Patch Attachment 281539 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: http://webkit-queues.webkit.org/results/1516674 New failing tests: http/tests/dom/set-document-location-host-to-accepted-values.html http/tests/dom/set-document-location-hostname-to-accepted-values.html

Build Bot

Comment 82 2016-06-17 00:36:15 PDT

Created attachment 281544 [details] Archive of layout-test-results from ews125 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews125 Port: ios-simulator-wk2 Platform: Mac OS X 10.11.4

Build Bot

Comment 83 2016-06-17 00:38:33 PDT

Comment on attachment 281539 [details] Patch Attachment 281539 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/1516676 New failing tests: http/tests/dom/set-document-location-host-to-accepted-values.html http/tests/dom/set-document-location-hostname-to-accepted-values.html

Build Bot

Comment 84 2016-06-17 00:38:38 PDT

Created attachment 281545 [details] Archive of layout-test-results from ews116 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews116 Port: mac-yosemite Platform: Mac OS X 10.10.5

Caio Lima

Comment 85 2016-06-17 01:06:13 PDT

Created attachment 281547 [details] Patch

WebKit Commit Bot

Comment 86 2016-06-17 01:08:49 PDT

Attachment 281547 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/llint/LLIntSlowPaths.cpp:64: Alphabetical sorting problem. [build/include_order] [4] ERROR: Source/JavaScriptCore/llint/LLIntSlowPaths.cpp:1666: Missing space after , [whitespace/comma] [3] ERROR: Source/JavaScriptCore/llint/LLIntSlowPaths.cpp:1675: Weird number of spaces at line-start. Are you using a 4-space indent? [whitespace/indent] [3] ERROR: Source/JavaScriptCore/llint/LLIntSlowPaths.cpp:1687: Missing space after , [whitespace/comma] [3] ERROR: Source/JavaScriptCore/llint/LLIntSlowPaths.cpp:1688: Missing space after , [whitespace/comma] [3] Total errors found: 5 in 17 files If any of these errors are false positives, please file a bug against check-webkit-style.

Caio Lima

Comment 87 2016-06-17 01:13:37 PDT

(In reply to comment #85) > Created attachment 281547 [details] > Patch Hey Guys, I am sorry to post this patch, but it is the version that I am using to get Cache Miss Information. The strange behavior here is that this patch is not failing in tests in my local machine and I want to test it here: http/tests/dom/set-document-location-host-to-accepted-values.html http/tests/dom/set-document-location-hostname-to-accepted-values.html However, the difference between this and the last version is just the call of Debug Functions... Do you guys can imagine why it is happening? BTW, the test is failing because the line of one CONSOLE MESSAGE is different (It is expected to be 66 and it is being 17). Sorry for the delay in this patch and for this question, but I am trying to solve this specific bug since yesterday with no success. Regards, Caio Lima.

Caio Lima

Comment 88 2016-06-17 01:34:31 PDT

Created attachment 281550 [details] Patch

Caio Lima

Comment 89 2016-06-18 01:55:30 PDT

Created attachment 281613 [details] Patch

Caio Lima

Comment 90 2016-06-18 14:00:42 PDT

Hi Guys. Finally, the patch is now Green! However I am in doubt if it is working for other architectures (Win, GTK, ios), because I saw some comments in the code about call C functions in Windows environment. Thi is my final version to review =)

Caio Lima

Comment 91 2016-06-22 00:12:30 PDT

New information about Cache Miss Ratio. I traced the execution of my Safari since Sunday afternoon until now and I got the following result: ---------------------------------------------------------------------------- Current CacheMiss Ratio:15510/37691 ---------------------------------------------------------------------------- Almost 70% of my usage was in Facebook, Youtube and Google. It is really close to results of speedometer benchmark, what validate the result of the last run.

Caio Lima

Comment 92 2016-06-26 15:53:19 PDT

Ping Review

Keith Miller

Comment 93 2016-06-28 11:44:49 PDT

Comment on attachment 281613 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=281613&action=review Looks good! I just have a couple of comments/questions. > Source/JavaScriptCore/ChangeLog:12 > + offset of a object to call a getter function and get_by_id_proto_custo typo: git_by_id_proto_custom > Source/JavaScriptCore/llint/LowLevelInterpreter64.asm:1299 > + bineq 0, dest, .loadEncodedThisValueDone Does this get assembled to: test: dest, dest jz: .loadEncodedThisValueDone ? If not, we should probably try to figure out how to make that happen.

Caio Lima

Comment 94 2016-06-28 23:28:21 PDT

(In reply to comment #93) > Comment on attachment 281613 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=281613&action=review > > Looks good! I just have a couple of comments/questions. > > > Source/JavaScriptCore/ChangeLog:12 > > + offset of a object to call a getter function and get_by_id_proto_custo > > typo: git_by_id_proto_custom Nice Catch =) > > Source/JavaScriptCore/llint/LowLevelInterpreter64.asm:1299 > > + bineq 0, dest, .loadEncodedThisValueDone > > Does this get assembled to: > test: dest, dest > jz: .loadEncodedThisValueDone > ? If not, we should probably try to figure out how to make that happen. Yes, Here is the code compiled: 0x100dbaf1f <+13003>: movq 0x30(%r13,%r8,8), %rsi 0x100dbaf24 <+13008>: testl %esi, %esi 0x100dbaf26 <+13010>: jne 0x100dbaf2f ; <+13019> 0x100dbaf2c <+13016>: movq %rcx, %rsi 0x100dbaf2f <+13019>: movq 0x28(%r13,%r8,8), %rax I am going to rebase the code and send the Patch updated

Caio Lima

Comment 95 2016-06-28 23:59:28 PDT

Created attachment 282325 [details] Patch

Build Bot

Comment 96 2016-06-29 01:02:38 PDT

Comment on attachment 282325 [details] Patch Attachment 282325 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: http://webkit-queues.webkit.org/results/1592963 New failing tests: animations/multiple-backgrounds.html

Build Bot

Comment 97 2016-06-29 01:02:43 PDT

Created attachment 282331 [details] Archive of layout-test-results from ews126 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews126 Port: ios-simulator-wk2 Platform: Mac OS X 10.11.4

Keith Miller

Comment 98 2016-06-29 09:31:55 PDT

Comment on attachment 282325 [details] Patch r=me. That failing test is just noise.

WebKit Commit Bot

Comment 99 2016-06-29 09:53:48 PDT

Comment on attachment 282325 [details] Patch Clearing flags on attachment: 282325 Committed r202627: <http://trac.webkit.org/changeset/202627>

WebKit Commit Bot

Comment 100 2016-06-29 09:53:57 PDT

All reviewed patches have been landed. Closing bug.

Ryan Haddad

Comment 101 2016-06-29 11:13:22 PDT

This change appears to have broken the CLoop and iOS device build https://build.webkit.org/builders/Apple%20El%20Capitan%20LLINT%20CLoop%20%28BuildAndTest%29/builds/7534 https://build.webkit.org/builders/Apple%20iOS%209%20Release%20%28Build%29/builds/5792/

WebKit Commit Bot

Comment 102 2016-06-29 11:23:29 PDT

Re-opened since this is blocked by bug 159266

Caio Lima

Comment 103 2016-06-30 01:15:54 PDT

Created attachment 282418 [details] Patch

Caio Lima

Comment 104 2016-06-30 01:17:51 PDT

(In reply to comment #103) > Created attachment 282418 [details] > Patch Considering ARM64 and C_LOOP cases.

Caio Lima

Comment 105 2016-07-07 23:52:08 PDT

Ping review. I think now it is ok now, because I am considering the last errors reported to ARM64 and C_LOOP. The last problems were: 1. ARM64 doesn't support "push <reg>", so it is necessary push two registers at the same time. 2. C_LOOP doesn't support call operations, So I added a slow case to it.

Keith Miller

Comment 106 2016-07-10 12:59:46 PDT

Comment on attachment 282418 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=282418&action=review > Source/JavaScriptCore/llint/LLIntSlowPaths.cpp:686 > + } else if (UNLIKELY(pc[7].u.operand && (slot.isValue() || slot.isUnset() || ((slot.isAccessor() || slot.isCustom()) && (slot.slotBase() != baseValue))))) { I think this could just be: if (UNLIKELY(pc[7].u.operand && (slot.slotBase() != baseValue))))) > Source/JavaScriptCore/llint/LowLevelInterpreter32_64.asm:1476 > + push t1 # Load arg3 JSObject * > + push t2 # Load arg2 PropertyName > + move CellTag, t3 > + if BIG_ENDIAN > + push t1 # Load arg1 Payload of EncodedJSValue > + push t3 # Load arg1 Tag of EncodedJSValue > + else > + push t3 # Load arg1 Tag of EncodedJSValue > + push t1 # Load arg1 Payload of EncodedJSValue > + end > + push cfr # Loading exec Looking at this closer. I don't think this is going to work on ARM. The ARM calling convention passes the first four arguments in r0-r3. Additionally, depending on whether or not the device is using the EABI 64-bit values need to start in r0 or r2. > Source/JavaScriptCore/llint/LowLevelInterpreter64.asm:1333 > + loadp Callee[cfr], t0 > + andp MarkedBlockMask, t0, t1 > + loadp MarkedBlock::m_weakSet + WeakSet::m_vm[t1], t1 It is probably worthwhile to make this a macro in LowLevelInterpreter.asm, loadVM(register) or something.

Caio Lima

Comment 107 2016-07-11 01:31:09 PDT

Created attachment 283306 [details] Patch

Keith Miller

Comment 108 2016-07-21 11:46:24 PDT

Comment on attachment 283306 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=283306&action=review > Source/JavaScriptCore/llint/LowLevelInterpreter32_64.asm:1478 > + loadEncodedThisValue(t3, a1) > + push t5, cfr > + loadp 20[PC], t5 > + # Inlining the GetValueFunc call > + move t2, a3 # Load arg2 PropertyName > + if BIG_ENDIAN > + move a1, a2 # Load arg1 Payload of EncodedJSValue > + move CellTag, a1 # Load arg1 Tag of EncodedJSValue > + else > + move CellTag, a2 # Load arg1 Tag of EncodedJSValue > + end > + move cfr, a0 # Loading exec > + cCall4(t5) Looks good. Although I still think you need to handle the EABI. For the EABI you'll need to push the PropertyName onto the stack put the EncodedJSValue into r2-r3. r1 should be empty. > Source/JavaScriptCore/llint/LowLevelInterpreter64.asm:1322 > +macro loadEncodedThisValue(baseValue, dest) > + loadpFromInstruction(6, dest) > + bineq 0, dest, .loadEncodedThisValueDone > + move baseValue, dest > + .loadEncodedThisValueDone: > +end I think this should be bpneq. bineq only compares the low 32 bits. Also, You could move it to LowLevelInterpreter.asm

Caio Lima

Comment 109 2016-07-31 17:34:15 PDT

(In reply to comment #108) > Comment on attachment 283306 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=283306&action=review > > > Source/JavaScriptCore/llint/LowLevelInterpreter32_64.asm:1478 > > + loadEncodedThisValue(t3, a1) > > + push t5, cfr > > + loadp 20[PC], t5 > > + # Inlining the GetValueFunc call > > + move t2, a3 # Load arg2 PropertyName > > + if BIG_ENDIAN > > + move a1, a2 # Load arg1 Payload of EncodedJSValue > > + move CellTag, a1 # Load arg1 Tag of EncodedJSValue > > + else > > + move CellTag, a2 # Load arg1 Tag of EncodedJSValue > > + end > > + move cfr, a0 # Loading exec > > + cCall4(t5) > > Looks good. Although I still think you need to handle the EABI. For the EABI > you'll need to push the PropertyName onto the stack put the EncodedJSValue > into r2-r3. r1 should be empty. Thank you to inform me about it. Which architectures use EABI? I suppose ARM, but I would like to be sure before send a new patch. Also, do you mind point me where you found this information? > > Source/JavaScriptCore/llint/LowLevelInterpreter64.asm:1322 > > +macro loadEncodedThisValue(baseValue, dest) > > + loadpFromInstruction(6, dest) > > + bineq 0, dest, .loadEncodedThisValueDone > > + move baseValue, dest > > + .loadEncodedThisValueDone: > > +end > > I think this should be bpneq. bineq only compares the low 32 bits. Also, You > could move it to LowLevelInterpreter.asm Nice catch, Thanks!

Keith Miller

Comment 110 2016-08-01 11:16:45 PDT

(In reply to comment #109) > (In reply to comment #108) > > > > Looks good. Although I still think you need to handle the EABI. For the EABI > > you'll need to push the PropertyName onto the stack put the EncodedJSValue > > into r2-r3. r1 should be empty. > > Thank you to inform me about it. Which architectures use EABI? I suppose > ARM, but I would like to be sure before send a new patch. Also, do you mind > point me where you found this information? EABI AFAIK is used for both MIPS and ARM. I don't have any documentation for ARM I have some documentation for MIPS, which I will email to you.

Caio Lima

Comment 111 2016-08-02 00:21:26 PDT

Created attachment 285079 [details] Patch

Caio Lima

Comment 112 2016-08-02 00:34:13 PDT

Created attachment 285082 [details] Patch

Keith Miller

Comment 113 2016-08-05 10:44:25 PDT

Comment on attachment 285082 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=285082&action=review > Source/JavaScriptCore/llint/LLIntSlowPaths.cpp:679 > + if (slot.isValue()) { I don't think you need this check. This block is dominated by a check that the slot is a value. > Source/JavaScriptCore/llint/LowLevelInterpreter32_64.asm:1474 > + if ARM or ARMv7 or ARMv7_TRADITIONAL or MIPS I think you want. if (COMPILER_SUPPORTS(EABI) and ARM) or MIPS. Not all ARM devices use the EABI.

Caio Lima

Comment 114 2016-08-06 22:27:42 PDT

Created attachment 285522 [details] Patch

Caio Lima

Comment 115 2016-08-14 16:51:22 PDT

Ping Review

Caio Lima

Comment 116 2016-10-08 22:47:19 PDT

Created attachment 291039 [details] Patch

Caio Lima

Comment 117 2016-10-08 23:42:55 PDT

(In reply to comment #116) > Created attachment 291039 [details] > Patch Rebased Patch

Geoffrey Garen

Comment 118 2016-10-11 13:53:55 PDT

Comment on attachment 291039 [details] Patch I remember Keith did a patch similar to this in the LLInt for prototype values and we didn't see a measurable speedup from it. Should we be doing optimizations like this in the LLInt? Is it worth the complexity and maintenance cost? What's the best benchmark result we have from this? Do we have any results that are not tailored micro benchmarks?

Keith Miller

Comment 119 2016-10-11 13:57:41 PDT

(In reply to comment #118) > Comment on attachment 291039 [details] > Patch > > I remember Keith did a patch similar to this in the LLInt for prototype > values and we didn't see a measurable speedup from it. > > Should we be doing optimizations like this in the LLInt? Is it worth the > complexity and maintenance cost? > > What's the best benchmark result we have from this? Do we have any results > that are not tailored micro benchmarks? I think I agree with Geoff here. Looking at this patch again, it's difficult to understand exactly what it's doing. If we don't have a measurable speedup from this on a non-microbenchmark, I'm not sure we should take the patch.

Caio Lima

Comment 120 2016-10-11 14:13:11 PDT

(In reply to comment #118) > Comment on attachment 291039 [details] > Patch > > I remember Keith did a patch similar to this in the LLInt for prototype > values and we didn't see a measurable speedup from it. > > Should we be doing optimizations like this in the LLInt? Is it worth the > complexity and maintenance I think I can't agree more. It is bringing a complex with practically no performance gains. I remember last time I ran Speedometer in x86_64 and I didn't saw gains. The point is that we can get better improvements if JIT=false, but it doesn't represent real world usage. > What's the best benchmark result we have from this? Do we have any results > that are not tailored micro benchmarks? I don't. What do you think if we just cache getters and forget about customGetter inline? The patch will be much more simpler and we also can take advantage on "super" access in a further case. I can take a look on that and see if it is worth or not.

Geoffrey Garen

Comment 121 2016-10-11 16:03:41 PDT

> What do you think if we just cache getters and forget about customGetter > inline? The patch will be much more simpler and we also can take advantage > on "super" access in a further case. I can take a look on that and see if it > is worth or not. If we have consensus that the performance motivation is not super strong, then I think we should adopt LLInt optimizations or not based on weighing their complexity. I don't know exactly what an optimization that kept caching but skipped customGetter would look like, but if it were simple enough I think it might be worth it.

Yusuke Suzuki

Comment 122 2017-07-20 19:44:51 PDT

I think non-JIT performance is important. We already retired some old JIT arches like SH4. And we do not actively maintain the existing but non-major JIT arch like ARM6. If we would like to freely focus on existing high-performance JIT (lovely x64, ARM64!), I think we should keep LLInt fast for non-JIT. In addition to slightly improve the performance of LLInt in JIT supported environments, it helps us to easily say that "OK, you do not need to worry about your non-JIT supported arch performance. LLInt is already super fast!".

Brady Eidson

Comment 123 2018-02-14 10:34:27 PST

Comment on attachment 291039 [details] Patch Patches that have been up for review since 2016 are almost certainly too stale to be relevant to trunk in their current form. If this patch is still important please rebase it and post it for review again.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution WONTFIX

Priority P2

Severity Normal

Classification Unclassified

Version WebKit Nightly Build

Hardware Unspecified

OS Unspecified

Product WebKit

Component JavaScriptCore

Assignee

Nobody

Reported

2016-05-25 13:22 PDT

Modified

2018-02-14 12:19 PST History

CC List

14 users Show

URL

Keywords InRadar

Depends on

159266

Blocks

Dependencies

tree graph