RESOLVED FIXED 164258
We should have a more concise way of determining when we're varargs calling a function using rest parameters
https://bugs.webkit.org/show_bug.cgi?id=164258
Summary We should have a more concise way of determining when we're varargs calling a...
Saam Barati
Reported 2016-10-31 17:39:26 PDT
For example, we emit the entire iterator protocol inside bytecode. The byte code contains a loop with a few property accesses and function calls. This makes it effectively impossible to do smart things for code like this: ``` function foo(...args) { return bar(a, b, ...args); } ``` We should be able to describe such programs at a higher level. After speaking with Filip, we have an idea for describing what we want the IR to look like: ``` a: ... something b: ... something args: CreateRest(...) X: NewArrayWithSpread(@a, @b, @args, BitSet[0, 0, 1]) Y: CallVarargs(@callee, @this, @X) Z: Return(@Y) ``` Where the BitSet inside node X has a bit for each argument. If the bit is zero, it means we take the argument as-is. If the bit is 1, it means we perform a spread operation on it. Having this high-level IR will make doing escape analysis for the spreading of CreateRest much easier.
Attachments
WIP (31.76 KB, patch)
2016-11-02 19:22 PDT, Saam Barati
no flags
WIP (32.67 KB, patch)
2016-11-02 19:35 PDT, Saam Barati
no flags
WIP (64.75 KB, patch)
2016-11-03 18:28 PDT, Saam Barati
no flags
WIP (73.72 KB, patch)
2016-11-04 14:31 PDT, Saam Barati
no flags
WIP (80.42 KB, patch)
2016-11-04 16:34 PDT, Saam Barati
no flags
WIP (81.59 KB, patch)
2016-11-04 19:17 PDT, Saam Barati
no flags
WIP (90.71 KB, patch)
2016-11-06 10:47 PST, Saam Barati
no flags
WIP (100.12 KB, patch)
2016-11-06 17:42 PST, Saam Barati
no flags
WIP (109.31 KB, patch)
2016-11-08 11:28 PST, Saam Barati
no flags
patch (117.38 KB, patch)
2016-11-08 17:05 PST, Saam Barati
buildbot: commit-queue-
Archive of layout-test-results from ews114 for mac-yosemite (2.01 MB, application/zip)
2016-11-09 13:24 PST, Build Bot
no flags
patch (125.31 KB, patch)
2016-11-09 18:05 PST, Saam Barati
ysuzuki: review+
patch for landing (129.55 KB, patch)
2016-11-10 23:23 PST, Saam Barati
no flags
patch for landing v2 (129.55 KB, patch)
2016-11-11 18:24 PST, Saam Barati
no flags
Saam Barati
Comment 1 2016-11-02 14:00:44 PDT
This proposal does not abide by JS semantics. For example, consider: [...f(), ...g()] The ordering should be r1 = call f spread r1 r2 = call g spread r2 After talking with Fil, we have a new design direction, which will take an argument, spread it, and produce a fixed buffer of JSValues. The buffer will be immutable, and we'll call it something like JSFixedArray or JSPureArray So we will now emit bytecode like this for this array: [...a, b, ...c] op_spread locA op_resolve locB op_spread locC op_new_array_cat locA, locB, locC, BitVector[1,0,1] Where the spread bytecode operation will create a JSFixedArray. In the DFG, we could transform these into something like ArrayClone() when we realize the input will be an array and that the array iterator protocol isn't observable.
Saam Barati
Comment 2 2016-11-02 15:57:31 PDT
I'm beginning work on this now.
Saam Barati
Comment 3 2016-11-02 19:22:02 PDT
Created attachment 293735 [details] WIP it begins
Saam Barati
Comment 4 2016-11-02 19:35:51 PDT
Saam Barati
Comment 5 2016-11-03 18:28:10 PDT
Created attachment 293842 [details] WIP More. I just started setting up the watchpoints we'll want to use as well as some initial DFG support.
Saam Barati
Comment 6 2016-11-04 14:31:19 PDT
Created attachment 293929 [details] WIP So this patch is a speedup for normal looking arrays, but a slowdown for all other things. I need to find a way to make it faster for non-arrays. It's about a 1-2% slowdown on ES6SampleBench/Basic It's about a 12-15% speedup on ES6SampleBench/Air
Saam Barati
Comment 7 2016-11-04 16:34:59 PDT
Created attachment 293958 [details] WIP Ok, now it's faster on both benchmarks in SampleBench. Now it's time to make the compiler smarter about Spread, and to make allocating a JSFixedArray not require a C call. After all this, we could probably make new_array_with_spread faster by dynamically allocating an array instead of calling out to C code as well. With the current WIP, we are ~7% faster on SampleBench. I think I can probably increase this speedup.
Saam Barati
Comment 8 2016-11-04 19:17:25 PDT
Created attachment 293974 [details] WIP Spread now has an inline allocation path in the FTL. Interestingly, this is making Air faster, but Basic slower. I bet I need to support double indexing types to make this faster for Basic.
Saam Barati
Comment 9 2016-11-06 10:47:32 PST
Created attachment 294027 [details] WIP some more
Saam Barati
Comment 10 2016-11-06 17:42:53 PST
Created attachment 294039 [details] WIP some more things for having a bad time.
Saam Barati
Comment 11 2016-11-08 11:28:24 PST
Created attachment 294171 [details] WIP Almost done, just need to do clean up work now. It passes all JSC tests in release.
Saam Barati
Comment 12 2016-11-08 17:05:10 PST
Saam Barati
Comment 13 2016-11-08 17:05:33 PST
I still need to write the 32-bit DFG implementation, but besides that, this patch is ready to be reviewed by folks.
WebKit Commit Bot
Comment 14 2016-11-08 17:08:09 PST
Attachment 294204 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/runtime/JSGlobalObjectInlines.h:29: Alphabetical sorting problem. [build/include_order] [4] ERROR: Source/JavaScriptCore/bytecode/ObjectPropertyConditionSet.h:160: The parameter name "object" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 2 in 53 files If any of these errors are false positives, please file a bug against check-webkit-style.
Saam Barati
Comment 15 2016-11-08 17:13:47 PST
(In reply to comment #13) > I still need to write the 32-bit DFG implementation, but besides that, this > patch is ready to be reviewed by folks. Also, I'm writing some tests and micro benchmarks. But again, it's ready for some feedback.
Saam Barati
Comment 16 2016-11-08 17:48:45 PST
Also, this new code is slower if spreading an array of size 0/1/2. It's about the same spreading an array of size 3. And it only gets faster and faster as we spread larger arrays. I think I opened a bug that will address this issue: https://bugs.webkit.org/show_bug.cgi?id=164536 My guess is that we're being hurt by the extra memcpy and the overall work being done surrounding the memcpy loop when spreading tiny arrays (like the extra allocation for JSFixedArray). I think it's probably worth fixing this in a follow-up, since this patch is already quite large. To see that this is really a pathology for tiny arrays: - spreading an array of size 6 is 50% faster - spreading an array of size 5 is 5% faster - spreading an array of size 500 is 2.5x faster.
Build Bot
Comment 17 2016-11-09 13:24:36 PST
Comment on attachment 294204 [details] patch Attachment 294204 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/2484860 New failing tests: js/basic-spread.html
Build Bot
Comment 18 2016-11-09 13:24:40 PST
Created attachment 294276 [details] Archive of layout-test-results from ews114 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews114 Port: mac-yosemite Platform: Mac OS X 10.10.5
Saam Barati
Comment 19 2016-11-09 18:05:36 PST
Created attachment 294316 [details] patch Now with a 32-bit implementation.
WebKit Commit Bot
Comment 20 2016-11-09 18:09:02 PST
Attachment 294316 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/runtime/JSGlobalObjectInlines.h:29: Alphabetical sorting problem. [build/include_order] [4] ERROR: Source/JavaScriptCore/bytecode/ObjectPropertyConditionSet.h:160: The parameter name "object" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 2 in 59 files If any of these errors are false positives, please file a bug against check-webkit-style.
Mark Lam
Comment 21 2016-11-10 08:46:14 PST
Comment on attachment 294316 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=294316&action=review > Source/JavaScriptCore/ChangeLog:130 > + * runtime/ArrayIteratorAdaptiveWatchpoint.cpp: Added. Need to add this to CMakeLists.txt. > Source/JavaScriptCore/ChangeLog:140 > + * runtime/JSFixedArray.cpp: Added. Need to add this to CMakeLists.txt.
Saam Barati
Comment 22 2016-11-10 12:41:35 PST
(In reply to comment #21) > Comment on attachment 294316 [details] > patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=294316&action=review > > > Source/JavaScriptCore/ChangeLog:130 > > + * runtime/ArrayIteratorAdaptiveWatchpoint.cpp: Added. > > Need to add this to CMakeLists.txt. > > > Source/JavaScriptCore/ChangeLog:140 > > + * runtime/JSFixedArray.cpp: Added. > > Need to add this to CMakeLists.txt. Yeah will fix.
Yusuke Suzuki
Comment 23 2016-11-10 14:05:12 PST
Comment on attachment 294316 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=294316&action=review Still looking. Quick comments. > Source/JavaScriptCore/ChangeLog:52 > + This patch is an ~8% speedup on ES6SampleBench on my MBP. Awesome! > Source/JavaScriptCore/builtins/IteratorHelpers.js:26 > +function performIteration(iterable) It is useful if we can have a URL to the spec. > Source/JavaScriptCore/builtins/IteratorHelpers.js:43 > +} Nice! > Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp:3050 > + RELEASE_ASSERT(argv.size() == 1 || argv[argv.size() - 1]->index() == argv[argv.size() - 2]->index() - 1); OK. As the same to CallArguments. > Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp:3065 > + for (ElementNode* node = elements; node; node = node->next()) { > + if (node->value()->isSpreadExpression()) { > + ExpressionNode* expression = static_cast<SpreadExpressionNode*>(node->value())->expression(); > + RefPtr<RegisterID> tmp = newTemporary(); > + emitNode(tmp.get(), expression); > + > + emitOpcode(op_spread); > + instructions().append(argv[i].get()->index()); > + instructions().append(tmp.get()->index()); OK, the result of op_spread becomes the one element of op_new_array_with_spread. > Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp:3079 > + instructions().append(bitVectorIndex); And bit vector holds which element is produced from op_spread. > Source/JavaScriptCore/jit/AssemblyHelpers.h:1559 > MarkedSpace::Subspace& subspace = vm()->heap.template subspaceForObjectOfType<ClassType>(); Name of "subspaceForObjectOfType" is a bit weird. But the logic itself is OK. > Source/JavaScriptCore/runtime/ArrayIteratorAdaptiveWatchpoint.cpp:42 > + m_globalObject->arrayIteratorProtocolWatchpoint().fireAll(m_globalObject->vm(), StringFireDetail("Array iterator protocol changed.")); OK, once this adaptive watchpoint is fired, it fires the array iterator protocol watchpoint set. And it later throw away code assuming this invariant. > Source/JavaScriptCore/runtime/CommonSlowPaths.cpp:993 > + JSValue value = values[-i]; OK, registers are ordered like this. > Source/JavaScriptCore/runtime/CommonSlowPaths.cpp:1057 > + JSArray* array; > + { > + JSFunction* iterationFunction = globalObject->iteratorProtocolFunction(); > + CallData callData; > + CallType callType = JSC::getCallData(iterationFunction, callData); > + ASSERT(callType != CallType::None); > + > + MarkedArgumentBuffer arguments; > + arguments.append(iterable); > + JSValue arrayResult = call(exec, iterationFunction, callType, callData, jsNull(), arguments); > + CHECK_EXCEPTION(); > + array = jsCast<JSArray*>(arrayResult); > + } > + > + RETURN(JSFixedArray::createFromArray(exec, vm, globalObject, array)); Could you add a test for this case? > Source/JavaScriptCore/runtime/JSFixedArray.h:34 > + typedef JSCell Base; If we do not add any structure transition onto this, we can make this StrucutreIsImmortal. > Source/JavaScriptCore/runtime/JSFixedArray.h:48 > + unsigned length = array->length(); Let's create the `JSFixedArray* result` here. All the returned JSFixedArray is created in the same form `JSFixedArray* result = JSFixedArray::create(vm, globalObject->fixedArrayStructure(), length);`, right? And I think we can insert if (!length) return result; here. > Source/JavaScriptCore/runtime/JSFixedArray.h:50 > + JSFixedArray* result = JSFixedArray::create(vm, globalObject->fixedArrayStructure(), length); Move this to the prologue of this function. > Source/JavaScriptCore/runtime/JSFixedArray.h:60 > + JSFixedArray* result = JSFixedArray::create(vm, globalObject->fixedArrayStructure(), length); Move this to the prologue of this function. > Source/JavaScriptCore/runtime/JSFixedArray.h:70 > + if (!length) > + return JSFixedArray::create(vm, globalObject->fixedArrayStructure(), 0); Move this to the prologue of this function. > Source/JavaScriptCore/runtime/JSFixedArray.h:73 > + JSFixedArray* result = JSFixedArray::create(vm, globalObject->fixedArrayStructure(), length); Move this to the prologue of this function. > Source/JavaScriptCore/runtime/JSGlobalObject.cpp:943 > + PropertySlot slot(arrayIteratorPrototype, PropertySlot::InternalMethodType::Get); > + bool result = arrayIteratorPrototype->getOwnPropertySlot(arrayIteratorPrototype, exec, m_vm.propertyNames->next, slot); > + RELEASE_ASSERT(result); > + RELEASE_ASSERT(!scope.exception()); > + RELEASE_ASSERT(slot.isCacheableValue()); > + JSValue nextFunction = slot.getValue(exec, m_vm.propertyNames->next); > + RELEASE_ASSERT(!scope.exception()); > + RELEASE_ASSERT(jsDynamicCast<JSFunction*>(nextFunction)); > + > + ObjectPropertyCondition condition = generateConditionForSelfEquivalence(m_vm, nullptr, arrayIteratorPrototype, m_vm.propertyNames->next.impl()); > + RELEASE_ASSERT(condition.requiredValue() == nextFunction); > + > + bool isWatchable = condition.isWatchable(PropertyCondition::EnsureWatchability); > + RELEASE_ASSERT(isWatchable); // We allow this to install the necessary watchpoints. > + > + m_arrayIteratorPrototypeNext = std::make_unique<ArrayIteratorAdaptiveWatchpoint>(condition, this); > + m_arrayIteratorPrototypeNext->install(); > + } > + > + { > + ArrayPrototype* arrayPrototype = this->arrayPrototype(); > + PropertySlot slot(arrayPrototype, PropertySlot::InternalMethodType::Get); > + bool result = arrayPrototype->getOwnPropertySlot(arrayPrototype, exec, m_vm.propertyNames->iteratorSymbol, slot); > + RELEASE_ASSERT(result); > + RELEASE_ASSERT(slot.isCacheableValue()); > + RELEASE_ASSERT(!scope.exception()); > + JSValue symbolIteratorFunction = slot.getValue(exec, m_vm.propertyNames->iteratorSymbol); > + RELEASE_ASSERT(!scope.exception()); > + RELEASE_ASSERT(jsDynamicCast<JSFunction*>(symbolIteratorFunction)); > + > + ObjectPropertyCondition condition = generateConditionForSelfEquivalence(m_vm, nullptr, arrayPrototype, m_vm.propertyNames->iteratorSymbol.impl()); > + RELEASE_ASSERT(condition); > + RELEASE_ASSERT(condition.requiredValue() == symbolIteratorFunction); > + > + bool isWatchable = condition.isWatchable(PropertyCondition::EnsureWatchability); // We allow this to install necessary watchpoints. > + RELEASE_ASSERT(isWatchable); > + > + m_arrayPrototypeSymbolIteratorWatchpoint = std::make_unique<ArrayIteratorAdaptiveWatchpoint>(condition, this); > + m_arrayPrototypeSymbolIteratorWatchpoint->install(); These operations are duplicate. And it is quite useful. Could you extract this to a static function? > Source/JavaScriptCore/runtime/JSType.h:52 > + JSFixedArrayType, Nice.
Yusuke Suzuki
Comment 24 2016-11-10 15:28:29 PST
Comment on attachment 294316 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=294316&action=review r=me > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7002 > + cellResult(resultGPR, node); At some point, we can profile indexing type and emit only one code instead. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7046 > + Edge use = m_jit.graph().m_varArgChildren[node->firstChild() + i]; You can use m_jit.graph().varArgChild(node, i); > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7056 > + compileAllocateNewArrayWithSize(m_jit.graph().globalObjectFor(node->origin.semantic), resultGPR, lengthGPR, ArrayWithContiguous, shouldAllowForArrayStorageStructureForLargeArrays); We may have a chance to create non contiguous array version at some point. (Like, DoubleShape array). > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7069 > + Edge use = m_jit.graph().m_varArgChildren[node->firstChild() + i]; Ditto. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7103 > + } OK, copying. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7117 > + Edge use = m_jit.graph().m_varArgChildren[node->firstChild() + i]; Ditto. > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4326 > + Edge use = m_graph.m_varArgChildren[m_node->firstChild() + i]; You can use m_graph.varArgChild(node, i); > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4339 > + Edge use = m_graph.m_varArgChildren[m_node->firstChild() + i]; You can use m_graph.varArgChild(node, i); > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4395 > + Edge use = m_graph.m_varArgChildren[m_node->firstChild() + i]; You can use m_graph.varArgChild(node, i);
Saam Barati
Comment 25 2016-11-10 21:47:51 PST
Comment on attachment 294316 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=294316&action=review >> Source/JavaScriptCore/builtins/IteratorHelpers.js:26 >> +function performIteration(iterable) > > It is useful if we can have a URL to the spec. Will do. >> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7002 >> + cellResult(resultGPR, node); > > At some point, we can profile indexing type and emit only one code instead. Yeah. I have a bug open for that. >> Source/JavaScriptCore/runtime/ArrayIteratorAdaptiveWatchpoint.cpp:42 >> + m_globalObject->arrayIteratorProtocolWatchpoint().fireAll(m_globalObject->vm(), StringFireDetail("Array iterator protocol changed.")); > > OK, once this adaptive watchpoint is fired, it fires the array iterator protocol watchpoint set. And it later throw away code assuming this invariant. Yeah. I'll add a test for this. >> Source/JavaScriptCore/runtime/JSFixedArray.h:34 >> + typedef JSCell Base; > > If we do not add any structure transition onto this, we can make this StrucutreIsImmortal. Yeah I will do that and also make the Structure a member of the VM instead of the global object >> Source/JavaScriptCore/runtime/JSGlobalObject.cpp:943 >> + m_arrayPrototypeSymbolIteratorWatchpoint->install(); > > These operations are duplicate. And it is quite useful. Could you extract this to a static function? I'll make a local lambda to abstract the operation.
Saam Barati
Comment 26 2016-11-10 23:23:33 PST
Created attachment 294477 [details] patch for landing
WebKit Commit Bot
Comment 27 2016-11-10 23:25:56 PST
Attachment 294477 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/runtime/JSGlobalObjectInlines.h:29: Alphabetical sorting problem. [build/include_order] [4] ERROR: Source/JavaScriptCore/bytecode/ObjectPropertyConditionSet.h:160: The parameter name "object" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 2 in 65 files If any of these errors are false positives, please file a bug against check-webkit-style.
WebKit Commit Bot
Comment 28 2016-11-11 00:36:41 PST
Comment on attachment 294477 [details] patch for landing Clearing flags on attachment: 294477 Committed r208584: <http://trac.webkit.org/changeset/208584>
WebKit Commit Bot
Comment 29 2016-11-11 00:36:47 PST
All reviewed patches have been landed. Closing bug.
Chris Dumez
Comment 30 2016-11-11 10:24:30 PST
Reverted r208584 for reason: Seems to have regressed Speedometer by 1% on Mac Committed r208592: <http://trac.webkit.org/changeset/208592>
Saam Barati
Comment 31 2016-11-11 11:00:23 PST
(In reply to comment #30) > Reverted r208584 for reason: > > Seems to have regressed Speedometer by 1% on Mac > > Committed r208592: <http://trac.webkit.org/changeset/208592> Ok. I'm looking into this.
Saam Barati
Comment 32 2016-11-11 11:07:05 PST
(In reply to comment #31) > (In reply to comment #30) > > Reverted r208584 for reason: > > > > Seems to have regressed Speedometer by 1% on Mac > > > > Committed r208592: <http://trac.webkit.org/changeset/208592> > > Ok. I'm looking into this. I can reproduce this locally.
Saam Barati
Comment 33 2016-11-11 18:23:30 PST
(In reply to comment #32) > (In reply to comment #31) > > (In reply to comment #30) > > > Reverted r208584 for reason: > > > > > > Seems to have regressed Speedometer by 1% on Mac > > > > > > Committed r208592: <http://trac.webkit.org/changeset/208592> > > > > Ok. I'm looking into this. > > I can reproduce this locally. I actually built two webkits, one with and one without the patch, to compare them locally, and I wasn't able to reproduce the regression. Also, speedometer doesn't ever emit a spread, so it's very unlikely that my patch is a regression. Also, since my patch does work in JSGlobalObject creation, I wrote a tiny benchmark which allocates JSGlobalObjects in a loop, and my patch isn't a slow down there either. So I'm going to reland and see what the perf bots think.
Saam Barati
Comment 34 2016-11-11 18:24:03 PST
Created attachment 294575 [details] patch for landing v2
WebKit Commit Bot
Comment 35 2016-11-11 18:26:27 PST
Attachment 294575 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/runtime/JSGlobalObjectInlines.h:29: Alphabetical sorting problem. [build/include_order] [4] ERROR: Source/JavaScriptCore/bytecode/ObjectPropertyConditionSet.h:160: The parameter name "object" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 2 in 65 files If any of these errors are false positives, please file a bug against check-webkit-style.
WebKit Commit Bot
Comment 36 2016-11-11 19:01:18 PST
Comment on attachment 294575 [details] patch for landing v2 Clearing flags on attachment: 294575 Committed r208637: <http://trac.webkit.org/changeset/208637>
WebKit Commit Bot
Comment 37 2016-11-11 19:01:25 PST
All reviewed patches have been landed. Closing bug.
Chris Dumez
Comment 38 2016-11-12 09:33:16 PST
(In reply to comment #37) > All reviewed patches have been landed. Closing bug. No obvious regression the second time this landed, all clear.
Note You need to log in before you can comment on or make changes to this bug.