For example, we emit the entire iterator protocol inside bytecode. The byte code contains a loop with a few property accesses and function calls. This makes it effectively impossible to do smart things for code like this: ``` function foo(...args) { return bar(a, b, ...args); } ``` We should be able to describe such programs at a higher level. After speaking with Filip, we have an idea for describing what we want the IR to look like: ``` a: ... something b: ... something args: CreateRest(...) X: NewArrayWithSpread(@a, @b, @args, BitSet[0, 0, 1]) Y: CallVarargs(@callee, @this, @X) Z: Return(@Y) ``` Where the BitSet inside node X has a bit for each argument. If the bit is zero, it means we take the argument as-is. If the bit is 1, it means we perform a spread operation on it. Having this high-level IR will make doing escape analysis for the spreading of CreateRest much easier.
This proposal does not abide by JS semantics. For example, consider: [...f(), ...g()] The ordering should be r1 = call f spread r1 r2 = call g spread r2 After talking with Fil, we have a new design direction, which will take an argument, spread it, and produce a fixed buffer of JSValues. The buffer will be immutable, and we'll call it something like JSFixedArray or JSPureArray So we will now emit bytecode like this for this array: [...a, b, ...c] op_spread locA op_resolve locB op_spread locC op_new_array_cat locA, locB, locC, BitVector[1,0,1] Where the spread bytecode operation will create a JSFixedArray. In the DFG, we could transform these into something like ArrayClone() when we realize the input will be an array and that the array iterator protocol isn't observable.
I'm beginning work on this now.
Created attachment 293735 [details] WIP it begins
Created attachment 293736 [details] WIP
Created attachment 293842 [details] WIP More. I just started setting up the watchpoints we'll want to use as well as some initial DFG support.
Created attachment 293929 [details] WIP So this patch is a speedup for normal looking arrays, but a slowdown for all other things. I need to find a way to make it faster for non-arrays. It's about a 1-2% slowdown on ES6SampleBench/Basic It's about a 12-15% speedup on ES6SampleBench/Air
Created attachment 293958 [details] WIP Ok, now it's faster on both benchmarks in SampleBench. Now it's time to make the compiler smarter about Spread, and to make allocating a JSFixedArray not require a C call. After all this, we could probably make new_array_with_spread faster by dynamically allocating an array instead of calling out to C code as well. With the current WIP, we are ~7% faster on SampleBench. I think I can probably increase this speedup.
Created attachment 293974 [details] WIP Spread now has an inline allocation path in the FTL. Interestingly, this is making Air faster, but Basic slower. I bet I need to support double indexing types to make this faster for Basic.
Created attachment 294027 [details] WIP some more
Created attachment 294039 [details] WIP some more things for having a bad time.
Created attachment 294171 [details] WIP Almost done, just need to do clean up work now. It passes all JSC tests in release.
Created attachment 294204 [details] patch
I still need to write the 32-bit DFG implementation, but besides that, this patch is ready to be reviewed by folks.
Attachment 294204 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/runtime/JSGlobalObjectInlines.h:29: Alphabetical sorting problem. [build/include_order] [4] ERROR: Source/JavaScriptCore/bytecode/ObjectPropertyConditionSet.h:160: The parameter name "object" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 2 in 53 files If any of these errors are false positives, please file a bug against check-webkit-style.
(In reply to comment #13) > I still need to write the 32-bit DFG implementation, but besides that, this > patch is ready to be reviewed by folks. Also, I'm writing some tests and micro benchmarks. But again, it's ready for some feedback.
Also, this new code is slower if spreading an array of size 0/1/2. It's about the same spreading an array of size 3. And it only gets faster and faster as we spread larger arrays. I think I opened a bug that will address this issue: https://bugs.webkit.org/show_bug.cgi?id=164536 My guess is that we're being hurt by the extra memcpy and the overall work being done surrounding the memcpy loop when spreading tiny arrays (like the extra allocation for JSFixedArray). I think it's probably worth fixing this in a follow-up, since this patch is already quite large. To see that this is really a pathology for tiny arrays: - spreading an array of size 6 is 50% faster - spreading an array of size 5 is 5% faster - spreading an array of size 500 is 2.5x faster.
Comment on attachment 294204 [details] patch Attachment 294204 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/2484860 New failing tests: js/basic-spread.html
Created attachment 294276 [details] Archive of layout-test-results from ews114 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews114 Port: mac-yosemite Platform: Mac OS X 10.10.5
Created attachment 294316 [details] patch Now with a 32-bit implementation.
Attachment 294316 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/runtime/JSGlobalObjectInlines.h:29: Alphabetical sorting problem. [build/include_order] [4] ERROR: Source/JavaScriptCore/bytecode/ObjectPropertyConditionSet.h:160: The parameter name "object" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 2 in 59 files If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 294316 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=294316&action=review > Source/JavaScriptCore/ChangeLog:130 > + * runtime/ArrayIteratorAdaptiveWatchpoint.cpp: Added. Need to add this to CMakeLists.txt. > Source/JavaScriptCore/ChangeLog:140 > + * runtime/JSFixedArray.cpp: Added. Need to add this to CMakeLists.txt.
(In reply to comment #21) > Comment on attachment 294316 [details] > patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=294316&action=review > > > Source/JavaScriptCore/ChangeLog:130 > > + * runtime/ArrayIteratorAdaptiveWatchpoint.cpp: Added. > > Need to add this to CMakeLists.txt. > > > Source/JavaScriptCore/ChangeLog:140 > > + * runtime/JSFixedArray.cpp: Added. > > Need to add this to CMakeLists.txt. Yeah will fix.
Comment on attachment 294316 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=294316&action=review Still looking. Quick comments. > Source/JavaScriptCore/ChangeLog:52 > + This patch is an ~8% speedup on ES6SampleBench on my MBP. Awesome! > Source/JavaScriptCore/builtins/IteratorHelpers.js:26 > +function performIteration(iterable) It is useful if we can have a URL to the spec. > Source/JavaScriptCore/builtins/IteratorHelpers.js:43 > +} Nice! > Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp:3050 > + RELEASE_ASSERT(argv.size() == 1 || argv[argv.size() - 1]->index() == argv[argv.size() - 2]->index() - 1); OK. As the same to CallArguments. > Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp:3065 > + for (ElementNode* node = elements; node; node = node->next()) { > + if (node->value()->isSpreadExpression()) { > + ExpressionNode* expression = static_cast<SpreadExpressionNode*>(node->value())->expression(); > + RefPtr<RegisterID> tmp = newTemporary(); > + emitNode(tmp.get(), expression); > + > + emitOpcode(op_spread); > + instructions().append(argv[i].get()->index()); > + instructions().append(tmp.get()->index()); OK, the result of op_spread becomes the one element of op_new_array_with_spread. > Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp:3079 > + instructions().append(bitVectorIndex); And bit vector holds which element is produced from op_spread. > Source/JavaScriptCore/jit/AssemblyHelpers.h:1559 > MarkedSpace::Subspace& subspace = vm()->heap.template subspaceForObjectOfType<ClassType>(); Name of "subspaceForObjectOfType" is a bit weird. But the logic itself is OK. > Source/JavaScriptCore/runtime/ArrayIteratorAdaptiveWatchpoint.cpp:42 > + m_globalObject->arrayIteratorProtocolWatchpoint().fireAll(m_globalObject->vm(), StringFireDetail("Array iterator protocol changed.")); OK, once this adaptive watchpoint is fired, it fires the array iterator protocol watchpoint set. And it later throw away code assuming this invariant. > Source/JavaScriptCore/runtime/CommonSlowPaths.cpp:993 > + JSValue value = values[-i]; OK, registers are ordered like this. > Source/JavaScriptCore/runtime/CommonSlowPaths.cpp:1057 > + JSArray* array; > + { > + JSFunction* iterationFunction = globalObject->iteratorProtocolFunction(); > + CallData callData; > + CallType callType = JSC::getCallData(iterationFunction, callData); > + ASSERT(callType != CallType::None); > + > + MarkedArgumentBuffer arguments; > + arguments.append(iterable); > + JSValue arrayResult = call(exec, iterationFunction, callType, callData, jsNull(), arguments); > + CHECK_EXCEPTION(); > + array = jsCast<JSArray*>(arrayResult); > + } > + > + RETURN(JSFixedArray::createFromArray(exec, vm, globalObject, array)); Could you add a test for this case? > Source/JavaScriptCore/runtime/JSFixedArray.h:34 > + typedef JSCell Base; If we do not add any structure transition onto this, we can make this StrucutreIsImmortal. > Source/JavaScriptCore/runtime/JSFixedArray.h:48 > + unsigned length = array->length(); Let's create the `JSFixedArray* result` here. All the returned JSFixedArray is created in the same form `JSFixedArray* result = JSFixedArray::create(vm, globalObject->fixedArrayStructure(), length);`, right? And I think we can insert if (!length) return result; here. > Source/JavaScriptCore/runtime/JSFixedArray.h:50 > + JSFixedArray* result = JSFixedArray::create(vm, globalObject->fixedArrayStructure(), length); Move this to the prologue of this function. > Source/JavaScriptCore/runtime/JSFixedArray.h:60 > + JSFixedArray* result = JSFixedArray::create(vm, globalObject->fixedArrayStructure(), length); Move this to the prologue of this function. > Source/JavaScriptCore/runtime/JSFixedArray.h:70 > + if (!length) > + return JSFixedArray::create(vm, globalObject->fixedArrayStructure(), 0); Move this to the prologue of this function. > Source/JavaScriptCore/runtime/JSFixedArray.h:73 > + JSFixedArray* result = JSFixedArray::create(vm, globalObject->fixedArrayStructure(), length); Move this to the prologue of this function. > Source/JavaScriptCore/runtime/JSGlobalObject.cpp:943 > + PropertySlot slot(arrayIteratorPrototype, PropertySlot::InternalMethodType::Get); > + bool result = arrayIteratorPrototype->getOwnPropertySlot(arrayIteratorPrototype, exec, m_vm.propertyNames->next, slot); > + RELEASE_ASSERT(result); > + RELEASE_ASSERT(!scope.exception()); > + RELEASE_ASSERT(slot.isCacheableValue()); > + JSValue nextFunction = slot.getValue(exec, m_vm.propertyNames->next); > + RELEASE_ASSERT(!scope.exception()); > + RELEASE_ASSERT(jsDynamicCast<JSFunction*>(nextFunction)); > + > + ObjectPropertyCondition condition = generateConditionForSelfEquivalence(m_vm, nullptr, arrayIteratorPrototype, m_vm.propertyNames->next.impl()); > + RELEASE_ASSERT(condition.requiredValue() == nextFunction); > + > + bool isWatchable = condition.isWatchable(PropertyCondition::EnsureWatchability); > + RELEASE_ASSERT(isWatchable); // We allow this to install the necessary watchpoints. > + > + m_arrayIteratorPrototypeNext = std::make_unique<ArrayIteratorAdaptiveWatchpoint>(condition, this); > + m_arrayIteratorPrototypeNext->install(); > + } > + > + { > + ArrayPrototype* arrayPrototype = this->arrayPrototype(); > + PropertySlot slot(arrayPrototype, PropertySlot::InternalMethodType::Get); > + bool result = arrayPrototype->getOwnPropertySlot(arrayPrototype, exec, m_vm.propertyNames->iteratorSymbol, slot); > + RELEASE_ASSERT(result); > + RELEASE_ASSERT(slot.isCacheableValue()); > + RELEASE_ASSERT(!scope.exception()); > + JSValue symbolIteratorFunction = slot.getValue(exec, m_vm.propertyNames->iteratorSymbol); > + RELEASE_ASSERT(!scope.exception()); > + RELEASE_ASSERT(jsDynamicCast<JSFunction*>(symbolIteratorFunction)); > + > + ObjectPropertyCondition condition = generateConditionForSelfEquivalence(m_vm, nullptr, arrayPrototype, m_vm.propertyNames->iteratorSymbol.impl()); > + RELEASE_ASSERT(condition); > + RELEASE_ASSERT(condition.requiredValue() == symbolIteratorFunction); > + > + bool isWatchable = condition.isWatchable(PropertyCondition::EnsureWatchability); // We allow this to install necessary watchpoints. > + RELEASE_ASSERT(isWatchable); > + > + m_arrayPrototypeSymbolIteratorWatchpoint = std::make_unique<ArrayIteratorAdaptiveWatchpoint>(condition, this); > + m_arrayPrototypeSymbolIteratorWatchpoint->install(); These operations are duplicate. And it is quite useful. Could you extract this to a static function? > Source/JavaScriptCore/runtime/JSType.h:52 > + JSFixedArrayType, Nice.
Comment on attachment 294316 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=294316&action=review r=me > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7002 > + cellResult(resultGPR, node); At some point, we can profile indexing type and emit only one code instead. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7046 > + Edge use = m_jit.graph().m_varArgChildren[node->firstChild() + i]; You can use m_jit.graph().varArgChild(node, i); > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7056 > + compileAllocateNewArrayWithSize(m_jit.graph().globalObjectFor(node->origin.semantic), resultGPR, lengthGPR, ArrayWithContiguous, shouldAllowForArrayStorageStructureForLargeArrays); We may have a chance to create non contiguous array version at some point. (Like, DoubleShape array). > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7069 > + Edge use = m_jit.graph().m_varArgChildren[node->firstChild() + i]; Ditto. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7103 > + } OK, copying. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7117 > + Edge use = m_jit.graph().m_varArgChildren[node->firstChild() + i]; Ditto. > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4326 > + Edge use = m_graph.m_varArgChildren[m_node->firstChild() + i]; You can use m_graph.varArgChild(node, i); > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4339 > + Edge use = m_graph.m_varArgChildren[m_node->firstChild() + i]; You can use m_graph.varArgChild(node, i); > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4395 > + Edge use = m_graph.m_varArgChildren[m_node->firstChild() + i]; You can use m_graph.varArgChild(node, i);
Comment on attachment 294316 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=294316&action=review >> Source/JavaScriptCore/builtins/IteratorHelpers.js:26 >> +function performIteration(iterable) > > It is useful if we can have a URL to the spec. Will do. >> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7002 >> + cellResult(resultGPR, node); > > At some point, we can profile indexing type and emit only one code instead. Yeah. I have a bug open for that. >> Source/JavaScriptCore/runtime/ArrayIteratorAdaptiveWatchpoint.cpp:42 >> + m_globalObject->arrayIteratorProtocolWatchpoint().fireAll(m_globalObject->vm(), StringFireDetail("Array iterator protocol changed.")); > > OK, once this adaptive watchpoint is fired, it fires the array iterator protocol watchpoint set. And it later throw away code assuming this invariant. Yeah. I'll add a test for this. >> Source/JavaScriptCore/runtime/JSFixedArray.h:34 >> + typedef JSCell Base; > > If we do not add any structure transition onto this, we can make this StrucutreIsImmortal. Yeah I will do that and also make the Structure a member of the VM instead of the global object >> Source/JavaScriptCore/runtime/JSGlobalObject.cpp:943 >> + m_arrayPrototypeSymbolIteratorWatchpoint->install(); > > These operations are duplicate. And it is quite useful. Could you extract this to a static function? I'll make a local lambda to abstract the operation.
Created attachment 294477 [details] patch for landing
Attachment 294477 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/runtime/JSGlobalObjectInlines.h:29: Alphabetical sorting problem. [build/include_order] [4] ERROR: Source/JavaScriptCore/bytecode/ObjectPropertyConditionSet.h:160: The parameter name "object" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 2 in 65 files If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 294477 [details] patch for landing Clearing flags on attachment: 294477 Committed r208584: <http://trac.webkit.org/changeset/208584>
All reviewed patches have been landed. Closing bug.
Reverted r208584 for reason: Seems to have regressed Speedometer by 1% on Mac Committed r208592: <http://trac.webkit.org/changeset/208592>
(In reply to comment #30) > Reverted r208584 for reason: > > Seems to have regressed Speedometer by 1% on Mac > > Committed r208592: <http://trac.webkit.org/changeset/208592> Ok. I'm looking into this.
(In reply to comment #31) > (In reply to comment #30) > > Reverted r208584 for reason: > > > > Seems to have regressed Speedometer by 1% on Mac > > > > Committed r208592: <http://trac.webkit.org/changeset/208592> > > Ok. I'm looking into this. I can reproduce this locally.
(In reply to comment #32) > (In reply to comment #31) > > (In reply to comment #30) > > > Reverted r208584 for reason: > > > > > > Seems to have regressed Speedometer by 1% on Mac > > > > > > Committed r208592: <http://trac.webkit.org/changeset/208592> > > > > Ok. I'm looking into this. > > I can reproduce this locally. I actually built two webkits, one with and one without the patch, to compare them locally, and I wasn't able to reproduce the regression. Also, speedometer doesn't ever emit a spread, so it's very unlikely that my patch is a regression. Also, since my patch does work in JSGlobalObject creation, I wrote a tiny benchmark which allocates JSGlobalObjects in a loop, and my patch isn't a slow down there either. So I'm going to reland and see what the perf bots think.
Created attachment 294575 [details] patch for landing v2
Attachment 294575 [details] did not pass style-queue: ERROR: Source/JavaScriptCore/runtime/JSGlobalObjectInlines.h:29: Alphabetical sorting problem. [build/include_order] [4] ERROR: Source/JavaScriptCore/bytecode/ObjectPropertyConditionSet.h:160: The parameter name "object" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 2 in 65 files If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 294575 [details] patch for landing v2 Clearing flags on attachment: 294575 Committed r208637: <http://trac.webkit.org/changeset/208637>
(In reply to comment #37) > All reviewed patches have been landed. Closing bug. No obvious regression the second time this landed, all clear.