Summary: | [DFG] Add ArrayIndexOf intrinsic | ||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Yusuke Suzuki <ysuzuki> | ||||||||||||||||||||||||||
Component: | JavaScriptCore | Assignee: | Yusuke Suzuki <ysuzuki> | ||||||||||||||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||||||||||||||
Severity: | Normal | CC: | buildbot, clopez, commit-queue, fpizlo, jlewis3, keith_miller, mark.lam, msaboff, ossy, ryanhaddad, saam | ||||||||||||||||||||||||||
Priority: | P2 | ||||||||||||||||||||||||||||
Version: | WebKit Nightly Build | ||||||||||||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||||||||||
Bug Depends on: | 173259 | ||||||||||||||||||||||||||||
Bug Blocks: | 173157 | ||||||||||||||||||||||||||||
Attachments: |
|
Description
Yusuke Suzuki
2017-05-20 15:32:35 PDT
Created attachment 311966 [details]
Patch
Int32/Double
Created attachment 311967 [details]
Patch
Int32/Double
Created attachment 311968 [details]
Patch
Int32/Double
Created attachment 312105 [details]
WIP
WIP, adding test cases right now and checking the logic. And I would like to see various perf results. (I think indexOf is also used in Octane / Kraken. idk). With updated ARES-6 Babylon, we see 11% perf win in steady state. It's good. Since Array#indexOf is frequently used API, optimizing it carefuly is super nice. Running... Babylon ( 1 to go) firstIteration: 46.09 +- 4.05 ms averageWorstCase: 24.06 +- 1.34 ms steadyState: 8.39 +- 0.27 ms Running... Babylon ( 1 to go) firstIteration: 44.09 +- 4.82 ms averageWorstCase: 23.07 +- 3.41 ms steadyState: 7.55 +- 0.41 ms And some of SixSpeed using indexOf also show significant perf win. baseline patched map-set-lookup.es5 687.6649+-8.2699 ^ 174.2516+-6.6156 ^ definitely 3.9464x faster map-set.es5 42.8671+-2.5895 ^ 33.6080+-0.8809 ^ definitely 1.2755x faster map-set-object.es5 63.6580+-1.2433 ^ 49.1197+-0.8750 ^ definitely 1.2960x faster Comment on attachment 312105 [details] WIP Attachment 312105 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/3883366 New failing tests: imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.html webgl/1.0.2/conformance/attribs/gl-vertexattribpointer.html imported/w3c/web-platform-tests/dom/nodes/Node-compareDocumentPosition.html webgl/1.0.3/conformance/extensions/angle-instanced-arrays-out-of-bounds.html Created attachment 312118 [details]
Archive of layout-test-results from ews117 for mac-elcapitan
The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews117 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Created attachment 312215 [details]
Patch
OK, it works. Early reviews are welcome
Comment on attachment 312215 [details] Patch Attachment 312215 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/3889768 New failing tests: imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.html webgl/1.0.2/conformance/attribs/gl-vertexattribpointer.html imported/w3c/web-platform-tests/dom/nodes/Node-compareDocumentPosition.html webgl/1.0.3/conformance/extensions/angle-instanced-arrays-out-of-bounds.html Created attachment 312236 [details]
Archive of layout-test-results from ews116 for mac-elcapitan
The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews116 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Created attachment 312310 [details]
Patch
Created attachment 312311 [details]
Patch
Comment on attachment 312311 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=312311&action=review > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2398 > + if (!arrayMode.isJSArray()) > + return false; > + > + if (arrayMode.arrayClass() != Array::OriginalArray) > + return false; One possible additional check is `doesConversion()`. By checking it, we can avoid emitting Arrayify/ArrayifyToStructure. But, for example, In operation just uses blessArrayOperation, and it may involve Arrayify. What do you think of? I think adding if (arrayMode.doesConversion()) return fasle; would be nice here. Created attachment 312442 [details]
Patch
Comment on attachment 312442 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=312442&action=review > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2398 > + if (arrayMode.arrayClass() != Array::OriginalArray) > + return false; Please add a test for this. Perhaps also a test where you set "indexOf" on base array. > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2413 > + // FIXME: We could easily relax the Array/Object.prototype transition as long as we OSR exitted if we saw a hole. Please file a bug or remove > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2421 > + m_graph.watchpoints().addLazily(globalObject->havingABadTimeWatchpoint()); > + m_graph.watchpoints().addLazily(arrayPrototypeTransition); > + m_graph.watchpoints().addLazily(objectPrototypeTransition); Please add tests for all three of these. > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2430 > + addVarArgChild(addToGraph(GetButterfly, array)); Why do we do this here if FixupPhase will insert a GetButterfly inside blessArrayOperation? > Source/JavaScriptCore/dfg/DFGClobberize.h:547 > + case ArrayIndexOf: { > + read(MiscFields); > + read(JSCell_indexingType); > + read(JSCell_structureID); > + read(JSObject_butterfly); > + read(Butterfly_publicLength); Maybe in a follow up, but we should write a CSE rule for this. > Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:1027 > + Edge& child1 = m_graph.varArgChild(node, 0); lets give this a better name > Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:1030 > + fixEdge<KnownCellUse>(child1); How is this KnownCell? > Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:1034 > + Edge& child2 = m_graph.varArgChild(node, 1); lets give this a better name > Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:1046 > + case Array::Double: { > + if (child2->shouldSpeculateNumber()) > + fixEdge<DoubleRepUse>(child2); > + break; > + } > + case Array::Int32: { > + if (child2->shouldSpeculateInt32()) > + fixEdge<Int32Use>(child2); > + break; > + } If you wanted to be really clever, you could constant fold these operations to return -1 if you prove that child2 is not a number. However, I think a program that does this is probably uncommon. > Source/JavaScriptCore/dfg/DFGOperations.cpp:1939 > + if (value.isObject() && asObject(value) == search) Why do you need to check isObject here? Why can't you just compare the bits since object equality is just bit equality > Source/JavaScriptCore/dfg/DFGOperations.cpp:1979 > + auto data = butterfly->contiguousDouble().data(); I think it'd be helpful to have the type here. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7495 > + auto holeValue = m_jit.branchTest64(MacroAssembler::Zero, tempGPR); Maybe it'd be faster if we didn't unbox child2? You could remove a branch this way by not having to check for a hole. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7568 > + case ObjectUse: { > + SpeculateCellOperand target(this, child2); > + > + GPRReg targetGPR = target.gpr(); > + > + speculateObject(child2, targetGPR); > + > + flushRegisters(); > + > + callOperation(operationArrayIndexOfObject, lengthGPR, storageGPR, targetGPR, indexGPR); > + m_jit.exceptionCheck(); > + > + int32Result(lengthGPR, node); > + break; > + } I think this can be almost an identical loop to the Int32 case. > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4126 > + case ObjectUse: > + setInt32(vmCall(Int32, m_out.operation(operationArrayIndexOfObject), m_callFrame, storage, lowObject(child2), startIndex)); > + break; Ditto here Comment on attachment 312442 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=312442&action=review Thank you! >> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2398 >> + return false; > > Please add a test for this. Perhaps also a test where you set "indexOf" on base array. Added. >> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2413 >> + // FIXME: We could easily relax the Array/Object.prototype transition as long as we OSR exitted if we saw a hole. > > Please file a bug or remove Filed in https://bugs.webkit.org/show_bug.cgi?id=173171. >> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2421 >> + m_graph.watchpoints().addLazily(objectPrototypeTransition); > > Please add tests for all three of these. OK, added. >> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2430 >> + addVarArgChild(addToGraph(GetButterfly, array)); > > Why do we do this here if FixupPhase will insert a GetButterfly inside blessArrayOperation? Ah, yeah, we can just insert empty node (nullptr) here. Fixed. And since we only emit ArrayIndexOf with Array::{Int32, Double, Contiguous) cases, blessArrayOperation always insert GetButterfly in Fixup. >> Source/JavaScriptCore/dfg/DFGClobberize.h:547 >> + read(Butterfly_publicLength); > > Maybe in a follow up, but we should write a CSE rule for this. Filed. https://bugs.webkit.org/show_bug.cgi?id=173173 >> Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:1027 >> + Edge& child1 = m_graph.varArgChild(node, 0); > > lets give this a better name Renamed it to array. >> Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:1030 >> + fixEdge<KnownCellUse>(child1); > > How is this KnownCell? Since blessArrayOperation ensures it by emitting CheckStructure or CheckArray. >> Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:1034 >> + Edge& child2 = m_graph.varArgChild(node, 1); > > lets give this a better name Renamed it to searchElement. >> Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:1046 >> + } > > If you wanted to be really clever, you could constant fold these operations to return -1 if you prove that child2 is not a number. However, I think a program that does this is probably uncommon. By emitting various speculation edges (like, CellUse, BooleanUse, OtherUse), we can constant fold this. Maybe, the priority is a bit low because it is typically uncommon. I'll file this in a separate bug. Filed. https://bugs.webkit.org/show_bug.cgi?id=173176 >> Source/JavaScriptCore/dfg/DFGOperations.cpp:1939 >> + if (value.isObject() && asObject(value) == search) > > Why do you need to check isObject here? Why can't you just compare the bits since object equality is just bit equality Yeah, that's fine. But as you pointed, we can emit the code in DFG code directly for that. Fixed. >> Source/JavaScriptCore/dfg/DFGOperations.cpp:1979 >> + auto data = butterfly->contiguousDouble().data(); > > I think it'd be helpful to have the type here. Changed this `auto` to `const double*`. >> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7495 >> + auto holeValue = m_jit.branchTest64(MacroAssembler::Zero, tempGPR); > > Maybe it'd be faster if we didn't unbox child2? You could remove a branch this way by not having to check for a hole. Yeah. For 64bit environment, we can drop one branch. I think it would be nice. Anyway, we need to speculate Int32 here because the comming value can be Double (but content is int32). So ManualSpeculation is one possible implemenation. Fixing... >> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7568 >> + } > > I think this can be almost an identical loop to the Int32 case. Cool. Fixed. >> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4126 >> + break; > > Ditto here Fixed. Created attachment 312479 [details]
Patch
Created attachment 312485 [details]
Patch
32bit fix
Comment on attachment 312485 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=312485&action=review r=me with a few comments. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7516 > + case DoubleRepUse: { Nit: Worth an assertion that the array mode is double here. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7556 > + case ObjectUse: { Style nit: This is almost identical to the int32 loop above (besides the 32-bit tag code). Maybe you can abstract it? > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4097 > + LValue index = m_out.phi(Int32, initialStartIndex); Nit: It might be worth doing 64 bit math and having the zeroExtend outside the loop. (Only if we actually end up emitting machine instructions to zero extend inside the loop) > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4098 > + ValueFromBlock notFoundResult = m_out.anchor(m_out.constInt32(-1)); What code do we generate here? Do we end up emitting a Mov, -1, Reg, inside the loop every time? If so, it might be faster to branch to a block that sets not found result. Comment on attachment 312485 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=312485&action=review >> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7516 >> + case DoubleRepUse: { > > Nit: Worth an assertion that the array mode is double here. Also, same for the FTLLower code, and maybe also worth doing for Int32 use kind Comment on attachment 312485 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=312485&action=review Thank you! >>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7516 >>> + case DoubleRepUse: { >> >> Nit: Worth an assertion that the array mode is double here. > > Also, same for the FTLLower code, and maybe also worth doing for Int32 use kind That's fine. Added. >> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7556 >> + case ObjectUse: { > > Style nit: This is almost identical to the int32 loop above (besides the 32-bit tag code). Maybe you can abstract it? OK, fixed. > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4087 > + searchElement = boxInt32(lowInt32(searchElementEdge)); I also changed it to speculate(searchElementEdge); searchElement = lowJSValue(searchElementEdge, ManualOperandSpeculation); This generates a bit efficient code. >> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4097 >> + LValue index = m_out.phi(Int32, initialStartIndex); > > Nit: It might be worth doing 64 bit math and having the zeroExtend outside the loop. (Only if we actually end up emitting machine instructions to zero extend inside the loop) Yeah, I changed it to perform zeroExtend out of the loop and use it as pointerType(). And finally, I performed castToInt32 to setInt32. It works fine. >> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4098 >> + ValueFromBlock notFoundResult = m_out.anchor(m_out.constInt32(-1)); > > What code do we generate here? Do we end up emitting a Mov, -1, Reg, inside the loop every time? If so, it might be faster to branch to a block that sets not found result. Yeah, I fixed it by moving it out of the loop. Committed r218084: <http://trac.webkit.org/changeset/218084> (In reply to Yusuke Suzuki from comment #22) > Committed r218084: <http://trac.webkit.org/changeset/218084> It broke the AArch64 (at least Linux) build: https://build.webkit.org/builders/JSCOnly%20Linux%20AArch64%20Release/builds/991 ../../Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp: In lambda function: ../../Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7518:159: error: no matching function for call to 'JSC::DFG::JITCompiler::branch64(JSC::MacroAssemblerARM64::RelationalCondition, JSC::AbstractMacroAssembler<JSC::ARM64Assembler, JSC::MacroAssemblerARM64>::BaseIndex, JSC::GPRReg&)' ../../Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7518:159: note: candidates are: ... note: I just noticed the build failure, no time and plan to fix it myself. Committed r218093: <http://trac.webkit.org/changeset/218093> Re-opened since this is blocked by bug 173259 Committed r218099: <http://trac.webkit.org/changeset/218099> 32bit bot fails with assertion. But it looks to me that this assertion is not correct. https://bugs.webkit.org/show_bug.cgi?id=112380 Code is like, SpeculateInt32Operand searchElement(this, searchElementEdge); GPRReg searchElementGPR = searchElement.gpr(); m_jit.zeroExtend32ToPtr(lengthGPR, lengthGPR); m_jit.zeroExtend32ToPtr(indexGPR, indexGPR); auto loop = m_jit.label(); ... m_jit.jump().linkTo(loop, &m_jit); The problem is that m_jit.zeroExtend32ToPtr() becomes empty in 32bit environment. And it accidentally causes assertion failure of DFG_REGISTER_ALLOCATION_VALIDATION. However, I think the above code is valid. And this assertion should not be fired ideally. I'll put clearRegisterAllocationOffsets() for now. Committed r218113: <http://trac.webkit.org/changeset/218113> (In reply to Yusuke Suzuki from comment #28) > Committed r218113: <http://trac.webkit.org/changeset/218113> This took care of half of them, but there are still 20 tests failing on 32-bit JSC bots: https://build.webkit.org/builders/Apple%20Sierra%2032-bit%20JSC%20%28BuildAndTest%29/builds/952 Committed r218312: <http://trac.webkit.org/changeset/218312> (In reply to Ryan Haddad from comment #29) > (In reply to Yusuke Suzuki from comment #28) > > Committed r218113: <http://trac.webkit.org/changeset/218113> > > This took care of half of them, but there are still 20 tests failing on > 32-bit JSC bots: > https://build.webkit.org/builders/Apple%20Sierra%2032- > bit%20JSC%20%28BuildAndTest%29/builds/952 OK, now it is recovered. |