RESOLVED FIXED 172421
[DFG] Add ArrayIndexOf intrinsic
https://bugs.webkit.org/show_bug.cgi?id=172421
Summary [DFG] Add ArrayIndexOf intrinsic
Yusuke Suzuki
Reported 2017-05-20 15:32:35 PDT
By doing so, we can easily optimize comparison code. Since I see this pattern in SixSpeed and ARES-6 Babylon, currently, I'll just attempt to specialize String and Object. It is worth adding for the other types like Int32, Double.
Attachments
Patch (28.10 KB, patch)
2017-06-04 11:06 PDT, Yusuke Suzuki
no flags
Patch (29.19 KB, patch)
2017-06-04 11:40 PDT, Yusuke Suzuki
no flags
Patch (29.11 KB, patch)
2017-06-04 12:08 PDT, Yusuke Suzuki
no flags
WIP (40.72 KB, patch)
2017-06-06 12:20 PDT, Yusuke Suzuki
no flags
Archive of layout-test-results from ews117 for mac-elcapitan (2.16 MB, application/zip)
2017-06-06 13:58 PDT, Build Bot
no flags
Patch (45.43 KB, patch)
2017-06-07 13:17 PDT, Yusuke Suzuki
no flags
Archive of layout-test-results from ews116 for mac-elcapitan (2.24 MB, application/zip)
2017-06-07 15:01 PDT, Build Bot
no flags
Patch (48.01 KB, patch)
2017-06-08 09:25 PDT, Yusuke Suzuki
no flags
Patch (48.01 KB, patch)
2017-06-08 09:28 PDT, Yusuke Suzuki
no flags
Patch (52.14 KB, patch)
2017-06-09 09:39 PDT, Yusuke Suzuki
no flags
Patch (60.65 KB, patch)
2017-06-09 13:22 PDT, Yusuke Suzuki
no flags
Patch (60.91 KB, patch)
2017-06-09 13:39 PDT, Yusuke Suzuki
saam: review+
Yusuke Suzuki
Comment 1 2017-06-04 11:06:10 PDT
Created attachment 311966 [details] Patch Int32/Double
Yusuke Suzuki
Comment 2 2017-06-04 11:40:02 PDT
Created attachment 311967 [details] Patch Int32/Double
Yusuke Suzuki
Comment 3 2017-06-04 12:08:25 PDT
Created attachment 311968 [details] Patch Int32/Double
Yusuke Suzuki
Comment 4 2017-06-06 12:20:12 PDT
Yusuke Suzuki
Comment 5 2017-06-06 12:27:21 PDT
WIP, adding test cases right now and checking the logic. And I would like to see various perf results. (I think indexOf is also used in Octane / Kraken. idk). With updated ARES-6 Babylon, we see 11% perf win in steady state. It's good. Since Array#indexOf is frequently used API, optimizing it carefuly is super nice. Running... Babylon ( 1 to go) firstIteration: 46.09 +- 4.05 ms averageWorstCase: 24.06 +- 1.34 ms steadyState: 8.39 +- 0.27 ms Running... Babylon ( 1 to go) firstIteration: 44.09 +- 4.82 ms averageWorstCase: 23.07 +- 3.41 ms steadyState: 7.55 +- 0.41 ms And some of SixSpeed using indexOf also show significant perf win. baseline patched map-set-lookup.es5 687.6649+-8.2699 ^ 174.2516+-6.6156 ^ definitely 3.9464x faster map-set.es5 42.8671+-2.5895 ^ 33.6080+-0.8809 ^ definitely 1.2755x faster map-set-object.es5 63.6580+-1.2433 ^ 49.1197+-0.8750 ^ definitely 1.2960x faster
Build Bot
Comment 6 2017-06-06 13:58:35 PDT
Comment on attachment 312105 [details] WIP Attachment 312105 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/3883366 New failing tests: imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.html webgl/1.0.2/conformance/attribs/gl-vertexattribpointer.html imported/w3c/web-platform-tests/dom/nodes/Node-compareDocumentPosition.html webgl/1.0.3/conformance/extensions/angle-instanced-arrays-out-of-bounds.html
Build Bot
Comment 7 2017-06-06 13:58:37 PDT
Created attachment 312118 [details] Archive of layout-test-results from ews117 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews117 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Yusuke Suzuki
Comment 8 2017-06-07 13:17:48 PDT
Created attachment 312215 [details] Patch OK, it works. Early reviews are welcome
Build Bot
Comment 9 2017-06-07 15:01:09 PDT
Comment on attachment 312215 [details] Patch Attachment 312215 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/3889768 New failing tests: imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.html webgl/1.0.2/conformance/attribs/gl-vertexattribpointer.html imported/w3c/web-platform-tests/dom/nodes/Node-compareDocumentPosition.html webgl/1.0.3/conformance/extensions/angle-instanced-arrays-out-of-bounds.html
Build Bot
Comment 10 2017-06-07 15:01:10 PDT
Created attachment 312236 [details] Archive of layout-test-results from ews116 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews116 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Yusuke Suzuki
Comment 11 2017-06-08 09:25:40 PDT
Yusuke Suzuki
Comment 12 2017-06-08 09:28:52 PDT
Yusuke Suzuki
Comment 13 2017-06-09 08:07:02 PDT
Comment on attachment 312311 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=312311&action=review > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2398 > + if (!arrayMode.isJSArray()) > + return false; > + > + if (arrayMode.arrayClass() != Array::OriginalArray) > + return false; One possible additional check is `doesConversion()`. By checking it, we can avoid emitting Arrayify/ArrayifyToStructure. But, for example, In operation just uses blessArrayOperation, and it may involve Arrayify. What do you think of? I think adding if (arrayMode.doesConversion()) return fasle; would be nice here.
Yusuke Suzuki
Comment 14 2017-06-09 09:39:01 PDT
Saam Barati
Comment 15 2017-06-09 11:19:37 PDT
Comment on attachment 312442 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=312442&action=review > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2398 > + if (arrayMode.arrayClass() != Array::OriginalArray) > + return false; Please add a test for this. Perhaps also a test where you set "indexOf" on base array. > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2413 > + // FIXME: We could easily relax the Array/Object.prototype transition as long as we OSR exitted if we saw a hole. Please file a bug or remove > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2421 > + m_graph.watchpoints().addLazily(globalObject->havingABadTimeWatchpoint()); > + m_graph.watchpoints().addLazily(arrayPrototypeTransition); > + m_graph.watchpoints().addLazily(objectPrototypeTransition); Please add tests for all three of these. > Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2430 > + addVarArgChild(addToGraph(GetButterfly, array)); Why do we do this here if FixupPhase will insert a GetButterfly inside blessArrayOperation? > Source/JavaScriptCore/dfg/DFGClobberize.h:547 > + case ArrayIndexOf: { > + read(MiscFields); > + read(JSCell_indexingType); > + read(JSCell_structureID); > + read(JSObject_butterfly); > + read(Butterfly_publicLength); Maybe in a follow up, but we should write a CSE rule for this. > Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:1027 > + Edge& child1 = m_graph.varArgChild(node, 0); lets give this a better name > Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:1030 > + fixEdge<KnownCellUse>(child1); How is this KnownCell? > Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:1034 > + Edge& child2 = m_graph.varArgChild(node, 1); lets give this a better name > Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:1046 > + case Array::Double: { > + if (child2->shouldSpeculateNumber()) > + fixEdge<DoubleRepUse>(child2); > + break; > + } > + case Array::Int32: { > + if (child2->shouldSpeculateInt32()) > + fixEdge<Int32Use>(child2); > + break; > + } If you wanted to be really clever, you could constant fold these operations to return -1 if you prove that child2 is not a number. However, I think a program that does this is probably uncommon. > Source/JavaScriptCore/dfg/DFGOperations.cpp:1939 > + if (value.isObject() && asObject(value) == search) Why do you need to check isObject here? Why can't you just compare the bits since object equality is just bit equality > Source/JavaScriptCore/dfg/DFGOperations.cpp:1979 > + auto data = butterfly->contiguousDouble().data(); I think it'd be helpful to have the type here. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7495 > + auto holeValue = m_jit.branchTest64(MacroAssembler::Zero, tempGPR); Maybe it'd be faster if we didn't unbox child2? You could remove a branch this way by not having to check for a hole. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7568 > + case ObjectUse: { > + SpeculateCellOperand target(this, child2); > + > + GPRReg targetGPR = target.gpr(); > + > + speculateObject(child2, targetGPR); > + > + flushRegisters(); > + > + callOperation(operationArrayIndexOfObject, lengthGPR, storageGPR, targetGPR, indexGPR); > + m_jit.exceptionCheck(); > + > + int32Result(lengthGPR, node); > + break; > + } I think this can be almost an identical loop to the Int32 case. > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4126 > + case ObjectUse: > + setInt32(vmCall(Int32, m_out.operation(operationArrayIndexOfObject), m_callFrame, storage, lowObject(child2), startIndex)); > + break; Ditto here
Yusuke Suzuki
Comment 16 2017-06-09 12:00:35 PDT
Comment on attachment 312442 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=312442&action=review Thank you! >> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2398 >> + return false; > > Please add a test for this. Perhaps also a test where you set "indexOf" on base array. Added. >> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2413 >> + // FIXME: We could easily relax the Array/Object.prototype transition as long as we OSR exitted if we saw a hole. > > Please file a bug or remove Filed in https://bugs.webkit.org/show_bug.cgi?id=173171. >> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2421 >> + m_graph.watchpoints().addLazily(objectPrototypeTransition); > > Please add tests for all three of these. OK, added. >> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:2430 >> + addVarArgChild(addToGraph(GetButterfly, array)); > > Why do we do this here if FixupPhase will insert a GetButterfly inside blessArrayOperation? Ah, yeah, we can just insert empty node (nullptr) here. Fixed. And since we only emit ArrayIndexOf with Array::{Int32, Double, Contiguous) cases, blessArrayOperation always insert GetButterfly in Fixup. >> Source/JavaScriptCore/dfg/DFGClobberize.h:547 >> + read(Butterfly_publicLength); > > Maybe in a follow up, but we should write a CSE rule for this. Filed. https://bugs.webkit.org/show_bug.cgi?id=173173 >> Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:1027 >> + Edge& child1 = m_graph.varArgChild(node, 0); > > lets give this a better name Renamed it to array. >> Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:1030 >> + fixEdge<KnownCellUse>(child1); > > How is this KnownCell? Since blessArrayOperation ensures it by emitting CheckStructure or CheckArray. >> Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:1034 >> + Edge& child2 = m_graph.varArgChild(node, 1); > > lets give this a better name Renamed it to searchElement. >> Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:1046 >> + } > > If you wanted to be really clever, you could constant fold these operations to return -1 if you prove that child2 is not a number. However, I think a program that does this is probably uncommon. By emitting various speculation edges (like, CellUse, BooleanUse, OtherUse), we can constant fold this. Maybe, the priority is a bit low because it is typically uncommon. I'll file this in a separate bug. Filed. https://bugs.webkit.org/show_bug.cgi?id=173176 >> Source/JavaScriptCore/dfg/DFGOperations.cpp:1939 >> + if (value.isObject() && asObject(value) == search) > > Why do you need to check isObject here? Why can't you just compare the bits since object equality is just bit equality Yeah, that's fine. But as you pointed, we can emit the code in DFG code directly for that. Fixed. >> Source/JavaScriptCore/dfg/DFGOperations.cpp:1979 >> + auto data = butterfly->contiguousDouble().data(); > > I think it'd be helpful to have the type here. Changed this `auto` to `const double*`. >> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7495 >> + auto holeValue = m_jit.branchTest64(MacroAssembler::Zero, tempGPR); > > Maybe it'd be faster if we didn't unbox child2? You could remove a branch this way by not having to check for a hole. Yeah. For 64bit environment, we can drop one branch. I think it would be nice. Anyway, we need to speculate Int32 here because the comming value can be Double (but content is int32). So ManualSpeculation is one possible implemenation. Fixing... >> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7568 >> + } > > I think this can be almost an identical loop to the Int32 case. Cool. Fixed. >> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4126 >> + break; > > Ditto here Fixed.
Yusuke Suzuki
Comment 17 2017-06-09 13:22:18 PDT
Yusuke Suzuki
Comment 18 2017-06-09 13:39:03 PDT
Created attachment 312485 [details] Patch 32bit fix
Saam Barati
Comment 19 2017-06-11 13:55:24 PDT
Comment on attachment 312485 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=312485&action=review r=me with a few comments. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7516 > + case DoubleRepUse: { Nit: Worth an assertion that the array mode is double here. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7556 > + case ObjectUse: { Style nit: This is almost identical to the int32 loop above (besides the 32-bit tag code). Maybe you can abstract it? > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4097 > + LValue index = m_out.phi(Int32, initialStartIndex); Nit: It might be worth doing 64 bit math and having the zeroExtend outside the loop. (Only if we actually end up emitting machine instructions to zero extend inside the loop) > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4098 > + ValueFromBlock notFoundResult = m_out.anchor(m_out.constInt32(-1)); What code do we generate here? Do we end up emitting a Mov, -1, Reg, inside the loop every time? If so, it might be faster to branch to a block that sets not found result.
Saam Barati
Comment 20 2017-06-11 13:56:13 PDT
Comment on attachment 312485 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=312485&action=review >> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7516 >> + case DoubleRepUse: { > > Nit: Worth an assertion that the array mode is double here. Also, same for the FTLLower code, and maybe also worth doing for Int32 use kind
Yusuke Suzuki
Comment 21 2017-06-11 20:36:27 PDT
Comment on attachment 312485 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=312485&action=review Thank you! >>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7516 >>> + case DoubleRepUse: { >> >> Nit: Worth an assertion that the array mode is double here. > > Also, same for the FTLLower code, and maybe also worth doing for Int32 use kind That's fine. Added. >> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7556 >> + case ObjectUse: { > > Style nit: This is almost identical to the int32 loop above (besides the 32-bit tag code). Maybe you can abstract it? OK, fixed. > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4087 > + searchElement = boxInt32(lowInt32(searchElementEdge)); I also changed it to speculate(searchElementEdge); searchElement = lowJSValue(searchElementEdge, ManualOperandSpeculation); This generates a bit efficient code. >> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4097 >> + LValue index = m_out.phi(Int32, initialStartIndex); > > Nit: It might be worth doing 64 bit math and having the zeroExtend outside the loop. (Only if we actually end up emitting machine instructions to zero extend inside the loop) Yeah, I changed it to perform zeroExtend out of the loop and use it as pointerType(). And finally, I performed castToInt32 to setInt32. It works fine. >> Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4098 >> + ValueFromBlock notFoundResult = m_out.anchor(m_out.constInt32(-1)); > > What code do we generate here? Do we end up emitting a Mov, -1, Reg, inside the loop every time? If so, it might be faster to branch to a block that sets not found result. Yeah, I fixed it by moving it out of the loop.
Yusuke Suzuki
Comment 22 2017-06-11 20:58:32 PDT
Csaba Osztrogonác
Comment 23 2017-06-12 02:39:32 PDT
(In reply to Yusuke Suzuki from comment #22) > Committed r218084: <http://trac.webkit.org/changeset/218084> It broke the AArch64 (at least Linux) build: https://build.webkit.org/builders/JSCOnly%20Linux%20AArch64%20Release/builds/991 ../../Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp: In lambda function: ../../Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7518:159: error: no matching function for call to 'JSC::DFG::JITCompiler::branch64(JSC::MacroAssemblerARM64::RelationalCondition, JSC::AbstractMacroAssembler<JSC::ARM64Assembler, JSC::MacroAssemblerARM64>::BaseIndex, JSC::GPRReg&)' ../../Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:7518:159: note: candidates are: ... note: I just noticed the build failure, no time and plan to fix it myself.
Yusuke Suzuki
Comment 24 2017-06-12 04:52:08 PDT
WebKit Commit Bot
Comment 25 2017-06-12 05:15:14 PDT
Re-opened since this is blocked by bug 173259
Yusuke Suzuki
Comment 26 2017-06-12 06:28:04 PDT
Yusuke Suzuki
Comment 27 2017-06-12 11:39:54 PDT
32bit bot fails with assertion. But it looks to me that this assertion is not correct. https://bugs.webkit.org/show_bug.cgi?id=112380 Code is like, SpeculateInt32Operand searchElement(this, searchElementEdge); GPRReg searchElementGPR = searchElement.gpr(); m_jit.zeroExtend32ToPtr(lengthGPR, lengthGPR); m_jit.zeroExtend32ToPtr(indexGPR, indexGPR); auto loop = m_jit.label(); ... m_jit.jump().linkTo(loop, &m_jit); The problem is that m_jit.zeroExtend32ToPtr() becomes empty in 32bit environment. And it accidentally causes assertion failure of DFG_REGISTER_ALLOCATION_VALIDATION. However, I think the above code is valid. And this assertion should not be fired ideally. I'll put clearRegisterAllocationOffsets() for now.
Yusuke Suzuki
Comment 28 2017-06-12 11:42:53 PDT
Ryan Haddad
Comment 29 2017-06-14 13:24:51 PDT
(In reply to Yusuke Suzuki from comment #28) > Committed r218113: <http://trac.webkit.org/changeset/218113> This took care of half of them, but there are still 20 tests failing on 32-bit JSC bots: https://build.webkit.org/builders/Apple%20Sierra%2032-bit%20JSC%20%28BuildAndTest%29/builds/952
Yusuke Suzuki
Comment 30 2017-06-14 20:31:47 PDT
Yusuke Suzuki
Comment 31 2017-06-15 00:03:02 PDT
(In reply to Ryan Haddad from comment #29) > (In reply to Yusuke Suzuki from comment #28) > > Committed r218113: <http://trac.webkit.org/changeset/218113> > > This took care of half of them, but there are still 20 tests failing on > 32-bit JSC bots: > https://build.webkit.org/builders/Apple%20Sierra%2032- > bit%20JSC%20%28BuildAndTest%29/builds/952 OK, now it is recovered.
Note You need to log in before you can comment on or make changes to this bug.