150038 – Use JITSubGenerator to support UntypedUse operands for op_sub in the DFG

RESOLVED FIXED 150038

Use JITSubGenerator to support UntypedUse operands for op_sub in the DFG

https://bugs.webkit.org/show_bug.cgi?id=150038

Summary Use JITSubGenerator to support UntypedUse operands for op_sub in the DFG

Mark Lam

Reported 2015-10-12 11:14:04 PDT

Patch coming.

Attachments
patch 1: almost there. (15.09 KB, patch) 2015-10-12 11:17 PDT, Mark Lam	no flags	Details Formatted Diff Diff
patch 2 (25.63 KB, patch) 2015-10-13 11:02 PDT, Mark Lam	ggaren: review-	Details Formatted Diff Diff
patch 2 32-bit benchmark results (63.12 KB, text/plain) 2015-10-13 11:13 PDT, Mark Lam	no flags	Details
patch 2 64-bit benchmark results (64.16 KB, text/plain) 2015-10-13 11:13 PDT, Mark Lam	no flags	Details
patch 3 (30.96 KB, patch) 2015-10-15 14:34 PDT, Mark Lam	ggaren: review+	Details Formatted Diff Diff
patch3: 32-bit benchmark results 1 (63.85 KB, text/plain) 2015-10-15 15:47 PDT, Mark Lam	no flags	Details
patch3: 32-bit benchmark results 2 (63.81 KB, text/plain) 2015-10-15 15:48 PDT, Mark Lam	no flags	Details
patch3: 64 -bit benchmark results 1 (64.32 KB, text/plain) 2015-10-15 15:48 PDT, Mark Lam	no flags	Details
patch3: 64 -bit benchmark results 2 (64.17 KB, text/plain) 2015-10-15 15:48 PDT, Mark Lam	no flags	Details
patch for landing. (30.72 KB, patch) 2015-10-16 16:03 PDT, Mark Lam	no flags	Details Formatted Diff Diff
Show Obsolete (9) View All Add attachment proposed patch, testcase, etc.

Mark Lam

Comment 1 2015-10-12 11:17:00 PDT

Created attachment 262906 [details] patch 1: almost there. This patch works and passes tests so far. Still need to reason a bit more about how I compute the ResultType. Also need to run benchmarks.

Mark Lam

Comment 2 2015-10-13 11:02:44 PDT

Created attachment 262994 [details] patch 2

Mark Lam

Comment 3 2015-10-13 11:13:19 PDT

Created attachment 262996 [details] patch 2 32-bit benchmark results

Mark Lam

Comment 4 2015-10-13 11:13:48 PDT

Created attachment 262997 [details] patch 2 64-bit benchmark results

Mark Lam

Comment 5 2015-10-13 11:16:13 PDT

In the benchmark results, string-sub (of JSRegress) shows a significant speed up. On 32-bit, int-or-other-sub (of JSRegress) shows a significant slow down. I'll re-run the benchmarks to make sure, and also look into int-or-other-sub to see what's happening here.

Geoffrey Garen

Comment 6 2015-10-13 11:59:00 PDT

Comment on attachment 262994 [details] patch 2 View in context: https://bugs.webkit.org/attachment.cgi?id=262994&action=review Code looks generally correct, with some comments below. Please add a regression test that verifies we have a speedup in some case where subtraction previously caused the DFG problems. (Or does your current test do this? If so, please post its result.) > Source/JavaScriptCore/dfg/DFGAbstractValue.cpp:497 > +ResultType AbstractValue::asResultType() const I would call this "resultType" instead of "asResultType". We use "as" to mean a cast, but this is a just an access to a field, with a type conversion. > Source/JavaScriptCore/dfg/DFGAbstractValue.cpp:512 > + ResultType::Type bits = 0; > + if (isType(SpecInt32)) > + bits |= (ResultType::TypeInt32 | ResultType::TypeMaybeNumber); > + if (isType(SpecBytecodeDouble)) > + bits |= ResultType::TypeMaybeNumber; > + if (isType(SpecString)) > + bits |= ResultType::TypeMaybeString; > + if (isType(SpecBoolean)) > + bits |= ResultType::TypeMaybeBool; > + if (isType(SpecOther)) > + bits |= ResultType::TypeMaybeNull; > + if (!bits) > + bits |= ResultType::TypeMaybeOther; > + return ResultType(bits); This code duplicates the logic of the static functions in the ResultType class. This logic is involved, and somewhat surprising at first read. So, I think it's a liability to duplicate this stuff. You should use the ResultType helper functions instead. For example, if I know that I'm an Int32, it's pretty surprising that I need to "ResultType::TypeInt32 | ResultType::TypeMaybeNumber". But it wouldn't be surprising at all to "ResultType::numberTypeIsInt32()". Once you switch to helper functions, you'll notice that you have a small problem with undefined. SpecOther means "undefined or null". But your ResultType only says "null". So, you need to add a resultType helper function for what to do with undefined. I think SpecBytecodeNumber misses some cases of number -- namely, Int52 and impure NaN. Don't you want SpecFullNumber? You shouldn't use |= logic here. "isType" means "I am this and only this". So, you will never |= with anything else. > Source/JavaScriptCore/ftl/FTLLowerDFGToLLVM.cpp:1464 > + case UntypedUse: > + compileForceOSRExit(); > + break; I think I suggested the force OSR exit here -- but I think it would be better to edit FT::canCompile instead so that we don't waste time FTL compiling such a function in the first place. > Source/JavaScriptCore/jit/AssemblyHelpers.h:969 > + void unboxDouble(JSValueRegs regs, FPRReg destFPR, GPRReg scratchGPR, FPRReg) Instead of a comment, let's use naming to distinguish from the destructive function. (In future, as you've suggested, we can use a type, which is even better than a name or a comment.) > Source/JavaScriptCore/jit/AssemblyHelpers.h:1018 > + // This version is non-destructive to the incoming regs. > + void unboxDouble(JSValueRegs regs, FPRReg destFPR, GPRReg, FPRReg scratchFPR) Ditto.

Geoffrey Garen

Comment 7 2015-10-13 12:06:54 PDT

> Please add a regression test that verifies we have a speedup in some case > where subtraction previously caused the DFG problems. It looks like string-sub does just enough to test this.

Mark Lam

Comment 8 2015-10-13 15:27:37 PDT

(In reply to comment #6) > Please add a regression test that verifies we have a speedup in some case > where subtraction previously caused the DFG problems. (Or does your current > test do this? If so, please post its result.) Yes, string-sub.js (in JSRegress) should exercise this code. I'll take a look to see if I need to add something more. > > Source/JavaScriptCore/dfg/DFGAbstractValue.cpp:497 > > +ResultType AbstractValue::asResultType() const > > I would call this "resultType" instead of "asResultType". We use "as" to > mean a cast, but this is a just an access to a field, with a type conversion. Fixed. > > Source/JavaScriptCore/dfg/DFGAbstractValue.cpp:512 > ... > This code duplicates the logic of the static functions in the ResultType > class. This logic is involved, and somewhat surprising at first read. So, I > think it's a liability to duplicate this stuff. You should use the > ResultType helper functions instead. From re-reading the code, I see that the resultType values are only every set once during Node creation in the BytecodeGenerator. I've changed the ResultType struct to reflect this by making its constructor private. This ensures that I don't have to check for cases where the ResultType::Type bits are OR'ed together other than those which ResultType already provided factory methods for. With that, I can now switch to using the ResultType factory methods only. > Once you switch to helper functions, you'll notice that you have a small > problem with undefined. SpecOther means "undefined or null". But your > ResultType only says "null". So, you need to add a resultType helper > function for what to do with undefined. It appears that there is currently no way to test for is definitely TypeMaybeNull. The snippets also currently don't need to know if the type is null. I'll punt on this for now by lumping it into ResultType::unknownType() (which does include TypeMaybeNull). If there's a need in the future, we can re-visit this. > I think SpecBytecodeNumber misses some cases of number -- namely, Int52 and > impure NaN. Don't you want SpecFullNumber? No. The ResultType is for JSValues, which do not include Int52 and impure NaNs. Similarly, the snippet generators which work with ResultType will not be able to handle Int52 and impure NaNs (and the baseline JIT code which they come from don't either). We could re-write the snippet generators to be based on SpeculatedType instead of ResultType, and have the BaselineJIT map ResultType to SpeculatedType instead. However, to stay on task with the migration to snippets, I suggest holding off on adding supporting for Int52 and NaNs, and only do so in a subsequent pass if needed. Back to the result type conversion here, we actually want SpecBytecodeNumber. Patch 2 was erroneously using SpecBytecodeDouble. Will fix it to use SpecBytecodeNumber instead. > You shouldn't use |= logic here. "isType" means "I am this and only this". > So, you will never |= with anything else. Got it, and I'm now enforcing it by making the ResultType constructor private. > > Source/JavaScriptCore/ftl/FTLLowerDFGToLLVM.cpp:1464 > > + case UntypedUse: > > + compileForceOSRExit(); > > + break; > > I think I suggested the force OSR exit here -- but I think it would be > better to edit FT::canCompile instead so that we don't waste time FTL > compiling such a function in the first place. Fixed. > > Source/JavaScriptCore/jit/AssemblyHelpers.h:969 > > + void unboxDouble(JSValueRegs regs, FPRReg destFPR, GPRReg scratchGPR, FPRReg) > > Instead of a comment, let's use naming to distinguish from the destructive > function. (In future, as you've suggested, we can use a type, which is even > better than a name or a comment.) > > > Source/JavaScriptCore/jit/AssemblyHelpers.h:1018 > > + // This version is non-destructive to the incoming regs. > > + void unboxDouble(JSValueRegs regs, FPRReg destFPR, GPRReg, FPRReg scratchFPR) > > Ditto. Fixed. I renamed them to unboxDoubleNonDestructive(). I'm not crazy about that name, but it should do until I can introduce read-only registers.

Geoffrey Garen

Comment 9 2015-10-13 16:23:31 PDT

> > I think SpecBytecodeNumber misses some cases of number -- namely, Int52 and > > impure NaN. Don't you want SpecFullNumber? > > No. The ResultType is for JSValues, which do not include Int52 and impure > NaNs. Similarly, the snippet generators which work with ResultType will not > be able to handle Int52 and impure NaNs (and the baseline JIT code which > they come from don't either). We could re-write the snippet generators to > be based on SpeculatedType instead of ResultType, and have the BaselineJIT > map ResultType to SpeculatedType instead. However, to stay on task with the > migration to snippets, I suggest holding off on adding supporting for Int52 > and NaNs, and only do so in a subsequent pass if needed. Yes, it's the right call to stay on task and hold off on adding to snippets. But let's consider Int52 for a minute. Let's say my input is speculated to be Int52. Of course there's no reason to tell the snippet generator "Int52", since it doesn't know what Int52 means. But surely, since the snippet does not understand Int52, before I run the snippet, I must convert my input to JSValue, right? If so, is the snippet generator not guaranteed to see a JSValue that is a number? The same goes for signaling NaN. If I have one, don't I need to convert to non-signaling NaN JSValue before running the snippet?

Mark Lam

Comment 10 2015-10-14 09:39:16 PDT

(In reply to comment #9) > But let's consider Int52 for a minute. Let's say my input is speculated to > be Int52. Of course there's no reason to tell the snippet generator "Int52", > since it doesn't know what Int52 means. But surely, since the snippet does > not understand Int52, before I run the snippet, I must convert my input to > JSValue, right? If so, is the snippet generator not guaranteed to see a > JSValue that is a number? > > The same goes for signaling NaN. If I have one, don't I need to convert to > non-signaling NaN JSValue before running the snippet? If I understand you correctly, you're telling me that I should add code to the DFG ArithSub generator to convert Int52s and impure NaNs to JSValues before calling the snippet. I'll look into that.

Geoffrey Garen

Comment 11 2015-10-14 09:47:03 PDT

> If I understand you correctly, you're telling me that I should add code to > the DFG ArithSub generator to convert Int52s and impure NaNs to JSValues > before calling the snippet. I'll look into that. I think this is automatic by virtue of UntypedUse providing JSValues. (See DFGUseKind.h.)

Filip Pizlo

Comment 12 2015-10-14 10:13:57 PDT

(In reply to comment #11) > > If I understand you correctly, you're telling me that I should add code to > > the DFG ArithSub generator to convert Int52s and impure NaNs to JSValues > > before calling the snippet. I'll look into that. > > I think this is automatic by virtue of UntypedUse providing JSValues. (See > DFGUseKind.h.) Exactly. There is no way for an UntypedUse edge to see either Int52 or those parts of FullNumber that aren't in BytecodeNunber. UntyprdUse means that you are dealing with a JSValye. There is no value in trying convert Int52 or non-BytecodeNunber numbers to JsVavlues as some kind of special case of this code. The compiler already converts int52s and non-bytecode numbers to JSValues if they get used by a UntypedUse.

Geoffrey Garen

Comment 13 2015-10-14 10:18:36 PDT

Thinking about this more, the ResultType accessor should probably ASSERT if your AbstractValue is not a JSValue type, since ResultType is a bytecode concept, and it implies being a JSValue. A ResultType that corresponds to Int52 is a contradiction.

Mark Lam

Comment 14 2015-10-14 10:45:59 PDT

(In reply to comment #12) > There is no way for an UntypedUse edge to see either Int52 or those parts of > FullNumber that aren't in BytecodeNunber. UntyprdUse means that you are > dealing with a JSValye. OK, I think I see what you guys are getting at. There's a detail here that I missed, and therefore my patch has a bug too. In the fixup phase, I only fixed the subtract node to be of type NodeResultJS, but I neglected to fix up the child edges to be of UntypedUse. That bug will be fixed in my next patch. (In reply to comment #13) > Thinking about this more, the ResultType accessor should probably ASSERT if > your AbstractValue is not a JSValue type, since ResultType is a bytecode > concept, and it implies being a JSValue. A ResultType that corresponds to > Int52 is a contradiction. Sounds good. Will do.

Filip Pizlo

Comment 15 2015-10-14 12:03:05 PDT

(In reply to comment #13) > Thinking about this more, the ResultType accessor should probably ASSERT if > your AbstractValue is not a JSValue type, since ResultType is a bytecode > concept, and it implies being a JSValue. A ResultType that corresponds to > Int52 is a contradiction. Yeah, you could assert that the AbstractValue type is a subset of SpecBytecodeTop, or even stricter, SpecHeapTop. Like this: ASSERT(isType(SpecHeapTop));

Mark Lam

Comment 16 2015-10-15 14:34:01 PDT

Created attachment 263191 [details] patch 3 Patch 3 incorporates all the feedback, and have passed all the tests. Preliminary benchmark results seem to be noisy. I'm re-running the benchmarks now with a cleanly booted machine to see if I can more consistent results.

Mark Lam

Comment 17 2015-10-15 15:47:53 PDT

Created attachment 263206 [details] patch3: 32-bit benchmark results 1

Mark Lam

Comment 18 2015-10-15 15:48:12 PDT

Created attachment 263207 [details] patch3: 32-bit benchmark results 2

Mark Lam

Comment 19 2015-10-15 15:48:37 PDT

Created attachment 263208 [details] patch3: 64 -bit benchmark results 1

Mark Lam

Comment 20 2015-10-15 15:48:57 PDT

Created attachment 263209 [details] patch3: 64 -bit benchmark results 2

Mark Lam

Comment 21 2015-10-15 15:52:35 PDT

From the patch 3 benchmark results, the aggregate results all show no significant difference. The only individual values that are "definitely" different are all from JSRegress as follows: 32-bit Run 1: JSRegress arguments-strict-mode 153.8041+-5.4704 ^ 146.4424+-0.8439 ^ definitely 1.0503x faster fold-double-to-int 43.5707+-2.4937 ! 47.8533+-0.9335 ! definitely 1.0983x slower string-sub 186.9781+-7.3105 ^ 149.8940+-4.3438 ^ definitely 1.2474x faster 32-bit Run 2: JSRegress infer-closure-const-then-mov 23.7246+-1.0659 ! 25.1477+-0.2112 ! definitely 1.0600x slower string-sub 184.3235+-4.9505 ^ 147.9168+-3.5522 ^ definitely 1.2461x faster 64-bit Run 1: JSRegress create-lots-of-functions 13.2767+-0.8298 ! 14.8483+-0.3443 ! definitely 1.1184x slower get-by-id-chain-from-try-block 3.0848+-0.0438 ! 3.2379+-0.1021 ! definitely 1.0496x slower in-one-case-true 16.6160+-1.1027 ^ 14.3440+-0.7501 ^ definitely 1.1584x faster string-repeat-arith 35.5699+-1.0970 ^ 34.0988+-0.3381 ^ definitely 1.0431x faster string-sub 74.8361+-5.4934 ^ 47.2620+-2.7527 ^ definitely 1.5834x faster 64-bit Run 2: JSRegress array-prototype-map 110.7988+-1.9023 ! 116.7427+-3.8079 ! definitely 1.0536x slower Float64Array-to-Int16Array-set 82.3625+-2.4934 ! 94.1563+-2.6538 ! definitely 1.1432x slower get_callee_polymorphic 4.4044+-0.2873 ! 5.2390+-0.5333 ! definitely 1.1895x slower repeat-multi-get-by-offset 27.7042+-0.5242 ! 30.7776+-0.5496 ! definitely 1.1109x slower richards-try-catch 330.9247+-5.0731 ^ 323.6898+-1.8041 ^ definitely 1.0224x faster sink-function 13.2079+-1.9366 ^ 10.7634+-0.1808 ^ definitely 1.2271x faster string-sub 74.4352+-3.4954 ^ 47.4148+-3.4284 ^ definitely 1.5699x faster substring-concat 51.1590+-1.2356 ! 57.2363+-4.2831 ! definitely 1.1188x slower There is no consistency in the results except for string-sub which hows a 24% speed up on 32-bit and about 57% speed up on 64-bit. The string-sub code does exercise the code that was change in this patch.

Geoffrey Garen

Comment 22 2015-10-15 16:27:12 PDT

Comment on attachment 263191 [details] patch 3 View in context: https://bugs.webkit.org/attachment.cgi?id=263191&action=review r=me with some fixes below > Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:188 > + if (op == ArithSub > + && (node->child1()->shouldSpeculateUntypedForArithmetic() > + || node->child2()->shouldSpeculateUntypedForArithmetic()) This test asks, "*can* either side be non-number?" We want to ask "*is* either side known to be non-number?" For values that can be number or not, we think it's optimal to try number first, and then recompile the generic path if we exit. > Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:189 > + && m_graph.hasExitSite(node->origin.semantic, BadType)) { Let's change this to ||. In other words, if, for any reason, speculation has been causing enough OSR exits to recompile, then the generic code is best. > Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:191 > + fixEdge<UntypedUse>(node->child2()); Please fix child1 edge. > Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:194 > + } So, to summarize, the test we want is: if (op == ArithSub && (shouldSpeculateUntypedForArithmetic(node->child1(), node->child2() || m_graph.hasExitSite(node->origin.semantic, BadType)) and for shouldSpeculateUntypedForArithmetic: check both nodes and for Node::shouldSpeculateUntypedForArithmetic return !(value & SpecFullNumber); > Source/WTF/wtf/Platform.h:867 > +#define ENABLE_MASM_PROBE 1 Revert.

Mark Lam

Comment 23 2015-10-16 16:03:31 PDT

Created attachment 263344 [details] patch for landing. (In reply to comment #22) > and for Node::shouldSpeculateUntypedForArithmetic > return !(value & SpecFullNumber); Turns out we need to check for SpecBoolean as well: return !(value & (SpecFullNumber | SpecBoolean)); There were 2 JSRegress components (minus-boolean.js and minus-boolean-double.js) that showed significant degradations without this. Adding the check for SpecBoolean makes them happy again. I will land as soon as my test and full benchmark runs complete.

Mark Lam

Comment 24 2015-10-16 16:27:33 PDT

Landed in r191224: <http://trac.webkit.org/r191224>.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version WebKit Local Build

Hardware Unspecified

OS Unspecified

Product WebKit

Component JavaScriptCore

Assignee

Mark Lam

Reported

2015-10-12 11:14 PDT

Modified

2015-10-16 16:27 PDT History

CC List

8 users Show

URL

Keywords

Depends on

Blocks