SpeculativeJIT::shouldSpeculateInteger(NodeIndex, NodeIndex) currently returns false if either child node's prediction or static type is double. But that ignores cases where the prediction says that a node can be double, but is not necessarily always double. This is responsible for 3d-cube always failing speculation.
Created attachment 107145 [details] the patch
Created attachment 107146 [details] the patch - disable tiering
Comment on attachment 107146 [details] the patch - disable tiering View in context: https://bugs.webkit.org/attachment.cgi?id=107146&action=review > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.h:385 > + bool shouldNotSpeculateInteger(NodeIndex nodeIndex) I think the function would be clearer with a small “why” comment in it. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.h:392 > + if ((info.registerFormat() | DataFormatJS) == DataFormatJSDouble > + || (info.spillFormat() | DataFormatJS) == DataFormatJSDouble) I think this would read better on a single line. I think this would read better with a small inline helper function that does the (format | DataFormatJS) == DataFormatJSDouble ditty, with a pleasant name.
Latest numbers, after fixing review: Benchmark report for SunSpider, V8, and Kraken. VMs tested: "TipOfTreeDyn" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc "CarefulNumber" at /Volumes/Data/pizlo/tertiary/OpenSource/WebKitBuild/Release/jsc Collected 30 samples per benchmark/VM, with 10 VM invocations per benchmark. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTreeDyn CarefulNumber SunSpider: 3d-cube 8.9150+-0.0876 ^ 7.7930+-0.0804 ^ definitely 1.1440x faster 3d-morph 7.5370+-0.0902 7.4855+-0.0884 3d-raytrace 7.5360+-0.0977 ? 7.6311+-0.0839 ? might be 1.0126x slower access-binary-trees 2.3354+-0.0372 2.3135+-0.0356 access-fannkuch 11.6601+-0.1098 11.5304+-0.1130 might be 1.0112x faster access-nbody 4.2218+-0.0413 ? 4.2356+-0.1036 ? access-nsieve 2.6112+-0.0352 2.5944+-0.0388 bitops-3bit-bits-in-byte 1.6676+-0.0246 1.6569+-0.0244 bitops-bits-in-byte 3.1623+-0.0323 ? 3.2116+-0.0469 ? might be 1.0156x slower bitops-bitwise-and 3.6386+-0.0488 3.5991+-0.0619 might be 1.0110x faster bitops-nsieve-bits 5.4136+-0.0893 5.2968+-0.0505 might be 1.0221x faster controlflow-recursive 1.9968+-0.0268 1.9722+-0.0360 might be 1.0125x faster crypto-aes 7.0017+-0.1240 ? 7.1120+-0.1322 ? might be 1.0158x slower crypto-md5 2.7654+-0.0302 2.7281+-0.0393 might be 1.0137x faster crypto-sha1 2.2182+-0.0332 ? 2.2460+-0.0375 ? might be 1.0126x slower date-format-tofte 10.5351+-0.6234 10.2592+-0.1074 might be 1.0269x faster date-format-xparb 8.5383+-0.1030 ? 8.5800+-0.0866 ? math-cordic 6.2498+-0.0663 6.1611+-0.0573 might be 1.0144x faster math-partial-sums 7.3615+-0.0760 ? 7.3637+-0.0856 ? math-spectral-norm 2.6108+-0.0393 2.5595+-0.0199 might be 1.0200x faster regexp-dna 10.7612+-0.0966 10.7408+-0.0993 string-base64 5.7395+-0.0641 5.7239+-0.0994 string-fasta 8.1985+-0.0748 8.1984+-0.0780 string-tagcloud 12.0131+-0.1120 11.8731+-0.1400 might be 1.0118x faster string-unpack-code 18.8384+-0.1434 ? 18.8713+-0.1348 ? string-validate-input 6.6322+-0.0814 6.5925+-0.0770 <arithmetic> 6.5446+-0.0304 ^ 6.4742+-0.0158 ^ definitely 1.0109x faster <geometric> 5.3974+-0.0176 ^ 5.3469+-0.0141 ^ definitely 1.0094x faster <harmonic> 4.3934+-0.0150 ^ 4.3596+-0.0174 ^ definitely 1.0077x faster TipOfTreeDyn CarefulNumber V8: crypto 86.3936+-0.5222 86.3028+-0.5295 deltablue 258.3522+-1.0382 258.0643+-0.8435 earley-boyer 95.5925+-0.4492 ? 95.7731+-0.5052 ? raytrace 79.5224+-1.7899 ^ 77.2884+-0.2945 ^ definitely 1.0289x faster regexp 108.1042+-0.4954 ! 109.5729+-0.5820 ! definitely 1.0136x slower richards 218.5178+-0.9283 216.9693+-0.7558 splay 102.8957+-0.5062 102.4142+-0.4951 <arithmetic> 135.6255+-0.3987 135.1979+-0.2049 <geometric> 122.4050+-0.4774 121.9538+-0.1870 <harmonic> 112.7881+-0.5597 112.2687+-0.2024 TipOfTreeDyn CarefulNumber Kraken: ai-astar 1142.1542+-16.8952 ? 1147.8947+-5.4359 ? audio-beat-detection 471.3633+-2.0838 ? 474.5726+-2.0400 ? audio-dft 428.1801+-2.8580 426.9048+-3.1434 audio-fft 372.7386+-3.3193 ^ 367.4533+-1.4467 ^ definitely 1.0144x faster audio-oscillator 356.0643+-1.2339 ? 358.4645+-1.3424 ? imaging-darkroom 512.1170+-1.3347 ? 512.1290+-1.4023 ? imaging-desaturate 220.0034+-0.8839 ? 220.7472+-1.5117 ? imaging-gaussian-blur 1748.7651+-5.6377 1739.4238+-5.6024 json-parse-financial 49.5538+-0.3005 49.4683+-0.2822 json-stringify-tinderbox 69.9394+-0.4919 69.3957+-0.2913 stanford-crypto-aes 146.0683+-0.7982 ? 146.5923+-0.5881 ? stanford-crypto-ccm 113.7544+-0.5470 113.6844+-0.5826 stanford-crypto-pbkdf2 405.6344+-2.1291 ? 405.7448+-1.5264 ? stanford-crypto-sha256-iterative 152.0317+-0.7024 ? 152.3759+-0.7158 ? <arithmetic> 442.0263+-1.3448 441.7751+-0.4788 <geometric> 283.2216+-0.4956 283.1418+-0.3289 <harmonic> 181.6587+-0.4115 181.4143+-0.3218 TipOfTreeDyn CarefulNumber All benchmarks: <arithmetic> 155.4874+-0.4033 155.3099+-0.1591 <geometric> 27.9512+-0.0539 ^ 27.7888+-0.0422 ^ definitely 1.0058x faster <harmonic> 7.7594+-0.0259 ^ 7.7007+-0.0300 ^ definitely 1.0076x faster
Created attachment 107217 [details] the patch - fix review, did some refactoring
Comment on attachment 107217 [details] the patch - fix review, did some refactoring View in context: https://bugs.webkit.org/attachment.cgi?id=107217&action=review r=me but please don't check in the accidental change to Platform.h. > Source/JavaScriptCore/ChangeLog:38 > + * dfg/DFGGenerationInfo.h: > + (JSC::DFG::isJSFormat): > + (JSC::DFG::isJSInteger): > + (JSC::DFG::isJSDouble): > + (JSC::DFG::isJSCell): > + (JSC::DFG::isJSBoolean): > + (JSC::DFG::GenerationInfo::isJSFormat): > + (JSC::DFG::GenerationInfo::isJSInteger): > + (JSC::DFG::GenerationInfo::isJSDouble): > + (JSC::DFG::GenerationInfo::isJSCell): > + (JSC::DFG::GenerationInfo::isJSBoolean): > + * dfg/DFGJITCodeGenerator.cpp: > + (JSC::DFG::JITCodeGenerator::isKnownInteger): > + (JSC::DFG::JITCodeGenerator::isKnownNumeric): > + (JSC::DFG::JITCodeGenerator::isKnownCell): > + (JSC::DFG::JITCodeGenerator::isKnownNotInteger): > + (JSC::DFG::JITCodeGenerator::isKnownBoolean): > + * dfg/DFGJITCodeGenerator.h: > + * dfg/DFGNonSpeculativeJIT.cpp: > + (JSC::DFG::NonSpeculativeJIT::compile): > + * dfg/DFGSpeculativeJIT.h: > + (JSC::DFG::SpeculativeJIT::isInteger): > + (JSC::DFG::SpeculativeJIT::shouldSpeculateDouble): > + (JSC::DFG::SpeculativeJIT::shouldNotSpeculateInteger): > + (JSC::DFG::SpeculativeJIT::shouldSpeculateInteger): It's great to put line-by-line comments next to this auto-generated helper text, but in cases where you don't do that, I think it's better just to delete the text. > Source/JavaScriptCore/dfg/DFGGenerationInfo.h:124 > +inline bool isJSFormat(DataFormat format, DataFormat expectedFormat) > +{ > + ASSERT(expectedFormat & DataFormatJS); > + return (format | DataFormatJS) == expectedFormat; > +} This is way better! > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.h:400 > + return !(shouldNotSpeculateInteger(op1) || shouldNotSpeculateInteger(op2)) && (shouldSpeculateInteger(op1) || shouldSpeculateInteger(op2)); The fight between shouldNotSpeculateInteger() and shouldSpeculateInteger() is a bit mind-bending, but I don't have a specific suggestion to make right now. > Source/JavaScriptCore/wtf/Platform.h:968 > -#define ENABLE_TIERED_COMPILATION 0 > +#define ENABLE_TIERED_COMPILATION 1 Hey man, if you want to enable it so badly, why don't you just submit a patch? ;)
(In reply to comment #6) > (From update of attachment 107217 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=107217&action=review > > r=me but please don't check in the accidental change to Platform.h. Ooops! > > > Source/JavaScriptCore/ChangeLog:38 > > + * dfg/DFGGenerationInfo.h: > > + (JSC::DFG::isJSFormat): > > + (JSC::DFG::isJSInteger): > > + (JSC::DFG::isJSDouble): > > + (JSC::DFG::isJSCell): > > + (JSC::DFG::isJSBoolean): > > + (JSC::DFG::GenerationInfo::isJSFormat): > > + (JSC::DFG::GenerationInfo::isJSInteger): > > + (JSC::DFG::GenerationInfo::isJSDouble): > > + (JSC::DFG::GenerationInfo::isJSCell): > > + (JSC::DFG::GenerationInfo::isJSBoolean): > > + * dfg/DFGJITCodeGenerator.cpp: > > + (JSC::DFG::JITCodeGenerator::isKnownInteger): > > + (JSC::DFG::JITCodeGenerator::isKnownNumeric): > > + (JSC::DFG::JITCodeGenerator::isKnownCell): > > + (JSC::DFG::JITCodeGenerator::isKnownNotInteger): > > + (JSC::DFG::JITCodeGenerator::isKnownBoolean): > > + * dfg/DFGJITCodeGenerator.h: > > + * dfg/DFGNonSpeculativeJIT.cpp: > > + (JSC::DFG::NonSpeculativeJIT::compile): > > + * dfg/DFGSpeculativeJIT.h: > > + (JSC::DFG::SpeculativeJIT::isInteger): > > + (JSC::DFG::SpeculativeJIT::shouldSpeculateDouble): > > + (JSC::DFG::SpeculativeJIT::shouldNotSpeculateInteger): > > + (JSC::DFG::SpeculativeJIT::shouldSpeculateInteger): > > It's great to put line-by-line comments next to this auto-generated helper text, but in cases where you don't do that, I think it's better just to delete the text. OK - I'll switch this to putting comments next to the auto-generated text. > > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.h:400 > > + return !(shouldNotSpeculateInteger(op1) || shouldNotSpeculateInteger(op2)) && (shouldSpeculateInteger(op1) || shouldSpeculateInteger(op2)); > > The fight between shouldNotSpeculateInteger() and shouldSpeculateInteger() is a bit mind-bending, but I don't have a specific suggestion to make right now. > > > Source/JavaScriptCore/wtf/Platform.h:968 > > -#define ENABLE_TIERED_COMPILATION 0 > > +#define ENABLE_TIERED_COMPILATION 1 > > Hey man, if you want to enable it so badly, why don't you just submit a patch? ;) Turns out that this patch is probably a prerequisite for enabling it, but it may not be sufficient. In browser, we're seeing somewhat different speculations with awkward behavior as a result. This patch will fix part, but not all, of that.
(In reply to comment #6) > > Source/JavaScriptCore/ChangeLog:38 > > + * dfg/DFGGenerationInfo.h: > > + (JSC::DFG::isJSFormat): > > + (JSC::DFG::isJSInteger): > > + (JSC::DFG::isJSDouble): > > + (JSC::DFG::isJSCell): > > + (JSC::DFG::isJSBoolean): > > + (JSC::DFG::GenerationInfo::isJSFormat): > > + (JSC::DFG::GenerationInfo::isJSInteger): > > + (JSC::DFG::GenerationInfo::isJSDouble): > > + (JSC::DFG::GenerationInfo::isJSCell): > > + (JSC::DFG::GenerationInfo::isJSBoolean): > > + * dfg/DFGJITCodeGenerator.cpp: > > + (JSC::DFG::JITCodeGenerator::isKnownInteger): > > + (JSC::DFG::JITCodeGenerator::isKnownNumeric): > > + (JSC::DFG::JITCodeGenerator::isKnownCell): > > + (JSC::DFG::JITCodeGenerator::isKnownNotInteger): > > + (JSC::DFG::JITCodeGenerator::isKnownBoolean): > > + * dfg/DFGJITCodeGenerator.h: > > + * dfg/DFGNonSpeculativeJIT.cpp: > > + (JSC::DFG::NonSpeculativeJIT::compile): > > + * dfg/DFGSpeculativeJIT.h: > > + (JSC::DFG::SpeculativeJIT::isInteger): > > + (JSC::DFG::SpeculativeJIT::shouldSpeculateDouble): > > + (JSC::DFG::SpeculativeJIT::shouldNotSpeculateInteger): > > + (JSC::DFG::SpeculativeJIT::shouldSpeculateInteger): > > It's great to put line-by-line comments next to this auto-generated helper text, but in cases where you don't do that, I think it's better just to delete the text. I’m not sure I agree. Some people like to be able to search for modified functions by deleting the change log. Lets not belabor the point in this patch, but might want to talk to others, perhaps including Maciej, about this.
Created attachment 107234 [details] the patch - fix review, and hardened shouldSpeculateInteger() even more Here are the latest perf numbers: Benchmark report for SunSpider, V8, and Kraken. VMs tested: "TipOfTree" at /Volumes/Data/pizlo/octonary/OpenSource/WebKitBuild/Release/jsc "TipOfTreeDyn" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc "CarefulNumber" at /Volumes/Data/pizlo/tertiary/OpenSource/WebKitBuild/Release/jsc Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree TipOfTreeDyn CarefulNumber CarefulNumber v. TipOfTree SunSpider: 3d-cube 7.7000+-0.1110 ! 9.0728+-0.2063 ^ 7.7518+-0.1422 ? 3d-morph 7.4553+-0.1671 ? 7.5690+-0.1943 7.4558+-0.1516 ? 3d-raytrace 7.6957+-0.2284 7.5390+-0.1541 ? 7.7188+-0.2071 ? access-binary-trees 2.2838+-0.0962 ? 2.4106+-0.1220 2.2463+-0.0583 might be 1.0167x faster access-fannkuch 11.7680+-0.1695 11.6634+-0.1932 ? 11.7183+-0.2692 access-nbody 4.2794+-0.0558 4.2430+-0.1707 4.1893+-0.0973 might be 1.0215x faster access-nsieve 2.6066+-0.1488 2.5871+-0.0419 2.5849+-0.0392 bitops-3bit-bits-in-byte 1.7161+-0.0694 1.6783+-0.0393 ? 1.6990+-0.0539 might be 1.0101x faster bitops-bits-in-byte 3.2165+-0.0406 ? 3.2256+-0.0493 3.1281+-0.0669 might be 1.0283x faster bitops-bitwise-and 3.6744+-0.0978 3.5958+-0.1036 3.5818+-0.0980 might be 1.0258x faster bitops-nsieve-bits 5.4681+-0.1144 5.4025+-0.1006 5.3010+-0.1118 might be 1.0315x faster controlflow-recursive 1.9870+-0.0448 ? 2.0253+-0.0299 1.9628+-0.0425 might be 1.0123x faster crypto-aes 6.6115+-0.1839 ! 7.3610+-0.4124 7.1159+-0.3231 ? might be 1.0763x slower crypto-md5 2.9036+-0.0785 2.7903+-0.0977 2.7666+-0.1011 might be 1.0495x faster crypto-sha1 2.3284+-0.0785 2.2361+-0.0427 ^ 2.1552+-0.0330 ^ definitely 1.0804x faster date-format-tofte 10.3591+-0.4120 10.3206+-0.2688 10.2628+-0.2429 date-format-xparb 8.4283+-0.2384 ? 8.9335+-0.3087 8.5139+-0.1523 ? might be 1.0102x slower math-cordic 6.3591+-0.1185 6.3367+-0.0792 ^ 6.1589+-0.0684 ^ definitely 1.0325x faster math-partial-sums 7.7329+-0.1796 ^ 7.3818+-0.1621 7.3804+-0.1093 ^ definitely 1.0478x faster math-spectral-norm 2.5507+-0.1063 ? 2.6231+-0.0423 2.6071+-0.0585 ? might be 1.0221x slower regexp-dna 10.8829+-0.2408 ? 11.0444+-0.3196 10.9296+-0.1949 ? string-base64 6.0589+-0.2107 5.8801+-0.2482 ? 5.9180+-0.1902 might be 1.0238x faster string-fasta 7.3014+-0.1081 ! 8.3146+-0.2175 8.3007+-0.1999 ! definitely 1.1369x slower string-tagcloud 12.1434+-0.4217 12.0108+-0.3340 11.9435+-0.3274 might be 1.0167x faster string-unpack-code 19.1970+-0.5470 19.0773+-0.4865 18.8295+-0.3295 might be 1.0195x faster string-validate-input 7.1812+-0.2631 6.8250+-0.2467 6.7450+-0.3059 might be 1.0647x faster <arithmetic> 6.5342+-0.0469 ! 6.6211+-0.0351 ^ 6.4987+-0.0275 <geometric> 5.4040+-0.0381 ? 5.4584+-0.0191 ^ 5.3546+-0.0229 <harmonic> 4.4180+-0.0393 ? 4.4379+-0.0164 ^ 4.3547+-0.0310 might be 1.0145x faster TipOfTree TipOfTreeDyn CarefulNumber CarefulNumber v. TipOfTree V8: crypto 92.6777+-1.0560 ^ 85.8287+-0.6768 84.6729+-1.3058 ^ definitely 1.0945x faster deltablue 271.7779+-2.4911 ^ 258.1274+-2.1974 257.1388+-2.2389 ^ definitely 1.0569x faster earley-boyer 95.3850+-0.7488 ? 95.9525+-0.9046 ? 96.9092+-1.7912 ? might be 1.0160x slower raytrace 78.7186+-0.7819 77.7404+-0.5089 76.6954+-0.5746 ^ definitely 1.0264x faster regexp 108.6307+-0.6516 ? 109.0979+-1.1529 107.6484+-0.5817 richards 231.8681+-2.1057 ^ 218.4648+-1.6917 218.0566+-1.9601 ^ definitely 1.0633x faster splay 101.9574+-0.4023 ? 102.9560+-0.8176 ? 102.9860+-0.8933 ? might be 1.0101x slower <arithmetic> 140.1451+-0.4827 ^ 135.4525+-0.3950 134.8725+-0.5967 ^ definitely 1.0391x faster <geometric> 125.3196+-0.3113 ^ 122.1319+-0.3559 121.4995+-0.4868 ^ definitely 1.0314x faster <harmonic> 114.7691+-0.3329 ^ 112.4043+-0.3475 111.7151+-0.4317 ^ definitely 1.0273x faster TipOfTree TipOfTreeDyn CarefulNumber CarefulNumber v. TipOfTree Kraken: ai-astar 1115.9489+-13.6925 ? 1127.2781+-4.3517 ! 1148.8542+-9.6147 ! definitely 1.0295x slower audio-beat-detection 483.6177+-5.5121 ^ 468.2725+-3.1906 ? 472.6012+-2.7622 ^ definitely 1.0233x faster audio-dft 435.9011+-11.7825 422.9108+-2.6337 ? 425.0530+-5.1124 might be 1.0255x faster audio-fft 368.5819+-1.3054 ? 369.9146+-3.2159 ^ 365.7559+-0.3816 ^ definitely 1.0077x faster audio-oscillator 381.7322+-1.6193 ^ 356.1275+-2.5445 353.8487+-2.4433 ^ definitely 1.0788x faster imaging-darkroom 536.6514+-4.1188 ^ 511.1556+-3.5697 509.2351+-2.1197 ^ definitely 1.0538x faster imaging-desaturate 622.3301+-2.8078 ^ 219.1983+-2.3835 218.1094+-0.8011 ^ definitely 2.8533x faster imaging-gaussian-blur 1737.6811+-4.6308 1735.7990+-10.7049 ? 1742.4213+-8.1005 ? json-parse-financial 49.0553+-0.3802 ? 49.3396+-0.5432 ! 50.7809+-0.6047 ! definitely 1.0352x slower json-stringify-tinderbox 69.8076+-1.5159 69.1354+-0.3432 68.6172+-0.2549 might be 1.0173x faster stanford-crypto-aes 146.7155+-2.4606 144.8160+-0.6303 ? 145.1716+-0.4200 might be 1.0106x faster stanford-crypto-ccm 114.1883+-2.5628 114.1723+-1.0068 ^ 112.0443+-0.7322 might be 1.0191x faster stanford-crypto-pbkdf2 333.9798+-2.2400 ! 401.9650+-2.1643 401.8433+-2.0952 ! definitely 1.2032x slower stanford-crypto-sha256-iterative 132.0898+-0.7086 ! 150.4716+-0.4734 ^ 148.9359+-0.9755 ! definitely 1.1275x slower <arithmetic> 466.3058+-1.7094 ^ 438.6111+-0.8533 ? 440.2337+-1.1670 ^ definitely 1.0592x faster <geometric> 300.2954+-1.1813 ^ 281.3736+-0.5503 ? 281.4741+-0.5687 ^ definitely 1.0669x faster <harmonic> 185.3853+-0.8586 ^ 180.5717+-0.6557 ? 181.0842+-0.6153 ^ definitely 1.0238x faster TipOfTree TipOfTreeDyn CarefulNumber CarefulNumber v. TipOfTree All benchmarks: <arithmetic> 163.3869+-0.4742 ^ 154.4866+-0.2504 ? 154.8158+-0.3607 ^ definitely 1.0554x faster <geometric> 28.5619+-0.0953 ^ 28.0616+-0.0556 ^ 27.7467+-0.0759 ^ definitely 1.0294x faster <harmonic> 7.8052+-0.0677 ? 7.8354+-0.0283 ^ 7.6916+-0.0537 might be 1.0148x faster
Comment on attachment 107234 [details] the patch - fix review, and hardened shouldSpeculateInteger() even more View in context: https://bugs.webkit.org/attachment.cgi?id=107234&action=review r=me but please remove stray printfs. > Source/JavaScriptCore/ChangeLog:9 > + This is a 17% speed-up on 3b-cube. Typo: "3b". > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.h:390 > + printf("%d is a double constant.\n", nodeIndex); Stray printf. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.h:400 > + printf("%d produces a double.\n", nodeIndex); And here. > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.h:405 > + printf("%d has a double prediction.\n", nodeIndex); And here.
Landed in r95060.