RESOLVED FIXED169914
[JSC][DFG] Propagate AnyIntAsDouble information carefully to utilize it in fixup
https://bugs.webkit.org/show_bug.cgi?id=169914
Summary [JSC][DFG] Propagate AnyIntAsDouble information carefully to utilize it in fixup
Yusuke Suzuki
Reported 2017-03-21 06:22:54 PDT
[JSC][DFG] Propagate AnyIntAsDouble information carefully to utilize it in fixup
Attachments
Patch (5.39 KB, patch)
2017-03-21 06:23 PDT, Yusuke Suzuki
no flags
Patch (5.08 KB, patch)
2017-03-21 07:57 PDT, Yusuke Suzuki
buildbot: commit-queue-
WIP (10.72 KB, patch)
2017-03-21 13:43 PDT, Yusuke Suzuki
no flags
WIP (9.87 KB, patch)
2017-03-21 14:06 PDT, Yusuke Suzuki
no flags
WIP: Patch becomes much solid (6.37 KB, patch)
2017-03-21 15:41 PDT, Yusuke Suzuki
no flags
WIP: Cleaning up right now (9.71 KB, patch)
2017-03-21 18:04 PDT, Yusuke Suzuki
no flags
Patch (11.18 KB, patch)
2017-03-22 13:50 PDT, Yusuke Suzuki
saam: review+
Yusuke Suzuki
Comment 1 2017-03-21 06:23:32 PDT
Created attachment 305000 [details] Patch WIP
Yusuke Suzuki
Comment 2 2017-03-21 06:31:49 PDT
We should investigate the heap prediction to offer proper Double speculation for the result of GetByVal. By doing so, we can propagate AnyIntAsDouble information in prediction propagation phase. And later, in fixup, we can select ArithAdd(Int52, Int52) for such integers. The current implementation is sooooooooo fragile, it's WIP. We should select appropriate ValueAdd fixup super carefully. For example, when both values are AnyIntAsDouble, we have a chance to make it to ArithAdd(Int52, Int52). But if it always overflows, we should select ArithAdd(Double, Double). Good news: With the current WIP, we can gain perf improvement in kraken crypto benchmarks. baseline patched stanford-crypto-aes 58.704+-2.227 56.238+-2.149 might be 1.0438x faster stanford-crypto-ccm 52.842+-3.927 51.027+-2.285 might be 1.0356x faster stanford-crypto-pbkdf2 130.434+-4.405 ^ 94.416+-2.397 ^ definitely 1.3815x faster stanford-crypto-sha256-iterative 44.315+-0.970 ^ 33.151+-0.708 ^ definitely 1.3368x faster <arithmetic> 71.574+-1.483 ^ 58.708+-0.960 ^ definitely 1.2191x faster
Yusuke Suzuki
Comment 3 2017-03-21 07:57:56 PDT
Created attachment 305003 [details] Patch WIP
Yusuke Suzuki
Comment 4 2017-03-21 08:35:41 PDT
kraken-imaging-darkroom shows 20% regression. Nice catch, let's investigate what happens.
Filip Pizlo
Comment 5 2017-03-21 08:48:24 PDT
Comment on attachment 305003 [details] Patch This is super interesting. It's a good idea if it becomes not-fragile!
Build Bot
Comment 6 2017-03-21 09:09:17 PDT
Comment on attachment 305003 [details] Patch Attachment 305003 [details] did not pass jsc-ews (mac): Output: http://webkit-queues.webkit.org/results/3376907 New failing tests: stress/exit-from-getter.js.ftl-no-cjit-no-inline-validate stress/exit-from-getter.js.no-cjit-collect-continuously jsc-layout-tests.yaml/js/script-tests/dfg-int-overflow-in-loop.js.layout-dfg-eager-no-cjit jsc-layout-tests.yaml/js/script-tests/dfg-int-overflow-in-loop.js.layout-no-ftl stress/exit-from-getter.js.dfg-eager-no-cjit-validate jsc-layout-tests.yaml/js/script-tests/dfg-int-overflow-in-loop.js.layout-no-llint jsc-layout-tests.yaml/js/script-tests/dfg-int-overflow-in-loop.js.layout-no-cjit stress/exit-from-getter.js.ftl-eager stress/exit-from-getter.js.ftl-no-cjit-validate-sampling-profiler stress/exit-from-getter.js.ftl-no-cjit-no-put-stack-validate stress/exit-from-getter.js.no-cjit-validate-phases
Yusuke Suzuki
Comment 7 2017-03-21 09:14:23 PDT
(In reply to comment #4) > kraken-imaging-darkroom shows 20% regression. Nice catch, let's investigate > what happens. We should honor DoubleConstant. It is offered by SourceCodeRepresentation::Double. It means that users explicitly write 1.0 instead of 1. In that case, handling it as double should be preferable.
Yusuke Suzuki
Comment 8 2017-03-21 13:43:54 PDT
Yusuke Suzuki
Comment 9 2017-03-21 14:06:59 PDT
Yusuke Suzuki
Comment 10 2017-03-21 14:32:00 PDT
Comment on attachment 305026 [details] WIP View in context: https://bugs.webkit.org/attachment.cgi?id=305026&action=review Haha, we found bunch of related issues! > Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h:478 > + forNode(node).setType(SpecAnyInt); Note: should be SpecAnyInt since the above constant value can be any int. > Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp:408 > + changed |= mergePrediction(SpecBytecodeNumber); Note: Oops! > Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:2568 > + speculate(NegativeZero, noValue(), 0, m_out.isZero64(result)); Note: Oops! part 2.
Yusuke Suzuki
Comment 11 2017-03-21 15:41:27 PDT
Created attachment 305038 [details] WIP: Patch becomes much solid
Yusuke Suzuki
Comment 12 2017-03-21 16:49:58 PDT
(In reply to Yusuke Suzuki from comment #7) > (In reply to comment #4) > > kraken-imaging-darkroom shows 20% regression. Nice catch, let's investigate > > what happens. > > We should honor DoubleConstant. It is offered by > SourceCodeRepresentation::Double. It means that users explicitly write 1.0 > instead of 1. In that case, handling it as double should be preferable. That's rather caused by Int52Rep's incorrect AI result.
Yusuke Suzuki
Comment 13 2017-03-21 18:04:29 PDT
Created attachment 305055 [details] WIP: Cleaning up right now
Yusuke Suzuki
Comment 14 2017-03-21 18:06:38 PDT
Performance numbers. Seems good. Benchmark report for SunSpider, Octane, and Kraken on sakura-trick. VMs tested: "baseline" at /home/yusukesuzuki/dev/WebKit/WebKitBuild/ic-dfg-tot/Release/bin/jsc "patched" at /home/yusukesuzuki/dev/WebKit/WebKitBuild/ic-dfg-value-add/Release/bin/jsc Collected 50 samples per benchmark/VM, with 50 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. baseline patched SunSpider: 3d-cube 9.0468+-0.4192 8.8036+-0.4336 might be 1.0276x faster 3d-morph 25.1012+-3.2549 ? 26.3637+-3.4597 ? might be 1.0503x slower 3d-raytrace 8.9271+-0.4894 ? 9.1181+-0.4437 ? might be 1.0214x slower access-binary-trees 4.1115+-0.2068 ? 4.1644+-0.2009 ? might be 1.0129x slower access-fannkuch 11.3397+-0.6874 10.7369+-0.6946 might be 1.0561x faster access-nbody 5.1591+-0.2796 ? 5.1814+-0.2438 ? access-nsieve 6.6204+-0.3874 6.3539+-0.3811 might be 1.0419x faster bitops-3bit-bits-in-byte 2.3271+-0.1316 ? 2.4111+-0.1198 ? might be 1.0361x slower bitops-bits-in-byte 5.7656+-0.2966 5.4785+-0.3563 might be 1.0524x faster bitops-bitwise-and 4.2864+-0.2719 ? 4.5494+-0.2811 ? might be 1.0614x slower bitops-nsieve-bits 6.1865+-0.3359 6.1447+-0.3429 controlflow-recursive 5.0490+-0.2080 5.0349+-0.2069 crypto-aes 7.7566+-0.4096 ? 8.2666+-0.3546 ? might be 1.0657x slower crypto-md5 5.0399+-0.2592 4.8526+-0.2656 might be 1.0386x faster crypto-sha1 5.0899+-0.2812 ? 5.3179+-0.3035 ? might be 1.0448x slower date-format-tofte 12.7879+-0.6442 ? 13.2374+-0.6718 ? might be 1.0352x slower date-format-xparb 9.6938+-0.4552 ? 9.7779+-0.4665 ? math-cordic 5.5725+-0.2822 ? 5.7865+-0.2725 ? might be 1.0384x slower math-partial-sums 11.1509+-0.5344 ? 11.3622+-0.5034 ? might be 1.0189x slower math-spectral-norm 4.1114+-0.2003 ? 4.1346+-0.1848 ? regexp-dna 12.4260+-0.7087 ? 12.6531+-0.6848 ? might be 1.0183x slower string-base64 8.0524+-0.3953 7.7479+-0.4451 might be 1.0393x faster string-fasta 10.7088+-0.5774 10.5849+-0.6265 might be 1.0117x faster string-tagcloud 15.0000+-0.6955 ? 15.8826+-0.5997 ? might be 1.0588x slower string-unpack-code 33.6959+-1.1267 ^ 31.2100+-1.0928 ^ definitely 1.0796x faster string-validate-input 7.7340+-0.3487 7.5341+-0.3061 might be 1.0265x faster <arithmetic> 9.3362+-0.1705 9.3342+-0.1536 might be 1.0002x faster baseline patched Octane: encrypt 0.22284+-0.00726 0.21788+-0.00208 might be 1.0228x faster decrypt 3.19350+-0.03107 3.15163+-0.01135 might be 1.0133x faster deltablue x2 0.19005+-0.00489 0.18991+-0.00378 earley 0.38626+-0.00634 0.38608+-0.00754 boyer 6.81411+-0.05849 6.78422+-0.04951 navier-stokes x2 5.13978+-0.02146 ? 5.14224+-0.02263 ? raytrace x2 1.07106+-0.00759 ? 1.07823+-0.00639 ? richards x2 0.10447+-0.00262 0.10354+-0.00372 splay x2 0.41359+-0.00187 ? 0.41372+-0.00254 ? regexp x2 24.17902+-0.57214 23.88639+-0.34467 might be 1.0123x faster pdfjs x2 52.99913+-1.12541 51.55391+-0.97258 might be 1.0280x faster mandreel x2 61.73522+-2.28842 59.43819+-0.62873 might be 1.0386x faster gbemu x2 47.52163+-1.43188 47.43213+-1.47181 closure 0.76502+-0.01333 0.75352+-0.01460 might be 1.0153x faster jquery 9.76510+-0.24014 ? 9.84420+-0.24252 ? box2d x2 13.43593+-0.27014 13.22219+-0.36775 might be 1.0162x faster zlib x2 467.78036+-4.54568 466.03177+-5.53952 typescript x2 874.60354+-18.01238 856.91126+-17.50218 might be 1.0206x faster <geometric> 6.84285+-0.03025 6.77972+-0.03310 might be 1.0093x faster baseline patched Kraken: ai-astar 156.991+-4.220 ? 160.604+-3.616 ? might be 1.0230x slower audio-beat-detection 52.085+-2.036 ? 54.572+-2.393 ? might be 1.0478x slower audio-dft 104.422+-1.980 102.996+-2.037 might be 1.0138x faster audio-fft 40.316+-0.724 40.158+-0.751 audio-oscillator 71.545+-1.686 70.198+-2.657 might be 1.0192x faster imaging-darkroom 117.680+-2.308 ? 120.226+-2.270 ? might be 1.0216x slower imaging-desaturate 73.859+-3.123 73.527+-1.867 imaging-gaussian-blur 107.189+-2.530 107.122+-2.535 json-parse-financial 62.838+-1.349 60.334+-1.798 might be 1.0415x faster json-stringify-tinderbox 39.174+-1.706 ? 39.392+-0.957 ? stanford-crypto-aes 58.237+-1.560 56.834+-1.717 might be 1.0247x faster stanford-crypto-ccm 48.910+-2.176 ? 50.467+-1.892 ? might be 1.0318x slower stanford-crypto-pbkdf2 128.624+-2.491 ^ 91.417+-1.972 ^ definitely 1.4070x faster stanford-crypto-sha256-iterative 43.015+-0.876 ^ 31.761+-0.818 ^ definitely 1.3543x faster <arithmetic> 78.920+-0.666 ^ 75.686+-0.493 ^ definitely 1.0427x faster baseline patched Geomean of preferred means: <scaled-result> 17.1374+-0.1164 ^ 16.8491+-0.1070 ^ definitely 1.0171x faster
Saam Barati
Comment 15 2017-03-22 12:06:46 PDT
Comment on attachment 305055 [details] WIP: Cleaning up right now View in context: https://bugs.webkit.org/attachment.cgi?id=305055&action=review > Source/JavaScriptCore/bytecode/SpeculatedType.h:439 > +inline bool isAnyIntSpeculationForAdd(SpeculatedType value) > +{ > + return !!value && (value & (SpecAnyInt | SpecAnyIntAsDouble)) == value; > +} Style nit: This seems specialized enough to the DFG's use case that I'd probably make this a lambda inside the function that calls this. The reason is that the name implies to me that this might definitely be the case, but what's really happening is we're making an optimistic assumption.
Yusuke Suzuki
Comment 16 2017-03-22 13:08:38 PDT
Comment on attachment 305055 [details] WIP: Cleaning up right now View in context: https://bugs.webkit.org/attachment.cgi?id=305055&action=review >> Source/JavaScriptCore/bytecode/SpeculatedType.h:439 >> +} > > Style nit: This seems specialized enough to the DFG's use case that I'd probably make this a lambda inside the function that calls this. The reason is that the name implies to me that this might definitely be the case, but what's really happening is we're making an optimistic assumption. Thanks. OK, I've moved this to DFGGraph::addShouldSpeculateAnyInt.
Yusuke Suzuki
Comment 17 2017-03-22 13:50:52 PDT
Yusuke Suzuki
Comment 18 2017-03-22 13:51:08 PDT
OK, ready.
Saam Barati
Comment 19 2017-03-22 18:17:04 PDT
Comment on attachment 305124 [details] Patch r=me
Yusuke Suzuki
Comment 20 2017-03-22 23:01:51 PDT
Yusuke Suzuki
Comment 21 2017-03-23 00:51:01 PDT
Track the darkroom regression in https://bugs.webkit.org/show_bug.cgi?id=169998
Note You need to log in before you can comment on or make changes to this bug.