RESOLVED FIXED 157399
[JSC] Get rid of NonNegZeroDouble, it is broken
https://bugs.webkit.org/show_bug.cgi?id=157399
Summary [JSC] Get rid of NonNegZeroDouble, it is broken
Benjamin Poulain
Reported 2016-05-05 16:48:01 PDT
[JSC] Get rid of NonNegZeroDouble, it is broken
Attachments
Patch (11.65 KB, patch)
2016-05-05 17:23 PDT, Benjamin Poulain
no flags
Patch for landing (11.69 KB, patch)
2016-05-05 18:27 PDT, Benjamin Poulain
no flags
Benjamin Poulain
Comment 1 2016-05-05 17:23:08 PDT
Mark Lam
Comment 2 2016-05-05 17:30:17 PDT
Comment on attachment 278208 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=278208&action=review r=me > Source/JavaScriptCore/ChangeLog:14 > + garanteed to end up in a recompile loop. typo: /garanteed/guaranteed/. > Source/JavaScriptCore/ChangeLog:17 > + -We speculate you have Int32 depsite producing doubles. typo: /depsite/despite/. > Source/JavaScriptCore/ChangeLog:18 > + -We OSR exit on an other node (ValueToInt32 for example) from the result of this ArithMul. typo: /an other/another/.
Mark Lam
Comment 3 2016-05-05 17:30:59 PDT
Can we have perf numbers to confirm that this is a good thing?
Benjamin Poulain
Comment 4 2016-05-05 17:50:29 PDT
Sure: Conf#1 Conf#2 SunSpider: 3d-cube 8.0846+-0.3743 8.0459+-0.2575 3d-morph 7.8293+-0.2526 7.8116+-0.1000 3d-raytrace 8.6219+-0.1723 8.6099+-0.0754 access-binary-trees 3.1553+-0.1817 2.9854+-0.1235 might be 1.0569x faster access-fannkuch 8.6970+-0.1854 ? 8.7535+-0.2042 ? access-nbody 4.2295+-0.2341 4.1943+-0.2971 access-nsieve 4.4164+-0.1774 ? 4.4925+-0.1428 ? might be 1.0172x slower bitops-3bit-bits-in-byte 1.5233+-0.0870 ? 1.5759+-0.1377 ? might be 1.0345x slower bitops-bits-in-byte 4.8295+-0.0882 4.8259+-0.1795 bitops-bitwise-and 2.5401+-0.4048 ? 2.5502+-0.2171 ? bitops-nsieve-bits 4.5153+-0.1806 4.4999+-0.2230 controlflow-recursive 3.3527+-0.0785 3.3500+-0.1449 crypto-aes 6.6942+-0.2688 ? 6.9943+-0.3257 ? might be 1.0448x slower crypto-md5 3.6370+-0.1232 ? 3.7021+-0.1336 ? might be 1.0179x slower crypto-sha1 3.3739+-0.2113 ? 3.4261+-0.1425 ? might be 1.0155x slower date-format-tofte 12.5109+-0.5631 ? 12.8644+-0.2872 ? might be 1.0283x slower date-format-xparb 7.3212+-0.4179 7.2598+-0.2875 math-cordic 4.4948+-0.1218 4.4470+-0.2341 might be 1.0107x faster math-partial-sums 8.3100+-0.2988 ? 8.3120+-0.3920 ? math-spectral-norm 3.2461+-0.1463 3.1270+-0.1298 might be 1.0381x faster regexp-dna 9.9422+-0.1440 ? 10.1365+-0.1793 ? might be 1.0195x slower string-base64 6.6079+-0.2937 ? 6.8386+-0.7408 ? might be 1.0349x slower string-fasta 8.7205+-0.2405 8.5977+-0.1590 might be 1.0143x faster string-tagcloud 13.7374+-0.5479 13.4935+-0.5171 might be 1.0181x faster string-unpack-code 27.5786+-1.2333 27.5617+-0.4642 string-validate-input 6.4800+-0.2418 ? 6.6505+-0.4393 ? might be 1.0263x slower <arithmetic> 7.0942+-0.0951 ? 7.1195+-0.0483 ? might be 1.0036x slower Conf#1 Conf#2 Octane: encrypt 0.27384+-0.00265 0.27241+-0.00445 decrypt 4.95010+-0.01478 ? 4.99333+-0.17006 ? deltablue x2 0.22430+-0.01941 0.20671+-0.00270 might be 1.0851x faster earley 0.50916+-0.00116 0.50767+-0.00636 boyer 8.61669+-0.15549 8.55408+-0.09662 navier-stokes x2 6.45209+-0.04611 6.43432+-0.01428 raytrace x2 1.38773+-0.04821 ? 1.38773+-0.03789 ? richards x2 0.13988+-0.00163 ? 0.14312+-0.00345 ? might be 1.0231x slower splay x2 0.52022+-0.00754 0.51874+-0.00838 regexp x2 27.47515+-0.22017 27.36919+-0.38902 pdfjs x2 60.43762+-0.91144 60.07478+-1.49832 mandreel x2 67.93454+-1.61191 67.26845+-0.09297 gbemu x2 55.01648+-2.00744 ? 55.26581+-1.82016 ? closure 0.83052+-0.00235 0.82970+-0.00841 jquery 10.47478+-0.13644 ? 10.51687+-0.18635 ? box2d x2 16.15290+-0.14111 16.11744+-0.20800 zlib x2 544.71757+-3.02098 544.45734+-4.92492 typescript x2 1065.56360+-7.26251 1058.31647+-10.26854 <geometric> 8.29946+-0.03544 8.24984+-0.02481 might be 1.0060x faster Conf#1 Conf#2 Kraken: ai-astar 132.196+-0.666 131.749+-0.676 audio-beat-detection 65.659+-1.021 65.544+-0.916 audio-dft 128.693+-0.780 ? 129.680+-1.228 ? audio-fft 54.094+-0.580 ? 55.206+-1.691 ? might be 1.0206x slower audio-oscillator 78.290+-0.406 ? 78.573+-0.708 ? imaging-darkroom 96.707+-0.360 96.335+-0.341 imaging-desaturate 92.778+-0.873 92.473+-0.714 imaging-gaussian-blur 118.468+-0.573 ^ 105.053+-4.718 ^ definitely 1.1277x faster json-parse-financial 61.865+-0.263 ^ 61.173+-0.318 ^ definitely 1.0113x faster json-stringify-tinderbox 39.372+-0.298 39.231+-0.392 stanford-crypto-aes 59.222+-0.259 ? 59.385+-0.613 ? stanford-crypto-ccm 53.943+-2.839 ? 54.580+-0.890 ? might be 1.0118x slower stanford-crypto-pbkdf2 142.284+-1.253 140.631+-0.595 might be 1.0117x faster stanford-crypto-sha256-iterative 55.831+-0.902 ? 56.101+-0.846 ? <arithmetic> 84.243+-0.430 ^ 83.265+-0.319 ^ definitely 1.0117x faster Conf#1 Conf#2 AsmBench: bigfib.cpp 676.8022+-8.1014 653.6763+-22.9224 might be 1.0354x faster cray.c 572.9987+-5.4750 570.0806+-5.9688 dry.c 658.0568+-29.5948 657.9347+-28.2415 FloatMM.c 889.4448+-1.4215 889.3693+-0.5424 gcc-loops.cpp 6420.7363+-39.9265 6390.6307+-16.4554 n-body.c 1551.2932+-82.5923 ? 1576.6259+-82.3750 ? might be 1.0163x slower Quicksort.c 599.7567+-5.7173 ? 600.9503+-3.5734 ? stepanov_container.cpp 4477.7452+-44.8298 4437.8970+-72.5955 Towers.c 390.1205+-0.4479 ? 390.3469+-0.3121 ? <geometric> 1100.4780+-11.7008 1096.1716+-3.9390 might be 1.0039x faster Conf#1 Conf#2 Geomean of preferred means: <scaled-result> 48.3353+-0.2945 48.1180+-0.0780 might be 1.0045x faster
Benjamin Poulain
Comment 5 2016-05-05 18:27:54 PDT
Created attachment 278216 [details] Patch for landing
WebKit Commit Bot
Comment 6 2016-05-05 19:30:13 PDT
Comment on attachment 278216 [details] Patch for landing Clearing flags on attachment: 278216 Committed r200502: <http://trac.webkit.org/changeset/200502>
WebKit Commit Bot
Comment 7 2016-05-05 19:30:18 PDT
All reviewed patches have been landed. Closing bug.
Filip Pizlo
Comment 8 2016-05-06 12:22:15 PDT
Comment on attachment 278216 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=278216&action=review > Source/JavaScriptCore/ChangeLog:9 > + The profile "NonNegZeroDouble" is fundamentally broken. Why is it fundamentally broken?
Filip Pizlo
Comment 9 2016-05-06 12:27:31 PDT
Comment on attachment 278216 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=278216&action=review > Source/JavaScriptCore/bytecode/ValueProfile.h:224 > - NonNegZeroDouble = 1 << 0, > - NegZeroDouble = 1 << 1, > - NonNumber = 1 << 2, > - Int32Overflow = 1 << 3, > - Int52Overflow = 1 << 4, > + NegZeroDouble = 1 << 0, > + NonNumber = 1 << 1, > + Int32Overflow = 1 << 2, > + Int52Overflow = 1 << 3, I really don't like what you did here. Prior to your change, this had a clear story to tell: NonNegZeroDouble and NegZeroDouble can be used together to form the set of doubles. Now all you can do here is detect if something is NegZeroDouble. I get that the previous profiling was incomplete, but that's not a good reason to break it even more.
Filip Pizlo
Comment 10 2016-05-06 12:37:44 PDT
Comment on attachment 278216 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=278216&action=review > Source/JavaScriptCore/ChangeLog:26 > + In LLINT, the flag is only set on the slow path. > + Since double*double is on the fast path, those cases are ignored. It's OK for this flag to never be set by the LLInt, because the code must have been profiled in the baseline JIT before making it to the DFG. > Source/JavaScriptCore/ChangeLog:30 > + In Baseline, the flag is set for any case that falls back on double > + multiplication. BUT, the DFG flag was only set for nodes that spend > + many iteration in slow path, which obviously does not apply to double*double. I'm confused by this. Are you saying that the problem here is that this occurred so rarely that the baseline JIT never saw it, and the only time we would see it was upon OSR exit from DFG? If that's the case then I think it's fixable. We need the DFG to immediately speculate that the result of the Mul is Int32, right after executing it. That speculation should feed a BadResultType ExitKind back to the op_mul. Then the DFG won't make the same mistake upon recompile.
Filip Pizlo
Comment 11 2016-05-06 12:51:20 PDT
Comment on attachment 278216 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=278216&action=review > Source/JavaScriptCore/ChangeLog:14 > + The problem is you are likely to mispredict, and when you do, you are > + guaranteed to end up in a recompile loop. You introduced a different kind of bug in this patch. If a program multiplies integers but one of the integers comes from an object with a valueOf method, then all downstream logic that touches either that integer or any integer derived from it via any computation will see a double instead. This will also lead to recompilations, since in baseline, the operation would have returned integers and so all downstream uses would report integers in their profiling. Then when the multiplication tiers up into the DFG, it will start producing doubles. Then everything else will have to recompile. It's great that your rollout fixed a bug on one website without hurting any of synthetic benchmarks. But for changes to the profiling, I think we have to be extra careful. We should avoid introducing this kind of int->double pollution as much as possible. It's as bad as recompilations. And in this case, you could have just fixed the profiler to catch the case where the DFG exited. It's not hard to catch. We already have support for this for ValueProfiles.
Benjamin Poulain
Comment 12 2016-05-06 13:58:51 PDT
(In reply to comment #11) > Comment on attachment 278216 [details] > Patch for landing > > View in context: > https://bugs.webkit.org/attachment.cgi?id=278216&action=review > > > Source/JavaScriptCore/ChangeLog:14 > > + The problem is you are likely to mispredict, and when you do, you are > > + guaranteed to end up in a recompile loop. > > You introduced a different kind of bug in this patch. If a program > multiplies integers but one of the integers comes from an object with a > valueOf method, then all downstream logic that touches either that integer > or any integer derived from it via any computation will see a double > instead. This will also lead to recompilations, since in baseline, the > operation would have returned integers and so all downstream uses would > report integers in their profiling. Then when the multiplication tiers up > into the DFG, it will start producing doubles. Then everything else will > have to recompile. > > It's great that your rollout fixed a bug on one website without hurting any > of synthetic benchmarks. But for changes to the profiling, I think we have > to be extra careful. We should avoid introducing this kind of int->double > pollution as much as possible. It's as bad as recompilations. And in this > case, you could have just fixed the profiler to catch the case where the DFG > exited. It's not hard to catch. We already have support for this for > ValueProfiles. If the goal of NonNegZeroDouble was to type the fallback snippet properly, that is not at all what the original patch did. Again, if that the original patch did not do any of what it was supposed to, I do not believe it is worth trying to salvage it. Better reimplement this properly.
Filip Pizlo
Comment 13 2016-05-06 14:16:50 PDT
Comment on attachment 278216 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=278216&action=review >>> Source/JavaScriptCore/ChangeLog:14 >>> + guaranteed to end up in a recompile loop. >> >> You introduced a different kind of bug in this patch. If a program multiplies integers but one of the integers comes from an object with a valueOf method, then all downstream logic that touches either that integer or any integer derived from it via any computation will see a double instead. This will also lead to recompilations, since in baseline, the operation would have returned integers and so all downstream uses would report integers in their profiling. Then when the multiplication tiers up into the DFG, it will start producing doubles. Then everything else will have to recompile. >> >> It's great that your rollout fixed a bug on one website without hurting any of synthetic benchmarks. But for changes to the profiling, I think we have to be extra careful. We should avoid introducing this kind of int->double pollution as much as possible. It's as bad as recompilations. And in this case, you could have just fixed the profiler to catch the case where the DFG exited. It's not hard to catch. We already have support for this for ValueProfiles. > > If the goal of NonNegZeroDouble was to type the fallback snippet properly, that is not at all what the original patch did. > > Again, if that the original patch did not do any of what it was supposed to, I do not believe it is worth trying to salvage it. Better reimplement this properly. I don't know why you say that it's "not at all" what the original patch did. Clearly, it's setting the bit if the baseline JIT sees double. Do you dispute this?
Note You need to log in before you can comment on or make changes to this bug.