Everyone should exit backwards.
The plan: - SetLocals: split up MovHint from SetLocal. When we lower bytecode instruction B to nodes N1..Nn, we place a MovHint at the end of the node sequence for B, and then we place a SetLocal at the start of the node sequence for B+1. We do this by having the bytecode parser keep a list of deferred SetLocal's. The MovHints shouldn't have VariableAccessData. SetLocals should never turn into MovHints. For flushed and captured locals, we rely on the fact that exiting between the MovHint and SetLocal will anyway cause the value to get flushed. For captured locals the situation is actually particularly humorous because will have a SetLocal associated with the bytecode instruction *after* the captured_mov. That's all fine. - UInt32ToDouble: reveal it in bytecode. No more forward exit, problem solved. - DoubleAsInt32: we could be clever for this one. But it's not worth it. Just make it exit backwards and place Phantoms right after it. - GetByVal on Uint32Array: just use a backwards exit.
Created attachment 219944 [details] first parts
Created attachment 219968 [details] more!
Created attachment 219974 [details] got something that makes some sense But I think that to really do this right, I need to finally fix the PhantomArguments nonsense. https://bugs.webkit.org/show_bug.cgi?id=126218
Created attachment 219998 [details] getting close
Created attachment 220006 [details] passing some tests
Created attachment 220009 [details] rebased
Comment on attachment 220009 [details] rebased View in context: https://bugs.webkit.org/attachment.cgi?id=220009&action=review Yay code death! > Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp:586 > + case MovHint: > + case ZombieHint: Is it okay to hit these two now? Previously we would have error'd out
Comment on attachment 220009 [details] rebased View in context: https://bugs.webkit.org/attachment.cgi?id=220009&action=review >> Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp:586 >> + case ZombieHint: > > Is it okay to hit these two now? Previously we would have error'd out Yeah that's the whole point of this patch. ByteCodeParser now inserts MovHint's. That's why there is so much code death. That of course implies that phases shouldn't assert that there aren't any MovHints. I do the same for ZombieHint even though ByteCodeParser doesn't insert it, just because asserting otherwise doesn't buy anything.
Created attachment 220011 [details] it's passing *so* many tests I think that it's all basically in place. I just need to port to 32-bit (easy) and to FTL (more code death!).
(In reply to comment #10) > Created an attachment (id=220011) [details] > it's passing *so* many tests > > I think that it's all basically in place. I just need to port to 32-bit (easy) and to FTL (more code death!). Oh yeah, and I still have to check if this makes the tests from https://bug-125523-attachments.webkit.org/attachment.cgi?id=219283 pass.
Created attachment 220033 [details] the patch
Attachment 220033 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h', u'Source/JavaScriptCore/dfg/DFGArrayifySlowPathGenerator.h', u'Source/JavaScriptCore/dfg/DFGBackwardsPropagationPhase.cpp', u'Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp', u'Source/JavaScriptCore/dfg/DFGClobberize.h', u'Source/JavaScriptCore/dfg/DFGCommon.h', u'Source/JavaScriptCore/dfg/DFGConstantFoldingPhase.cpp', u'Source/JavaScriptCore/dfg/DFGDCEPhase.cpp', u'Source/JavaScriptCore/dfg/DFGFixupPhase.cpp', u'Source/JavaScriptCore/dfg/DFGLICMPhase.cpp', u'Source/JavaScriptCore/dfg/DFGMinifiedNode.cpp', u'Source/JavaScriptCore/dfg/DFGMinifiedNode.h', u'Source/JavaScriptCore/dfg/DFGNode.cpp', u'Source/JavaScriptCore/dfg/DFGNode.h', u'Source/JavaScriptCore/dfg/DFGNodeFlags.cpp', u'Source/JavaScriptCore/dfg/DFGNodeFlags.h', u'Source/JavaScriptCore/dfg/DFGNodeType.h', u'Source/JavaScriptCore/dfg/DFGOSRAvailabilityAnalysisPhase.cpp', u'Source/JavaScriptCore/dfg/DFGOSREntrypointCreationPhase.cpp', u'Source/JavaScriptCore/dfg/DFGOSRExit.cpp', u'Source/JavaScriptCore/dfg/DFGOSRExit.h', u'Source/JavaScriptCore/dfg/DFGOSRExitBase.cpp', u'Source/JavaScriptCore/dfg/DFGOSRExitBase.h', u'Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp', u'Source/JavaScriptCore/dfg/DFGSSAConversionPhase.cpp', u'Source/JavaScriptCore/dfg/DFGSafeToExecute.h', u'Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp', u'Source/JavaScriptCore/dfg/DFGSpeculativeJIT.h', u'Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp', u'Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp', u'Source/JavaScriptCore/dfg/DFGTypeCheckHoistingPhase.cpp', u'Source/JavaScriptCore/dfg/DFGValidate.cpp', u'Source/JavaScriptCore/dfg/DFGVariableEventStream.cpp', u'Source/JavaScriptCore/ftl/FTLCapabilities.cpp', u'Source/JavaScriptCore/ftl/FTLLowerDFGToLLVM.cpp', u'Source/JavaScriptCore/ftl/FTLOSRExit.cpp', u'Source/JavaScriptCore/ftl/FTLOSRExit.h', u'Source/JavaScriptCore/tests/stress/dead-int32-to-double.js', u'Source/JavaScriptCore/tests/stress/dead-uint32-to-number.js', '--commit-queue']" exit_code: 1 ERROR: Source/JavaScriptCore/dfg/DFGDCEPhase.cpp:212: When wrapping a line, only indent 4 spaces. [whitespace/indent] [3] Total errors found: 1 in 25 files If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 220033 [details] the patch Clearing r?. Needs some more work.
I still have some performance regressions to investigate. Benchmark report for SunSpider, LongSpider, V8Spider, Octane, Kraken, and JSRegress on oldmac (MacPro4,1). VMs tested: "TipOfTree" at /Volumes/Data/pizlo/OpenSource/WebKitBuild/Release/jsc (r161072) "NoMoreForward" at /Volumes/Data/fromMiniMe/tertiary/OpenSource/WebKitBuild/Release/jsc (r161072) Collected 10 samples per benchmark/VM, with 10 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree NoMoreForward SunSpider: 3d-cube 7.7477+-0.0655 ? 7.8040+-0.0684 ? 3d-morph 8.8383+-0.0580 ? 8.9054+-0.0531 ? 3d-raytrace 8.9696+-0.1516 ? 9.1067+-0.1551 ? might be 1.0153x slower access-binary-trees 2.1243+-0.0205 ? 2.1502+-0.0291 ? might be 1.0122x slower access-fannkuch 7.9958+-0.0946 ! 8.2260+-0.0626 ! definitely 1.0288x slower access-nbody 4.2594+-0.0067 ! 4.2879+-0.0114 ! definitely 1.0067x slower access-nsieve 5.0340+-0.0481 5.0122+-0.0430 bitops-3bit-bits-in-byte 1.8939+-0.0178 ? 1.9051+-0.0063 ? bitops-bits-in-byte 7.1334+-0.1183 ? 7.1619+-0.1152 ? bitops-bitwise-and 3.0377+-0.0584 ? 3.0417+-0.0516 ? bitops-nsieve-bits 4.6469+-0.0057 ! 4.7000+-0.0050 ! definitely 1.0114x slower controlflow-recursive 3.1760+-0.0202 ? 3.2100+-0.0231 ? might be 1.0107x slower crypto-aes 5.5861+-0.0542 ? 5.6480+-0.0338 ? might be 1.0111x slower crypto-md5 3.3692+-0.0226 ! 3.4390+-0.0236 ! definitely 1.0207x slower crypto-sha1 3.0251+-0.0303 ? 3.0389+-0.0060 ? date-format-tofte 12.0573+-0.1279 11.8568+-0.2270 might be 1.0169x faster date-format-xparb 8.7362+-0.1108 8.6558+-0.0770 math-cordic 4.2879+-0.0179 ? 4.3172+-0.0228 ? math-partial-sums 10.2010+-0.1525 ? 10.2306+-0.1136 ? math-spectral-norm 2.7587+-0.0068 ! 2.8089+-0.0131 ! definitely 1.0182x slower regexp-dna 12.9280+-0.1087 ? 13.0476+-0.1113 ? string-base64 5.7965+-0.0799 ! 5.9296+-0.0330 ! definitely 1.0230x slower string-fasta 10.5137+-0.0996 10.4503+-0.0870 string-tagcloud 15.5365+-0.1118 15.4937+-0.1666 string-unpack-code 31.4263+-0.1543 ? 31.5493+-0.4988 ? string-validate-input 7.0883+-0.0931 ? 7.1331+-0.1378 ? <arithmetic> * 7.6218+-0.0197 ? 7.6581+-0.0203 ? might be 1.0048x slower <geometric> 6.1040+-0.0154 ! 6.1457+-0.0123 ! definitely 1.0068x slower <harmonic> 5.0086+-0.0125 ! 5.0509+-0.0131 ! definitely 1.0084x slower TipOfTree NoMoreForward LongSpider: 3d-cube 2687.1113+-6.1242 2680.3029+-5.6173 3d-morph 1499.6746+-1.3573 ? 1499.8336+-1.0011 ? 3d-raytrace 1517.9514+-21.0093 1510.8118+-5.6789 access-binary-trees 2455.1927+-15.6564 ? 2471.5546+-14.5309 ? access-fannkuch 665.4782+-0.3369 ! 671.7787+-4.0167 ! definitely 1.0095x slower access-nbody 1496.0047+-0.8569 ? 1496.3180+-1.0250 ? access-nsieve 1548.7951+-4.6670 ? 1551.2909+-2.6881 ? bitops-3bit-bits-in-byte 125.9864+-0.0956 ? 126.0005+-0.1283 ? bitops-bits-in-byte 603.3671+-4.8342 598.4732+-4.1046 bitops-nsieve-bits 1049.4697+-0.4431 ? 1050.0849+-0.6175 ? controlflow-recursive 1473.0839+-0.4837 1472.3620+-0.7985 crypto-aes 1661.9899+-3.5560 ! 1690.0070+-3.3103 ! definitely 1.0169x slower crypto-md5 1164.5818+-0.8966 ! 1238.1303+-1.0405 ! definitely 1.0632x slower crypto-sha1 1611.6601+-18.5114 ? 1625.2485+-5.9088 ? date-format-tofte 1230.4447+-10.3433 ^ 1206.8781+-4.0405 ^ definitely 1.0195x faster date-format-xparb 1484.8765+-13.0692 ^ 1445.6019+-9.1896 ^ definitely 1.0272x faster math-cordic 1737.2906+-2.4877 ? 1738.6585+-2.2659 ? math-partial-sums 1308.6686+-2.0908 1308.5303+-1.8876 math-spectral-norm 1826.3281+-1.3018 1825.7732+-0.2902 string-base64 591.2958+-2.5032 ? 595.3154+-2.3568 ? string-fasta 995.9428+-3.7675 ? 1001.3261+-5.0387 ? string-tagcloud 392.1862+-1.7731 ^ 388.8681+-1.4390 ^ definitely 1.0085x faster <arithmetic> 1323.9718+-1.6218 ! 1326.9613+-0.9489 ! definitely 1.0023x slower <geometric> * 1130.5281+-1.3345 ! 1132.9652+-0.7583 ! definitely 1.0022x slower <harmonic> 819.4104+-0.8964 ? 820.3031+-0.6844 ? might be 1.0011x slower TipOfTree NoMoreForward V8Spider: crypto 79.3324+-0.2074 ! 80.0579+-0.1768 ! definitely 1.0091x slower deltablue 98.2991+-0.8466 ! 100.1585+-0.8703 ! definitely 1.0189x slower earley-boyer 73.1583+-0.9094 ! 75.7814+-0.4329 ! definitely 1.0359x slower raytrace 44.0949+-0.2261 ! 45.9286+-0.3006 ! definitely 1.0416x slower regexp 100.4728+-0.6226 ? 100.5200+-0.1890 ? richards 134.7356+-1.5732 134.2745+-1.1417 splay 46.0965+-0.3838 ? 47.1427+-1.3224 ? might be 1.0227x slower <arithmetic> 82.3128+-0.2474 ! 83.4091+-0.2853 ! definitely 1.0133x slower <geometric> * 76.7532+-0.1966 ! 78.1133+-0.3346 ! definitely 1.0177x slower <harmonic> 71.2801+-0.1825 ! 72.8575+-0.4323 ! definitely 1.0221x slower TipOfTree NoMoreForward Octane and V8v7: encrypt 0.46544+-0.00027 ! 0.46913+-0.00093 ! definitely 1.0079x slower decrypt 8.62226+-0.09987 8.61537+-0.00791 deltablue x2 0.56304+-0.00545 ? 0.56964+-0.00791 ? might be 1.0117x slower earley 0.90549+-0.00742 ? 0.91750+-0.00601 ? might be 1.0133x slower boyer 12.46417+-0.03947 ! 12.93956+-0.05310 ! definitely 1.0381x slower raytrace x2 4.28739+-0.02931 ? 4.35041+-0.06543 ? might be 1.0147x slower regexp x2 32.93311+-0.21436 32.90265+-0.12887 richards x2 0.43088+-0.00764 ? 0.44163+-0.00548 ? might be 1.0250x slower splay x2 0.64101+-0.01104 0.63656+-0.00364 navier-stokes x2 10.69854+-0.00584 ! 10.78366+-0.00282 ! definitely 1.0080x slower closure 0.43408+-0.00082 ! 0.43603+-0.00074 ! definitely 1.0045x slower jquery 6.38149+-0.03133 6.36050+-0.01041 gbemu x2 71.40098+-1.05727 70.68938+-0.34214 might be 1.0101x faster mandreel x2 136.42504+-0.55428 ? 136.72298+-0.16404 ? pdfjs x2 102.04133+-0.35905 ^ 101.22505+-0.20627 ^ definitely 1.0081x faster box2d x2 35.01198+-0.37891 ? 35.07940+-0.13972 ? V8v7: <arithmetic> 7.59783+-0.03294 ? 7.64442+-0.01865 ? might be 1.0061x slower <geometric> * 2.51091+-0.01153 ! 2.53612+-0.00768 ! definitely 1.0100x slower <harmonic> 1.03357+-0.00834 ? 1.04545+-0.00589 ? might be 1.0115x slower Octane including V8v7: <arithmetic> 31.46690+-0.12749 31.40541+-0.03596 might be 1.0020x faster <geometric> * 6.96917+-0.02671 ? 7.00508+-0.01477 ? might be 1.0052x slower <harmonic> 1.43961+-0.00994 ? 1.45456+-0.00704 ? might be 1.0104x slower TipOfTree NoMoreForward Kraken: ai-astar 494.227+-0.804 ! 499.995+-4.909 ! definitely 1.0117x slower audio-beat-detection 224.197+-1.173 ! 228.236+-2.271 ! definitely 1.0180x slower audio-dft 290.598+-1.527 289.962+-0.837 audio-fft 130.681+-0.129 ! 131.309+-0.251 ! definitely 1.0048x slower audio-oscillator 244.058+-0.396 243.848+-0.324 imaging-darkroom 286.226+-0.618 286.161+-0.510 imaging-desaturate 158.360+-0.150 ? 158.485+-0.228 ? imaging-gaussian-blur 362.678+-0.179 ! 363.279+-0.302 ! definitely 1.0017x slower json-parse-financial 80.266+-0.284 ? 81.052+-1.038 ? json-stringify-tinderbox 103.755+-0.232 ? 104.104+-0.227 ? stanford-crypto-aes 90.973+-0.411 ! 93.992+-0.432 ! definitely 1.0332x slower stanford-crypto-ccm 101.318+-0.761 100.943+-1.055 stanford-crypto-pbkdf2 260.082+-2.108 ? 261.418+-1.031 ? stanford-crypto-sha256-iterative 114.879+-0.493 114.336+-0.471 <arithmetic> * 210.164+-0.252 ! 211.223+-0.493 ! definitely 1.0050x slower <geometric> 180.416+-0.207 ! 181.393+-0.347 ! definitely 1.0054x slower <harmonic> 155.986+-0.173 ! 157.002+-0.409 ! definitely 1.0065x slower TipOfTree NoMoreForward JSRegress: adapt-to-double-divide 22.8042+-0.0961 ? 22.8468+-0.0777 ? aliased-arguments-getbyval 1.0011+-0.0151 ? 1.0183+-0.0032 ? might be 1.0172x slower allocate-big-object 3.0356+-0.0097 ! 3.0737+-0.0167 ! definitely 1.0126x slower arity-mismatch-inlining 0.9662+-0.0064 ! 0.9829+-0.0057 ! definitely 1.0173x slower array-access-polymorphic-structure 10.4376+-0.3544 10.0325+-0.1296 might be 1.0404x faster array-nonarray-polymorhpic-access 58.2338+-0.1714 58.0463+-0.2219 array-with-double-add 5.7222+-0.0879 ? 5.7970+-0.0188 ? might be 1.0131x slower array-with-double-increment 4.3060+-0.0595 ? 4.3683+-0.0105 ? might be 1.0145x slower array-with-double-mul-add 6.8309+-0.0548 ? 6.8941+-0.0265 ? array-with-double-sum 8.0272+-0.1116 ? 8.0797+-0.0377 ? array-with-int32-add-sub 10.4290+-0.0460 ? 10.4942+-0.1256 ? array-with-int32-or-double-sum 8.0125+-0.0731 7.9787+-0.1047 ArrayBuffer-DataView-alloc-large-long-lived 118.1131+-0.8454 118.0374+-1.2965 ArrayBuffer-DataView-alloc-long-lived 30.6087+-0.1267 ? 30.7952+-0.1732 ? ArrayBuffer-Int32Array-byteOffset 6.2424+-0.1378 6.0867+-0.0533 might be 1.0256x faster ArrayBuffer-Int8Array-alloc-huge-long-lived 215.8761+-1.8303 ? 216.5015+-1.6644 ? ArrayBuffer-Int8Array-alloc-large-long-lived-fragmented 165.7724+-0.9543 ? 166.7524+-0.9831 ? ArrayBuffer-Int8Array-alloc-large-long-lived 118.2743+-0.6912 ? 119.2394+-1.1310 ? ArrayBuffer-Int8Array-alloc-long-lived-buffer 48.5395+-0.2610 ! 49.7598+-0.3603 ! definitely 1.0251x slower ArrayBuffer-Int8Array-alloc-long-lived 30.3642+-0.1388 ! 30.7505+-0.1622 ! definitely 1.0127x slower ArrayBuffer-Int8Array-alloc 26.3659+-0.2710 ? 26.5725+-0.2117 ? asmjs_bool_bug 9.2633+-0.1033 9.2190+-0.0759 basic-set 19.9261+-0.1190 ? 20.3165+-0.4320 ? might be 1.0196x slower big-int-mul 5.6289+-0.1977 ! 6.0430+-0.0279 ! definitely 1.0736x slower boolean-test 4.4472+-0.0086 ! 4.4889+-0.0197 ! definitely 1.0094x slower branch-fold 5.0073+-0.0160 ? 5.0132+-0.0075 ? by-val-generic 12.7380+-0.1388 ^ 12.4039+-0.1768 ^ definitely 1.0269x faster captured-assignments 0.6461+-0.0205 ? 0.6488+-0.0114 ? cast-int-to-double 12.5385+-0.1219 ? 12.6116+-0.0658 ? cell-argument 16.3613+-0.3326 16.1082+-0.3328 might be 1.0157x faster cfg-simplify 3.9956+-0.0112 ? 4.0206+-0.0217 ? chain-custom-getter 162.7431+-7.5380 160.2406+-5.6751 might be 1.0156x faster chain-getter-access 496.6991+-5.0602 ? 497.0593+-4.0655 ? cmpeq-obj-to-obj-other 12.9000+-0.4403 ^ 12.0499+-0.3175 ^ definitely 1.0706x faster constant-test 8.9683+-0.1175 ? 9.0479+-0.0857 ? DataView-custom-properties 125.4796+-0.7799 125.3826+-0.8890 delay-tear-off-arguments-strictmode 3.6424+-0.0062 ! 3.6924+-0.0097 ! definitely 1.0137x slower destructuring-arguments-length 176.2143+-1.2171 ? 176.5282+-1.2420 ? destructuring-arguments 8.9885+-0.0788 8.9101+-0.0633 destructuring-swap 8.6646+-0.0640 ? 8.7525+-0.0527 ? might be 1.0101x slower direct-arguments-getbyval 0.8694+-0.0051 ? 0.8967+-0.0312 ? might be 1.0313x slower double-get-by-val-out-of-bounds 7.4979+-0.1009 7.4261+-0.0589 double-pollution-getbyval 11.1429+-0.0657 11.0724+-0.0957 double-pollution-putbyoffset 6.0551+-0.0296 ? 6.1104+-0.0272 ? double-to-int32-typed-array-no-inline 2.5860+-0.0073 ! 2.6481+-0.0096 ! definitely 1.0240x slower double-to-int32-typed-array 2.2361+-0.0271 ! 2.3395+-0.0423 ! definitely 1.0462x slower double-to-uint32-typed-array-no-inline 2.7458+-0.0053 ! 2.8321+-0.0254 ! definitely 1.0314x slower double-to-uint32-typed-array 2.4729+-0.0307 ! 2.5811+-0.0065 ! definitely 1.0437x slower empty-string-plus-int 10.9704+-0.2005 ? 11.0245+-0.0641 ? emscripten-cube2hash 55.5230+-0.3342 ? 55.6238+-0.4500 ? emscripten-memops 7074.5014+-36.2745 ? 7112.5207+-92.2659 ? external-arguments-getbyval 2.1418+-0.0162 ? 2.1692+-0.0237 ? might be 1.0128x slower external-arguments-putbyval 3.0666+-0.0106 ? 3.1054+-0.0342 ? might be 1.0126x slower fixed-typed-array-storage-var-index 1.4052+-0.0038 ! 1.4284+-0.0082 ! definitely 1.0165x slower fixed-typed-array-storage 0.9940+-0.0042 ? 1.0324+-0.0358 ? might be 1.0386x slower Float32Array-matrix-mult 6.5979+-0.0394 ? 6.6286+-0.0279 ? Float32Array-to-Float64Array-set 93.3661+-0.5651 92.7816+-0.6265 Float64Array-alloc-long-lived 103.7919+-0.8400 103.3107+-0.4230 Float64Array-to-Int16Array-set 116.8621+-0.4076 ? 117.8234+-1.0954 ? fold-double-to-int 20.5558+-0.2363 ? 20.7628+-0.1452 ? might be 1.0101x slower for-of-iterate-array-entries 8.5566+-0.1032 ? 8.6473+-0.1208 ? might be 1.0106x slower for-of-iterate-array-keys 3.4777+-0.0318 3.4653+-0.0341 for-of-iterate-array-values 3.0015+-0.0818 2.9753+-0.0557 function-dot-apply 3.1788+-0.0093 ^ 3.1453+-0.0126 ^ definitely 1.0106x faster function-test 4.9035+-0.0423 4.8976+-0.0554 get-by-id-chain-from-try-block 7.9991+-0.1037 7.9137+-0.0770 might be 1.0108x faster get-by-id-proto-or-self 25.8817+-0.2404 ? 26.1238+-0.2486 ? get-by-id-self-or-proto 23.1910+-0.5496 ? 23.5511+-0.6282 ? might be 1.0155x slower get-by-val-out-of-bounds 7.2849+-0.0901 7.2012+-0.0649 might be 1.0116x faster get_callee_monomorphic 4.9301+-0.0292 ? 4.9861+-0.0356 ? might be 1.0114x slower get_callee_polymorphic 4.6803+-0.0129 ! 4.9943+-0.0217 ! definitely 1.0671x slower global-var-const-infer-fire-from-opt 1.0909+-0.0197 1.0803+-0.0361 global-var-const-infer 0.8118+-0.0057 ! 0.8255+-0.0078 ! definitely 1.0169x slower HashMap-put-get-iterate-keys 42.0478+-0.2254 ! 42.4932+-0.2126 ! definitely 1.0106x slower HashMap-put-get-iterate 53.7629+-0.2044 ! 54.6027+-0.2035 ! definitely 1.0156x slower HashMap-string-put-get-iterate 50.6761+-0.3177 ! 51.5645+-0.2001 ! definitely 1.0175x slower imul-double-only 17.7637+-0.1207 ? 17.7743+-0.1270 ? imul-int-only 14.9138+-0.0589 ? 14.9456+-0.1433 ? imul-mixed 22.2844+-0.9489 21.8356+-0.1119 might be 1.0206x faster in-four-cases 25.8803+-0.0858 ? 25.9571+-0.0955 ? in-one-case-false 12.0892+-0.0722 ? 12.1152+-0.0663 ? in-one-case-true 12.0147+-0.0983 ? 12.1002+-0.1127 ? in-two-cases 12.7623+-0.1040 ? 12.9287+-0.1085 ? might be 1.0130x slower indexed-properties-in-objects 4.2472+-0.0282 4.2346+-0.0035 infer-closure-const-then-mov-no-inline 15.3678+-0.1175 15.2948+-0.0736 infer-closure-const-then-mov 28.8934+-0.0958 ? 29.0201+-0.0673 ? infer-closure-const-then-put-to-scope-no-inline 17.8260+-0.0732 ? 17.9267+-0.0482 ? infer-closure-const-then-put-to-scope 35.9740+-0.3123 ? 36.0262+-0.1364 ? infer-closure-const-then-reenter-no-inline 84.3804+-0.0990 ? 84.4148+-0.1377 ? infer-closure-const-then-reenter 36.1998+-0.2893 36.0303+-0.1269 infer-one-time-closure-ten-vars 29.0426+-0.1155 28.9502+-0.1259 infer-one-time-closure-two-vars 28.7890+-0.0649 ? 28.8195+-0.0926 ? infer-one-time-closure 28.7509+-0.1572 28.7422+-0.0902 infer-one-time-deep-closure 58.3938+-0.3419 ? 58.6988+-0.1272 ? inline-arguments-access 1.6503+-0.0072 ! 1.7205+-0.0148 ! definitely 1.0425x slower inline-arguments-aliased-access 1.7576+-0.0087 ! 1.8380+-0.0045 ! definitely 1.0458x slower inline-arguments-local-escape 23.0905+-0.1376 23.0662+-0.2389 inline-get-scoped-var 7.5136+-0.0730 ? 7.5300+-0.0972 ? inlined-put-by-id-transition 15.3927+-0.3418 ? 15.3933+-0.2914 ? int-or-other-abs-then-get-by-val 9.5739+-0.1165 9.5406+-0.1100 int-or-other-abs-zero-then-get-by-val 37.3492+-0.1092 37.2312+-0.2290 int-or-other-add-then-get-by-val 10.5988+-0.0915 10.5663+-0.1209 int-or-other-add 11.0795+-0.1178 10.9834+-0.0651 int-or-other-div-then-get-by-val 6.4177+-0.0332 ? 6.4579+-0.0282 ? int-or-other-max-then-get-by-val 8.8967+-0.2316 8.8514+-0.1131 int-or-other-min-then-get-by-val 7.0613+-0.0961 ? 7.1271+-0.0789 ? int-or-other-mod-then-get-by-val 6.2482+-0.0718 ? 6.3042+-0.0193 ? int-or-other-mul-then-get-by-val 6.6008+-0.0909 ? 6.6739+-0.0292 ? might be 1.0111x slower int-or-other-neg-then-get-by-val 8.0528+-0.0610 8.0152+-0.0833 int-or-other-neg-zero-then-get-by-val 37.1738+-0.2402 36.8975+-0.1139 int-or-other-sub-then-get-by-val 10.6293+-0.1375 ? 10.6710+-0.0716 ? int-or-other-sub 8.9853+-0.0533 ? 8.9919+-0.1109 ? int-overflow-local 6.4569+-0.0583 ? 6.4864+-0.1060 ? Int16Array-alloc-long-lived 68.0220+-0.4474 67.9823+-0.3662 Int16Array-bubble-sort-with-byteLength 48.9381+-0.1183 ? 48.9974+-0.0853 ? Int16Array-bubble-sort 47.9278+-0.1856 47.9051+-0.1562 Int16Array-load-int-mul 1.8156+-0.0050 ? 1.8227+-0.0026 ? Int16Array-to-Int32Array-set 91.5646+-0.8243 ? 92.0332+-1.0601 ? Int32Array-alloc-huge-long-lived 704.0210+-4.3887 ? 706.0086+-3.8831 ? Int32Array-alloc-huge 807.3482+-7.7661 ? 807.4237+-7.0678 ? Int32Array-alloc-large-long-lived 973.7106+-8.3239 970.6329+-8.7517 Int32Array-alloc-large 45.4632+-0.9344 44.4254+-0.9908 might be 1.0234x faster Int32Array-alloc-long-lived 80.8620+-0.3737 80.8612+-0.5846 Int32Array-alloc 4.5197+-0.0146 ? 4.5387+-0.0101 ? Int32Array-Int8Array-view-alloc 14.8767+-0.0943 ! 15.1146+-0.0483 ! definitely 1.0160x slower int52-spill 12.7927+-0.1164 ? 12.9876+-0.1671 ? might be 1.0152x slower Int8Array-alloc-long-lived 67.3659+-0.6365 67.0750+-0.4802 Int8Array-load-with-byteLength 5.0477+-0.0445 ? 5.0545+-0.0450 ? Int8Array-load 5.0586+-0.0067 5.0455+-0.0588 integer-divide 15.1311+-0.0934 15.0140+-0.0835 integer-modulo 2.0642+-0.0123 ? 2.0738+-0.0106 ? large-int-captured 9.7840+-0.0954 ! 9.9757+-0.0839 ! definitely 1.0196x slower large-int-neg 26.1343+-0.2149 ? 26.2443+-0.1737 ? large-int 23.0567+-0.1655 ? 23.1449+-0.1516 ? logical-not 10.8586+-0.1949 10.7267+-0.2202 might be 1.0123x faster lots-of-fields 12.5766+-0.1127 ? 12.6680+-0.0906 ? make-indexed-storage 4.3902+-0.0336 4.3576+-0.0257 make-rope-cse 6.1280+-0.0709 6.1274+-0.0818 marsaglia-larger-ints 111.9740+-0.2957 ? 112.0046+-0.1504 ? marsaglia-osr-entry 46.9785+-0.1613 ? 47.2933+-0.5532 ? marsaglia 463.8864+-0.4146 462.9979+-0.5833 method-on-number 29.8324+-0.4967 ? 29.9110+-0.4583 ? negative-zero-divide 0.4418+-0.0380 0.4300+-0.0031 might be 1.0277x faster negative-zero-modulo 0.4099+-0.0071 ? 0.4132+-0.0071 ? negative-zero-negate 0.3977+-0.0185 0.3948+-0.0040 nested-function-parsing-random 383.2315+-0.7706 ! 384.9427+-0.5413 ! definitely 1.0045x slower nested-function-parsing 47.5037+-0.1065 ? 47.6521+-0.0725 ? new-array-buffer-dead 3.7642+-0.0327 ? 3.7851+-0.0102 ? new-array-buffer-push 10.5419+-0.1578 ? 10.5525+-0.1210 ? new-array-dead 28.5546+-0.1110 ? 28.5808+-0.1205 ? new-array-push 6.9786+-0.0927 6.9369+-0.0389 number-test 4.4021+-0.0104 ? 4.4260+-0.0288 ? object-closure-call 13.4216+-0.0982 ? 13.4546+-0.1205 ? object-test 4.7474+-0.0225 ? 4.7662+-0.0168 ? poly-stricteq 87.2970+-0.9793 ! 93.8995+-1.4552 ! definitely 1.0756x slower polymorphic-structure 20.4840+-0.1597 ! 21.4524+-0.1071 ! definitely 1.0473x slower polyvariant-monomorphic-get-by-id 12.0462+-0.1008 ? 12.0762+-0.1456 ? proto-custom-getter 157.7228+-0.0933 ? 167.2331+-9.6984 ? might be 1.0603x slower proto-getter-access 496.8927+-4.1834 496.2913+-2.9376 put-by-id 19.5437+-0.2232 ? 19.6845+-0.4045 ? put-by-val-large-index-blank-indexing-type 21.0910+-0.2060 20.7707+-0.1255 might be 1.0154x faster put-by-val-machine-int 3.3564+-0.0342 ? 3.3677+-0.0081 ? rare-osr-exit-on-local 20.1830+-0.1061 ? 20.3437+-0.0660 ? register-pressure-from-osr 31.3390+-0.1198 ? 31.3535+-0.1157 ? simple-activation-demo 35.2670+-0.1328 35.2654+-0.2005 simple-custom-getter 500.1667+-0.3404 ! 534.2697+-31.8058 ! definitely 1.0682x slower simple-getter-access 782.9094+-6.7829 ? 786.8956+-8.3196 ? slow-array-profile-convergence 4.1095+-0.0376 ? 4.1139+-0.0286 ? slow-convergence 4.4710+-0.0293 ! 4.5951+-0.0180 ! definitely 1.0278x slower sparse-conditional 1.4906+-0.0286 ? 1.5030+-0.0265 ? splice-to-remove 76.7857+-0.1722 ? 77.1254+-0.1903 ? stepanov_container 10169.4107+-18.1421 ! 10354.9833+-76.5706 ! definitely 1.0182x slower string-concat-object 3.2545+-0.0524 3.2530+-0.0402 string-concat-pair-object 3.1700+-0.0308 3.1691+-0.0183 string-concat-pair-simple 17.1379+-0.3838 ? 17.2014+-0.2467 ? string-concat-simple 17.2894+-0.3898 17.1671+-0.3293 string-cons-repeat 10.8613+-0.0603 ? 10.8642+-0.0315 ? string-cons-tower 11.3203+-0.0423 ? 11.3925+-0.1251 ? string-equality 42.6110+-0.1146 ? 42.8052+-0.4006 ? string-get-by-val-big-char 12.7328+-0.1040 ? 12.9569+-0.1219 ? might be 1.0176x slower string-get-by-val-out-of-bounds-insane 6.1063+-0.7018 5.8819+-0.1419 might be 1.0382x faster string-get-by-val-out-of-bounds 5.3093+-0.0548 ? 5.3127+-0.0731 ? string-get-by-val 4.9370+-0.0618 ? 4.9765+-0.0213 ? string-hash 2.7766+-0.0052 ^ 2.7069+-0.0104 ^ definitely 1.0258x faster string-long-ident-equality 39.0940+-0.1409 ? 39.3713+-0.3394 ? string-repeat-arith 50.4718+-0.3955 50.0146+-0.3680 string-sub 105.1262+-0.8467 104.9879+-0.3750 string-test 4.3470+-0.0862 ? 4.4140+-0.0194 ? might be 1.0154x slower string-var-equality 73.6421+-4.8252 70.1406+-0.3491 might be 1.0499x faster structure-hoist-over-transitions 3.5282+-0.0206 ? 3.5528+-0.0291 ? switch-char-constant 3.5120+-0.0493 ? 3.5169+-0.0211 ? switch-char 8.1482+-0.0648 ? 8.1774+-0.0382 ? switch-constant 9.3343+-0.1172 ? 9.5277+-0.1042 ? might be 1.0207x slower switch-string-basic-big-var 20.4855+-0.2578 ? 20.6018+-0.1379 ? switch-string-basic-big 22.3702+-1.3497 21.9566+-1.1229 might be 1.0188x faster switch-string-basic-var 20.2953+-0.0554 ? 20.3510+-0.1062 ? switch-string-basic 21.9674+-0.7937 ? 22.0119+-0.7842 ? switch-string-big-length-tower-var 29.1287+-0.2100 29.0724+-0.1701 switch-string-length-tower-var 21.8794+-0.2325 ? 21.9596+-0.2173 ? switch-string-length-tower 16.5185+-0.1242 ? 16.5854+-0.0667 ? switch-string-short 16.6181+-0.1483 16.6005+-0.1572 switch 13.5613+-0.1128 ? 13.5805+-0.1100 ? tear-off-arguments-simple 2.3709+-0.0062 ! 2.4853+-0.0024 ! definitely 1.0483x slower tear-off-arguments 3.6440+-0.0079 ! 3.7214+-0.0089 ! definitely 1.0212x slower temporal-structure 17.2226+-0.1039 17.1660+-0.0760 to-int32-boolean 21.6201+-0.1567 21.5986+-0.1536 undefined-test 4.5878+-0.0581 ? 4.6379+-0.0154 ? might be 1.0109x slower weird-inlining-const-prop 2.3704+-0.0100 ! 2.4334+-0.0232 ! definitely 1.0266x slower <arithmetic> 133.8280+-0.2644 ! 135.1805+-0.6977 ! definitely 1.0101x slower <geometric> * 14.7482+-0.0154 ! 14.8209+-0.0109 ! definitely 1.0049x slower <harmonic> 5.2911+-0.0312 ! 5.3335+-0.0073 ! definitely 1.0080x slower TipOfTree NoMoreForward All benchmarks: <arithmetic> 202.9650+-0.2227 ! 204.1826+-0.5335 ! definitely 1.0060x slower <geometric> 20.4908+-0.0166 ! 20.5979+-0.0129 ! definitely 1.0052x slower <harmonic> 4.8264+-0.0202 ! 4.8688+-0.0091 ! definitely 1.0088x slower TipOfTree NoMoreForward Geomean of preferred means: <scaled-result> 49.2589+-0.0385 ! 49.5846+-0.0640 ! definitely 1.0066x slower
OK - it appears that the entire regression is due to graph size increase. I will look at ways of reducing the graph size or compensating in other ways. I think this is totally solvable and I should have expected this - what ToT is doing is essentially introducing a lot of IR awkwardness just to allow a SetLocal and a MovHint to be the same node. But there are probably better ways of achieving the same effect...
(In reply to comment #16) > OK - it appears that the entire regression is due to graph size increase. I will look at ways of reducing the graph size or compensating in other ways. I think this is totally solvable and I should have expected this - what ToT is doing is essentially introducing a lot of IR awkwardness just to allow a SetLocal and a MovHint to be the same node. But there are probably better ways of achieving the same effect... Actually, this appears to be because I broke peephole branch optimization! I have a fix and am testing it now...
OK, new performance numbers. I believe these are solid enough to land. Basically, we have a *slight* compile-time regression - but in return for this we get a lot of compiler sanity. I'll still play with the compile-time regression but I think even if I don't find a solution, this is ready to go. Benchmark report for SunSpider, LongSpider, V8Spider, Octane, Kraken, and JSRegress on oldmac (MacPro4,1). VMs tested: "TipOfTree" at /Volumes/Data/pizlo/OpenSource/WebKitBuild/Release/jsc (r161072) "NoMoreForward" at /Volumes/Data/fromMiniMe/tertiary/OpenSource/WebKitBuild/Release/jsc (r161072) Collected 10 samples per benchmark/VM, with 10 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree NoMoreForward SunSpider: 3d-cube 7.7370+-0.0407 ? 7.7981+-0.0376 ? 3d-morph 8.8103+-0.0665 ? 8.9125+-0.0736 ? might be 1.0116x slower 3d-raytrace 8.9546+-0.1388 ? 9.0668+-0.1115 ? might be 1.0125x slower access-binary-trees 2.1135+-0.0108 ? 2.1317+-0.0171 ? access-fannkuch 8.0254+-0.0728 ? 8.1135+-0.0431 ? might be 1.0110x slower access-nbody 4.2910+-0.0589 4.2756+-0.0072 access-nsieve 4.9995+-0.0422 ? 5.0113+-0.0473 ? bitops-3bit-bits-in-byte 1.8870+-0.0182 ? 1.9072+-0.0048 ? might be 1.0107x slower bitops-bits-in-byte 7.1521+-0.0826 ? 7.1591+-0.0648 ? bitops-bitwise-and 3.0063+-0.0811 ? 3.0806+-0.0288 ? might be 1.0247x slower bitops-nsieve-bits 4.6589+-0.0228 ? 4.6891+-0.0127 ? controlflow-recursive 3.1725+-0.0256 ? 3.1865+-0.0253 ? crypto-aes 5.5718+-0.0262 ! 5.6269+-0.0241 ! definitely 1.0099x slower crypto-md5 3.3458+-0.0113 ! 3.4098+-0.0271 ! definitely 1.0191x slower crypto-sha1 3.0045+-0.0146 ? 3.0386+-0.0480 ? might be 1.0114x slower date-format-tofte 12.1441+-0.1843 12.0038+-0.2073 might be 1.0117x faster date-format-xparb 8.7549+-0.0591 8.7179+-0.0574 math-cordic 4.2854+-0.0238 4.2826+-0.0513 math-partial-sums 10.1684+-0.0649 ? 10.2828+-0.1159 ? might be 1.0113x slower math-spectral-norm 2.7670+-0.0101 ! 2.7984+-0.0041 ! definitely 1.0113x slower regexp-dna 12.9764+-0.0814 12.9704+-0.0810 string-base64 5.7913+-0.0293 ! 5.9404+-0.0357 ! definitely 1.0258x slower string-fasta 10.3990+-0.0442 ? 10.4592+-0.0934 ? string-tagcloud 15.5646+-0.1599 15.5487+-0.1611 string-unpack-code 31.3105+-0.1309 31.1102+-0.1265 string-validate-input 7.0962+-0.0773 ? 7.1452+-0.0911 ? <arithmetic> * 7.6149+-0.0181 ? 7.6410+-0.0195 ? might be 1.0034x slower <geometric> 6.0962+-0.0123 ! 6.1370+-0.0123 ! definitely 1.0067x slower <harmonic> 4.9978+-0.0083 ! 5.0422+-0.0085 ! definitely 1.0089x slower TipOfTree NoMoreForward LongSpider: 3d-cube 2687.8397+-5.1710 2680.8112+-7.2138 3d-morph 1499.7998+-1.5775 ? 1500.1987+-1.8401 ? 3d-raytrace 1513.2198+-10.5350 1512.8543+-22.4506 access-binary-trees 2452.5323+-18.3547 ? 2471.6895+-10.5027 ? access-fannkuch 665.1397+-0.3717 ^ 658.0128+-0.5343 ^ definitely 1.0108x faster access-nbody 1496.1419+-0.8738 1496.0277+-0.7509 access-nsieve 1548.6860+-4.3524 ? 1548.7072+-3.7352 ? bitops-3bit-bits-in-byte 125.9811+-0.1221 ? 126.3846+-0.7199 ? bitops-bits-in-byte 602.9040+-3.8246 600.7204+-4.0931 bitops-nsieve-bits 1050.5088+-0.9646 1049.7785+-0.8109 controlflow-recursive 1473.0015+-0.5669 1472.4836+-0.5062 crypto-aes 1661.2781+-4.9991 ! 1687.3493+-8.6841 ! definitely 1.0157x slower crypto-md5 1171.2685+-16.2691 ! 1240.0371+-3.5152 ! definitely 1.0587x slower crypto-sha1 1680.7231+-131.3260 1624.6520+-4.8496 might be 1.0345x faster date-format-tofte 1232.2434+-7.0994 1218.1908+-7.1591 might be 1.0115x faster date-format-xparb 1514.3265+-70.5821 1460.1894+-24.2241 might be 1.0371x faster math-cordic 1736.5387+-0.4024 ? 1740.4658+-4.4304 ? math-partial-sums 1309.0154+-0.8354 1308.2350+-1.8905 math-spectral-norm 1826.2699+-0.4439 ? 1826.3374+-0.5837 ? string-base64 591.1884+-3.0089 ? 594.4468+-2.2112 ? string-fasta 1000.5397+-9.1203 997.1832+-5.4043 string-tagcloud 392.0066+-1.9437 390.4587+-1.8079 <arithmetic> 1328.6888+-6.1195 1327.5097+-1.3461 might be 1.0009x faster <geometric> * 1133.7120+-3.8268 1133.1715+-0.9633 might be 1.0005x faster <harmonic> 820.5279+-1.5168 ? 820.8324+-1.4959 ? might be 1.0004x slower TipOfTree NoMoreForward V8Spider: crypto 79.2281+-0.2133 ! 80.1773+-0.2365 ! definitely 1.0120x slower deltablue 98.1150+-0.5293 ! 99.8364+-0.6674 ! definitely 1.0175x slower earley-boyer 72.8347+-0.3468 ! 74.1789+-0.3027 ! definitely 1.0185x slower raytrace 44.4347+-0.8700 ! 46.2000+-0.1712 ! definitely 1.0397x slower regexp 100.7230+-0.9069 ? 101.0771+-0.7770 ? richards 134.9995+-1.3492 ^ 131.3913+-0.8325 ^ definitely 1.0275x faster splay 46.3375+-0.4017 46.3261+-0.3574 <arithmetic> 82.3818+-0.2923 ? 82.7410+-0.1528 ? might be 1.0044x slower <geometric> * 76.8586+-0.2977 ! 77.5538+-0.1029 ! definitely 1.0090x slower <harmonic> 71.4418+-0.3575 ! 72.3810+-0.1012 ! definitely 1.0131x slower TipOfTree NoMoreForward Octane and V8v7: encrypt 0.46555+-0.00047 ! 0.46791+-0.00026 ! definitely 1.0051x slower decrypt 8.57806+-0.02120 ? 8.60319+-0.01067 ? deltablue x2 0.56448+-0.00656 ? 0.56603+-0.00820 ? earley 0.90517+-0.00683 ? 0.90991+-0.00576 ? boyer 12.46218+-0.04809 ? 12.49413+-0.03108 ? raytrace x2 4.30932+-0.03232 4.30880+-0.06815 regexp x2 32.81055+-0.12539 32.71625+-0.05464 richards x2 0.43346+-0.00574 0.43291+-0.00555 splay x2 0.63757+-0.00400 0.63755+-0.00774 navier-stokes x2 10.69834+-0.00480 ! 10.83884+-0.12310 ! definitely 1.0131x slower closure 0.43340+-0.00086 0.43253+-0.00121 jquery 6.35426+-0.01745 6.35001+-0.00692 gbemu x2 72.03080+-0.69386 ? 76.66918+-6.07321 ? might be 1.0644x slower mandreel x2 136.48272+-0.56480 135.79060+-0.13469 pdfjs x2 101.66569+-0.26926 ? 102.13125+-0.35527 ? box2d x2 34.96345+-0.18150 34.82066+-0.16753 V8v7: <arithmetic> 7.58240+-0.01811 ? 7.59224+-0.01566 ? might be 1.0013x slower <geometric> * 2.51157+-0.00795 ? 2.51754+-0.00982 ? might be 1.0024x slower <harmonic> 1.03512+-0.00637 ? 1.03661+-0.00721 ? might be 1.0014x slower Octane including V8v7: <arithmetic> 31.47659+-0.06347 ? 31.81084+-0.50006 ? might be 1.0106x slower <geometric> * 6.97104+-0.01774 ? 7.00905+-0.05269 ? might be 1.0055x slower <harmonic> 1.44113+-0.00741 ? 1.44261+-0.00837 ? might be 1.0010x slower TipOfTree NoMoreForward Kraken: ai-astar 494.019+-0.733 ? 494.499+-0.455 ? audio-beat-detection 224.665+-1.088 ? 226.655+-1.722 ? audio-dft 292.093+-5.819 291.118+-1.829 audio-fft 130.618+-0.072 ! 131.250+-0.150 ! definitely 1.0048x slower audio-oscillator 244.391+-0.487 244.240+-0.257 imaging-darkroom 286.164+-0.266 ? 287.068+-2.095 ? imaging-desaturate 158.497+-0.143 158.472+-0.255 imaging-gaussian-blur 362.890+-0.342 ? 362.996+-0.130 ? json-parse-financial 80.569+-0.270 ? 80.704+-0.398 ? json-stringify-tinderbox 103.807+-0.239 ? 104.291+-0.479 ? stanford-crypto-aes 91.267+-0.615 ? 92.045+-0.358 ? stanford-crypto-ccm 102.191+-0.768 101.887+-1.234 stanford-crypto-pbkdf2 261.040+-1.543 ? 261.411+-2.033 ? stanford-crypto-sha256-iterative 114.681+-0.417 114.557+-0.622 <arithmetic> * 210.492+-0.463 ? 210.800+-0.339 ? might be 1.0015x slower <geometric> 180.760+-0.332 ? 181.098+-0.282 ? might be 1.0019x slower <harmonic> 156.340+-0.301 ? 156.688+-0.269 ? might be 1.0022x slower TipOfTree NoMoreForward JSRegress: adapt-to-double-divide 22.7979+-0.1339 ? 22.8154+-0.1452 ? aliased-arguments-getbyval 0.9994+-0.0057 ! 1.0141+-0.0030 ! definitely 1.0146x slower allocate-big-object 3.0500+-0.0203 ? 3.0582+-0.0208 ? arity-mismatch-inlining 0.9727+-0.0234 ? 0.9857+-0.0071 ? might be 1.0133x slower array-access-polymorphic-structure 10.7340+-0.3908 10.5264+-0.4635 might be 1.0197x faster array-nonarray-polymorhpic-access 57.9910+-0.1089 ? 58.2064+-0.1749 ? array-with-double-add 5.7874+-0.0192 5.7823+-0.0554 array-with-double-increment 4.3426+-0.0390 ? 4.3695+-0.0242 ? array-with-double-mul-add 6.8497+-0.0426 ? 6.8502+-0.0596 ? array-with-double-sum 8.0224+-0.1199 ? 8.0914+-0.1033 ? array-with-int32-add-sub 10.4396+-0.0853 ? 10.4668+-0.1118 ? array-with-int32-or-double-sum 8.0059+-0.1016 ? 8.0376+-0.0297 ? ArrayBuffer-DataView-alloc-large-long-lived 118.6983+-0.8969 118.2920+-0.9466 ArrayBuffer-DataView-alloc-long-lived 30.6334+-0.1587 ? 30.8729+-0.1941 ? ArrayBuffer-Int32Array-byteOffset 6.3819+-0.1446 ^ 6.0424+-0.0185 ^ definitely 1.0562x faster ArrayBuffer-Int8Array-alloc-huge-long-lived 216.2365+-1.3293 ? 217.8689+-4.1216 ? ArrayBuffer-Int8Array-alloc-large-long-lived-fragmented 165.9004+-1.3675 ? 166.4141+-1.0180 ? ArrayBuffer-Int8Array-alloc-large-long-lived 119.5908+-1.7590 118.9133+-1.8228 ArrayBuffer-Int8Array-alloc-long-lived-buffer 49.1861+-0.8636 49.1026+-0.2429 ArrayBuffer-Int8Array-alloc-long-lived 30.3829+-0.1682 ? 30.7534+-0.2245 ? might be 1.0122x slower ArrayBuffer-Int8Array-alloc 26.3094+-0.1785 ? 26.3680+-0.1488 ? asmjs_bool_bug 9.1604+-0.1709 ? 9.2350+-0.0880 ? basic-set 19.9898+-0.1402 ? 20.0876+-0.1816 ? big-int-mul 5.5323+-0.0259 ! 6.0817+-0.0219 ! definitely 1.0993x slower boolean-test 4.4518+-0.0208 ? 4.4762+-0.0685 ? branch-fold 4.9649+-0.0591 ? 4.9724+-0.0796 ? by-val-generic 12.7185+-0.1698 12.5507+-0.1173 might be 1.0134x faster captured-assignments 0.6499+-0.0191 0.6346+-0.0043 might be 1.0241x faster cast-int-to-double 12.4343+-0.1006 ? 12.6085+-0.0991 ? might be 1.0140x slower cell-argument 15.7150+-0.4607 ? 16.2198+-0.3605 ? might be 1.0321x slower cfg-simplify 4.0055+-0.0199 ? 4.0105+-0.0070 ? chain-custom-getter 164.1987+-7.3788 160.4344+-5.5917 might be 1.0235x faster chain-getter-access 495.5537+-5.7366 494.2574+-2.5374 cmpeq-obj-to-obj-other 12.6098+-0.3796 12.3684+-0.5670 might be 1.0195x faster constant-test 8.9679+-0.0973 8.9381+-0.0603 DataView-custom-properties 125.4395+-0.6597 ? 126.5586+-0.9850 ? delay-tear-off-arguments-strictmode 3.6449+-0.0132 ! 3.6981+-0.0306 ! definitely 1.0146x slower destructuring-arguments-length 176.2906+-1.4916 ? 176.3650+-1.8792 ? destructuring-arguments 8.8388+-0.1017 ? 9.0079+-0.0873 ? might be 1.0191x slower destructuring-swap 8.6667+-0.0458 ? 8.7604+-0.0977 ? might be 1.0108x slower direct-arguments-getbyval 0.8738+-0.0223 ? 0.8916+-0.0265 ? might be 1.0204x slower double-get-by-val-out-of-bounds 7.3917+-0.0575 ? 7.4300+-0.0431 ? double-pollution-getbyval 11.1245+-0.0752 11.0324+-0.0911 double-pollution-putbyoffset 6.0822+-0.0243 ! 6.3514+-0.2296 ! definitely 1.0443x slower double-to-int32-typed-array-no-inline 2.5898+-0.0164 ! 2.6479+-0.0078 ! definitely 1.0224x slower double-to-int32-typed-array 2.2179+-0.0117 ! 2.3100+-0.0116 ! definitely 1.0415x slower double-to-uint32-typed-array-no-inline 2.7513+-0.0115 ! 2.8169+-0.0101 ! definitely 1.0238x slower double-to-uint32-typed-array 2.4429+-0.0227 ! 2.5688+-0.0130 ! definitely 1.0516x slower empty-string-plus-int 10.8813+-0.0580 ! 11.1137+-0.1353 ! definitely 1.0214x slower emscripten-cube2hash 55.3991+-0.1541 ? 55.6979+-0.2447 ? emscripten-memops 7054.4150+-1.8440 ? 7073.0583+-37.9971 ? external-arguments-getbyval 2.1675+-0.0665 ? 2.1775+-0.0157 ? external-arguments-putbyval 3.0630+-0.0113 ? 3.0913+-0.0267 ? fixed-typed-array-storage-var-index 1.4035+-0.0038 ! 1.4206+-0.0058 ! definitely 1.0122x slower fixed-typed-array-storage 0.9901+-0.0033 ! 1.0103+-0.0062 ! definitely 1.0204x slower Float32Array-matrix-mult 6.5740+-0.0341 ? 6.6176+-0.0301 ? Float32Array-to-Float64Array-set 92.9186+-0.6410 ? 94.3479+-1.7139 ? might be 1.0154x slower Float64Array-alloc-long-lived 103.2693+-0.8573 ? 103.8233+-0.8255 ? Float64Array-to-Int16Array-set 116.9203+-0.5726 116.7661+-0.6653 fold-double-to-int 20.6336+-0.1862 ? 20.7148+-0.2728 ? for-of-iterate-array-entries 8.6304+-0.0697 ? 8.8116+-0.1821 ? might be 1.0210x slower for-of-iterate-array-keys 3.4427+-0.0490 ? 3.4604+-0.0416 ? for-of-iterate-array-values 3.0082+-0.0468 2.9680+-0.0362 might be 1.0135x faster function-dot-apply 3.1789+-0.0066 ^ 3.1371+-0.0074 ^ definitely 1.0133x faster function-test 4.8959+-0.0529 ? 4.9256+-0.0412 ? get-by-id-chain-from-try-block 7.9464+-0.1101 ? 8.0974+-0.1104 ? might be 1.0190x slower get-by-id-proto-or-self 25.9583+-0.1939 ? 26.0492+-0.2332 ? get-by-id-self-or-proto 23.7204+-0.6895 23.5455+-0.6211 get-by-val-out-of-bounds 7.2490+-0.0916 ? 7.2790+-0.0573 ? get_callee_monomorphic 4.9536+-0.0753 ? 4.9573+-0.0892 ? get_callee_polymorphic 4.6743+-0.0114 ! 4.8473+-0.0176 ! definitely 1.0370x slower global-var-const-infer-fire-from-opt 1.0627+-0.0573 ? 1.0785+-0.0271 ? might be 1.0148x slower global-var-const-infer 0.8148+-0.0023 ? 0.8177+-0.0034 ? HashMap-put-get-iterate-keys 42.2033+-0.2869 ? 42.6549+-0.3028 ? might be 1.0107x slower HashMap-put-get-iterate 53.6342+-0.2413 ! 54.4598+-0.1587 ! definitely 1.0154x slower HashMap-string-put-get-iterate 51.4289+-0.9065 ? 52.4558+-1.2172 ? might be 1.0200x slower imul-double-only 18.1764+-0.9016 17.8054+-0.1354 might be 1.0208x faster imul-int-only 14.9094+-0.0562 ? 14.9713+-0.0721 ? imul-mixed 21.8089+-0.1310 ? 21.8315+-0.1342 ? in-four-cases 25.9292+-0.0762 ? 25.9538+-0.1179 ? in-one-case-false 12.0834+-0.1413 ? 12.2130+-0.1551 ? might be 1.0107x slower in-one-case-true 12.1514+-0.1243 ? 12.1708+-0.1179 ? in-two-cases 12.8937+-0.0821 ? 12.9872+-0.1155 ? indexed-properties-in-objects 4.2305+-0.0101 ? 4.2578+-0.0258 ? infer-closure-const-then-mov-no-inline 15.3599+-0.0731 15.3459+-0.1020 infer-closure-const-then-mov 28.9018+-0.0913 ? 29.0352+-0.0611 ? infer-closure-const-then-put-to-scope-no-inline 17.8592+-0.0883 17.8444+-0.0645 infer-closure-const-then-put-to-scope 35.8394+-0.2004 ? 36.1314+-0.2261 ? infer-closure-const-then-reenter-no-inline 84.4838+-0.1440 ? 84.4934+-0.1485 ? infer-closure-const-then-reenter 36.1502+-0.4437 35.9537+-0.2690 infer-one-time-closure-ten-vars 28.9633+-0.0816 ? 29.0746+-0.1196 ? infer-one-time-closure-two-vars 28.8226+-0.0885 ? 28.8870+-0.1003 ? infer-one-time-closure 28.7719+-0.0978 ? 28.9587+-0.3235 ? infer-one-time-deep-closure 58.6751+-0.0750 58.5470+-0.1391 inline-arguments-access 1.6618+-0.0060 ! 1.7106+-0.0070 ! definitely 1.0294x slower inline-arguments-aliased-access 1.7622+-0.0036 ! 1.8337+-0.0152 ! definitely 1.0406x slower inline-arguments-local-escape 23.3260+-0.4183 23.1793+-0.1748 inline-get-scoped-var 7.4388+-0.1023 ? 7.4947+-0.0895 ? inlined-put-by-id-transition 15.5449+-0.2883 ? 15.6068+-0.3108 ? int-or-other-abs-then-get-by-val 9.4265+-0.1164 ? 9.5470+-0.1233 ? might be 1.0128x slower int-or-other-abs-zero-then-get-by-val 37.3098+-0.1732 ! 38.0776+-0.1568 ! definitely 1.0206x slower int-or-other-add-then-get-by-val 10.6672+-0.0744 10.6438+-0.0880 int-or-other-add 10.9677+-0.0837 10.9650+-0.1418 int-or-other-div-then-get-by-val 6.4069+-0.0626 6.3636+-0.1060 int-or-other-max-then-get-by-val 8.8351+-0.1671 ? 8.8781+-0.1878 ? int-or-other-min-then-get-by-val 7.0791+-0.0623 ? 7.1161+-0.0314 ? int-or-other-mod-then-get-by-val 6.2676+-0.0130 ? 6.3010+-0.0250 ? int-or-other-mul-then-get-by-val 6.6462+-0.0337 ? 6.6624+-0.0548 ? int-or-other-neg-then-get-by-val 7.9849+-0.0905 ? 8.0055+-0.0801 ? int-or-other-neg-zero-then-get-by-val 36.9437+-0.0988 ! 38.0377+-0.3241 ! definitely 1.0296x slower int-or-other-sub-then-get-by-val 10.6721+-0.1657 ? 10.6946+-0.0762 ? int-or-other-sub 9.0009+-0.0858 ? 9.0273+-0.0968 ? int-overflow-local 6.5084+-0.0229 ! 6.5537+-0.0214 ! definitely 1.0070x slower Int16Array-alloc-long-lived 67.7097+-0.4817 ? 67.9172+-0.4538 ? Int16Array-bubble-sort-with-byteLength 48.8478+-0.1200 ? 48.9618+-0.0911 ? Int16Array-bubble-sort 47.8567+-0.0629 ? 47.8974+-0.1543 ? Int16Array-load-int-mul 1.8168+-0.0098 ? 1.8305+-0.0181 ? Int16Array-to-Int32Array-set 92.9159+-1.1802 ^ 88.7736+-0.4653 ^ definitely 1.0467x faster Int32Array-alloc-huge-long-lived 705.6639+-3.8145 ? 705.6671+-2.7574 ? Int32Array-alloc-huge 801.4405+-6.7298 ? 811.1174+-7.0368 ? might be 1.0121x slower Int32Array-alloc-large-long-lived 963.9104+-6.8320 ? 976.2686+-10.6170 ? might be 1.0128x slower Int32Array-alloc-large 45.3267+-1.1212 44.9958+-0.9438 Int32Array-alloc-long-lived 80.5703+-0.6999 ? 80.8712+-0.4614 ? Int32Array-alloc 4.5289+-0.0208 ? 4.5326+-0.0174 ? Int32Array-Int8Array-view-alloc 14.8802+-0.0509 ! 15.0645+-0.0708 ! definitely 1.0124x slower int52-spill 12.8061+-0.2173 12.7028+-0.1395 Int8Array-alloc-long-lived 67.4405+-0.6475 67.4135+-0.4403 Int8Array-load-with-byteLength 5.0603+-0.0071 ! 5.0794+-0.0094 ! definitely 1.0038x slower Int8Array-load 5.0245+-0.0598 ? 5.0713+-0.0076 ? integer-divide 15.0958+-0.0718 15.0498+-0.1198 integer-modulo 2.0687+-0.0087 2.0660+-0.0106 large-int-captured 9.8063+-0.0981 ? 9.9441+-0.1006 ? might be 1.0141x slower large-int-neg 26.2267+-0.1478 ? 26.2848+-0.2060 ? large-int 23.1520+-0.1020 ? 23.1905+-0.1095 ? logical-not 10.7481+-0.1921 10.7373+-0.2451 lots-of-fields 12.6284+-0.0912 ? 12.7479+-0.0749 ? make-indexed-storage 4.2975+-0.1355 ? 4.3597+-0.0309 ? might be 1.0145x slower make-rope-cse 6.1626+-0.0564 6.0941+-0.0655 might be 1.0112x faster marsaglia-larger-ints 112.7334+-1.5025 111.8839+-0.1912 marsaglia-osr-entry 47.1420+-0.1466 47.1069+-0.0916 marsaglia 463.5552+-0.2812 ? 463.6060+-0.1824 ? method-on-number 29.5792+-0.2109 ? 29.5941+-0.1213 ? negative-zero-divide 0.4253+-0.0029 ? 0.4281+-0.0028 ? negative-zero-modulo 0.4110+-0.0056 0.4085+-0.0021 negative-zero-negate 0.3990+-0.0159 0.3943+-0.0041 might be 1.0120x faster nested-function-parsing-random 382.9724+-0.6238 ? 383.6150+-0.6712 ? nested-function-parsing 47.8245+-0.6605 ? 47.8551+-0.0906 ? new-array-buffer-dead 3.8254+-0.0630 3.7745+-0.0116 might be 1.0135x faster new-array-buffer-push 10.7291+-0.1681 10.6652+-0.1418 new-array-dead 28.5250+-0.0590 ? 28.5450+-0.1026 ? new-array-push 6.8914+-0.0359 ? 6.9116+-0.0641 ? number-test 4.3814+-0.0454 ? 4.4179+-0.0299 ? object-closure-call 13.4986+-0.0633 13.4043+-0.0685 object-test 4.7361+-0.0127 4.7352+-0.0660 poly-stricteq 86.9439+-0.2516 86.8925+-0.3621 polymorphic-structure 20.6187+-0.2777 ! 21.4896+-0.3501 ! definitely 1.0422x slower polyvariant-monomorphic-get-by-id 11.9526+-0.0692 ? 11.9696+-0.0823 ? proto-custom-getter 160.2282+-5.6546 160.1928+-5.7260 proto-getter-access 493.0336+-4.9171 ? 495.6034+-2.9902 ? put-by-id 19.5899+-0.3073 ? 19.6366+-0.2312 ? put-by-val-large-index-blank-indexing-type 21.1444+-0.3351 20.7504+-0.1802 might be 1.0190x faster put-by-val-machine-int 3.3443+-0.0101 ! 3.3997+-0.0342 ! definitely 1.0166x slower rare-osr-exit-on-local 20.3027+-0.0934 20.2775+-0.0909 register-pressure-from-osr 31.3348+-0.0872 ? 31.4094+-0.1102 ? simple-activation-demo 35.4655+-0.4542 35.2814+-0.0993 simple-custom-getter 509.7294+-21.3770 ? 516.7259+-25.0231 ? might be 1.0137x slower simple-getter-access 787.2423+-6.3859 ? 791.0633+-6.8770 ? slow-array-profile-convergence 4.0780+-0.0393 4.0495+-0.0129 slow-convergence 4.4935+-0.0154 ! 4.5860+-0.0215 ! definitely 1.0206x slower sparse-conditional 1.4686+-0.0037 ! 1.4956+-0.0139 ! definitely 1.0184x slower splice-to-remove 76.7661+-0.1755 ? 77.3004+-0.8766 ? stepanov_container 10170.5719+-34.1377 ? 10184.3170+-20.3829 ? string-concat-object 3.2507+-0.0330 3.2418+-0.0208 string-concat-pair-object 3.1466+-0.0131 ? 3.1815+-0.0242 ? might be 1.0111x slower string-concat-pair-simple 16.9644+-0.3235 ? 17.0852+-0.3991 ? string-concat-simple 17.3857+-0.2123 17.2326+-0.3637 string-cons-repeat 10.8186+-0.0366 ? 10.9452+-0.1449 ? might be 1.0117x slower string-cons-tower 11.3058+-0.0517 ? 11.3214+-0.0406 ? string-equality 42.9872+-0.3900 42.9047+-0.3442 string-get-by-val-big-char 12.7058+-0.1745 ? 12.9361+-0.1202 ? might be 1.0181x slower string-get-by-val-out-of-bounds-insane 5.8717+-0.0388 ? 5.9194+-0.1803 ? string-get-by-val-out-of-bounds 5.3476+-0.0141 ^ 5.2462+-0.0791 ^ definitely 1.0193x faster string-get-by-val 4.9828+-0.0310 4.9449+-0.0697 string-hash 2.7825+-0.0211 ^ 2.6974+-0.0038 ^ definitely 1.0315x faster string-long-ident-equality 39.0970+-0.1167 ? 39.2494+-0.2871 ? string-repeat-arith 50.6061+-0.4740 ? 51.2458+-1.8089 ? might be 1.0126x slower string-sub 105.5550+-1.1782 105.3269+-0.6876 string-test 4.3789+-0.0655 ? 4.4287+-0.0419 ? might be 1.0114x slower string-var-equality 70.1328+-0.1737 ? 70.2046+-0.2452 ? structure-hoist-over-transitions 3.5490+-0.0447 3.5341+-0.0151 switch-char-constant 3.5353+-0.0710 ? 3.5414+-0.0458 ? switch-char 8.1656+-0.0835 ? 8.2875+-0.0562 ? might be 1.0149x slower switch-constant 9.3335+-0.1342 ? 9.4512+-0.1343 ? might be 1.0126x slower switch-string-basic-big-var 20.4201+-0.0225 ? 20.4668+-0.1011 ? switch-string-basic-big 22.1810+-1.2241 21.7833+-1.1704 might be 1.0183x faster switch-string-basic-var 20.3556+-0.1500 20.3348+-0.1209 switch-string-basic 21.9028+-0.9033 ? 22.2716+-0.3790 ? might be 1.0168x slower switch-string-big-length-tower-var 29.3390+-0.4496 29.1082+-0.1524 switch-string-length-tower-var 21.9693+-0.1025 21.9319+-0.0613 switch-string-length-tower 16.5774+-0.0909 16.5717+-0.0790 switch-string-short 16.6155+-0.0597 16.5795+-0.0951 switch 13.6382+-0.1323 ? 13.6925+-0.1206 ? tear-off-arguments-simple 2.3758+-0.0055 ! 2.4732+-0.0050 ! definitely 1.0410x slower tear-off-arguments 3.6354+-0.0054 ! 3.7269+-0.0376 ! definitely 1.0252x slower temporal-structure 17.1844+-0.0934 ? 17.2437+-0.0669 ? to-int32-boolean 21.4797+-0.1436 ? 21.5412+-0.1489 ? undefined-test 4.6071+-0.0183 ! 4.6383+-0.0088 ! definitely 1.0068x slower weird-inlining-const-prop 2.3755+-0.0090 ! 2.4166+-0.0115 ! definitely 1.0173x slower <arithmetic> 133.7356+-0.2293 ? 134.0905+-0.3003 ? might be 1.0027x slower <geometric> * 14.7447+-0.0197 ! 14.8143+-0.0118 ! definitely 1.0047x slower <harmonic> 5.2821+-0.0221 ! 5.3155+-0.0080 ! definitely 1.0063x slower TipOfTree NoMoreForward All benchmarks: <arithmetic> 203.2610+-0.4389 ? 203.4725+-0.2671 ? might be 1.0010x slower <geometric> 20.4923+-0.0200 ! 20.5853+-0.0216 ! definitely 1.0045x slower <harmonic> 4.8220+-0.0169 ? 4.8462+-0.0111 ? might be 1.0050x slower TipOfTree NoMoreForward Geomean of preferred means: <scaled-result> 49.2988+-0.0424 ! 49.4926+-0.0847 ! definitely 1.0039x slower
Created attachment 220074 [details] the patch
Hmmm, it appears that there is a gbemu issue. Here's the gbemu performance on my MBP with concurrent JIT disabled. I will continue to look at this. I suspect that the fix for this will be as small as the fix for the last perf pathology, so it's still useful to have the patch reviewed in its current form. [pizlo@dethklok OpenSource] JSC_enableConcurrentJIT=false /Volumes/Data/pizlo/primary/Internal/Tools/Scripts/run-jsc-benchmarks TipOfTree:/Volumes/Data/pizlo/primary/OpenSource/WebKitBuild/Release/jsc NoMoreForward:WebKitBuild/Release/jsc --benchmark gbemu --outer 10 72/72 Generating benchmark report at /Volumes/Data/pizlo/tertiary/OpenSource/TipOfTree_NoMoreForward_Octane_dethklok_20131228_0914_report.txt And raw data at /Volumes/Data/pizlo/tertiary/OpenSource/TipOfTree_NoMoreForward_Octane_dethklok_20131228_0914.json Benchmark report for Octane on dethklok (MacBookPro9,1). VMs tested: "TipOfTree" at /Volumes/Data/pizlo/primary/OpenSource/WebKitBuild/Release/jsc (r161072) "NoMoreForward" at /Volumes/Data/pizlo/tertiary/OpenSource/WebKitBuild/Release/jsc (r161072) Collected 10 samples per benchmark/VM, with 10 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree NoMoreForward Octane and V8v7: gbemu x2 43.64871+-1.18154 ! 58.01383+-0.18044 ! definitely 1.3291x slower V8v7: <arithmetic> ERROR ERROR <geometric> * ERROR ERROR <harmonic> ERROR ERROR Octane including V8v7: <arithmetic> 43.64871+-1.18154 ! 58.01383+-0.18044 ! definitely 1.3291x slower <geometric> * 43.64871+-1.18154 ! 58.01383+-0.18044 ! definitely 1.3291x slower <harmonic> 43.64871+-1.18154 ! 58.01383+-0.18044 ! definitely 1.3291x slower
(In reply to comment #20) > Hmmm, it appears that there is a gbemu issue. Here's the gbemu performance on my MBP with concurrent JIT disabled. I will continue to look at this. I suspect that the fix for this will be as small as the fix for the last perf pathology, so it's still useful to have the patch reviewed in its current form. > > > [pizlo@dethklok OpenSource] JSC_enableConcurrentJIT=false /Volumes/Data/pizlo/primary/Internal/Tools/Scripts/run-jsc-benchmarks TipOfTree:/Volumes/Data/pizlo/primary/OpenSource/WebKitBuild/Release/jsc NoMoreForward:WebKitBuild/Release/jsc --benchmark gbemu --outer 10 > 72/72 > Generating benchmark report at /Volumes/Data/pizlo/tertiary/OpenSource/TipOfTree_NoMoreForward_Octane_dethklok_20131228_0914_report.txt > And raw data at /Volumes/Data/pizlo/tertiary/OpenSource/TipOfTree_NoMoreForward_Octane_dethklok_20131228_0914.json > > Benchmark report for Octane on dethklok (MacBookPro9,1). > > VMs tested: > "TipOfTree" at /Volumes/Data/pizlo/primary/OpenSource/WebKitBuild/Release/jsc (r161072) > "NoMoreForward" at /Volumes/Data/pizlo/tertiary/OpenSource/WebKitBuild/Release/jsc (r161072) > > Collected 10 samples per benchmark/VM, with 10 VM invocations per benchmark. Emitted a call to gc() > between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the > jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution > times with 95% confidence intervals in milliseconds. > > TipOfTree NoMoreForward > Octane and V8v7: > gbemu x2 43.64871+-1.18154 ! 58.01383+-0.18044 ! definitely 1.3291x slower > > V8v7: > <arithmetic> ERROR ERROR > <geometric> * ERROR ERROR > <harmonic> ERROR ERROR > > Octane including V8v7: > <arithmetic> 43.64871+-1.18154 ! 58.01383+-0.18044 ! definitely 1.3291x slower > <geometric> * 43.64871+-1.18154 ! 58.01383+-0.18044 ! definitely 1.3291x slower > <harmonic> 43.64871+-1.18154 ! 58.01383+-0.18044 ! definitely 1.3291x slower It appears that, for some reason, the second time that we DFG-compile gbemu, it takes a lot longer to do it. run-jsc-benchmarks will run gbemu in a sandbox and then run it again in a sandbox; it's that second run that gets measured. And the first run appears to go as fast with this change as without. But the second run regressed. Furthermore, if I allow more per-run warm-up, then the regression disappears - so it's the warm-up of the second run that is the slow part. Also the slow-down is greater with concurrent JIT. So, the compiler is slower to compile the second time around. Interesting. Still investigating.
(In reply to comment #21) > (In reply to comment #20) > > Hmmm, it appears that there is a gbemu issue. Here's the gbemu performance on my MBP with concurrent JIT disabled. I will continue to look at this. I suspect that the fix for this will be as small as the fix for the last perf pathology, so it's still useful to have the patch reviewed in its current form. > > > > > > [pizlo@dethklok OpenSource] JSC_enableConcurrentJIT=false /Volumes/Data/pizlo/primary/Internal/Tools/Scripts/run-jsc-benchmarks TipOfTree:/Volumes/Data/pizlo/primary/OpenSource/WebKitBuild/Release/jsc NoMoreForward:WebKitBuild/Release/jsc --benchmark gbemu --outer 10 > > 72/72 > > Generating benchmark report at /Volumes/Data/pizlo/tertiary/OpenSource/TipOfTree_NoMoreForward_Octane_dethklok_20131228_0914_report.txt > > And raw data at /Volumes/Data/pizlo/tertiary/OpenSource/TipOfTree_NoMoreForward_Octane_dethklok_20131228_0914.json > > > > Benchmark report for Octane on dethklok (MacBookPro9,1). > > > > VMs tested: > > "TipOfTree" at /Volumes/Data/pizlo/primary/OpenSource/WebKitBuild/Release/jsc (r161072) > > "NoMoreForward" at /Volumes/Data/pizlo/tertiary/OpenSource/WebKitBuild/Release/jsc (r161072) > > > > Collected 10 samples per benchmark/VM, with 10 VM invocations per benchmark. Emitted a call to gc() > > between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the > > jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution > > times with 95% confidence intervals in milliseconds. > > > > TipOfTree NoMoreForward > > Octane and V8v7: > > gbemu x2 43.64871+-1.18154 ! 58.01383+-0.18044 ! definitely 1.3291x slower > > > > V8v7: > > <arithmetic> ERROR ERROR > > <geometric> * ERROR ERROR > > <harmonic> ERROR ERROR > > > > Octane including V8v7: > > <arithmetic> 43.64871+-1.18154 ! 58.01383+-0.18044 ! definitely 1.3291x slower > > <geometric> * 43.64871+-1.18154 ! 58.01383+-0.18044 ! definitely 1.3291x slower > > <harmonic> 43.64871+-1.18154 ! 58.01383+-0.18044 ! definitely 1.3291x slower > > It appears that, for some reason, the second time that we DFG-compile gbemu, it takes a lot longer to do it. run-jsc-benchmarks will run gbemu in a sandbox and then run it again in a sandbox; it's that second run that gets measured. And the first run appears to go as fast with this change as without. But the second run regressed. Furthermore, if I allow more per-run warm-up, then the regression disappears - so it's the warm-up of the second run that is the slow part. > > Also the slow-down is greater with concurrent JIT. > > So, the compiler is slower to compile the second time around. Interesting. Still investigating. Found the issue. And boy was I wrong. It appears that MovHint wasn't accounted for in TypeCheckHoisting. It's a mystery to me, why it was the second execution of gbemu that was affected and not the first.
Latest performance numbers. All throughput regressions have been taken care of. There is a *slight* compile-time regression, and I think we should eat it. Benchmark report for SunSpider, LongSpider, V8Spider, Octane, Kraken, and JSRegress on oldmac (MacPro4,1). VMs tested: "TipOfTree" at /Volumes/Data/pizlo/OpenSource/WebKitBuild/Release/jsc (r161072) "NoMoreForward" at /Volumes/Data/fromMiniMe/tertiary/OpenSource/WebKitBuild/Release/jsc (r161072) Collected 10 samples per benchmark/VM, with 10 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree NoMoreForward SunSpider: 3d-cube 7.7707+-0.0871 ? 7.8824+-0.1566 ? might be 1.0144x slower 3d-morph 8.8038+-0.1084 ? 8.8590+-0.0708 ? 3d-raytrace 8.9614+-0.1284 ? 9.0334+-0.0684 ? access-binary-trees 2.1138+-0.0141 ? 2.1619+-0.0691 ? might be 1.0227x slower access-fannkuch 7.9563+-0.0835 ? 8.1200+-0.0969 ? might be 1.0206x slower access-nbody 4.2587+-0.0066 ! 4.2919+-0.0236 ! definitely 1.0078x slower access-nsieve 5.0052+-0.0292 ? 5.0285+-0.0354 ? bitops-3bit-bits-in-byte 1.8965+-0.0174 ? 1.9208+-0.0267 ? might be 1.0128x slower bitops-bits-in-byte 7.2181+-0.0623 7.1711+-0.0499 bitops-bitwise-and 3.0131+-0.0589 ? 3.0677+-0.0380 ? might be 1.0181x slower bitops-nsieve-bits 4.6268+-0.0544 ! 4.7203+-0.0218 ! definitely 1.0202x slower controlflow-recursive 3.2028+-0.0191 3.1906+-0.0221 crypto-aes 5.5542+-0.0093 ! 5.5987+-0.0321 ! definitely 1.0080x slower crypto-md5 3.3373+-0.0099 ! 3.3952+-0.0118 ! definitely 1.0173x slower crypto-sha1 3.0063+-0.0087 ! 3.0380+-0.0227 ! definitely 1.0105x slower date-format-tofte 12.1569+-0.2034 12.0294+-0.1489 might be 1.0106x faster date-format-xparb 8.7315+-0.1043 8.6873+-0.1072 math-cordic 4.2968+-0.0323 ? 4.3562+-0.0960 ? might be 1.0138x slower math-partial-sums 10.2246+-0.1458 ? 10.2996+-0.0990 ? math-spectral-norm 2.7763+-0.0286 ? 2.7953+-0.0054 ? regexp-dna 12.9987+-0.0911 ? 13.0022+-0.1049 ? string-base64 5.7871+-0.0302 ! 5.9697+-0.0647 ! definitely 1.0315x slower string-fasta 10.4870+-0.0763 10.4587+-0.0828 string-tagcloud 15.6906+-0.1642 15.5980+-0.1377 string-unpack-code 31.3225+-0.1378 31.1885+-0.0810 string-validate-input 7.0844+-0.0522 ? 7.1627+-0.1045 ? might be 1.0111x slower <arithmetic> * 7.6262+-0.0126 ! 7.6549+-0.0140 ! definitely 1.0038x slower <geometric> 6.1031+-0.0079 ! 6.1503+-0.0164 ! definitely 1.0077x slower <harmonic> 5.0040+-0.0088 ! 5.0568+-0.0199 ! definitely 1.0105x slower TipOfTree NoMoreForward LongSpider: 3d-cube 2686.7349+-5.6732 2680.4095+-4.9259 3d-morph 1499.3344+-1.8816 ? 1500.0261+-1.7393 ? 3d-raytrace 1513.7138+-11.5130 1498.6762+-5.4061 might be 1.0100x faster access-binary-trees 2457.6009+-26.4828 ? 2483.5420+-31.6203 ? might be 1.0106x slower access-fannkuch 664.9965+-0.3552 ^ 658.9932+-3.9321 ^ definitely 1.0091x faster access-nbody 1496.0395+-0.9849 1495.7793+-0.8817 access-nsieve 1547.0229+-3.5808 ? 1551.3904+-5.1761 ? bitops-3bit-bits-in-byte 125.8471+-0.0819 ? 125.9064+-0.0997 ? bitops-bits-in-byte 601.2117+-2.5347 ? 601.9310+-5.6803 ? bitops-nsieve-bits 1069.3648+-30.0227 1049.4630+-0.6587 might be 1.0190x faster controlflow-recursive 1473.0030+-0.4945 ? 1480.3208+-17.9566 ? crypto-aes 1667.9490+-15.4315 1664.7693+-10.0508 crypto-md5 1165.4767+-1.4507 ! 1232.3499+-0.7388 ! definitely 1.0574x slower crypto-sha1 1611.4396+-6.0027 1608.3483+-3.7429 date-format-tofte 1232.8215+-9.5833 1224.2186+-6.8218 date-format-xparb 1500.9848+-16.2641 ^ 1456.3645+-13.0716 ^ definitely 1.0306x faster math-cordic 1736.6626+-1.1812 ? 1750.5305+-23.1365 ? math-partial-sums 1308.8438+-1.7715 ? 1309.0471+-2.3986 ? math-spectral-norm 1833.2356+-13.4300 1826.6121+-0.4708 string-base64 590.2314+-2.2888 589.4660+-2.0555 string-fasta 992.8766+-7.1027 ? 996.0589+-4.7846 ? string-tagcloud 391.4510+-1.6624 389.6454+-1.3961 <arithmetic> 1325.7656+-1.4958 ? 1326.0840+-2.4506 ? might be 1.0002x slower <geometric> * 1131.7151+-1.2368 1131.5651+-1.6968 might be 1.0001x faster <harmonic> 819.4391+-0.5969 819.0512+-0.7493 might be 1.0005x faster TipOfTree NoMoreForward V8Spider: crypto 79.7771+-1.1944 ? 79.9386+-0.1388 ? deltablue 98.0596+-0.4655 ! 99.6041+-0.5674 ! definitely 1.0158x slower earley-boyer 73.0492+-1.2485 ? 74.0662+-0.4442 ? might be 1.0139x slower raytrace 44.3350+-0.3555 ! 46.1064+-0.2998 ! definitely 1.0400x slower regexp 100.1912+-0.1154 100.1866+-0.3198 richards 134.1635+-1.5732 ^ 131.1238+-0.6062 ^ definitely 1.0232x faster splay 46.7470+-1.1223 ? 46.7613+-0.2829 ? <arithmetic> 82.3318+-0.3292 ? 82.5410+-0.1494 ? might be 1.0025x slower <geometric> * 76.9007+-0.3472 ! 77.4393+-0.1201 ! definitely 1.0070x slower <harmonic> 71.5489+-0.4120 ! 72.3610+-0.1201 ! definitely 1.0114x slower TipOfTree NoMoreForward Octane and V8v7: encrypt 0.46568+-0.00030 ! 0.46824+-0.00035 ! definitely 1.0055x slower decrypt 8.57894+-0.00702 ? 8.58905+-0.01351 ? deltablue x2 0.56313+-0.00338 ? 0.56842+-0.00608 ? earley 0.90842+-0.00680 0.90778+-0.00782 boyer 12.44349+-0.03388 ? 12.51499+-0.04530 ? raytrace x2 4.31395+-0.05929 ? 4.36641+-0.05867 ? might be 1.0122x slower regexp x2 32.90332+-0.13148 32.87480+-0.26387 richards x2 0.44132+-0.00713 0.43653+-0.00977 might be 1.0110x faster splay x2 0.63789+-0.00369 0.63783+-0.00411 navier-stokes x2 10.76385+-0.13875 ? 10.76497+-0.14090 ? closure 0.43352+-0.00080 0.43241+-0.00109 jquery 6.35243+-0.00955 6.34127+-0.00999 gbemu x2 71.65755+-1.10893 ? 71.76569+-1.21169 ? mandreel x2 136.22167+-0.97786 135.78187+-0.61829 pdfjs x2 102.02107+-0.54935 ? 102.26446+-0.37790 ? box2d x2 34.97106+-0.38590 ? 35.04815+-0.44866 ? V8v7: <arithmetic> 7.60272+-0.02015 ? 7.61112+-0.03893 ? might be 1.0011x slower <geometric> * 2.52008+-0.00734 ? 2.52493+-0.01534 ? might be 1.0019x slower <harmonic> 1.04057+-0.00538 1.04050+-0.00915 might be 1.0001x faster Octane including V8v7: <arithmetic> 31.46816+-0.11611 ? 31.47200+-0.10195 ? might be 1.0001x slower <geometric> * 6.98359+-0.01601 ? 6.99220+-0.03204 ? might be 1.0012x slower <harmonic> 1.44768+-0.00640 1.44709+-0.01101 might be 1.0004x faster TipOfTree NoMoreForward Kraken: ai-astar 494.245+-0.733 ? 494.726+-0.459 ? audio-beat-detection 224.925+-0.958 ? 224.986+-1.830 ? audio-dft 290.340+-0.812 289.982+-0.441 audio-fft 130.697+-0.117 ? 130.729+-0.162 ? audio-oscillator 244.433+-0.715 243.920+-0.430 imaging-darkroom 287.738+-3.945 287.587+-3.982 imaging-desaturate 158.368+-0.257 ? 158.590+-0.263 ? imaging-gaussian-blur 362.673+-0.211 ? 364.774+-3.893 ? json-parse-financial 80.624+-0.516 ? 80.812+-0.405 ? json-stringify-tinderbox 103.774+-0.317 ? 104.257+-0.405 ? stanford-crypto-aes 91.329+-0.384 ? 91.781+-0.402 ? stanford-crypto-ccm 102.150+-0.896 101.729+-1.292 stanford-crypto-pbkdf2 260.090+-2.305 ? 261.708+-3.676 ? stanford-crypto-sha256-iterative 115.559+-1.789 115.070+-1.015 <arithmetic> * 210.496+-0.389 ? 210.761+-0.467 ? might be 1.0013x slower <geometric> 180.824+-0.315 ? 181.010+-0.391 ? might be 1.0010x slower <harmonic> 156.447+-0.267 ? 156.602+-0.385 ? might be 1.0010x slower TipOfTree NoMoreForward JSRegress: adapt-to-double-divide 22.7427+-0.1268 ? 22.8070+-0.0983 ? aliased-arguments-getbyval 0.9999+-0.0130 ? 1.0202+-0.0154 ? might be 1.0203x slower allocate-big-object 3.0367+-0.0098 ? 3.0629+-0.0212 ? arity-mismatch-inlining 0.9637+-0.0040 ! 0.9775+-0.0055 ! definitely 1.0143x slower array-access-polymorphic-structure 10.3905+-0.4230 ? 10.5821+-0.4108 ? might be 1.0184x slower array-nonarray-polymorhpic-access 58.2568+-0.1692 ? 58.3820+-0.3411 ? array-with-double-add 5.8134+-0.0376 5.8071+-0.0138 array-with-double-increment 4.3291+-0.0299 ? 4.3531+-0.0468 ? array-with-double-mul-add 6.8623+-0.0894 6.8239+-0.0837 array-with-double-sum 7.9760+-0.1243 ? 8.0129+-0.1034 ? array-with-int32-add-sub 10.3805+-0.0869 ? 10.4847+-0.0746 ? might be 1.0100x slower array-with-int32-or-double-sum 8.0200+-0.0366 ? 8.0634+-0.0453 ? ArrayBuffer-DataView-alloc-large-long-lived 118.2480+-1.0564 ? 118.5870+-0.5100 ? ArrayBuffer-DataView-alloc-long-lived 30.6127+-0.1521 ? 30.9072+-0.1556 ? ArrayBuffer-Int32Array-byteOffset 6.3678+-0.1423 ^ 6.0835+-0.1164 ^ definitely 1.0467x faster ArrayBuffer-Int8Array-alloc-huge-long-lived 217.4163+-3.4889 216.6251+-1.9894 ArrayBuffer-Int8Array-alloc-large-long-lived-fragmented 166.4321+-0.8320 166.1120+-1.3279 ArrayBuffer-Int8Array-alloc-large-long-lived 117.6469+-1.2687 ? 118.2269+-1.1736 ? ArrayBuffer-Int8Array-alloc-long-lived-buffer 48.9378+-0.5204 ? 49.2790+-0.2798 ? ArrayBuffer-Int8Array-alloc-long-lived 30.4206+-0.1132 ? 30.6008+-0.1131 ? ArrayBuffer-Int8Array-alloc 26.4590+-0.3060 ? 26.4628+-0.2584 ? asmjs_bool_bug 9.3014+-0.0801 9.2731+-0.0676 basic-set 19.8900+-0.1735 19.8848+-0.2593 big-int-mul 5.5439+-0.0200 5.5230+-0.0711 boolean-test 4.4316+-0.0494 ? 4.4874+-0.0189 ? might be 1.0126x slower branch-fold 4.9963+-0.0069 ! 5.0220+-0.0089 ! definitely 1.0051x slower by-val-generic 12.6913+-0.1427 12.5254+-0.1162 might be 1.0132x faster captured-assignments 0.6374+-0.0040 ? 0.6655+-0.0433 ? might be 1.0441x slower cast-int-to-double 12.4756+-0.1211 ? 12.5516+-0.1285 ? cell-argument 15.7582+-0.3975 ? 15.8793+-0.4241 ? cfg-simplify 4.0086+-0.0393 ? 4.0105+-0.0181 ? chain-custom-getter 162.8260+-7.4455 160.2959+-5.6059 might be 1.0158x faster chain-getter-access 500.2018+-6.5513 495.1475+-5.1070 might be 1.0102x faster cmpeq-obj-to-obj-other 12.8920+-0.4636 12.3184+-0.4649 might be 1.0466x faster constant-test 9.0254+-0.0721 8.9036+-0.1302 might be 1.0137x faster DataView-custom-properties 126.3695+-0.8445 126.2512+-0.6817 delay-tear-off-arguments-strictmode 3.6470+-0.0143 ! 3.6884+-0.0052 ! definitely 1.0113x slower destructuring-arguments-length 175.6819+-2.3269 ? 177.5190+-1.8969 ? might be 1.0105x slower destructuring-arguments 8.8559+-0.0928 ? 8.9486+-0.1074 ? might be 1.0105x slower destructuring-swap 8.7100+-0.0828 ? 8.7927+-0.0745 ? direct-arguments-getbyval 0.8763+-0.0129 ? 0.8823+-0.0124 ? double-get-by-val-out-of-bounds 7.4235+-0.0701 7.3951+-0.0587 double-pollution-getbyval 11.1518+-0.0671 11.1029+-0.0897 double-pollution-putbyoffset 6.0825+-0.0410 ! 6.4924+-0.2225 ! definitely 1.0674x slower double-to-int32-typed-array-no-inline 2.5928+-0.0194 ! 2.6488+-0.0083 ! definitely 1.0216x slower double-to-int32-typed-array 2.2288+-0.0131 ! 2.2998+-0.0051 ! definitely 1.0319x slower double-to-uint32-typed-array-no-inline 2.7502+-0.0133 ! 2.8129+-0.0155 ! definitely 1.0228x slower double-to-uint32-typed-array 2.4529+-0.0255 ! 2.5709+-0.0071 ! definitely 1.0481x slower empty-string-plus-int 10.8984+-0.0792 ! 11.1141+-0.1159 ! definitely 1.0198x slower emscripten-cube2hash 55.6869+-0.5751 ? 55.7474+-0.7519 ? emscripten-memops 7055.2471+-1.2636 7052.4138+-12.4688 external-arguments-getbyval 2.1319+-0.0170 ! 2.1682+-0.0152 ! definitely 1.0170x slower external-arguments-putbyval 3.0689+-0.0109 ? 3.1305+-0.1219 ? might be 1.0201x slower fixed-typed-array-storage-var-index 1.4140+-0.0192 ? 1.4203+-0.0075 ? fixed-typed-array-storage 0.9939+-0.0056 ! 1.0188+-0.0095 ! definitely 1.0250x slower Float32Array-matrix-mult 6.5604+-0.0596 ? 6.6268+-0.0154 ? might be 1.0101x slower Float32Array-to-Float64Array-set 92.9188+-0.5070 ? 94.3849+-1.5358 ? might be 1.0158x slower Float64Array-alloc-long-lived 103.7086+-0.5559 ? 103.7479+-0.4745 ? Float64Array-to-Int16Array-set 117.6815+-1.6012 116.7557+-0.6259 fold-double-to-int 20.6970+-0.1931 20.6781+-0.2092 for-of-iterate-array-entries 8.6394+-0.1221 ? 8.6831+-0.1296 ? for-of-iterate-array-keys 3.4580+-0.0432 ? 3.4594+-0.0564 ? for-of-iterate-array-values 2.9744+-0.0583 ? 3.0058+-0.0373 ? might be 1.0105x slower function-dot-apply 3.1969+-0.0387 ^ 3.1339+-0.0065 ^ definitely 1.0201x faster function-test 4.8656+-0.0585 ? 4.9009+-0.0606 ? get-by-id-chain-from-try-block 7.9878+-0.0888 ? 8.0966+-0.0901 ? might be 1.0136x slower get-by-id-proto-or-self 25.9654+-0.3094 25.8907+-0.2234 get-by-id-self-or-proto 23.8658+-0.7419 ? 24.0109+-0.6687 ? get-by-val-out-of-bounds 7.2371+-0.0506 ? 7.2829+-0.0596 ? get_callee_monomorphic 4.8865+-0.0491 ? 5.0059+-0.1125 ? might be 1.0245x slower get_callee_polymorphic 4.6928+-0.0215 ! 4.8507+-0.0247 ! definitely 1.0336x slower global-var-const-infer-fire-from-opt 1.0951+-0.0362 1.0463+-0.0461 might be 1.0466x faster global-var-const-infer 0.8142+-0.0054 ? 0.8197+-0.0050 ? HashMap-put-get-iterate-keys 41.9820+-0.1880 ! 42.6426+-0.3409 ! definitely 1.0157x slower HashMap-put-get-iterate 53.6611+-0.3064 ! 54.4919+-0.3173 ! definitely 1.0155x slower HashMap-string-put-get-iterate 51.1104+-0.6247 ! 52.4024+-0.3899 ! definitely 1.0253x slower imul-double-only 17.7427+-0.1347 ? 17.7727+-0.0620 ? imul-int-only 14.8241+-0.1606 ? 14.9155+-0.0984 ? imul-mixed 21.8230+-0.0612 ? 21.8854+-0.1352 ? in-four-cases 26.0243+-0.1555 25.9413+-0.1530 in-one-case-false 12.1930+-0.2111 12.0696+-0.0880 might be 1.0102x faster in-one-case-true 12.1702+-0.1818 12.0379+-0.0915 might be 1.0110x faster in-two-cases 12.8646+-0.1009 12.8434+-0.1182 indexed-properties-in-objects 4.2034+-0.0486 ? 4.2369+-0.0148 ? infer-closure-const-then-mov-no-inline 15.3924+-0.1092 ? 15.4089+-0.1186 ? infer-closure-const-then-mov 28.9156+-0.0993 ? 29.1696+-0.3726 ? infer-closure-const-then-put-to-scope-no-inline 17.8068+-0.1231 ? 17.8792+-0.1363 ? infer-closure-const-then-put-to-scope 35.9304+-0.2532 ? 35.9792+-0.2175 ? infer-closure-const-then-reenter-no-inline 84.3723+-0.1810 ? 84.4449+-0.1023 ? infer-closure-const-then-reenter 36.0461+-0.3386 ? 36.3387+-0.2174 ? infer-one-time-closure-ten-vars 29.0508+-0.0669 29.0302+-0.0652 infer-one-time-closure-two-vars 28.8723+-0.0558 28.8016+-0.1769 infer-one-time-closure 28.7812+-0.1184 28.7693+-0.1046 infer-one-time-deep-closure 58.3046+-0.3957 ? 58.4623+-0.3578 ? inline-arguments-access 1.6652+-0.0217 ! 1.7322+-0.0364 ! definitely 1.0402x slower inline-arguments-aliased-access 1.7682+-0.0159 ! 1.8452+-0.0230 ! definitely 1.0436x slower inline-arguments-local-escape 23.0904+-0.2141 ? 23.5501+-0.3133 ? might be 1.0199x slower inline-get-scoped-var 7.4664+-0.1039 ? 7.4814+-0.0594 ? inlined-put-by-id-transition 15.4099+-0.3138 ? 15.5037+-0.1969 ? int-or-other-abs-then-get-by-val 9.5045+-0.0561 ? 9.5367+-0.1196 ? int-or-other-abs-zero-then-get-by-val 37.5077+-0.4739 ? 38.2695+-0.3040 ? might be 1.0203x slower int-or-other-add-then-get-by-val 10.6737+-0.0944 10.6133+-0.0755 int-or-other-add 11.1091+-0.1866 10.9473+-0.1322 might be 1.0148x faster int-or-other-div-then-get-by-val 6.4487+-0.0293 6.4325+-0.0258 int-or-other-max-then-get-by-val 8.8599+-0.2454 ? 8.9147+-0.2137 ? int-or-other-min-then-get-by-val 7.0839+-0.0805 7.0781+-0.0967 int-or-other-mod-then-get-by-val 6.2138+-0.0775 ? 6.3002+-0.0245 ? might be 1.0139x slower int-or-other-mul-then-get-by-val 6.6526+-0.0342 ? 6.6610+-0.0733 ? int-or-other-neg-then-get-by-val 7.9921+-0.0823 ? 8.0008+-0.0156 ? int-or-other-neg-zero-then-get-by-val 36.9343+-0.1202 ! 37.7557+-0.2722 ! definitely 1.0222x slower int-or-other-sub-then-get-by-val 10.6918+-0.0845 10.6610+-0.1030 int-or-other-sub 8.9964+-0.0875 ? 9.0404+-0.0926 ? int-overflow-local 6.4880+-0.0818 ? 6.5154+-0.0468 ? Int16Array-alloc-long-lived 67.5609+-0.4870 ? 68.2969+-0.4374 ? might be 1.0109x slower Int16Array-bubble-sort-with-byteLength 48.9817+-0.1798 ? 49.2759+-0.6091 ? Int16Array-bubble-sort 47.9183+-0.1540 ? 47.9507+-0.1160 ? Int16Array-load-int-mul 1.8140+-0.0066 ? 1.8306+-0.0112 ? Int16Array-to-Int32Array-set 91.7691+-0.8777 89.8468+-1.9884 might be 1.0214x faster Int32Array-alloc-huge-long-lived 704.7210+-1.9232 702.6489+-3.0684 Int32Array-alloc-huge 801.8112+-9.1708 ? 812.0109+-5.8820 ? might be 1.0127x slower Int32Array-alloc-large-long-lived 974.6950+-9.3393 971.4815+-9.2652 Int32Array-alloc-large 45.5516+-1.0413 45.5341+-1.0939 Int32Array-alloc-long-lived 80.8311+-0.9737 80.7990+-0.4171 Int32Array-alloc 4.5260+-0.0065 ? 4.5369+-0.0088 ? Int32Array-Int8Array-view-alloc 14.9161+-0.0569 ! 15.0835+-0.0517 ! definitely 1.0112x slower int52-spill 12.7781+-0.1083 12.7601+-0.1259 Int8Array-alloc-long-lived 67.3676+-0.4653 67.2302+-0.6704 Int8Array-load-with-byteLength 5.0598+-0.0062 ? 5.0705+-0.0068 ? Int8Array-load 5.0234+-0.0556 ? 5.0482+-0.0598 ? integer-divide 15.0283+-0.1240 ? 15.1857+-0.0876 ? might be 1.0105x slower integer-modulo 2.0616+-0.0101 ? 2.0706+-0.0088 ? large-int-captured 9.8416+-0.0969 ! 9.9855+-0.0311 ! definitely 1.0146x slower large-int-neg 26.1120+-0.1628 ? 26.2685+-0.2313 ? large-int 23.1722+-0.1403 23.1427+-0.1821 logical-not 10.6172+-0.2857 10.5604+-0.2835 lots-of-fields 12.6732+-0.1015 ? 12.6803+-0.1372 ? make-indexed-storage 4.3678+-0.0466 4.2985+-0.1309 might be 1.0161x faster make-rope-cse 6.1243+-0.0644 ? 6.1931+-0.1501 ? might be 1.0112x slower marsaglia-larger-ints 111.9808+-0.1205 ? 112.0232+-0.3242 ? marsaglia-osr-entry 46.9977+-0.1310 46.9961+-0.1573 marsaglia 466.1451+-5.7049 ? 466.3019+-6.0786 ? method-on-number 30.1759+-0.8705 29.8363+-0.6955 might be 1.0114x faster negative-zero-divide 0.4271+-0.0081 ? 0.4285+-0.0033 ? negative-zero-modulo 0.4068+-0.0021 ? 0.4129+-0.0042 ? might be 1.0149x slower negative-zero-negate 0.3924+-0.0040 ? 0.3936+-0.0022 ? nested-function-parsing-random 381.9590+-0.8553 ! 383.1846+-0.3498 ! definitely 1.0032x slower nested-function-parsing 47.5205+-0.0543 ! 47.8634+-0.1086 ! definitely 1.0072x slower new-array-buffer-dead 3.7685+-0.0438 ? 3.8131+-0.0460 ? might be 1.0118x slower new-array-buffer-push 10.6199+-0.1555 10.6150+-0.2313 new-array-dead 28.7165+-0.3851 28.5535+-0.1193 new-array-push 6.9363+-0.0560 ? 6.9612+-0.0578 ? number-test 4.4178+-0.0303 ? 4.4439+-0.0271 ? object-closure-call 13.5192+-0.0782 ? 13.5818+-0.0949 ? object-test 4.7393+-0.0185 ? 4.7488+-0.0270 ? poly-stricteq 87.3225+-0.6516 86.8721+-0.5674 polymorphic-structure 20.8079+-0.5797 ? 21.3885+-0.0805 ? might be 1.0279x slower polyvariant-monomorphic-get-by-id 12.0300+-0.1178 12.0234+-0.1440 proto-custom-getter 157.7544+-0.0608 ? 161.1137+-5.7821 ? might be 1.0213x slower proto-getter-access 494.2127+-4.0239 ? 494.4868+-4.9632 ? put-by-id 19.8108+-0.2635 19.6645+-0.3838 put-by-val-large-index-blank-indexing-type 21.0484+-0.2499 20.7587+-0.1179 might be 1.0140x faster put-by-val-machine-int 3.3388+-0.0076 ! 3.3696+-0.0150 ! definitely 1.0092x slower rare-osr-exit-on-local 20.2268+-0.1453 ? 20.3922+-0.1128 ? register-pressure-from-osr 31.3952+-0.0791 ? 31.4557+-0.0710 ? simple-activation-demo 35.2751+-0.1041 35.2554+-0.1019 simple-custom-getter 517.8108+-26.9562 509.9667+-21.4592 might be 1.0154x faster simple-getter-access 790.1361+-10.3817 790.0570+-4.7424 slow-array-profile-convergence 4.1169+-0.0651 4.0492+-0.0194 might be 1.0167x faster slow-convergence 4.4743+-0.0258 ! 4.6213+-0.0573 ! definitely 1.0329x slower sparse-conditional 1.4738+-0.0081 ? 1.5029+-0.0213 ? might be 1.0197x slower splice-to-remove 76.8394+-0.1419 ? 77.2979+-0.9268 ? stepanov_container 10162.0477+-18.7154 ? 10203.5277+-56.2890 ? string-concat-object 3.2269+-0.0124 ? 3.2407+-0.0190 ? string-concat-pair-object 3.1847+-0.0469 ? 3.1940+-0.0430 ? string-concat-pair-simple 17.0686+-0.2588 ? 17.3539+-0.2940 ? might be 1.0167x slower string-concat-simple 17.5087+-0.2340 17.2734+-0.4672 might be 1.0136x faster string-cons-repeat 10.8532+-0.0430 ? 10.8821+-0.0429 ? string-cons-tower 11.3388+-0.0340 ? 11.3790+-0.0412 ? string-equality 42.7407+-0.2897 ? 42.7701+-0.1059 ? string-get-by-val-big-char 12.6794+-0.0757 ! 12.8562+-0.0573 ! definitely 1.0139x slower string-get-by-val-out-of-bounds-insane 5.8974+-0.1030 5.7710+-0.1212 might be 1.0219x faster string-get-by-val-out-of-bounds 5.3206+-0.0652 ? 5.3341+-0.0320 ? string-get-by-val 4.9626+-0.0237 4.9374+-0.0342 string-hash 2.7838+-0.0205 ? 2.7897+-0.0041 ? string-long-ident-equality 39.1227+-0.0819 ? 39.1705+-0.2771 ? string-repeat-arith 50.3878+-0.5745 ? 50.4924+-0.3274 ? string-sub 104.8552+-0.8673 ? 104.8911+-0.4349 ? string-test 4.3379+-0.0549 ? 4.4135+-0.0229 ? might be 1.0174x slower string-var-equality 71.8989+-3.4223 70.0436+-0.1551 might be 1.0265x faster structure-hoist-over-transitions 3.5131+-0.0088 ! 3.5463+-0.0229 ! definitely 1.0095x slower switch-char-constant 3.4924+-0.0062 ! 3.5066+-0.0074 ! definitely 1.0041x slower switch-char 8.1731+-0.0399 8.1526+-0.0580 switch-constant 9.4787+-0.1028 9.3911+-0.1064 switch-string-basic-big-var 20.3829+-0.0961 ? 20.4764+-0.1473 ? switch-string-basic-big 22.2445+-1.0974 ? 22.3371+-1.3547 ? switch-string-basic-var 20.1984+-0.0861 ? 20.3442+-0.0758 ? switch-string-basic 21.8081+-0.7406 21.5628+-0.7483 might be 1.0114x faster switch-string-big-length-tower-var 29.4119+-0.4712 29.0871+-0.0982 might be 1.0112x faster switch-string-length-tower-var 22.0609+-0.2785 21.9725+-0.1338 switch-string-length-tower 16.5795+-0.0798 16.5795+-0.1014 switch-string-short 16.6371+-0.2308 16.5920+-0.1364 switch 13.6430+-0.1834 13.6220+-0.1305 tear-off-arguments-simple 2.3823+-0.0118 ! 2.5251+-0.1111 ! definitely 1.0600x slower tear-off-arguments 3.6465+-0.0148 ! 3.7044+-0.0063 ! definitely 1.0159x slower temporal-structure 17.1798+-0.1177 ? 17.2641+-0.1325 ? to-int32-boolean 21.4377+-0.1297 ! 21.7617+-0.1010 ! definitely 1.0151x slower undefined-test 4.6360+-0.0533 ? 4.6463+-0.0200 ? weird-inlining-const-prop 2.3766+-0.0133 ! 2.4241+-0.0103 ! definitely 1.0200x slower <arithmetic> 133.8209+-0.1897 ? 134.0238+-0.2902 ? might be 1.0015x slower <geometric> * 14.7484+-0.0139 ! 14.8129+-0.0157 ! definitely 1.0044x slower <harmonic> 5.2757+-0.0061 ! 5.3253+-0.0184 ! definitely 1.0094x slower TipOfTree NoMoreForward All benchmarks: <arithmetic> 203.1066+-0.1605 ? 203.2889+-0.2777 ? might be 1.0009x slower <geometric> 20.4989+-0.0148 ! 20.5802+-0.0218 ! definitely 1.0040x slower <harmonic> 4.8251+-0.0078 ! 4.8572+-0.0147 ! definitely 1.0066x slower TipOfTree NoMoreForward Geomean of preferred means: <scaled-result> 49.3180+-0.0403 ! 49.4617+-0.0643 ! definitely 1.0029x slower
Created attachment 220080 [details] the patch This fixes the gbemu regression by adding MovHint to the list of ignored opcodes in TypeCheckHoisting.
Landed in http://trac.webkit.org/changeset/161126