RESOLVED FIXED 144001
Make FunctionRareData allocation thread-safe
https://bugs.webkit.org/show_bug.cgi?id=144001
Summary Make FunctionRareData allocation thread-safe
Basile Clement
Reported 2015-04-21 10:54:38 PDT
Patch forthcoming.
Attachments
Add fences in FunctionRareData::rareData() and FunctionRareData::rareData(ExecState, size_t inlineCapacity) (2.58 KB, patch)
2015-04-21 19:28 PDT, Basile Clement
no flags
JSFunction's rare data + ObjectAllocationProfile's Structure&Allocator fenced (3.86 KB, patch)
2015-04-22 10:39 PDT, Basile Clement
no flags
Patch w/ ChangeLog (6.42 KB, patch)
2015-04-22 13:16 PDT, Basile Clement
no flags
Also ObjectAllocationProfile::isNull to private scope (6.80 KB, patch)
2015-04-23 13:27 PDT, Basile Clement
fpizlo: review+
fpizlo: commit-queue+
Too many different working directories are bad for your health (6.36 KB, patch)
2015-04-23 13:41 PDT, Basile Clement
no flags
Basile Clement
Comment 1 2015-04-21 19:28:02 PDT
Created attachment 251293 [details] Add fences in FunctionRareData::rareData() and FunctionRareData::rareData(ExecState, size_t inlineCapacity)
Basile Clement
Comment 2 2015-04-22 10:39:59 PDT
Created attachment 251336 [details] JSFunction's rare data + ObjectAllocationProfile's Structure&Allocator fenced
Basile Clement
Comment 3 2015-04-22 11:19:41 PDT
I think the current fences are enough, here is my reasoning. The two things we want to prevent are: 1. A thread seeing a pointer to a not-yet-fully-created rare data from a JSFunction 2. A thread seeing a pointer to a not-yet-fully-created Structure from an ObjectAllocationProfile For 1., only the JS thread can be creating the rare data (in runtime/CommonSlowPaths.cpp or in dfg/DFGOperations.cpp), so we don't need to worry about concurrent writes, and we don't need any fences when *reading* the rare data from the JS thread. Thus we only need a storeStoreFence between the rare data creation and assignment to m_rareData in JSFunction::createAndInitializeRareData() to ensure that when the store to m_rareData is issued, the rare data has been properly created. For the DFG compilation threads, the only place they can access the rare data is through JSFunction::rareData(), and so we only need a loadLoadFence there to ensure that when we see a non-null pointer in m_rareData, the pointed object will be seen as a fully created FunctionRareData. For 2., the structure is created in ObjectAllocationProfile::initialize() (which appears to be called only by the JS thread as well, in bytecode/CodeBlock.cpp and on rare data initialization, which always happen in the JS thread), and read through ObjectAllocationProfile::structure() and ObjectAllocationProfile::inlineCapacity(), so following the same reasoning we put a storeStoreFence in ObjectAllocationProfile::initialize() and a loadLoadFence in ObjectAllocationProfile::structure() (and change ObjectAllocationProfile::inlineCapacity() to go through ObjectAllocationProfile::structure()). We don't need a fence in ObjectAllocationProfile::clear() because clearing the structure is an atomic change. Also notice that we don't care about the ObjectAllocationProfile's allocator as that is only used by ObjectAllocationProfile::initialize() and ObjectAllocationProfile::clear() that are always run in the JS thread. ObjectAllocationProfile::isNull() could cause some trouble, but it looks like it is only used in the ObjectAllocationProfile::clear() check, which is also in the JS thread (trying to build w/ isNull() private to verify this).
Basile Clement
Comment 4 2015-04-22 12:58:54 PDT
I was expecting this to be perf-neutral, but apparently it isn't? I have run the Kraken benchmark several times w/ consistent results, but these numbers still seem really weird to me. Benchmark report for SunSpider, LongSpider, V8Spider, Octane, Kraken, JSRegress, AsmBench, and CompressionBench on Basiles-MacBook-Pro (MacBookPro11,3). VMs tested: "Conf#1" at /Volumes/Data/SVN/Baseline/OpenSource/WebKitBuild/Release/jsc (r183114) "Conf#2" at /Volumes/Data/SVN/WIP/OpenSource/WebKitBuild/Release/jsc (r183114) Collected 6 samples per benchmark/VM, with 6 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. Conf#1 Conf#2 SunSpider: 3d-cube 4.4353+-0.1513 ? 4.6088+-0.4794 ? might be 1.0391x slower 3d-morph 5.5881+-0.3794 5.3280+-0.2217 might be 1.0488x faster 3d-raytrace 5.1925+-0.1924 ? 5.4993+-0.4735 ? might be 1.0591x slower access-binary-trees 2.0345+-0.1422 ? 2.0935+-0.1290 ? might be 1.0290x slower access-fannkuch 5.4790+-0.1607 ? 5.5982+-0.3416 ? might be 1.0218x slower access-nbody 2.6063+-0.2451 2.5146+-0.0757 might be 1.0365x faster access-nsieve 3.3501+-0.1169 3.2323+-0.1266 might be 1.0364x faster bitops-3bit-bits-in-byte 1.4320+-0.0569 ? 1.5517+-0.2076 ? might be 1.0836x slower bitops-bits-in-byte 3.4915+-0.3205 3.2898+-0.2889 might be 1.0613x faster bitops-bitwise-and 2.0900+-0.1273 ? 2.1259+-0.2040 ? might be 1.0172x slower bitops-nsieve-bits 3.1307+-0.1492 ? 3.1919+-0.1255 ? might be 1.0195x slower controlflow-recursive 2.1903+-0.1806 ^ 1.9284+-0.0382 ^ definitely 1.1358x faster crypto-aes 3.8271+-0.6925 ? 3.8767+-0.4590 ? might be 1.0129x slower crypto-md5 2.1478+-0.0529 ? 2.1944+-0.1973 ? might be 1.0217x slower crypto-sha1 2.5313+-0.4351 2.5241+-0.2240 date-format-tofte 7.2346+-0.5479 7.0985+-0.2518 might be 1.0192x faster date-format-xparb 4.8761+-0.1878 4.8739+-0.1462 math-cordic 2.8952+-0.1415 2.7555+-0.0219 might be 1.0507x faster math-partial-sums 4.6932+-0.3237 ? 4.9536+-0.4962 ? might be 1.0555x slower math-spectral-norm 1.8273+-0.0941 ? 1.8685+-0.2294 ? might be 1.0226x slower regexp-dna 6.8654+-0.5735 6.6967+-0.2889 might be 1.0252x faster string-base64 4.4054+-0.3068 4.3634+-0.4383 string-fasta 6.2585+-0.3565 6.0610+-0.4245 might be 1.0326x faster string-tagcloud 8.6238+-0.1777 ? 8.8418+-0.3017 ? might be 1.0253x slower string-unpack-code 20.1375+-1.0708 19.1556+-0.4400 might be 1.0513x faster string-validate-input 4.5809+-0.1479 ? 4.7644+-0.6369 ? might be 1.0401x slower <arithmetic> 4.6894+-0.0408 4.6535+-0.0318 might be 1.0077x faster Conf#1 Conf#2 LongSpider: 3d-cube 832.3554+-15.6521 831.0989+-14.3630 3d-morph 1627.3334+-33.1397 1621.9746+-26.7117 3d-raytrace 694.9163+-16.7973 ? 701.6835+-7.6839 ? access-binary-trees 883.6753+-7.8280 ? 892.4621+-6.1909 ? access-fannkuch 277.7843+-4.7717 ? 285.4883+-12.9451 ? might be 1.0277x slower access-nbody 585.0649+-14.4911 ? 606.5778+-21.0410 ? might be 1.0368x slower access-nsieve 688.0964+-15.0250 ? 688.3172+-9.0926 ? bitops-3bit-bits-in-byte 43.7256+-2.6037 43.2418+-0.9959 might be 1.0112x faster bitops-bits-in-byte 93.0723+-3.2182 ? 93.9191+-5.0233 ? bitops-nsieve-bits 659.8131+-14.5542 ? 667.2841+-10.6044 ? might be 1.0113x slower controlflow-recursive 493.0334+-23.9063 485.2687+-17.4088 might be 1.0160x faster crypto-aes 582.7811+-16.5314 ? 602.8807+-20.1678 ? might be 1.0345x slower crypto-md5 564.9390+-12.6393 562.1348+-8.9322 crypto-sha1 615.6455+-22.0219 614.1862+-12.7002 date-format-tofte 542.0933+-12.4481 ? 554.2762+-11.1277 ? might be 1.0225x slower date-format-xparb 681.9327+-9.1361 671.1212+-12.7448 might be 1.0161x faster math-cordic 537.7730+-10.6876 ? 542.5652+-16.6267 ? math-partial-sums 462.7460+-8.4181 ? 465.7054+-10.7027 ? math-spectral-norm 592.6882+-12.5627 ? 608.7194+-17.4317 ? might be 1.0270x slower string-base64 340.3506+-4.4849 ? 357.1724+-15.4188 ? might be 1.0494x slower string-fasta 404.1642+-6.5204 398.4764+-5.3441 might be 1.0143x faster string-tagcloud 206.3975+-3.3353 ? 210.4167+-8.5157 ? might be 1.0195x slower <geometric> 460.1916+-3.0129 ? 464.3788+-3.0547 ? might be 1.0091x slower Conf#1 Conf#2 V8Spider: crypto 54.0349+-2.3798 ? 54.3095+-2.6325 ? deltablue 81.9799+-2.8865 ? 84.8230+-7.8300 ? might be 1.0347x slower earley-boyer 41.9612+-1.4242 ? 43.6582+-1.8849 ? might be 1.0404x slower raytrace 33.8180+-3.4671 32.9440+-2.1068 might be 1.0265x faster regexp 81.3009+-4.1615 78.2463+-3.9613 might be 1.0390x faster richards 78.2319+-2.0493 76.0116+-2.9068 might be 1.0292x faster splay 39.9360+-2.8778 36.8094+-2.4762 might be 1.0849x faster <geometric> 55.3134+-0.9687 54.5553+-1.1820 might be 1.0139x faster Conf#1 Conf#2 Octane: encrypt 0.21501+-0.00498 0.21261+-0.00310 might be 1.0113x faster decrypt 3.65038+-0.07951 3.63887+-0.04564 deltablue x2 0.17133+-0.00322 ? 0.17166+-0.00195 ? earley 0.45896+-0.00373 ? 0.46667+-0.00666 ? might be 1.0168x slower boyer 6.27579+-0.16581 6.21558+-0.06588 navier-stokes x2 5.47150+-0.08585 5.41433+-0.04577 might be 1.0106x faster raytrace x2 1.08489+-0.06070 ? 1.12508+-0.04542 ? might be 1.0370x slower richards x2 0.10282+-0.00119 0.09898+-0.00300 might be 1.0388x faster splay x2 0.37825+-0.01238 0.37564+-0.00729 regexp x2 29.81069+-1.03843 ? 30.10510+-0.99517 ? pdfjs x2 40.44447+-0.42128 ? 40.63084+-0.46205 ? mandreel x2 48.65925+-0.57836 48.20755+-0.58695 gbemu x2 36.29409+-0.80857 ? 37.29404+-2.90085 ? might be 1.0276x slower closure 0.51882+-0.01557 ? 0.53285+-0.02216 ? might be 1.0270x slower jquery 6.57081+-0.15217 6.52869+-0.14826 box2d x2 11.03558+-0.16908 ? 11.05483+-0.17349 ? zlib x2 374.78502+-23.58295 ? 380.38261+-5.38599 ? might be 1.0149x slower typescript x2 683.28288+-10.84315 ? 687.96252+-13.16831 ? <geometric> 6.13569+-0.02919 ? 6.15439+-0.06829 ? might be 1.0030x slower Conf#1 Conf#2 Kraken: ai-astar 434.666+-22.080 ^ 303.440+-7.230 ^ definitely 1.4325x faster audio-beat-detection 100.509+-0.998 ? 103.678+-4.352 ? might be 1.0315x slower audio-dft 174.509+-2.672 ? 174.850+-10.737 ? audio-fft 82.892+-1.443 81.713+-3.502 might be 1.0144x faster audio-oscillator 188.824+-5.888 ? 191.953+-10.438 ? might be 1.0166x slower imaging-darkroom 99.557+-2.777 ? 101.775+-2.967 ? might be 1.0223x slower imaging-desaturate 61.896+-3.449 ? 63.477+-3.888 ? might be 1.0256x slower imaging-gaussian-blur 92.599+-4.179 ? 92.918+-3.284 ? json-parse-financial 39.892+-1.750 ? 40.628+-3.869 ? might be 1.0185x slower json-stringify-tinderbox 54.225+-2.224 ? 55.813+-3.232 ? might be 1.0293x slower stanford-crypto-aes 59.995+-1.289 ? 60.090+-4.229 ? stanford-crypto-ccm 51.436+-5.030 ? 51.614+-4.797 ? stanford-crypto-pbkdf2 162.319+-6.810 156.031+-2.616 might be 1.0403x faster stanford-crypto-sha256-iterative 52.126+-1.789 ? 53.658+-1.888 ? might be 1.0294x slower <arithmetic> 118.246+-1.662 ^ 109.403+-1.025 ^ definitely 1.0808x faster Conf#1 Conf#2 JSRegress: abs-boolean 2.4167+-0.0478 ? 2.8230+-0.7388 ? might be 1.1681x slower adapt-to-double-divide 16.8736+-1.0876 16.7283+-0.5923 aliased-arguments-getbyval 1.1803+-0.1241 ? 1.4742+-0.5141 ? might be 1.2489x slower allocate-big-object 2.7050+-0.5142 2.5432+-0.1782 might be 1.0636x faster arguments-named-and-reflective 12.1755+-0.7318 11.8265+-0.6290 might be 1.0295x faster arguments-out-of-bounds 11.0167+-0.8931 ? 11.1595+-0.5796 ? might be 1.0130x slower arguments-strict-mode 10.5916+-0.6123 10.4898+-0.7511 arguments 9.6276+-0.4310 9.3225+-0.5027 might be 1.0327x faster arity-mismatch-inlining 0.9342+-0.1754 0.8137+-0.0471 might be 1.1481x faster array-access-polymorphic-structure 6.9198+-1.0319 6.2931+-0.5897 might be 1.0996x faster array-nonarray-polymorhpic-access 29.2566+-1.3096 29.0380+-1.2310 array-prototype-every 85.5088+-2.4437 85.1174+-2.0088 array-prototype-forEach 82.9918+-1.8932 ? 84.7477+-5.5181 ? might be 1.0212x slower array-prototype-map 93.4983+-4.7102 91.3628+-4.8674 might be 1.0234x faster array-prototype-some 84.4172+-4.3054 ? 86.4408+-3.8639 ? might be 1.0240x slower array-splice-contiguous 43.9795+-2.3918 ? 44.0281+-1.9634 ? array-with-double-add 3.5132+-0.1293 3.4739+-0.1188 might be 1.0113x faster array-with-double-increment 3.1622+-0.1935 ? 3.2814+-0.2808 ? might be 1.0377x slower array-with-double-mul-add 4.2835+-0.1722 ? 4.3679+-0.2506 ? might be 1.0197x slower array-with-double-sum 3.4087+-0.2409 3.2313+-0.0377 might be 1.0549x faster array-with-int32-add-sub 5.9524+-0.2121 ? 6.0045+-0.3445 ? array-with-int32-or-double-sum 3.5087+-0.3663 3.4414+-0.3217 might be 1.0196x faster ArrayBuffer-DataView-alloc-large-long-lived 29.5104+-2.5135 29.2418+-2.7326 ArrayBuffer-DataView-alloc-long-lived 13.3108+-1.3882 ? 13.3159+-0.7134 ? ArrayBuffer-Int32Array-byteOffset 3.8005+-0.2229 3.7770+-0.2651 ArrayBuffer-Int8Array-alloc-large-long-lived 31.2364+-3.2885 28.3175+-1.0144 might be 1.1031x faster ArrayBuffer-Int8Array-alloc-long-lived-buffer 21.8736+-1.3013 ? 22.3775+-1.1947 ? might be 1.0230x slower ArrayBuffer-Int8Array-alloc-long-lived 12.9710+-1.0949 12.0900+-0.2507 might be 1.0729x faster ArrayBuffer-Int8Array-alloc 10.7196+-0.8474 ? 11.4655+-3.1165 ? might be 1.0696x slower asmjs_bool_bug 7.7023+-0.3996 ? 7.8442+-0.8108 ? might be 1.0184x slower assign-custom-setter-polymorphic 2.5416+-0.1326 ? 2.6241+-0.3422 ? might be 1.0325x slower assign-custom-setter 3.7146+-0.3140 3.6717+-0.1589 might be 1.0117x faster basic-set 7.7248+-0.3996 ? 8.1038+-0.8269 ? might be 1.0491x slower big-int-mul 3.6631+-0.3389 3.6445+-0.3519 boolean-test 3.1919+-0.7690 2.8603+-0.0607 might be 1.1159x faster branch-fold 3.6932+-0.1170 3.6468+-0.1287 might be 1.0127x faster by-val-generic 7.6178+-0.1708 7.4012+-0.1909 might be 1.0293x faster call-spread-apply 27.7551+-1.2217 ? 27.9482+-1.4726 ? call-spread-call 22.8534+-0.6498 ? 22.9357+-0.8219 ? captured-assignments 0.4224+-0.0301 0.3942+-0.0422 might be 1.0717x faster cast-int-to-double 4.9629+-0.1371 ? 5.0190+-0.3199 ? might be 1.0113x slower cell-argument 6.5756+-0.3714 6.5167+-0.5087 cfg-simplify 2.9827+-0.0461 ? 3.0035+-0.2543 ? chain-getter-access 9.5176+-0.2910 9.3426+-0.3532 might be 1.0187x faster cmpeq-obj-to-obj-other 12.2694+-0.8453 12.0483+-1.4419 might be 1.0183x faster constant-test 4.5533+-0.0464 ! 4.6884+-0.0769 ! definitely 1.0297x slower create-lots-of-functions 10.0383+-1.9324 ? 10.0472+-0.5893 ? DataView-custom-properties 35.1123+-3.1334 ? 35.8569+-3.8184 ? might be 1.0212x slower deconstructing-parameters-overridden-by-function 0.4677+-0.1327 ? 0.5406+-0.2202 ? might be 1.1561x slower delay-tear-off-arguments-strictmode 12.7437+-0.5581 ? 12.8787+-0.3374 ? might be 1.0106x slower deltablue-varargs 149.3834+-1.5594 ? 152.2106+-3.8107 ? might be 1.0189x slower destructuring-arguments 14.9521+-1.6548 ? 15.4644+-1.3411 ? might be 1.0343x slower destructuring-swap 4.7980+-0.1448 4.7359+-0.1229 might be 1.0131x faster direct-arguments-getbyval 1.1604+-0.0934 ? 1.1986+-0.1247 ? might be 1.0330x slower div-boolean-double 5.4127+-0.1479 ? 5.4220+-0.1604 ? div-boolean 8.5641+-0.1970 8.5043+-0.1811 double-get-by-val-out-of-bounds 4.2892+-0.2478 ? 4.4900+-0.2344 ? might be 1.0468x slower double-pollution-getbyval 9.2493+-0.5698 8.9575+-0.2614 might be 1.0326x faster double-pollution-putbyoffset 3.9415+-0.1636 ? 5.3841+-3.4205 ? might be 1.3660x slower double-to-int32-typed-array-no-inline 2.0848+-0.0961 2.0504+-0.0869 might be 1.0168x faster double-to-int32-typed-array 1.6521+-0.0420 ? 1.7169+-0.1121 ? might be 1.0392x slower double-to-uint32-typed-array-no-inline 2.2035+-0.1788 2.2030+-0.1939 double-to-uint32-typed-array 1.7822+-0.0766 1.7768+-0.0435 elidable-new-object-dag 38.8936+-4.9598 37.5815+-3.4315 might be 1.0349x faster elidable-new-object-roflcopter 42.0613+-3.1489 40.9705+-3.9674 might be 1.0266x faster elidable-new-object-then-call 33.8572+-2.9101 33.3957+-1.5640 might be 1.0138x faster elidable-new-object-tree 40.0110+-3.3110 ? 40.3750+-3.3847 ? empty-string-plus-int 4.8594+-0.2043 ? 5.1221+-0.5582 ? might be 1.0541x slower emscripten-cube2hash 29.4589+-0.9935 29.2051+-1.6557 exit-length-on-plain-object 13.1971+-0.7219 ? 13.7371+-0.7052 ? might be 1.0409x slower external-arguments-getbyval 1.2583+-0.1615 ? 1.3683+-0.2537 ? might be 1.0875x slower external-arguments-putbyval 2.2881+-0.4736 2.2376+-0.1925 might be 1.0226x faster fixed-typed-array-storage-var-index 1.2150+-0.1091 1.2031+-0.1363 fixed-typed-array-storage 0.9455+-0.3363 0.8676+-0.1480 might be 1.0898x faster Float32Array-matrix-mult 3.9729+-0.3510 3.8196+-0.0688 might be 1.0402x faster Float32Array-to-Float64Array-set 50.8818+-3.1112 ? 52.5493+-3.3516 ? might be 1.0328x slower Float64Array-alloc-long-lived 68.5358+-7.2021 ? 69.2657+-5.3305 ? might be 1.0106x slower Float64Array-to-Int16Array-set 61.7324+-2.1826 ? 63.4322+-2.2190 ? might be 1.0275x slower fold-double-to-int 12.9622+-0.3145 ? 13.0689+-0.4892 ? fold-get-by-id-to-multi-get-by-offset-rare-int 10.4813+-0.7112 9.9407+-0.6032 might be 1.0544x faster fold-get-by-id-to-multi-get-by-offset 7.9404+-0.7041 7.8822+-0.4075 fold-multi-get-by-offset-to-get-by-offset 8.5785+-1.0104 7.6311+-1.2708 might be 1.1241x faster fold-multi-get-by-offset-to-poly-get-by-offset 7.3379+-1.1628 6.9028+-1.1903 might be 1.0630x faster fold-multi-put-by-offset-to-poly-put-by-offset 7.1982+-0.7196 ? 7.7133+-1.5306 ? might be 1.0716x slower fold-multi-put-by-offset-to-put-by-offset 4.3297+-0.7220 4.3253+-0.5187 fold-multi-put-by-offset-to-replace-or-transition-put-by-offset 9.5320+-1.0301 8.9991+-0.8375 might be 1.0592x faster fold-put-by-id-to-multi-put-by-offset 8.7173+-0.6786 8.4754+-0.6840 might be 1.0285x faster fold-put-structure 4.1141+-0.6161 3.7417+-0.2409 might be 1.0995x faster for-of-iterate-array-entries 4.1770+-0.1260 ? 4.2598+-0.1427 ? might be 1.0198x slower for-of-iterate-array-keys 3.4222+-0.4347 3.2863+-0.1032 might be 1.0413x faster for-of-iterate-array-values 3.8211+-0.6362 3.3150+-0.1867 might be 1.1527x faster fround 18.9028+-1.2775 18.4106+-1.0793 might be 1.0267x faster ftl-library-inlining-dataview 62.4405+-1.9515 ? 62.8280+-2.3589 ? ftl-library-inlining 119.3035+-4.4261 116.3825+-4.2533 might be 1.0251x faster function-dot-apply 2.0017+-0.0707 ? 2.0693+-0.0447 ? might be 1.0337x slower function-test 2.9897+-0.1040 ? 3.0708+-0.2172 ? might be 1.0271x slower function-with-eval 98.1050+-6.6907 97.3850+-2.4958 gcse-poly-get-less-obvious 15.2463+-0.4843 ? 15.3335+-0.7207 ? gcse-poly-get 16.9200+-0.4172 16.9054+-0.8368 gcse 3.8682+-0.2160 ? 3.9013+-0.1458 ? get-by-id-bimorphic-check-structure-elimination-simple 2.7301+-0.1596 ? 2.8475+-0.4200 ? might be 1.0430x slower get-by-id-bimorphic-check-structure-elimination 6.0665+-0.4300 5.9204+-0.1839 might be 1.0247x faster get-by-id-chain-from-try-block 7.0569+-0.5146 6.8019+-0.2569 might be 1.0375x faster get-by-id-check-structure-elimination 4.7083+-0.6591 4.5090+-0.0962 might be 1.0442x faster get-by-id-proto-or-self 14.5393+-0.6130 ? 15.1705+-1.0026 ? might be 1.0434x slower get-by-id-quadmorphic-check-structure-elimination-simple 2.9860+-0.1442 2.9550+-0.1453 might be 1.0105x faster get-by-id-self-or-proto 15.5522+-1.4049 15.4950+-0.7216 get-by-val-out-of-bounds 4.0091+-0.1338 ? 4.2081+-0.2444 ? might be 1.0496x slower get_callee_monomorphic 2.4390+-0.2349 ? 2.5741+-0.2855 ? might be 1.0554x slower get_callee_polymorphic 3.5576+-0.1683 ? 3.6161+-0.3018 ? might be 1.0164x slower getter-no-activation 4.9987+-0.3392 ? 5.1001+-0.4686 ? might be 1.0203x slower getter-richards 115.4798+-6.8709 ? 121.8240+-7.8064 ? might be 1.0549x slower getter 5.2617+-0.2036 ? 5.7078+-0.5992 ? might be 1.0848x slower global-var-const-infer-fire-from-opt 0.8716+-0.1633 ? 0.9450+-0.1441 ? might be 1.0842x slower global-var-const-infer 0.8320+-0.1120 ? 0.9356+-0.1235 ? might be 1.1245x slower HashMap-put-get-iterate-keys 24.3960+-0.7000 24.0010+-0.5066 might be 1.0165x faster HashMap-put-get-iterate 24.6828+-1.0917 ? 24.7902+-0.5964 ? HashMap-string-put-get-iterate 25.5018+-2.9609 23.8171+-1.4325 might be 1.0707x faster hoist-make-rope 8.9097+-1.5407 ? 9.1958+-0.9647 ? might be 1.0321x slower hoist-poly-check-structure-effectful-loop 4.4798+-0.6212 4.2600+-0.1759 might be 1.0516x faster hoist-poly-check-structure 3.3848+-0.2386 ? 3.3925+-0.1887 ? imul-double-only 6.9112+-0.4667 ? 7.7718+-1.2264 ? might be 1.1245x slower imul-int-only 9.0255+-1.0025 8.6974+-0.3245 might be 1.0377x faster imul-mixed 7.0025+-0.4039 6.9930+-0.6957 in-four-cases 16.1306+-0.6578 15.8482+-0.6254 might be 1.0178x faster in-one-case-false 8.7791+-0.6551 ? 8.8798+-1.0469 ? might be 1.0115x slower in-one-case-true 8.9371+-0.5134 ? 9.2900+-1.0707 ? might be 1.0395x slower in-two-cases 9.4810+-1.1853 9.1688+-0.8586 might be 1.0340x faster indexed-properties-in-objects 3.3002+-0.4603 3.0675+-0.5033 might be 1.0759x faster infer-closure-const-then-mov-no-inline 3.1357+-0.0943 3.1094+-0.1607 infer-closure-const-then-mov 18.2635+-1.2450 17.5389+-0.6736 might be 1.0413x faster infer-closure-const-then-put-to-scope-no-inline 14.2855+-1.2141 13.7454+-1.3018 might be 1.0393x faster infer-closure-const-then-put-to-scope 25.1483+-1.5993 24.2054+-0.7731 might be 1.0390x faster infer-closure-const-then-reenter-no-inline 63.0570+-1.6493 ? 65.1399+-2.4728 ? might be 1.0330x slower infer-closure-const-then-reenter 26.2822+-1.8759 24.5947+-0.7131 might be 1.0686x faster infer-constant-global-property 32.7149+-2.3522 31.6355+-0.7032 might be 1.0341x faster infer-constant-property 2.6896+-0.2072 2.5971+-0.0224 might be 1.0356x faster infer-one-time-closure-ten-vars 8.9458+-0.8092 ? 9.1481+-0.6326 ? might be 1.0226x slower infer-one-time-closure-two-vars 8.3568+-0.4844 8.3114+-0.2080 infer-one-time-closure 8.2540+-0.5320 8.0311+-0.2146 might be 1.0278x faster infer-one-time-deep-closure 13.4673+-0.5099 ? 13.6945+-0.6222 ? might be 1.0169x slower inline-arguments-access 4.7100+-1.2382 4.1682+-0.2451 might be 1.1300x faster inline-arguments-aliased-access 3.8919+-0.1417 ? 4.0637+-0.3933 ? might be 1.0441x slower inline-arguments-local-escape 4.2133+-0.2358 ? 4.2351+-0.6076 ? inline-get-scoped-var 4.7782+-0.2465 4.6715+-0.1311 might be 1.0228x faster inlined-put-by-id-transition 10.4067+-1.2255 ? 10.8631+-0.9509 ? might be 1.0439x slower int-or-other-abs-then-get-by-val 4.9307+-0.4399 4.6764+-0.1379 might be 1.0544x faster int-or-other-abs-zero-then-get-by-val 17.0365+-1.0412 16.9250+-0.4475 int-or-other-add-then-get-by-val 4.1591+-0.3232 3.9647+-0.0693 might be 1.0490x faster int-or-other-add 4.9397+-0.1705 ? 4.9594+-0.2071 ? int-or-other-div-then-get-by-val 3.9275+-0.0851 ? 4.4276+-0.5979 ? might be 1.1273x slower int-or-other-max-then-get-by-val 4.1619+-0.3147 4.1224+-0.2884 int-or-other-min-then-get-by-val 4.0415+-0.1020 ? 4.1794+-0.2763 ? might be 1.0341x slower int-or-other-mod-then-get-by-val 3.7520+-0.1917 3.7306+-0.1455 int-or-other-mul-then-get-by-val 3.7437+-0.1302 ? 3.8400+-0.1927 ? might be 1.0257x slower int-or-other-neg-then-get-by-val 4.3582+-0.1546 ? 4.5470+-0.3540 ? might be 1.0433x slower int-or-other-neg-zero-then-get-by-val 16.6822+-0.3658 ? 16.9685+-0.6315 ? might be 1.0172x slower int-or-other-sub-then-get-by-val 4.1799+-0.2960 4.1213+-0.1811 might be 1.0142x faster int-or-other-sub 3.5047+-0.3617 ? 3.5776+-0.3126 ? might be 1.0208x slower int-overflow-local 4.1756+-0.2734 4.0972+-0.1606 might be 1.0191x faster Int16Array-alloc-long-lived 50.0232+-3.5200 49.4485+-5.4940 might be 1.0116x faster Int16Array-bubble-sort-with-byteLength 18.7257+-1.9756 18.0610+-0.4210 might be 1.0368x faster Int16Array-bubble-sort 17.6900+-0.6539 17.5111+-0.4134 might be 1.0102x faster Int16Array-load-int-mul 1.4528+-0.0929 1.4059+-0.0468 might be 1.0333x faster Int16Array-to-Int32Array-set 47.7124+-1.7959 46.6479+-1.8089 might be 1.0228x faster Int32Array-alloc-large 14.2405+-2.0780 13.1156+-0.9148 might be 1.0858x faster Int32Array-alloc-long-lived 52.6025+-2.9181 ? 53.1608+-2.5637 ? might be 1.0106x slower Int32Array-alloc 2.8603+-0.2361 ? 2.8843+-0.2397 ? Int32Array-Int8Array-view-alloc 6.8857+-1.1322 6.3975+-0.2348 might be 1.0763x faster int52-spill 6.0297+-0.2357 5.8714+-0.2345 might be 1.0270x faster Int8Array-alloc-long-lived 44.1742+-3.1438 ? 45.1311+-5.0734 ? might be 1.0217x slower Int8Array-load-with-byteLength 3.5967+-0.3884 3.5354+-0.2613 might be 1.0173x faster Int8Array-load 3.3956+-0.1306 ? 3.5472+-0.2857 ? might be 1.0446x slower integer-divide 11.1696+-0.6322 11.1178+-0.3585 integer-modulo 1.7227+-0.1354 ? 1.7573+-0.2303 ? might be 1.0201x slower large-int-captured 4.2350+-0.3715 ? 4.2855+-0.2902 ? might be 1.0119x slower large-int-neg 15.0682+-0.3768 ? 15.1535+-0.5773 ? large-int 14.2997+-0.2745 ? 15.3568+-1.2258 ? might be 1.0739x slower logical-not 4.4715+-0.3244 4.2217+-0.2594 might be 1.0592x faster lots-of-fields 10.6672+-0.7729 ? 10.9335+-0.5022 ? might be 1.0250x slower make-indexed-storage 2.7259+-0.2566 ? 3.0432+-0.2270 ? might be 1.1164x slower make-rope-cse 3.9144+-0.2405 3.7549+-0.1004 might be 1.0425x faster marsaglia-larger-ints 35.2551+-1.8262 ? 35.5552+-2.5509 ? marsaglia-osr-entry 21.7434+-0.7198 ? 22.5769+-1.3098 ? might be 1.0383x slower max-boolean 2.7700+-0.0831 2.6952+-0.0599 might be 1.0278x faster method-on-number 17.7487+-1.3370 17.2074+-0.3127 might be 1.0315x faster min-boolean 2.7546+-0.1671 2.6766+-0.1225 might be 1.0291x faster minus-boolean-double 3.1838+-0.2049 3.1503+-0.1162 might be 1.0106x faster minus-boolean 2.4468+-0.3131 2.3403+-0.0537 might be 1.0455x faster misc-strict-eq 32.0657+-3.0548 30.9404+-0.6344 might be 1.0364x faster mod-boolean-double 11.6159+-0.6137 11.5835+-0.3905 mod-boolean 8.5605+-0.0809 ? 8.7320+-0.4243 ? might be 1.0200x slower mul-boolean-double 3.8188+-0.2741 3.7526+-0.2638 might be 1.0176x faster mul-boolean 2.9638+-0.1526 2.7923+-0.0512 might be 1.0614x faster neg-boolean 3.0915+-0.0774 ? 3.1283+-0.1542 ? might be 1.0119x slower negative-zero-divide 0.3129+-0.0254 0.3111+-0.0286 negative-zero-modulo 0.3678+-0.1594 0.2966+-0.0231 might be 1.2401x faster negative-zero-negate 0.3048+-0.0708 0.2813+-0.0251 might be 1.0835x faster nested-function-parsing 35.4219+-2.8303 34.0065+-0.9362 might be 1.0416x faster new-array-buffer-dead 98.6464+-4.5235 96.4295+-5.6975 might be 1.0230x faster new-array-buffer-push 6.0331+-0.2927 ? 6.6875+-1.3320 ? might be 1.1085x slower new-array-dead 13.8396+-0.7789 ? 15.3411+-1.4017 ? might be 1.1085x slower new-array-push 3.5367+-0.1052 3.4415+-0.3015 might be 1.0277x faster no-inline-constructor 111.0593+-2.3227 106.3672+-4.3100 might be 1.0441x faster number-test 2.9524+-0.1849 2.9148+-0.1279 might be 1.0129x faster object-closure-call 5.1482+-0.4793 5.0535+-0.2463 might be 1.0187x faster object-test 2.8591+-0.0648 ? 2.9816+-0.3384 ? might be 1.0428x slower obvious-sink-pathology-taken 111.9813+-4.2488 109.4659+-6.0788 might be 1.0230x faster obvious-sink-pathology 108.2312+-7.0478 104.2853+-6.4327 might be 1.0378x faster obviously-elidable-new-object 30.7822+-2.3513 ? 31.8637+-1.5855 ? might be 1.0351x slower plus-boolean-arith 2.4429+-0.1379 2.4294+-0.1210 plus-boolean-double 3.2247+-0.1970 3.1247+-0.0968 might be 1.0320x faster plus-boolean 2.5488+-0.0699 ? 2.6059+-0.1741 ? might be 1.0224x slower poly-chain-access-different-prototypes-simple 2.8197+-0.2496 ? 2.8395+-0.1731 ? poly-chain-access-different-prototypes 2.4945+-0.0609 ? 2.6626+-0.1697 ? might be 1.0674x slower poly-chain-access-simpler 2.7429+-0.1150 2.6925+-0.0250 might be 1.0187x faster poly-chain-access 2.6879+-0.3927 2.6584+-0.3185 might be 1.0111x faster poly-stricteq 54.8625+-2.3990 ? 55.5979+-3.7765 ? might be 1.0134x slower polymorphic-array-call 1.2641+-0.4155 1.2065+-0.1826 might be 1.0478x faster polymorphic-get-by-id 2.8175+-0.0427 ? 2.8947+-0.1137 ? might be 1.0274x slower polymorphic-put-by-id 26.9848+-1.1718 26.1658+-0.6853 might be 1.0313x faster polymorphic-structure 13.6088+-0.5545 ? 13.8648+-0.2574 ? might be 1.0188x slower polyvariant-monomorphic-get-by-id 7.1073+-1.3167 7.0912+-0.7548 proto-getter-access 9.4326+-0.3320 ? 9.4626+-0.4573 ? put-by-id-replace-and-transition 8.1292+-0.3976 ? 8.9200+-1.5409 ? might be 1.0973x slower put-by-id-slightly-polymorphic 2.7738+-0.4763 2.7209+-0.2497 might be 1.0194x faster put-by-id 10.2069+-0.3066 ? 10.8580+-1.2601 ? might be 1.0638x slower put-by-val-direct 0.3853+-0.0213 0.3809+-0.0430 might be 1.0115x faster put-by-val-large-index-blank-indexing-type 5.7926+-0.5807 5.3792+-0.6324 might be 1.0769x faster put-by-val-machine-int 2.7236+-0.2734 2.5609+-0.1057 might be 1.0635x faster rare-osr-exit-on-local 15.1539+-0.4465 ? 15.3346+-0.3351 ? might be 1.0119x slower register-pressure-from-osr 17.8601+-0.2759 ? 18.4645+-1.4084 ? might be 1.0338x slower setter 5.7569+-0.3490 5.7215+-0.1335 simple-activation-demo 25.7653+-1.1622 ? 27.1953+-2.1483 ? might be 1.0555x slower simple-getter-access 12.6530+-0.7928 12.3588+-0.2583 might be 1.0238x faster simple-poly-call-nested 9.6672+-0.8043 9.4338+-0.3212 might be 1.0247x faster simple-poly-call 1.2802+-0.2142 1.2644+-0.2409 might be 1.0125x faster sin-boolean 19.2299+-1.9395 ? 19.5778+-2.1724 ? might be 1.0181x slower singleton-scope 65.5336+-2.1845 ? 68.9616+-4.7326 ? might be 1.0523x slower sinkable-new-object-dag 61.6789+-4.1304 ? 62.1704+-3.5265 ? sinkable-new-object-taken 48.3709+-2.9668 48.2065+-2.4714 sinkable-new-object 34.1097+-2.3564 33.1188+-2.6903 might be 1.0299x faster slow-array-profile-convergence 2.6855+-0.2409 2.6090+-0.1579 might be 1.0294x faster slow-convergence 2.5417+-0.2627 2.5209+-0.1934 sorting-benchmark 23.2951+-1.1998 22.1348+-0.7030 might be 1.0524x faster sparse-conditional 1.1222+-0.0467 ? 1.3260+-0.5395 ? might be 1.1816x slower splice-to-remove 14.9962+-0.8462 14.2574+-0.3521 might be 1.0518x faster string-char-code-at 15.3922+-1.1045 ? 15.8472+-0.6751 ? might be 1.0296x slower string-concat-object 2.2163+-0.1407 ? 2.5015+-0.1543 ? might be 1.1287x slower string-concat-pair-object 2.2235+-0.1529 ? 2.2640+-0.1439 ? might be 1.0182x slower string-concat-pair-simple 10.0750+-0.8607 9.9739+-0.8822 might be 1.0101x faster string-concat-simple 10.4784+-1.0256 10.1696+-1.1498 might be 1.0304x faster string-cons-repeat 7.3133+-0.8916 7.1814+-0.6686 might be 1.0184x faster string-cons-tower 7.6784+-1.1937 7.4449+-1.1918 might be 1.0314x faster string-equality 15.2191+-0.2651 ? 15.7612+-0.5652 ? might be 1.0356x slower string-get-by-val-big-char 7.2392+-1.4160 6.7185+-0.1084 might be 1.0775x faster string-get-by-val-out-of-bounds-insane 3.3553+-0.2733 ? 3.4134+-0.2755 ? might be 1.0173x slower string-get-by-val-out-of-bounds 4.1635+-0.1638 ? 4.2227+-0.3365 ? might be 1.0142x slower string-get-by-val 2.8412+-0.1197 ? 2.9137+-0.2432 ? might be 1.0255x slower string-hash 1.9895+-0.3024 1.8647+-0.0790 might be 1.0669x faster string-long-ident-equality 13.1139+-0.7039 12.9958+-0.4658 string-out-of-bounds 10.7131+-0.3183 ? 11.3811+-0.9807 ? might be 1.0624x slower string-repeat-arith 27.6097+-0.5811 27.2865+-1.0052 might be 1.0118x faster string-sub 59.5830+-4.5310 55.0962+-2.2548 might be 1.0814x faster string-test 2.6607+-0.0699 ? 2.6846+-0.0984 ? string-var-equality 26.9023+-1.2809 26.8898+-1.6893 structure-hoist-over-transitions 2.4460+-0.1960 ? 2.4542+-0.1483 ? substring-concat-weird 37.0975+-1.8696 ? 37.9767+-2.6965 ? might be 1.0237x slower substring-concat 38.9295+-1.0435 ? 40.2652+-3.5428 ? might be 1.0343x slower substring 46.7250+-1.4061 46.1477+-2.4334 might be 1.0125x faster switch-char-constant 2.6931+-0.1853 2.6267+-0.0442 might be 1.0252x faster switch-char 6.5503+-0.8534 6.1363+-0.7513 might be 1.0675x faster switch-constant 8.4766+-0.6868 8.3603+-1.4272 might be 1.0139x faster switch-string-basic-big-var 13.5207+-0.4127 ? 13.9292+-0.9656 ? might be 1.0302x slower switch-string-basic-big 13.4317+-1.2721 ? 13.6340+-0.5171 ? might be 1.0151x slower switch-string-basic-var 13.3404+-0.7639 ? 13.9514+-1.0679 ? might be 1.0458x slower switch-string-basic 12.0618+-0.2756 ? 12.3399+-0.2263 ? might be 1.0231x slower switch-string-big-length-tower-var 19.5382+-0.8883 19.3960+-0.6414 switch-string-length-tower-var 13.5964+-0.3844 ? 14.3957+-1.6032 ? might be 1.0588x slower switch-string-length-tower 12.3685+-0.3465 12.3418+-1.0001 switch-string-short 12.0588+-0.2471 11.8566+-0.2656 might be 1.0171x faster switch 13.0385+-1.6912 12.7917+-1.3384 might be 1.0193x faster tear-off-arguments-simple 3.1045+-0.1385 ? 3.1536+-0.3092 ? might be 1.0158x slower tear-off-arguments 4.5518+-0.4994 4.4827+-0.5349 might be 1.0154x faster temporal-structure 12.5994+-0.3181 ? 12.9323+-0.9252 ? might be 1.0264x slower to-int32-boolean 13.2832+-0.5729 ? 13.4223+-0.4928 ? might be 1.0105x slower try-catch-get-by-val-cloned-arguments 13.7481+-0.6119 ? 14.4667+-0.6819 ? might be 1.0523x slower try-catch-get-by-val-direct-arguments 6.3370+-0.9073 6.0754+-0.2570 might be 1.0431x faster try-catch-get-by-val-scoped-arguments 7.4062+-0.3904 7.3315+-0.7334 might be 1.0102x faster undefined-property-access 235.4623+-6.2249 ? 236.4904+-8.8459 ? undefined-test 2.8496+-0.1264 ? 3.0185+-0.2179 ? might be 1.0593x slower unprofiled-licm 15.3687+-0.9217 15.0091+-0.7997 might be 1.0240x faster varargs-call 14.4001+-0.1857 ? 14.6958+-0.2039 ? might be 1.0205x slower varargs-construct-inline 19.7086+-1.6777 19.4824+-1.6487 might be 1.0116x faster varargs-construct 32.7043+-2.3200 31.8022+-1.8111 might be 1.0284x faster varargs-inline 8.7878+-0.2908 ? 9.1469+-1.0000 ? might be 1.0409x slower varargs-strict-mode 9.8503+-0.4560 9.6945+-0.3129 might be 1.0161x faster varargs 9.5750+-0.1729 ? 9.7299+-0.5440 ? might be 1.0162x slower weird-inlining-const-prop 2.0314+-0.2221 ? 2.3809+-0.4849 ? might be 1.1720x slower <geometric> 8.0533+-0.0405 8.0521+-0.0274 might be 1.0002x faster Conf#1 Conf#2 AsmBench: bigfib.cpp 471.7764+-3.5017 ? 487.9275+-12.7784 ? might be 1.0342x slower cray.c 424.0862+-8.3635 ? 440.9827+-34.3008 ? might be 1.0398x slower dry.c 462.8022+-18.7109 ? 464.1502+-37.2670 ? FloatMM.c 754.9574+-23.9107 750.7539+-15.4918 gcc-loops.cpp 3793.5697+-36.0436 ? 4015.4735+-488.3612 ? might be 1.0585x slower n-body.c 887.3397+-34.1997 ? 889.0302+-8.3727 ? Quicksort.c 446.3313+-11.9112 436.6346+-7.8387 might be 1.0222x faster stepanov_container.cpp 10277.3720+-15907.1606 4092.1757+-91.8203 might be 2.5115x faster Towers.c 256.8579+-4.7076 ? 259.0417+-7.1674 ? <geometric> 820.3285+-92.5167 792.0148+-9.7260 might be 1.0357x faster Conf#1 Conf#2 CompressionBench: huffman 321.3388+-5.9491 ? 323.2857+-12.3924 ? arithmetic-simple 371.3127+-7.7131 ? 382.4933+-30.4509 ? might be 1.0301x slower arithmetic-precise 288.2277+-1.7735 ? 293.5511+-7.8305 ? might be 1.0185x slower arithmetic-complex-precise 284.7605+-6.0316 ? 290.5964+-7.4715 ? might be 1.0205x slower arithmetic-precise-order-0 394.0160+-8.3638 387.5076+-6.2389 might be 1.0168x faster arithmetic-precise-order-1 316.5880+-6.3783 ? 320.7105+-8.6065 ? might be 1.0130x slower arithmetic-precise-order-2 367.7725+-9.4551 367.7529+-10.7545 arithmetic-simple-order-1 363.1527+-8.8705 359.3105+-8.9938 might be 1.0107x faster arithmetic-simple-order-2 423.5934+-10.9974 411.1817+-14.4727 might be 1.0302x faster lz-string 358.4060+-8.6524 ? 358.7869+-12.2153 ? <geometric> 346.1999+-2.7604 ? 347.1796+-4.7292 ? might be 1.0028x slower Conf#1 Conf#2 Geomean of preferred means: <scaled-result> 61.2170+-0.6778 ^ 60.3380+-0.1998 ^ definitely 1.0146x faster
Basile Clement
Comment 5 2015-04-22 13:15:20 PDT
Reran AsmBench due to huge hiccup. Benchmark report for AsmBench on Basiles-MacBook-Pro (MacBookPro11,3). VMs tested: "Conf#1" at /Volumes/Data/SVN/Baseline/OpenSource/WebKitBuild/Release/jsc (r183123) "Conf#2" at /Volumes/Data/SVN/WIP/OpenSource/WebKitBuild/Release/jsc (r183123) Collected 6 samples per benchmark/VM, with 6 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. Conf#1 Conf#2 bigfib.cpp 472.8986+-3.8969 ? 478.9491+-6.4586 ? might be 1.0128x slower cray.c 424.3024+-9.7129 422.5944+-14.4131 dry.c 456.4883+-23.2965 ? 465.4514+-15.5756 ? might be 1.0196x slower FloatMM.c 752.8911+-12.8045 ? 757.6528+-17.3431 ? gcc-loops.cpp 3787.7625+-63.5631 ? 3802.5506+-37.2358 ? n-body.c 895.4360+-15.5858 ? 905.3407+-30.6225 ? might be 1.0111x slower Quicksort.c 437.3180+-9.9509 ? 444.9053+-8.9211 ? might be 1.0173x slower stepanov_container.cpp 4103.7755+-78.1070 4065.5324+-67.4782 Towers.c 259.8154+-8.6392 251.0401+-4.2857 might be 1.0350x faster <geometric> 781.6373+-12.5174 ? 783.5936+-6.2165 ? might be 1.0025x slower
Basile Clement
Comment 6 2015-04-22 13:16:14 PDT
Created attachment 251361 [details] Patch w/ ChangeLog
Basile Clement
Comment 7 2015-04-23 13:27:11 PDT
Created attachment 251478 [details] Also ObjectAllocationProfile::isNull to private scope The previous benchmarking issues were apparently present for unrelated reasons. I am getting weird results: Basiles-MacBook-Pro:OpenSource elarnon$ JSC_useFTLJIT1=true /Volumes/Data/SVN/Baseline/Internal/Tools/Scripts/run-jsc-benchmarks /Volumes/Data/SVN/Baseline/OpenSource/WebKitBuild/Release/jsc --outer 6 --kraken --benchmark astar 32/32 Generating benchmark report at /Volumes/Data/SVN/Baseline/OpenSource/Kraken_Basiles-MacBook-Pro_20150423_1135_report.txt And raw data at /Volumes/Data/SVN/Baseline/OpenSource/Kraken_Basiles-MacBook-Pro_20150423_1135.json Benchmark report for Kraken on Basiles-MacBook-Pro (MacBookPro11,3). VMs tested: "Conf#1" at /Volumes/Data/SVN/Baseline/OpenSource/WebKitBuild/Release/jsc (r183152) Collected 6 samples per benchmark/VM, with 6 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. Conf#1 ai-astar 424.889+-7.874 <arithmetic> 424.889+-7.874 (note the useFTLJIT1=true, so the option has no effect, just triggers the parsing code) VS Basiles-MacBook-Pro:OpenSource elarnon$ /Volumes/Data/SVN/Baseline/Internal/Tools/Scripts/run-jsc-benchmarks /Volumes/Data/SVN/Baseline/OpenSource/WebKitBuild/Release/jsc --outer 6 --kraken --benchmark astar 32/32 Generating benchmark report at /Volumes/Data/SVN/Baseline/OpenSource/Kraken_Basiles-MacBook-Pro_20150423_1132_report.txt And raw data at /Volumes/Data/SVN/Baseline/OpenSource/Kraken_Basiles-MacBook-Pro_20150423_1132.json Benchmark report for Kraken on Basiles-MacBook-Pro (MacBookPro11,3). VMs tested: "Conf#1" at /Volumes/Data/SVN/Baseline/OpenSource/WebKitBuild/Release/jsc (r183152) Collected 6 samples per benchmark/VM, with 6 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. Conf#1 ai-astar 591.127+-5.260 <arithmetic> 591.127+-5.260 Note that this is on the *Baseline*, i.e. w/o patches applied (actually the timings are similar w/ the patch applied but reversed, i.e. around 600 w/ options and around 420 w/ no options). I discussed this with mlam, who thinks this might be memory alignment issues, which are in any case not related to this patch. I have also run tests for Kraken/ai-astar (which was the only test w/ significant difference) against r183069 : Benchmark report for Kraken on Basiles-MacBook-Pro (MacBookPro11,3). VMs tested: "Conf#1" at /Volumes/Data/BenchTest/ICloudFix/OpenSource/WebKitBuild/Release/jsc "Conf#2" at /Volumes/Data/BenchTest/RemoveNode/OpenSource/WebKitBuild/Release/jsc "Conf#3" at /Volumes/Data/BenchTest/NoDeallocate/OpenSource/WebKitBuild/Release/jsc Collected 6 samples per benchmark/VM, with 6 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. Conf#1 Conf#2 Conf#3 Conf#3 v. Conf#1 ai-astar 399.571+-4.196 ? 400.163+-3.402 399.722+-4.901 ? <arithmetic> 399.571+-4.196 ? 400.163+-3.402 399.722+-4.901 ? might be 1.0004x slower (Conf#1 is r183069, Conf#2 is r183069 + https://bugs.webkit.org/show_bug.cgi?id=143999, and Conf#3 is Conf#2 + https://bugs.webkit.org/show_bug.cgi?id=144000 + this patch). So I think this can safely be considered perf-neutral.
Basile Clement
Comment 8 2015-04-23 13:41:59 PDT
Created attachment 251483 [details] Too many different working directories are bad for your health
Geoffrey Garen
Comment 9 2015-04-23 14:16:04 PDT
Comment on attachment 251483 [details] Too many different working directories are bad for your health View in context: https://bugs.webkit.org/attachment.cgi?id=251483&action=review > Source/JavaScriptCore/runtime/JSFunction.h:127 > + FunctionRareData* rareData = m_rareData.get(); > > - return m_rareData.get()->allocationStructure(); > + // The JS thread may be concurrently creating the rare data > + // If we see it, we want to ensure it has been properly created > + WTF::loadLoadFence(); Sometimes, we load this pointer in JIT code. Does our JIT code need to include a load-load fence?
Basile Clement
Comment 10 2015-04-23 14:28:38 PDT
Comment on attachment 251483 [details] Too many different working directories are bad for your health View in context: https://bugs.webkit.org/attachment.cgi?id=251483&action=review >> Source/JavaScriptCore/runtime/JSFunction.h:127 >> + WTF::loadLoadFence(); > > Sometimes, we load this pointer in JIT code. Does our JIT code need to include a load-load fence? The load-load fence is only needed if there is possibly a concurrent thread writing to that pointer. The only thread writing to the pointer being the JS thread, which is also the only thread running the JIT code (as far as I understand), no fence should be needed.
WebKit Commit Bot
Comment 11 2015-04-23 14:57:33 PDT
Comment on attachment 251483 [details] Too many different working directories are bad for your health Clearing flags on attachment: 251483 Committed r183212: <http://trac.webkit.org/changeset/183212>
WebKit Commit Bot
Comment 12 2015-04-23 14:57:37 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.