Bug 112780

Summary: DFG implementation of op_strcat should inline rope allocations
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: JavaScriptCoreAssignee: Filip Pizlo <fpizlo>
Status: RESOLVED FIXED    
Severity: Normal CC: barraclough, ggaren, mark.lam, mhahnenberg, msaboff, oliver, sam
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
the patch oliver: review+

Filip Pizlo
Reported 2013-03-20 00:45:54 PDT
Patch forthcoming.
Attachments
the patch (36.71 KB, patch)
2013-03-20 01:06 PDT, Filip Pizlo
oliver: review+
Filip Pizlo
Comment 1 2013-03-20 00:51:40 PDT
OMG. Benchmark report for SunSpider, V8Spider, Octane, Kraken, and JSRegress on oldmac (MacPro4,1). VMs tested: "TipOfTree" at /Volumes/Data/pizlo/OpenSource/WebKitBuild/Release/jsc (r146250) "StrCat" at /Volumes/Data/fromMiniMe/secondary/OpenSource/WebKitBuild/Release/jsc (r146250) Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree StrCat SunSpider: 3d-cube 9.1020+-0.1405 9.0950+-0.1850 3d-morph 8.7251+-0.1225 ? 8.8282+-0.1520 ? might be 1.0118x slower 3d-raytrace 10.4436+-0.1461 10.4374+-0.1566 access-binary-trees 1.9471+-0.0509 1.9353+-0.0111 access-fannkuch 7.7616+-0.1153 7.7001+-0.1067 access-nbody 4.6520+-0.0112 ? 4.6531+-0.0095 ? access-nsieve 4.9701+-0.0401 4.9238+-0.0813 bitops-3bit-bits-in-byte 1.8327+-0.0085 1.8307+-0.0101 bitops-bits-in-byte 7.0533+-0.1048 ? 7.0751+-0.1076 ? bitops-bitwise-and 2.5455+-0.0763 ? 2.6443+-0.0709 ? might be 1.0388x slower bitops-nsieve-bits 4.1817+-0.0232 4.1740+-0.0162 controlflow-recursive 3.1199+-0.0449 3.1174+-0.0490 crypto-aes 7.6646+-0.1196 ? 7.7703+-0.1513 ? might be 1.0138x slower crypto-md5 3.9127+-0.0297 ! 4.0287+-0.0361 ! definitely 1.0296x slower crypto-sha1 3.2083+-0.0224 3.1921+-0.0152 date-format-tofte 14.8310+-0.1966 ? 14.8933+-0.1365 ? date-format-xparb 10.5381+-0.2785 ^ 9.3676+-0.1994 ^ definitely 1.1250x faster math-cordic 4.0293+-0.0119 ? 4.0297+-0.0107 ? math-partial-sums 12.4347+-0.1101 ? 12.4886+-0.1218 ? math-spectral-norm 3.1625+-0.0169 3.1514+-0.0139 regexp-dna 11.4363+-0.2102 11.3769+-0.1685 string-base64 4.8215+-0.0351 4.7259+-0.0886 might be 1.0202x faster string-fasta 10.8107+-0.1140 ? 10.8940+-0.1457 ? string-tagcloud 14.3466+-0.3045 14.0300+-0.2068 might be 1.0226x faster string-unpack-code 26.7938+-0.0828 ! 27.2721+-0.1211 ! definitely 1.0179x slower string-validate-input 7.5661+-0.1156 7.4003+-0.0780 might be 1.0224x faster <arithmetic> * 7.7650+-0.0574 7.7321+-0.0674 might be 1.0043x faster <geometric> 6.2588+-0.0384 6.2363+-0.0459 might be 1.0036x faster <harmonic> 5.0462+-0.0299 5.0432+-0.0279 might be 1.0006x faster TipOfTree StrCat V8Spider: crypto 87.5234+-0.2011 ? 88.3779+-1.6128 ? deltablue 125.2600+-0.4073 ? 125.2918+-0.8755 ? earley-boyer 82.7613+-0.3771 82.7290+-0.2061 raytrace 61.5320+-0.1550 ! 61.9204+-0.1081 ! definitely 1.0063x slower regexp 101.8963+-0.5501 101.4172+-0.1028 richards 119.0732+-0.3724 ? 119.1839+-0.5371 ? splay 56.6075+-0.4767 ^ 48.7913+-0.2259 ^ definitely 1.1602x faster <arithmetic> 90.6648+-0.1710 ^ 89.6731+-0.1945 ^ definitely 1.0111x faster <geometric> * 87.2010+-0.1745 ^ 85.5134+-0.1948 ^ definitely 1.0197x faster <harmonic> 83.7071+-0.1899 ^ 81.1221+-0.1878 ^ definitely 1.0319x faster TipOfTree StrCat Octane and V8v7: encrypt 0.46706+-0.00054 ? 0.46774+-0.00056 ? decrypt 8.65021+-0.02235 ? 8.67169+-0.06606 ? deltablue x2 0.56827+-0.00067 0.56742+-0.00166 earley 0.88077+-0.00188 ! 0.90650+-0.00392 ! definitely 1.0292x slower boyer 12.79319+-0.03411 ? 12.81921+-0.04082 ? raytrace x2 4.48527+-0.05895 4.43804+-0.03000 might be 1.0106x faster regexp x2 32.37329+-0.15671 32.25172+-0.14972 richards x2 0.30726+-0.00048 0.30637+-0.00106 splay x2 0.70344+-0.00925 ^ 0.64376+-0.02271 ^ definitely 1.0927x faster navier-stokes x2 10.81820+-0.01309 10.79608+-0.01563 closure 0.30878+-0.03588 ? 0.30972+-0.03426 ? jquery 4.39308+-0.55456 ? 4.41432+-0.55612 ? gbemu x2 252.03606+-16.94385 251.16266+-16.24598 box2d x2 31.63886+-0.18567 31.46117+-0.08973 V8v7: <arithmetic> 7.58142+-0.02224 7.55450+-0.02116 might be 1.0036x faster <geometric> * 2.45074+-0.00605 ^ 2.42239+-0.00884 ^ definitely 1.0117x faster <harmonic> 0.93920+-0.00210 ^ 0.92490+-0.00497 ^ definitely 1.0155x faster Octane including V8v7: <arithmetic> 31.51611+-1.55121 31.40198+-1.49411 might be 1.0036x faster <geometric> * 4.39657+-0.06955 4.35776+-0.06437 might be 1.0089x faster <harmonic> 1.06504+-0.02038 1.05233+-0.01717 might be 1.0121x faster TipOfTree StrCat Kraken: ai-astar 494.695+-0.353 ? 495.012+-0.590 ? audio-beat-detection 246.476+-2.484 246.077+-2.210 audio-dft 312.825+-0.970 312.310+-1.964 audio-fft 144.105+-0.240 143.736+-0.145 audio-oscillator 234.673+-1.084 234.629+-1.034 imaging-darkroom 291.928+-1.033 290.231+-0.802 imaging-desaturate 160.432+-0.333 160.428+-0.308 imaging-gaussian-blur 398.305+-0.991 397.517+-0.641 json-parse-financial 79.658+-0.365 79.631+-0.155 json-stringify-tinderbox 100.691+-0.306 ? 101.113+-0.543 ? stanford-crypto-aes 97.129+-0.381 96.490+-0.544 stanford-crypto-ccm 105.133+-4.045 104.265+-4.112 stanford-crypto-pbkdf2 273.904+-7.458 268.631+-0.808 might be 1.0196x faster stanford-crypto-sha256-iterative 117.754+-2.031 ^ 115.072+-0.131 ^ definitely 1.0233x faster <arithmetic> * 218.408+-0.856 217.510+-0.467 might be 1.0041x faster <geometric> 186.885+-0.895 186.008+-0.689 might be 1.0047x faster <harmonic> 160.695+-0.915 159.914+-0.861 might be 1.0049x faster TipOfTree StrCat JSRegress: adapt-to-double-divide 22.4650+-0.0973 ? 22.6001+-0.1221 ? aliased-arguments-getbyval 0.9029+-0.0087 ? 0.9087+-0.0103 ? allocate-big-object 2.5204+-0.0474 ? 2.6371+-0.1047 ? might be 1.0463x slower arity-mismatch-inlining 0.7669+-0.0108 ? 0.7744+-0.0219 ? array-access-polymorphic-structure 7.1098+-0.0810 7.0706+-0.0928 array-with-double-add 5.7874+-0.0919 ? 5.7987+-0.0900 ? array-with-double-increment 4.2175+-0.0912 4.1231+-0.0101 might be 1.0229x faster array-with-double-mul-add 6.9980+-0.0931 ? 7.0709+-0.1059 ? might be 1.0104x slower array-with-double-sum 7.9035+-0.0950 7.8891+-0.1105 array-with-int32-add-sub 10.5194+-0.1474 10.4017+-0.1086 might be 1.0113x faster array-with-int32-or-double-sum 7.9755+-0.1145 ? 7.9937+-0.0935 ? big-int-mul 4.9911+-0.0155 ? 5.0021+-0.0173 ? boolean-test 4.3989+-0.0565 ? 4.4199+-0.0680 ? cast-int-to-double 13.9444+-0.1181 13.9162+-0.1042 cell-argument 14.4922+-0.1202 14.3659+-0.0764 cfg-simplify 3.9303+-0.0930 ? 4.0021+-0.0150 ? might be 1.0183x slower cmpeq-obj-to-obj-other 11.1266+-0.2263 ! 11.5914+-0.1692 ! definitely 1.0418x slower constant-test 8.5508+-0.1374 8.4728+-0.1135 direct-arguments-getbyval 0.8325+-0.0093 0.8320+-0.0098 double-pollution-getbyval 10.6867+-0.1196 ? 10.7506+-0.1226 ? double-pollution-putbyoffset 5.0264+-0.0251 5.0231+-0.0249 external-arguments-getbyval 2.2082+-0.0389 2.1934+-0.0410 external-arguments-putbyval 3.3122+-0.0171 ? 3.3415+-0.0699 ? Float32Array-matrix-mult 13.8183+-0.0865 ! 14.4195+-0.1432 ! definitely 1.0435x slower fold-double-to-int 22.0176+-0.2537 21.8695+-0.1613 function-dot-apply 3.1733+-0.0080 ? 3.1754+-0.0096 ? function-test 4.9984+-0.0560 4.9837+-0.1162 get-by-id-chain-from-try-block 7.4270+-0.0960 ? 7.5099+-0.0977 ? might be 1.0112x slower HashMap-put-get-iterate-keys 88.5555+-0.5321 ? 89.8767+-0.9993 ? might be 1.0149x slower HashMap-put-get-iterate 91.6294+-0.8363 90.5546+-0.8644 might be 1.0119x faster HashMap-string-put-get-iterate 73.6103+-0.4110 73.3356+-0.3383 indexed-properties-in-objects 4.5354+-0.0180 ? 4.5420+-0.0374 ? inline-arguments-access 1.2481+-0.0065 1.2465+-0.0097 inline-arguments-local-escape 23.2782+-0.1161 ^ 22.9261+-0.1571 ^ definitely 1.0154x faster inline-get-scoped-var 6.6146+-0.0857 ? 6.6248+-0.0877 ? inlined-put-by-id-transition 16.7545+-0.3331 16.6549+-0.1323 int-or-other-abs-then-get-by-val 8.7932+-0.0988 ? 8.9186+-0.1068 ? might be 1.0143x slower int-or-other-abs-zero-then-get-by-val 37.0613+-0.1350 ? 37.3534+-0.3862 ? int-or-other-add-then-get-by-val 10.2542+-0.1285 ? 10.2806+-0.1248 ? int-or-other-add 10.4798+-0.0966 ? 10.5149+-0.1229 ? int-or-other-div-then-get-by-val 7.9403+-0.0946 ? 7.9640+-0.0814 ? int-or-other-max-then-get-by-val 10.0088+-0.2449 9.9383+-0.2083 int-or-other-min-then-get-by-val 8.1825+-0.1114 ? 8.1835+-0.1020 ? int-or-other-mod-then-get-by-val 8.0365+-0.1108 8.0016+-0.1055 int-or-other-mul-then-get-by-val 7.2223+-0.0971 ? 7.2227+-0.1034 ? int-or-other-neg-then-get-by-val 8.1501+-0.0902 ? 8.1594+-0.1232 ? int-or-other-neg-zero-then-get-by-val 36.4814+-0.1262 36.4060+-0.1169 int-or-other-sub-then-get-by-val 10.2348+-0.1096 ? 10.2845+-0.1287 ? int-or-other-sub 8.2182+-0.1126 8.1580+-0.1015 int-overflow-local 12.8913+-0.1131 12.8646+-0.1150 Int16Array-bubble-sort 49.7224+-0.4382 49.4922+-0.2366 Int16Array-load-int-mul 1.8789+-0.0074 ? 1.8830+-0.0066 ? Int8Array-load 4.8646+-0.0419 ? 4.8714+-0.0239 ? integer-divide 15.1984+-0.1290 15.1204+-0.1107 integer-modulo 2.0599+-0.0125 ? 2.0610+-0.0150 ? make-indexed-storage 3.9300+-0.0424 3.9054+-0.0447 method-on-number 23.8250+-0.5492 23.4125+-0.4873 might be 1.0176x faster nested-function-parsing-random 376.1677+-13.1717 ? 377.4356+-13.0783 ? nested-function-parsing 51.1583+-1.1442 ^ 47.8857+-1.1675 ^ definitely 1.0683x faster new-array-buffer-dead 3.6232+-0.0125 ? 3.6266+-0.0173 ? new-array-buffer-push 10.4755+-0.1594 10.3989+-0.1856 new-array-dead 28.2948+-0.1251 28.2491+-0.0811 new-array-push 7.1196+-0.1813 6.9540+-0.0700 might be 1.0238x faster number-test 4.3065+-0.0908 ? 4.3227+-0.0573 ? object-closure-call 8.3433+-0.0916 8.3339+-0.1054 object-test 4.9050+-0.0548 ? 4.9232+-0.1054 ? poly-stricteq 91.5765+-1.1989 90.8259+-0.2760 polymorphic-structure 20.1160+-0.1612 20.0110+-0.1295 polyvariant-monomorphic-get-by-id 12.5509+-0.1449 12.5053+-0.1203 rare-osr-exit-on-local 20.6214+-0.1457 20.5618+-0.1147 register-pressure-from-osr 31.5523+-0.1350 ? 31.5747+-0.1140 ? simple-activation-demo 34.4323+-0.1305 ? 34.4448+-0.1393 ? slow-array-profile-convergence 4.3552+-0.0278 4.3467+-0.0201 slow-convergence 3.7944+-0.0081 ? 3.8006+-0.0109 ? sparse-conditional 1.3154+-0.0115 1.3125+-0.0139 splice-to-remove 50.4247+-0.1741 50.3684+-0.1691 string-concat-object 5.5094+-0.0579 ^ 2.7209+-0.0145 ^ definitely 2.0248x faster string-concat-pair-object 2.7271+-0.0297 ^ 2.6626+-0.0188 ^ definitely 1.0242x faster string-concat-pair-simple 17.9229+-0.2238 ^ 17.2733+-0.1407 ^ definitely 1.0376x faster string-concat-simple 44.9397+-0.2664 ^ 16.9350+-0.1740 ^ definitely 2.6537x faster string-cons-repeat 10.1017+-0.0206 ? 10.1288+-0.0274 ? string-cons-tower 10.9276+-0.0291 ? 10.9645+-0.0564 ? string-hash 2.6490+-0.0112 ? 2.6492+-0.0094 ? string-repeat-arith 45.7644+-0.1524 ^ 45.3655+-0.1909 ^ definitely 1.0088x faster string-sub 89.3981+-0.8347 ^ 87.0744+-1.3783 ^ definitely 1.0267x faster string-test 4.2867+-0.0246 ? 4.3133+-0.0555 ? structure-hoist-over-transitions 3.3272+-0.0755 3.2727+-0.0276 might be 1.0166x faster tear-off-arguments-simple 1.7767+-0.0109 1.7759+-0.0108 tear-off-arguments 3.3767+-0.0100 ? 3.3871+-0.0095 ? temporal-structure 20.9459+-0.1200 20.8629+-0.1132 to-int32-boolean 27.1413+-0.1167 27.1114+-0.0943 undefined-test 4.5538+-0.0417 ? 4.5629+-0.0409 ? <arithmetic> 20.2635+-0.1591 ^ 19.8569+-0.1546 ^ definitely 1.0205x faster <geometric> * 9.3117+-0.0247 ^ 9.1322+-0.0242 ^ definitely 1.0197x faster <harmonic> 5.1612+-0.0135 ^ 5.1031+-0.0212 ^ definitely 1.0114x faster TipOfTree StrCat All benchmarks: <arithmetic> 40.0736+-0.3417 39.6992+-0.3252 might be 1.0094x faster <geometric> 11.2751+-0.0537 ^ 11.1165+-0.0531 ^ definitely 1.0143x faster <harmonic> 3.6747+-0.0349 3.6368+-0.0304 might be 1.0104x faster TipOfTree StrCat Geomean of preferred means: <scaled-result> 22.7195+-0.1281 ^ 22.4654+-0.1228 ^ definitely 1.0113x faster
Filip Pizlo
Comment 2 2013-03-20 01:06:55 PDT
Created attachment 193996 [details] the patch
Oliver Hunt
Comment 3 2013-03-20 01:35:18 PDT
Comment on attachment 193996 [details] the patch View in context: https://bugs.webkit.org/attachment.cgi?id=193996&action=review > Source/JavaScriptCore/dfg/DFGOperations.cpp:1576 > + JSGlobalData& globalData = exec->globalData(); #if CPU(X86) RELEASE_ASSERT_NOT_REACHED(); #endif ?
Filip Pizlo
Comment 4 2013-03-20 01:55:19 PDT
(In reply to comment #3) > (From update of attachment 193996 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=193996&action=review > > > Source/JavaScriptCore/dfg/DFGOperations.cpp:1576 > > + JSGlobalData& globalData = exec->globalData(); > > #if CPU(X86) > RELEASE_ASSERT_NOT_REACHED(); > #endif > ? I could do that, but would it help? There's nothing wrong with calling that function on X86. The fact that it won't be called is a detail that is orthogonal to the DFGOperations interface.
Filip Pizlo
Comment 5 2013-03-20 13:32:14 PDT
Note You need to log in before you can comment on or make changes to this bug.