| Summary: | Make slowPathAllocsBetweenGCs a runtime option | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | WebKit | Reporter: | Mark Lam <mark.lam> | ||||||
| Component: | JavaScriptCore | Assignee: | Mark Lam <mark.lam> | ||||||
| Status: | RESOLVED FIXED | ||||||||
| Severity: | Normal | CC: | fpizlo, ggaren, mhahnenberg, mmirman, msaboff, oliver | ||||||
| Priority: | P2 | ||||||||
| Version: | 528+ (Nightly build) | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Attachments: |
|
||||||||
|
Description
Mark Lam
2014-04-24 11:26:22 PDT
Created attachment 230097 [details]
the patch
Comment on attachment 230097 [details] the patch View in context: https://bugs.webkit.org/attachment.cgi?id=230097&action=review I'd like to see performance numbers for this change. > Source/JavaScriptCore/heap/MarkedAllocator.cpp:159 > + static unsigned allocationCount = 0; > + if (!allocationCount) { > + if (!m_heap->isDeferred()) > + m_heap->collectAllGarbage(); > + ASSERT(m_heap->m_operationInProgress == NoOperation); > + } > + if (++allocationCount >= Options::collectOnEveryAllocation()) > + allocationCount = 0; This is sort of an odd way to write this. Why not trigger a GC when you exceed the limit rather than when you hit 0? (In reply to comment #2) > (From update of attachment 230097 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=230097&action=review > > I'd like to see performance numbers for this change. > > > Source/JavaScriptCore/heap/MarkedAllocator.cpp:159 > > + static unsigned allocationCount = 0; > > + if (!allocationCount) { > > + if (!m_heap->isDeferred()) > > + m_heap->collectAllGarbage(); > > + ASSERT(m_heap->m_operationInProgress == NoOperation); > > + } > > + if (++allocationCount >= Options::collectOnEveryAllocation()) > > + allocationCount = 0; > > This is sort of an odd way to write this. Why not trigger a GC when you exceed the limit rather than when you hit 0? Just a heuristic based on my experience of testing with collections on every 100 allocations. I found that collecting on the first allocation rather than the last makes it a lot more likely that I’ll see issues (based on our regression tests as the workload). For some short running tests, they may not get to the 100th slow path allocation before the test ends. Also, collectOnEveryAllocation is a less than ideal name for this because that's not what it does. Perhaps a better name would be numberOfAllocSlowPathsBeforeCollect or something like that? Less verbose would be good, but an accurate name is more important. It also might be helpful to add something like this to CopiedSpace as well. Comment on attachment 230097 [details] the patch View in context: https://bugs.webkit.org/attachment.cgi?id=230097&action=review r=me with good perf numbers >>> Source/JavaScriptCore/heap/MarkedAllocator.cpp:159 >>> + allocationCount = 0; >> >> This is sort of an odd way to write this. Why not trigger a GC when you exceed the limit rather than when you hit 0? > > Just a heuristic based on my experience of testing with collections on every 100 allocations. I found that collecting on the first allocation rather than the last makes it a lot more likely that I’ll see issues (based on our regression tests as the workload). For some short running tests, they may not get to the 100th slow path allocation before the test ends. Fair enough. You should factor this out into an ALWAYS_INLINE method so that we keep the main method relatively clean. I know you're just replacing old crufty code, but let's improve things while we're here :-) Oh, one more thing I forgot! It would be cool to add a mode to run-jsc-stress-tests with this enabled on some reasonable setting so we get more GC coverage during our normal test harness runs. So many demands. =) Ok, to summarize: 1. collect on allocations in CopiedSpace. 2. run-jsc-stress-tests case to stress some GC action. Would other folks agree with adding this case to the stress tests given that the nature of this test is slow? (In reply to comment #8) > So many demands. =) Ok, to summarize: > 1. collect on allocations in CopiedSpace. > 2. run-jsc-stress-tests case to stress some GC action. These can be followup bugs if you want to pad your stats :-) In fact, it's probably better if they are since they're orthogonal to this bug. > > Would other folks agree with adding this case to the stress tests given that the nature of this test is slow? You could avoid making it part of the default run and instead just make it part of a "very stressful" mode. I know we've talked about something like this for a while. Benchmark results say we are neutral. Will upload the updated patch and hopefully land shortly.
Conf#1 Conf#2
SunSpider:
3d-cube 7.7817+-0.3281 ? 8.2096+-1.3606 ? might be 1.0550x slower
3d-morph 8.4812+-1.0369 ? 9.6898+-4.6944 ? might be 1.1425x slower
3d-raytrace 11.5082+-2.8843 9.6127+-1.2812 might be 1.1972x faster
access-binary-trees 2.6550+-0.6632 ? 2.7865+-0.7214 ? might be 1.0495x slower
access-fannkuch 7.8917+-0.8625 7.8015+-0.6520 might be 1.0116x faster
access-nbody 4.4300+-1.0925 ? 4.8876+-1.0792 ? might be 1.1033x slower
access-nsieve 4.9244+-1.3837 4.4075+-0.0423 might be 1.1173x faster
bitops-3bit-bits-in-byte 2.3371+-0.5959 2.0787+-0.0116 might be 1.1243x faster
bitops-bits-in-byte 3.6168+-0.1941 3.5460+-0.0218 might be 1.0200x faster
bitops-bitwise-and 3.1810+-0.6017 2.9653+-0.0650 might be 1.0727x faster
bitops-nsieve-bits 5.0885+-0.0589 ? 5.1084+-0.0839 ?
controlflow-recursive 2.5657+-0.0380 ? 3.3945+-2.5128 ? might be 1.3230x slower
crypto-aes 6.9799+-2.7119 6.5001+-2.6254 might be 1.0738x faster
crypto-md5 3.6060+-0.6929 ? 4.0301+-1.0396 ? might be 1.1176x slower
crypto-sha1 3.7060+-1.4035 3.6785+-0.7393
date-format-tofte 11.4237+-0.4750 ? 12.4314+-1.3862 ? might be 1.0882x slower
date-format-xparb 9.1884+-0.8705 9.1500+-0.9057
math-cordic 4.6897+-1.9326 4.3546+-0.9816 might be 1.0770x faster
math-partial-sums 7.5208+-1.7282 ? 8.1153+-1.3507 ? might be 1.0790x slower
math-spectral-norm 2.7354+-0.2828 ? 3.2899+-1.0157 ? might be 1.2027x slower
regexp-dna 10.6552+-0.8363 10.6185+-0.7766
string-base64 6.4170+-1.7555 5.9673+-0.2326 might be 1.0754x faster
string-fasta 9.4849+-0.2355 ? 9.9951+-0.4049 ? might be 1.0538x slower
string-tagcloud 14.0712+-1.3022 ? 14.9808+-2.3676 ? might be 1.0646x slower
string-unpack-code 30.7645+-3.9564 ? 31.5425+-4.0303 ? might be 1.0253x slower
string-validate-input 6.7678+-0.5161 ? 6.8790+-1.1584 ? might be 1.0164x slower
<arithmetic> * 7.4028+-0.3936 ? 7.5393+-0.2117 ? might be 1.0184x slower
<geometric> 6.0253+-0.2874 ? 6.1280+-0.3136 ? might be 1.0170x slower
<harmonic> 5.0649+-0.2447 ? 5.1580+-0.3539 ? might be 1.0184x slower
Conf#1 Conf#2
LongSpider:
3d-cube 1857.9495+-15.0811 ? 1866.6677+-32.3299 ?
3d-morph 1238.3037+-13.8758 ? 1275.7288+-128.9616 ? might be 1.0302x slower
3d-raytrace 1222.5578+-30.8440 ? 1293.4274+-280.3974 ? might be 1.0580x slower
access-binary-trees 1329.3123+-28.3387 ? 1389.5810+-135.1811 ? might be 1.0453x slower
access-fannkuch 543.8162+-2.4885 ? 543.8214+-1.9855 ?
access-nbody 1168.7630+-5.0817 ? 1173.0695+-9.3087 ?
access-nsieve 1237.9380+-33.6346 ? 1261.2711+-7.8720 ? might be 1.0188x slower
bitops-3bit-bits-in-byte 134.8546+-2.1557 ? 135.4371+-1.8951 ?
bitops-bits-in-byte 215.7848+-11.8690 ? 222.2466+-36.8085 ? might be 1.0299x slower
bitops-nsieve-bits 1133.7550+-20.8954 1128.2193+-32.8758
controlflow-recursive 604.7928+-3.6893 ? 605.1859+-7.8722 ?
crypto-aes 1424.5351+-15.7268 ? 1434.2581+-6.1408 ?
crypto-md5 1178.8233+-15.0178 ? 1182.0520+-20.7265 ?
crypto-sha1 1416.6429+-12.7029 1414.4310+-3.3047
date-format-tofte 1143.2686+-256.8228 1073.5927+-29.6544 might be 1.0649x faster
date-format-xparb 1396.4704+-48.5130 ? 1432.3767+-97.5210 ? might be 1.0257x slower
math-cordic 1423.0532+-28.5076 1408.4755+-6.7751 might be 1.0103x faster
math-partial-sums 848.6508+-4.8391 847.5743+-11.9581
math-spectral-norm 1376.6771+-330.2995 1293.3749+-53.8112 might be 1.0644x faster
string-base64 542.2126+-117.0719 ? 545.4697+-146.3643 ?
string-fasta 926.9550+-22.1414 914.6357+-16.3680 might be 1.0135x faster
string-tagcloud 338.2670+-7.5392 ? 342.1763+-13.2518 ? might be 1.0116x slower
<arithmetic> 1031.9720+-19.0559 ? 1035.5942+-6.3347 ? might be 1.0035x slower
<geometric> * 881.0723+-16.3946 ? 884.4750+-7.7875 ? might be 1.0039x slower
<harmonic> 658.9682+-11.7429 ? 662.8566+-15.8704 ? might be 1.0059x slower
Conf#1 Conf#2
V8Spider:
crypto 69.9048+-4.1645 ? 70.7333+-3.4173 ? might be 1.0119x slower
deltablue 101.1406+-39.7781 91.2475+-5.0083 might be 1.1084x faster
earley-boyer 65.9218+-3.0490 ? 66.0215+-4.6824 ?
raytrace 57.0201+-23.3816 ? 57.5493+-12.6118 ?
regexp 97.2420+-25.5873 88.6986+-1.1290 might be 1.0963x faster
richards 98.4443+-2.7928 ? 115.5652+-46.9034 ? might be 1.1739x slower
splay 48.7642+-1.9023 48.3143+-3.0282
<arithmetic> 76.9197+-3.3334 76.8757+-6.2474 might be 1.0006x faster
<geometric> * 73.6663+-1.9546 ? 73.7040+-4.2066 ? might be 1.0005x slower
<harmonic> 70.5878+-2.6731 ? 70.8193+-3.7366 ? might be 1.0033x slower
Conf#1 Conf#2
Octane and V8v7:
encrypt 0.40158+-0.10244 0.36916+-0.00114 might be 1.0878x faster
decrypt 6.71789+-0.03932 6.71385+-0.06757
deltablue x2 0.46008+-0.00552 ? 0.46483+-0.00628 ? might be 1.0103x slower
earley 0.70934+-0.00493 ? 0.71613+-0.00964 ?
boyer 9.31738+-0.13290 ? 9.38592+-0.10754 ?
navier-stokes x2 9.51646+-0.03267 ? 9.54048+-0.08468 ?
raytrace x2 3.70691+-0.27539 3.52496+-0.07128 might be 1.0516x faster
regexp x2 27.47273+-0.22018 27.28569+-0.30541
richards x2 0.25877+-0.00362 ? 0.26166+-0.00347 ? might be 1.0112x slower
splay x2 0.65997+-0.02005 0.65940+-0.00578
pdfjs x2 83.53057+-0.99396 ? 84.59044+-1.17326 ? might be 1.0127x slower
mandreel x2 128.86821+-7.11868 128.43458+-3.74882
gbemu x2 74.22887+-11.32429 ? 74.48753+-11.22750 ?
closure 0.78517+-0.00624 ? 0.78672+-0.01304 ?
jquery 11.67210+-2.83831 10.84885+-0.31685 might be 1.0759x faster
box2d x2 24.65797+-0.46701 24.51883+-0.07576
zlib x2 843.10042+-33.68579 836.15651+-73.20082
typescript x2 1144.93622+-50.43383 ? 1165.61444+-45.09676 ? might be 1.0181x slower
V8v7:
<arithmetic> 6.33100+-0.05173 6.29119+-0.02317 might be 1.0063x faster
<geometric> * 2.05649+-0.03423 2.04033+-0.00605 might be 1.0079x faster
<harmonic> 0.79500+-0.01966 0.79244+-0.00471 might be 1.0032x faster
Octane including V8v7:
<arithmetic> 157.07993+-2.82761 ? 157.99664+-8.35162 ? might be 1.0058x slower
<geometric> * 12.13662+-0.15402 12.07393+-0.23821 might be 1.0052x faster
<harmonic> 1.38641+-0.03171 1.38210+-0.00749 might be 1.0031x faster
Conf#1 Conf#2
Kraken:
ai-astar 341.860+-2.798 340.678+-3.426
audio-beat-detection 184.256+-1.739 ? 185.440+-3.195 ?
audio-dft 308.765+-13.410 305.883+-15.414
audio-fft 122.492+-49.100 106.389+-0.717 might be 1.1514x faster
audio-oscillator 218.122+-5.223 ? 218.203+-4.089 ?
imaging-darkroom 256.919+-4.533 255.501+-1.907
imaging-desaturate 139.108+-0.574 138.844+-3.743
imaging-gaussian-blur 326.481+-176.703 268.797+-3.180 might be 1.2146x faster
json-parse-financial 65.803+-3.706 ? 70.209+-3.340 ? might be 1.0670x slower
json-stringify-tinderbox 89.832+-10.574 ? 101.337+-30.933 ? might be 1.1281x slower
stanford-crypto-aes 82.302+-33.081 ? 83.899+-40.824 ? might be 1.0194x slower
stanford-crypto-ccm 87.922+-2.415 86.757+-4.534 might be 1.0134x faster
stanford-crypto-pbkdf2 240.082+-94.425 209.218+-2.996 might be 1.1475x faster
stanford-crypto-sha256-iterative 77.190+-1.074 ? 87.154+-25.380 ? might be 1.1291x slower
<arithmetic> * 181.510+-10.614 175.594+-2.702 might be 1.0337x faster
<geometric> 154.345+-5.817 152.932+-5.366 might be 1.0092x faster
<harmonic> 131.274+-5.395 ? 132.886+-7.138 ? might be 1.0123x slower
Conf#1 Conf#2
JSRegress:
adapt-to-double-divide 26.5853+-1.3726 25.9360+-1.3263 might be 1.0250x faster
aliased-arguments-getbyval 1.3508+-0.4089 1.3353+-0.1187 might be 1.0116x faster
allocate-big-object 3.3343+-1.0991 3.0991+-0.4356 might be 1.0759x faster
arity-mismatch-inlining 1.1399+-0.1252 ? 1.1502+-0.2081 ?
array-access-polymorphic-structure 10.1639+-4.0119 8.8299+-0.2458 might be 1.1511x faster
array-nonarray-polymorhpic-access 50.8994+-2.6047 ? 51.4412+-2.1470 ? might be 1.0106x slower
array-prototype-every 111.0372+-2.1073 110.7119+-2.1238
array-prototype-forEach 109.1503+-1.6150 ? 113.6860+-4.7336 ? might be 1.0416x slower
array-prototype-map 135.9296+-6.8767 135.4051+-4.4987
array-prototype-some 110.5018+-3.1442 110.4952+-1.7417
array-with-double-add 5.1561+-0.0429 ? 5.4432+-0.9607 ? might be 1.0557x slower
array-with-double-increment 4.3077+-0.5172 4.0900+-0.1127 might be 1.0532x faster
array-with-double-mul-add 6.4768+-2.6226 6.3420+-1.3743 might be 1.0212x faster
array-with-double-sum 5.4218+-2.8657 4.4393+-0.2268 might be 1.2213x faster
array-with-int32-add-sub 9.7018+-1.7519 9.6859+-1.0100
array-with-int32-or-double-sum 4.5615+-0.3718 4.4307+-0.0717 might be 1.0295x faster
ArrayBuffer-DataView-alloc-large-long-lived
92.7383+-21.1111 ? 94.5363+-18.2296 ? might be 1.0194x slower
ArrayBuffer-DataView-alloc-long-lived 27.3530+-0.4370 ? 27.4035+-0.4026 ?
ArrayBuffer-Int32Array-byteOffset 5.3277+-0.9605 5.0772+-1.1157 might be 1.0493x faster
ArrayBuffer-Int8Array-alloc-large-long-lived
87.8235+-3.7037 ? 89.3667+-2.8378 ? might be 1.0176x slower
ArrayBuffer-Int8Array-alloc-long-lived-buffer
42.4662+-1.8608 ? 44.5422+-3.2409 ? might be 1.0489x slower
ArrayBuffer-Int8Array-alloc-long-lived 27.2574+-1.2893 ? 40.7502+-26.2666 ? might be 1.4950x slower
ArrayBuffer-Int8Array-alloc 24.3102+-1.8089 23.4233+-0.3014 might be 1.0379x faster
asmjs_bool_bug 10.2275+-1.6232 9.9374+-0.9731 might be 1.0292x faster
assign-custom-setter-polymorphic 5.3958+-2.8372 4.2377+-0.0790 might be 1.2733x faster
assign-custom-setter 6.7116+-1.8186 5.7207+-0.1553 might be 1.1732x faster
basic-set 15.2944+-1.4699 ? 16.8418+-1.6224 ? might be 1.1012x slower
big-int-mul 5.3982+-2.3876 4.8770+-0.5635 might be 1.1069x faster
boolean-test 3.9315+-0.0601 3.9216+-0.0528
branch-fold 5.3407+-0.3900 5.2039+-0.1548 might be 1.0263x faster
by-val-generic 14.1953+-3.0894 12.2887+-0.4783 might be 1.1551x faster
call-spread-apply 21.1329+-2.0698 19.9612+-1.6778 might be 1.0587x faster
call-spread-call 9.4270+-2.2975 ? 10.0299+-3.5196 ? might be 1.0640x slower
captured-assignments 0.7473+-0.2279 0.6880+-0.0339 might be 1.0861x faster
cast-int-to-double 13.8560+-3.3016 12.7221+-1.0309 might be 1.0891x faster
cell-argument 16.7746+-0.7704 ? 16.8671+-1.9028 ?
cfg-simplify 3.9363+-0.1383 3.8776+-0.2038 might be 1.0151x faster
chain-getter-access 32.0134+-0.7059 ? 42.8976+-20.4539 ? might be 1.3400x slower
cmpeq-obj-to-obj-other 12.2512+-1.0254 ? 12.3747+-0.9741 ? might be 1.0101x slower
constant-test 6.7241+-1.0113 ? 7.2228+-1.5912 ? might be 1.0742x slower
DataView-custom-properties 102.6498+-28.4165 95.2300+-3.8282 might be 1.0779x faster
delay-tear-off-arguments-strictmode 3.8663+-0.6316 3.6145+-0.1341 might be 1.0697x faster
destructuring-arguments 7.5680+-1.1250 7.1902+-0.2746 might be 1.0525x faster
destructuring-swap 7.2413+-1.4801 6.7482+-0.3006 might be 1.0731x faster
direct-arguments-getbyval 1.0186+-0.0260 ? 1.1976+-0.5724 ? might be 1.1758x slower
double-get-by-val-out-of-bounds 8.1046+-0.9512 8.0263+-1.7558
double-pollution-getbyval 13.4520+-1.4616 13.4288+-1.9079
double-pollution-putbyoffset 6.7491+-1.6974 5.6722+-0.0553 might be 1.1899x faster
double-to-int32-typed-array-no-inline 3.2407+-1.1429 ? 3.2752+-0.9846 ? might be 1.0106x slower
double-to-int32-typed-array 2.4253+-0.0577 ? 2.4659+-0.0630 ? might be 1.0168x slower
double-to-uint32-typed-array-no-inline 3.2659+-0.7862 2.8707+-0.0955 might be 1.1377x faster
double-to-uint32-typed-array 3.0692+-1.3075 2.6465+-0.2561 might be 1.1597x faster
empty-string-plus-int 9.8101+-1.5952 9.5392+-0.4874 might be 1.0284x faster
emscripten-cube2hash 66.7296+-10.0378 63.3163+-12.9618 might be 1.0539x faster
external-arguments-getbyval 1.9769+-0.0874 ? 2.3238+-1.0649 ? might be 1.1755x slower
external-arguments-putbyval 2.7413+-0.2124 ? 2.7703+-0.1554 ? might be 1.0106x slower
fixed-typed-array-storage-var-index 1.8513+-0.3844 1.8207+-0.4535 might be 1.0168x faster
fixed-typed-array-storage 1.1812+-0.0593 1.1570+-0.0289 might be 1.0209x faster
Float32Array-matrix-mult 8.2892+-3.3549 6.9694+-1.0974 might be 1.1894x faster
Float32Array-to-Float64Array-set 80.3307+-0.6767 ^ 71.7085+-3.4909 ^ definitely 1.1202x faster
Float64Array-alloc-long-lived 94.0060+-3.4542 ? 94.2178+-4.4826 ?
Float64Array-to-Int16Array-set 122.9407+-75.6291 93.7740+-1.1519 might be 1.3110x faster
fold-double-to-int 20.7211+-3.5298 ? 24.0407+-12.4710 ? might be 1.1602x slower
for-of-iterate-array-entries 9.3400+-0.9179 8.6447+-0.2646 might be 1.0804x faster
for-of-iterate-array-keys 3.8580+-1.0004 3.6834+-0.2588 might be 1.0474x faster
for-of-iterate-array-values 3.4532+-0.9667 3.1402+-0.1228 might be 1.0997x faster
fround 32.1973+-0.2235 ? 33.4987+-1.3024 ? might be 1.0404x slower
function-dot-apply 1.8180+-0.1370 ? 2.0289+-0.4967 ? might be 1.1160x slower
function-test 4.5516+-0.3610 4.4225+-0.1952 might be 1.0292x faster
function-with-eval 31.7252+-3.1305 30.0603+-2.1529 might be 1.0554x faster
get-by-id-chain-from-try-block 8.1958+-0.5394 8.0446+-0.1962 might be 1.0188x faster
get-by-id-proto-or-self 20.4640+-0.5515 ? 21.5950+-3.2453 ? might be 1.0553x slower
get-by-id-self-or-proto 21.5588+-1.3150 ? 22.1125+-1.0984 ? might be 1.0257x slower
get-by-val-out-of-bounds 7.4294+-0.9574 ? 8.6473+-4.1948 ? might be 1.1639x slower
get_callee_monomorphic 5.1578+-0.7696 ? 5.4516+-1.8727 ? might be 1.0570x slower
get_callee_polymorphic 4.8195+-0.8840 4.5848+-0.2326 might be 1.0512x faster
getter 17.4755+-2.2433 16.7350+-0.5555 might be 1.0442x faster
global-var-const-infer-fire-from-opt 1.3748+-0.2713 1.3028+-0.1516 might be 1.0552x faster
global-var-const-infer 1.0248+-0.1205 0.9642+-0.0273 might be 1.0628x faster
HashMap-put-get-iterate-keys 41.7585+-4.0984 41.5412+-2.8311
HashMap-put-get-iterate 41.8647+-1.0272 ? 50.3089+-25.3522 ? might be 1.2017x slower
HashMap-string-put-get-iterate 44.7075+-6.2275 ? 45.0616+-6.5395 ?
imul-double-only 15.8456+-2.5819 ? 18.5297+-10.3589 ? might be 1.1694x slower
imul-int-only 14.7005+-1.7614 ? 14.9564+-1.4794 ? might be 1.0174x slower
imul-mixed 19.4137+-1.9534 19.2422+-2.5853
in-four-cases 24.2115+-10.6961 21.1573+-0.6499 might be 1.1444x faster
in-one-case-false 11.1131+-0.8385 ? 11.3940+-1.1181 ? might be 1.0253x slower
in-one-case-true 13.3086+-6.5538 11.9175+-2.5792 might be 1.1167x faster
in-two-cases 11.4745+-0.5842 ? 12.0227+-0.9180 ? might be 1.0478x slower
indexed-properties-in-objects 4.3527+-0.7218 4.1464+-0.9367 might be 1.0498x faster
infer-closure-const-then-mov-no-inline 5.4289+-3.2787 4.2347+-0.0646 might be 1.2820x faster
infer-closure-const-then-mov 26.8688+-2.7559 ? 27.4850+-1.8953 ? might be 1.0229x slower
infer-closure-const-then-put-to-scope-no-inline
17.7740+-1.6431 ? 17.8256+-1.8367 ?
infer-closure-const-then-put-to-scope 30.9261+-0.9087 30.1460+-0.8932 might be 1.0259x faster
infer-closure-const-then-reenter-no-inline
84.4408+-3.2891 ? 86.1025+-5.5076 ? might be 1.0197x slower
infer-closure-const-then-reenter 32.0818+-2.8805 ? 34.7264+-12.0811 ? might be 1.0824x slower
infer-one-time-closure-ten-vars 26.1993+-3.0638 25.5443+-1.6031 might be 1.0256x faster
infer-one-time-closure-two-vars 25.0657+-0.8285 ? 25.5281+-1.4406 ? might be 1.0184x slower
infer-one-time-closure 25.3208+-1.0584 ? 25.3891+-1.6030 ?
infer-one-time-deep-closure 50.5080+-2.0417 50.0031+-3.1110 might be 1.0101x faster
inline-arguments-access 1.8079+-0.2869 ? 1.8617+-0.5070 ? might be 1.0298x slower
inline-arguments-aliased-access 1.8713+-0.0511 ? 2.0662+-0.5325 ? might be 1.1042x slower
inline-arguments-local-escape 16.8954+-0.7978 ? 17.5120+-0.3287 ? might be 1.0365x slower
inline-get-scoped-var 6.0897+-1.5772 5.5700+-0.5221 might be 1.0933x faster
inlined-put-by-id-transition 11.4883+-0.3000 ? 12.2261+-0.7939 ? might be 1.0642x slower
int-or-other-abs-then-get-by-val 9.3752+-5.3183 7.9805+-0.4666 might be 1.1748x faster
int-or-other-abs-zero-then-get-by-val 39.3630+-18.8871 32.8015+-4.1172 might be 1.2000x faster
int-or-other-add-then-get-by-val 11.7188+-2.0976 10.9802+-0.4705 might be 1.0673x faster
int-or-other-add 9.9643+-0.8790 ? 10.3979+-1.6176 ? might be 1.0435x slower
int-or-other-div-then-get-by-val 7.6581+-3.5966 7.1395+-1.2478 might be 1.0726x faster
int-or-other-max-then-get-by-val 7.9525+-1.5630 ? 8.7600+-4.9233 ? might be 1.1015x slower
int-or-other-min-then-get-by-val 7.8059+-0.9442 7.7828+-1.2924
int-or-other-mod-then-get-by-val 6.9943+-1.1702 ? 8.1813+-2.8848 ? might be 1.1697x slower
int-or-other-mul-then-get-by-val 6.6740+-0.7408 ? 7.6552+-1.8065 ? might be 1.1470x slower
int-or-other-neg-then-get-by-val 7.8483+-2.8591 ? 8.3329+-1.2615 ? might be 1.0618x slower
int-or-other-neg-zero-then-get-by-val 31.6415+-1.7283 29.8420+-2.5352 might be 1.0603x faster
int-or-other-sub-then-get-by-val 10.6354+-0.1073 ? 10.6717+-0.3674 ?
int-or-other-sub 7.9955+-1.3759 7.9533+-1.2905
int-overflow-local 7.0939+-2.2971 5.8370+-0.0554 might be 1.2153x faster
Int16Array-alloc-long-lived 69.2947+-0.8772 67.5840+-1.7406 might be 1.0253x faster
Int16Array-bubble-sort-with-byteLength 32.6938+-2.3920 ? 34.3080+-1.7015 ? might be 1.0494x slower
Int16Array-bubble-sort 28.5263+-4.0900 ? 28.8425+-4.5226 ? might be 1.0111x slower
Int16Array-load-int-mul 1.9990+-0.2565 1.9376+-0.0578 might be 1.0317x faster
Int16Array-to-Int32Array-set 89.5992+-39.9506 70.7845+-1.9310 might be 1.2658x faster
Int32Array-alloc-large 31.4408+-2.0799 30.5198+-1.9677 might be 1.0302x faster
Int32Array-alloc-long-lived 75.9565+-2.9072 74.9927+-1.1918 might be 1.0129x faster
Int32Array-alloc 4.6470+-1.1440 4.1216+-0.4228 might be 1.1275x faster
Int32Array-Int8Array-view-alloc 12.7440+-0.8115 ? 13.3265+-1.0827 ? might be 1.0457x slower
int52-spill 10.8069+-0.7233 ? 11.5554+-2.0249 ? might be 1.0693x slower
Int8Array-alloc-long-lived 63.8255+-1.1077 ? 63.8987+-2.1864 ?
Int8Array-load-with-byteLength 4.7138+-0.7073 4.5664+-0.5640 might be 1.0323x faster
Int8Array-load 4.9595+-1.2430 4.5480+-0.3524 might be 1.0905x faster
integer-divide 17.4471+-1.7175 16.5241+-0.6360 might be 1.0559x faster
integer-modulo 2.5934+-1.6519 2.2083+-0.4094 might be 1.1744x faster
large-int-captured 9.1987+-1.0069 8.9847+-0.4825 might be 1.0238x faster
large-int-neg 23.6665+-3.1706 ? 23.7142+-2.5426 ?
large-int 26.3954+-15.4044 21.3854+-1.0682 might be 1.2343x faster
logical-not 6.4646+-1.0448 6.0062+-0.4903 might be 1.0763x faster
lots-of-fields 13.9201+-2.3576 13.5471+-2.7176 might be 1.0275x faster
make-indexed-storage 3.8641+-0.2683 ? 4.4921+-1.0177 ? might be 1.1625x slower
make-rope-cse 6.4275+-1.4481 ? 6.6870+-1.5049 ? might be 1.0404x slower
marsaglia-larger-ints 98.9485+-2.5556 ? 99.3793+-2.2440 ?
marsaglia-osr-entry 41.8600+-0.8444 ? 43.2228+-5.4257 ? might be 1.0326x slower
method-on-number 27.2859+-1.1045 26.7730+-2.6778 might be 1.0192x faster
misc-strict-eq 64.6840+-7.1466 62.0667+-3.4867 might be 1.0422x faster
negative-zero-divide 0.5828+-0.1612 ? 0.6077+-0.2390 ? might be 1.0427x slower
negative-zero-modulo 0.5094+-0.0130 ? 0.5704+-0.1834 ? might be 1.1198x slower
negative-zero-negate 0.5148+-0.0159 ? 0.5204+-0.0148 ? might be 1.0110x slower
nested-function-parsing 38.3182+-2.5144 37.9318+-1.4442 might be 1.0102x faster
new-array-buffer-dead 4.0557+-0.0164 ? 4.0897+-0.1318 ?
new-array-buffer-push 9.8289+-1.6838 8.9083+-0.2580 might be 1.1033x faster
new-array-dead 31.1060+-1.6179 30.1218+-0.5609 might be 1.0327x faster
new-array-push 6.4183+-0.0564 ? 6.7667+-0.8311 ? might be 1.0543x slower
number-test 4.2139+-1.0026 3.8815+-0.0832 might be 1.0856x faster
object-closure-call 9.1504+-1.1262 ? 9.5311+-1.2198 ? might be 1.0416x slower
object-test 4.2000+-0.2132 4.1095+-0.0541 might be 1.0220x faster
poly-stricteq 84.4606+-4.4393 84.4077+-2.9810
polymorphic-get-by-id 3.9060+-0.1554 ? 3.9498+-0.2273 ? might be 1.0112x slower
polymorphic-put-by-id 72.8988+-68.5369 ? 95.8370+-80.6429 ? might be 1.3147x slower
polymorphic-structure 28.0064+-18.0453 ? 28.0096+-17.5880 ?
polyvariant-monomorphic-get-by-id 12.1135+-2.5919 10.7445+-0.5947 might be 1.1274x faster
proto-getter-access 33.6077+-2.3691 33.0453+-3.2087 might be 1.0170x faster
put-by-id 16.5244+-1.2533 16.3435+-0.9697 might be 1.0111x faster
put-by-val-large-index-blank-indexing-type
9.4885+-0.3463 ? 9.5186+-0.7328 ?
put-by-val-machine-int 3.3607+-0.2915 ? 3.4525+-0.2176 ? might be 1.0273x slower
rare-osr-exit-on-local 19.3826+-0.8860 ? 21.5028+-6.3982 ? might be 1.1094x slower
register-pressure-from-osr 29.9227+-1.1443 ? 30.7917+-2.4698 ? might be 1.0290x slower
setter 19.2878+-2.5324 19.0494+-3.5336 might be 1.0125x faster
simple-activation-demo 34.5583+-1.8805 ? 36.2892+-2.1257 ? might be 1.0501x slower
simple-getter-access 50.9291+-1.9527 ? 51.6914+-3.5013 ? might be 1.0150x slower
slow-array-profile-convergence 3.9481+-0.1939 3.8962+-0.0614 might be 1.0133x faster
slow-convergence 5.0742+-0.8411 4.3039+-0.2412 might be 1.1790x faster
sparse-conditional 1.5156+-0.0270 ? 1.5833+-0.2322 ? might be 1.0447x slower
splice-to-remove 63.1953+-1.2919 62.7690+-3.2119
string-char-code-at 26.8528+-1.9617 24.6282+-4.4934 might be 1.0903x faster
string-concat-object 2.8815+-0.0217 ? 3.1141+-0.6523 ? might be 1.0807x slower
string-concat-pair-object 3.4343+-0.8053 3.1948+-1.0026 might be 1.0749x faster
string-concat-pair-simple 14.0681+-1.8277 ? 14.8300+-2.1296 ? might be 1.0542x slower
string-concat-simple 13.8769+-0.2379 ? 13.9775+-0.1005 ?
string-cons-repeat 10.4382+-1.9957 ? 10.6436+-1.8100 ? might be 1.0197x slower
string-cons-tower 10.6697+-1.5501 ? 10.7988+-2.1100 ? might be 1.0121x slower
string-equality 39.2385+-0.5030 ? 40.8249+-3.0975 ? might be 1.0404x slower
string-get-by-val-big-char 11.9070+-1.3326 ? 12.2754+-1.6915 ? might be 1.0309x slower
string-get-by-val-out-of-bounds-insane 6.5706+-1.6813 ? 7.6985+-1.5752 ? might be 1.1717x slower
string-get-by-val-out-of-bounds 5.9410+-1.9276 5.5321+-0.5011 might be 1.0739x faster
string-get-by-val 3.9114+-0.0683 ? 4.3478+-1.4484 ? might be 1.1116x slower
string-hash 2.6592+-0.2756 2.5626+-0.0183 might be 1.0377x faster
string-long-ident-equality 34.9924+-1.0943 ? 37.8195+-2.5732 ? might be 1.0808x slower
string-repeat-arith 47.6771+-3.3100 43.0648+-1.5542 might be 1.1071x faster
string-sub 89.9006+-4.2412 86.5112+-1.9659 might be 1.0392x faster
string-test 3.7381+-0.0997 ? 3.7490+-0.1849 ?
string-var-equality 59.9330+-3.7681 ? 60.0725+-1.7292 ?
structure-hoist-over-transitions 3.5838+-1.0085 3.3077+-0.0441 might be 1.0835x faster
switch-char-constant 3.1700+-0.2602 3.1155+-0.0355 might be 1.0175x faster
switch-char 8.3749+-2.3614 8.3381+-1.2628
switch-constant 10.0244+-1.6401 9.6490+-0.7354 might be 1.0389x faster
switch-string-basic-big-var 20.6093+-0.4876 ? 21.2503+-1.4227 ? might be 1.0311x slower
switch-string-basic-big 20.2893+-1.1519 20.0833+-0.9829 might be 1.0103x faster
switch-string-basic-var 21.6278+-0.8927 20.5326+-1.3016 might be 1.0533x faster
switch-string-basic 18.0966+-0.8079 18.0683+-0.5337
switch-string-big-length-tower-var 27.5045+-0.6520 ? 28.0016+-0.8188 ? might be 1.0181x slower
switch-string-length-tower-var 21.0065+-0.5750 ? 21.9965+-2.8253 ? might be 1.0471x slower
switch-string-length-tower 18.1628+-1.4607 ? 18.4920+-2.3722 ? might be 1.0181x slower
switch-string-short 19.5737+-3.8429 18.6371+-1.0606 might be 1.0503x faster
switch 17.0074+-2.2226 16.3795+-0.3643 might be 1.0383x faster
tear-off-arguments-simple 2.6013+-0.0840 ? 2.7831+-0.3933 ? might be 1.0699x slower
tear-off-arguments 3.9955+-0.4599 3.9250+-0.2602 might be 1.0180x faster
temporal-structure 18.2529+-1.6863 ? 19.6742+-3.8070 ? might be 1.0779x slower
to-int32-boolean 21.1107+-4.6526 19.9608+-3.0242 might be 1.0576x faster
undefined-test 4.3229+-1.1569 4.1130+-0.6682 might be 1.0510x faster
unprofiled-licm 59.8765+-6.3399 58.6620+-3.3857 might be 1.0207x faster
weird-inlining-const-prop 2.6855+-0.6112 2.5662+-0.1349 might be 1.0465x faster
<arithmetic> 22.1598+-0.6929 22.0114+-0.5846 might be 1.0067x faster
<geometric> * 11.4939+-0.2032 11.3916+-0.1575 might be 1.0090x faster
<harmonic> 5.5682+-0.1129 5.5424+-0.0724 might be 1.0047x faster
Conf#1 Conf#2
AsmBench:
bigfib.cpp 1466.8200+-14.3095 1463.6599+-23.9301
cray.c 53.6849+-4.1114 ? 55.8937+-1.8710 ? might be 1.0411x slower
dry.c 1278.2316+-41.4835 1261.2917+-8.1168 might be 1.0134x faster
FloatMM.c 1828.5541+-11.5089 ? 1833.6780+-8.2449 ?
gcc-loops.cpp 3165.5955+-50.1891 ? 3270.8739+-329.1033 ? might be 1.0333x slower
n-body.c 2123.0682+-14.5961 ? 2126.6964+-3.6637 ?
Quicksort.c 104.7718+-0.4877 ? 120.3850+-26.7873 ? might be 1.1490x slower
stepanov_container.cpp 7294.9308+-152.7661 7220.3734+-97.6040 might be 1.0103x faster
Towers.c 74.0829+-3.5863 ? 74.0936+-3.4701 ?
<arithmetic> 1932.1933+-22.3593 ? 1936.3273+-32.3755 ? might be 1.0021x slower
<geometric> * 744.1055+-4.0407 ? 759.4225+-13.9067 ? might be 1.0206x slower
<harmonic> 201.6047+-4.2082 ? 210.5416+-8.7023 ? might be 1.0443x slower
Conf#1 Conf#2
All benchmarks:
<arithmetic> 164.9232+-1.3712 ? 165.0275+-1.5218 ? might be 1.0006x slower
<geometric> 19.3808+-0.2470 19.2919+-0.1689 might be 1.0046x faster
<harmonic> 4.9128+-0.0795 4.9015+-0.0209 might be 1.0023x faster
Conf#1 Conf#2
Geomean of preferred means:
<scaled-result> 70.9432+-1.0831 70.9029+-0.7597 might be 1.0006x faster
Created attachment 230103 [details]
revised patch
Comment on attachment 230103 [details] revised patch View in context: https://bugs.webkit.org/attachment.cgi?id=230103&action=review r=me > Source/JavaScriptCore/heap/MarkedAllocator.h:53 > + ALWAYS_INLINE void doTestCollectionsIfNeeded(); Do you need ALWAYS_INLINE here? I've never seen it in a header before. (In reply to comment #12) > > Source/JavaScriptCore/heap/MarkedAllocator.h:53 > > + ALWAYS_INLINE void doTestCollectionsIfNeeded(); > > Do you need ALWAYS_INLINE here? I've never seen it in a header before. ALWAYS_INLINE is used in headers everywhere. Just grep for it in header files and you’ll see. Thanks for the review. Landed in r167772: <http://trac.webkit.org/r167772>. |