Bug 88840 - GC should be 1.7X faster
Summary: GC should be 1.7X faster
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Geoffrey Garen
URL:
Keywords:
Depends on: 88966
Blocks:
  Show dependency treegraph
 
Reported: 2012-06-11 22:57 PDT by Geoffrey Garen
Modified: 2012-06-12 23:13 PDT (History)
1 user (show)

See Also:


Attachments
Patch (21.27 KB, patch)
2012-06-11 23:35 PDT, Geoffrey Garen
oliver: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Geoffrey Garen 2012-06-11 22:57:41 PDT
GC should be 1.7X faster
Comment 1 Geoffrey Garen 2012-06-11 23:00:59 PDT
Highlights:

1.7X reduction in average pause on gc-benchmark.html on a Mac Pro: 105ms vs 62ms.

1.12X speedup on v8-splay.

Possible small regression on SunSpider, but not statistically significant.

Benchmark report for SunSpider, V8, V8Real, Kraken, JSBench, JSRegress, and DSP on garen (MacPro5,1).

VMs tested:
"SPADE" at /Volumes/Big/ggaren/Downloads/Safari-MountainLion-Production-Curie-119909-43390.app/Contents/Resources/DumpRenderTree
"PATCH" at /Volumes/Big/ggaren/webkit/WebKitBuild/Release/DumpRenderTree (r119908)

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to
get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds.

                                                    SPADE                     PATCH                                       
SunSpider:
   3d-cube                                      8.5830+-0.1405     ?      9.3066+-0.6883        ? might be 1.0843x slower
   3d-morph                                     8.6177+-0.1482     ?      8.6985+-0.0973        ?
   3d-raytrace                                 15.2287+-0.6303     ?     15.2385+-0.6271        ?
   access-binary-trees                          2.3384+-0.4223     ?      2.7461+-0.5536        ? might be 1.1743x slower
   access-fannkuch                              7.6331+-0.0680     ?      7.6771+-0.0479        ?
   access-nbody                                 4.6996+-0.0765     ?      4.7859+-0.0354        ? might be 1.0184x slower
   access-nsieve                                3.6986+-0.0436     ?      3.7721+-0.0729        ? might be 1.0199x slower
   bitops-3bit-bits-in-byte                     1.5707+-0.0080     ?      1.5955+-0.0252        ? might be 1.0158x slower
   bitops-bits-in-byte                          6.5760+-0.0391            6.5643+-0.0353        
   bitops-bitwise-and                           2.7999+-0.1013     ?      2.8158+-0.0815        ?
   bitops-nsieve-bits                           3.6562+-0.0232     !      3.6960+-0.0095        ! definitely 1.0109x slower
   controlflow-recursive                        2.9061+-0.0245     ?      2.9222+-0.0246        ?
   crypto-aes                                  10.0641+-0.5468     ?     10.8969+-0.8429        ? might be 1.0827x slower
   crypto-md5                                   4.2083+-0.0921     ?      4.2382+-0.1361        ?
   crypto-sha1                                  3.5431+-0.0680     !      3.6564+-0.0185        ! definitely 1.0320x slower
   date-format-tofte                           15.6157+-1.4840           15.4828+-1.6521        
   date-format-xparb                           12.9708+-0.1381     ?     13.6844+-1.1520        ? might be 1.0550x slower
   math-cordic                                  5.1981+-0.0536     ?      5.2323+-0.0491        ?
   math-partial-sums                           10.5900+-0.0850     ?     10.6056+-0.0157        ?
   math-spectral-norm                           3.4542+-0.0605     ?      3.5117+-0.0154        ? might be 1.0166x slower
   regexp-dna                                  11.4805+-0.0649     ?     11.7938+-0.3619        ? might be 1.0273x slower
   string-base64                                6.6503+-0.8510     ?      6.7046+-0.8126        ?
   string-fasta                                 9.3593+-0.4904     ?      9.4326+-0.4028        ?
   string-tagcloud                             15.7902+-0.3367           15.7115+-0.4151        
   string-unpack-code                          27.1805+-1.0656           26.5491+-1.0049          might be 1.0238x faster
   string-validate-input                        9.9656+-0.6400            9.5112+-0.7653          might be 1.0478x faster

   <arithmetic> *                               8.2453+-0.1518     ?      8.3396+-0.2877        ? might be 1.0114x slower
   <geometric>                                  6.5643+-0.0814     ?      6.6848+-0.1922        ? might be 1.0184x slower
   <harmonic>                                   5.1823+-0.0700     ?      5.3088+-0.1312        ? might be 1.0244x slower

                                                    SPADE                     PATCH                                       
V8:
   crypto                                      87.8475+-0.7467           87.8045+-0.7383        
   deltablue                                  172.8938+-1.7931          172.3493+-1.7812        
   earley-boyer                               101.2235+-1.5644     ?    101.5495+-1.8055        ?
   raytrace                                    69.5681+-0.3166     ?     70.1683+-2.1174        ?
   regexp                                     102.5394+-0.3300     !    104.3305+-0.3496        ! definitely 1.0175x slower
   richards                                   146.3581+-1.0942     ?    147.1016+-2.0144        ?
   splay                                      144.2647+-11.0317    ^    119.3962+-12.2720       ^ definitely 1.2083x faster

   <arithmetic>                               117.8136+-1.1902     ^    114.6714+-1.4067        ^ definitely 1.0274x faster
   <geometric> *                              112.6310+-0.9225     ^    110.0133+-1.3500        ^ definitely 1.0238x faster
   <harmonic>                                 107.6217+-0.6607     ^    105.6110+-1.2751        ^ definitely 1.0190x faster

                                                    SPADE                     PATCH                                       
V8Real:
   encrypt                                     0.47470+-0.00099    ?     0.47477+-0.00082       ?
   decrypt                                     8.23235+-0.00980    ?     8.23890+-0.01067       ?
   deltablue                          x2       0.90939+-0.01345    ?     0.91183+-0.01432       ?
   earley                                      2.44956+-0.01556    ?     2.46486+-0.02022       ?
   boyer                                      15.97746+-0.10361    ?    16.15869+-0.14813       ? might be 1.0113x slower
   raytrace                           x2       6.26569+-0.06272    ?     6.32463+-0.04388       ?
   regexp                             x2      30.72591+-0.09405    ?    30.76099+-0.06738       ?
   richards                           x2       0.40930+-0.00952          0.40682+-0.00740       
   splay                              x2       1.13597+-0.02031    ^     1.01809+-0.03479       ^ definitely 1.1158x faster

   <arithmetic>                                7.57333+-0.01659    ?     7.58442+-0.01926       ? might be 1.0015x slower
   <geometric> *                               2.68480+-0.01298    ^     2.64905+-0.01572       ^ definitely 1.0135x faster
   <harmonic>                                  1.17302+-0.01180    ^     1.15142+-0.00867       ^ definitely 1.0188x faster

                                                    SPADE                     PATCH                                       
Kraken:
   ai-astar                                    927.713+-3.869            926.424+-3.120         
   audio-beat-detection                        241.742+-3.215      ?     242.090+-4.505         ?
   audio-dft                                   320.833+-0.427      ?     320.974+-0.771         ?
   audio-fft                                   146.010+-0.387            145.769+-0.339         
   audio-oscillator                            273.660+-0.212            273.610+-0.327         
   imaging-darkroom                            331.959+-2.721      ?     331.970+-2.382         ?
   imaging-desaturate                          257.814+-0.332            257.678+-0.166         
   imaging-gaussian-blur                       502.870+-0.165      ?     504.113+-2.315         ?
   json-parse-financial                         76.072+-0.283      ^      75.543+-0.080         ^ definitely 1.0070x faster
   json-stringify-tinderbox                     96.523+-0.427      ?      97.317+-0.477         ?
   stanford-crypto-aes                         102.722+-0.650            102.187+-0.541         
   stanford-crypto-ccm                         112.508+-0.409      ?     112.643+-0.376         ?
   stanford-crypto-pbkdf2                      227.194+-0.954            226.091+-0.785         
   stanford-crypto-sha256-iterative            111.009+-0.213            110.499+-0.433         

   <arithmetic> *                              266.331+-0.437            266.208+-0.551           might be 1.0005x faster
   <geometric>                                 206.972+-0.232            206.798+-0.348           might be 1.0008x faster
   <harmonic>                                  168.153+-0.146            167.921+-0.225           might be 1.0014x faster

                                                    SPADE                     PATCH                                       
JSBench:
   amazon                                      20.8333+-0.2473           20.6667+-0.3128        
   facebook                                    78.8333+-2.2962     ?     81.5833+-2.3378        ? might be 1.0349x slower
   google                                     111.5833+-2.0536     ?    111.8333+-1.8541        ?
   twitter                                     60.9167+-0.4248     ^     59.3333+-0.3128        ^ definitely 1.0267x faster
   yahoo                                       25.6667+-0.4138     ?     26.0833+-0.4248        ? might be 1.0162x slower

   <arithmetic> *                              59.5667+-0.7485     ?     59.9000+-0.9486        ? might be 1.0056x slower
   <geometric>                                 49.1269+-0.4701     ?     49.3125+-0.6902        ? might be 1.0038x slower
   <harmonic>                                  39.9751+-0.3130     ?     40.0608+-0.5170        ? might be 1.0021x slower

                                                    SPADE                     PATCH                                       
JSRegress:
   adapt-to-double-divide                      83.3630+-0.0237     !     83.4670+-0.0737        ! definitely 1.0012x slower
   aliased-arguments-getbyval                   0.9438+-0.0245     ?      0.9580+-0.0168        ? might be 1.0151x slower
   arity-mismatch-inlining                      0.7349+-0.0287            0.7267+-0.0136          might be 1.0113x faster
   big-int-mul                                 10.0781+-0.0257     ?     10.1085+-0.0219        ?
   boolean-test                                 4.0164+-0.0126     ?      4.0522+-0.0256        ?
   cast-int-to-double                          14.1922+-0.0215     ?     14.2338+-0.0351        ?
   cfg-simplify                                 3.5019+-0.0127     ?      3.5164+-0.0133        ?
   cmpeq-obj-to-obj-other                      16.1665+-0.2244           16.1618+-0.5207        
   constant-test                                7.8481+-0.0155     ?      7.8666+-0.0082        ?
   direct-arguments-getbyval                    0.8645+-0.0149     ?      0.8756+-0.0117        ? might be 1.0129x slower
   double-pollution-getbyval                   10.1094+-0.0282     ?     10.1474+-0.0164        ?
   double-pollution-putbyoffset                 5.0111+-0.0786     ?      5.0267+-0.0773        ?
   external-arguments-getbyval                  2.6587+-0.3422     ?      2.6905+-0.3176        ? might be 1.0120x slower
   external-arguments-putbyval                  3.9350+-0.6730     ?      4.0129+-0.6519        ? might be 1.0198x slower
   Float32Array-matrix-mult                    13.3589+-1.1334           13.1731+-1.1115          might be 1.0141x faster
   fold-double-to-int                          42.8418+-1.0133     ?     44.4507+-1.9277        ? might be 1.0376x slower
   function-dot-apply                           3.5925+-0.0181     !      3.6583+-0.0180        ! definitely 1.0183x slower
   function-test                                4.9003+-0.0471            4.8732+-0.0610        
   inline-arguments-access                      1.2773+-0.0144     ?      1.2877+-0.0038        ?
   inline-arguments-local-escape               33.2692+-4.6584     ?     33.5490+-4.5307        ?
   int-overflow-local                         104.7309+-0.0833          104.6487+-0.1333        
   Int16Array-bubble-sort                      80.0115+-0.2301           79.7546+-1.6814        
   Int16Array-load-int-mul                      2.0172+-0.0201     ?      2.0255+-0.0119        ?
   Int8Array-load                               5.1788+-0.0778            5.1478+-0.0118        
   integer-divide                              16.0347+-0.0165     ?     16.0573+-0.0164        ?
   method-on-number                           231.4135+-4.9584     ^    211.2290+-3.5086        ^ definitely 1.0956x faster
   number-test                                  3.9184+-0.0366     ?      3.9268+-0.0218        ?
   object-test                                  4.2765+-0.0342     ?      4.3429+-0.0438        ? might be 1.0155x slower
   poly-stricteq                               94.1316+-1.2360           93.5036+-0.4059        
   rare-osr-exit-on-local                     201.5588+-0.3274          201.3859+-0.2713        
   simple-activation-demo                      45.6224+-0.1255     ?     45.7235+-0.1803        ?
   slow-convergence                            89.5075+-0.1528     ?     89.5298+-0.1226        ?
   sparse-conditional                           1.3877+-0.0156     ?      1.3905+-0.0120        ?
   string-hash                                  5.4583+-0.0251     ?      5.4798+-0.0131        ?
   string-test                                  3.8887+-0.0146     !      3.9280+-0.0147        ! definitely 1.0101x slower
   tear-off-arguments                           0.9541+-0.0193     ?      0.9617+-0.0155        ?
   to-int32-boolean                            26.9931+-0.0650     ?     26.9961+-0.0169        ?
   undefined-test                               4.2910+-0.0176     !      4.3289+-0.0184        ! definitely 1.0088x slower

   <arithmetic>                                31.1589+-0.2293     ^     30.6631+-0.2177        ^ definitely 1.0162x faster
   <geometric> *                                9.1655+-0.1023     ?      9.1864+-0.1010        ? might be 1.0023x slower
   <harmonic>                                   3.5706+-0.0184     ?      3.5940+-0.0138        ? might be 1.0066x slower

                                                    SPADE                     PATCH                                       
DSP:
   filtrr-posterize-tint                       50.5980+-0.6100     ?     50.7318+-0.5787        ?
   filtrr-tint-contrast-sat-bright             79.0194+-0.9351     ?     79.9662+-1.1456        ? might be 1.0120x slower
   filtrr-tint-sat-adj-contr-mult             103.4475+-0.3864     !    107.6216+-3.3997        ! definitely 1.0403x slower
   filtrr-blur-overlay-sat-contr              260.2373+-4.6067     ?    264.2078+-4.6535        ? might be 1.0153x slower
   filtrr-sat-blur-mult-sharpen-contr         326.9078+-5.5831     ?    327.4784+-5.4996        ?
   filtrr-sepia-bias                           36.0644+-0.6724           35.7091+-0.8061        
   route9-vp8                         x5     1388.5253+-15.9151    ?   1395.0577+-7.1459        ?

   <arithmetic>                               708.9910+-7.2687     ?    712.8185+-3.5027        ? might be 1.0054x slower
   <geometric> *                              337.8135+-1.8839     ?    340.4040+-2.2165        ? might be 1.0077x slower
   <harmonic>                                 136.9157+-1.1923     ?    137.5429+-1.7390        ? might be 1.0046x slower

                                                    SPADE                     PATCH                                       
All benchmarks:
   <arithmetic>                               123.0826+-0.6766     ?    123.1158+-0.3829        ? might be 1.0003x slower
   <geometric>                                 18.9278+-0.0993     ?     18.9744+-0.1599        ? might be 1.0025x slower
   <harmonic>                                   4.1141+-0.0199            4.1085+-0.0214          might be 1.0014x faster

                                                    SPADE                     PATCH                                       
Geomean of preferred means:
   <scaled-result>                             38.3671+-0.1316           38.3038+-0.2069          might be 1.0017x faster
Comment 2 Geoffrey Garen 2012-06-11 23:35:15 PDT
Created attachment 147019 [details]
Patch
Comment 3 WebKit Review Bot 2012-06-11 23:37:39 PDT
Attachment 147019 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/ChangeLog', u'Source..." exit_code: 1
Source/JavaScriptCore/heap/MarkStack.h:46:  Alphabetical sorting problem.  [build/include_order] [4]
Total errors found: 1 in 9 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 4 Oliver Hunt 2012-06-12 11:39:07 PDT
Comment on attachment 147019 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=147019&action=review

Makes me wonder if we should be looking at other Mutex users and seeing if spinlocks would be more appropriate...

> Source/JavaScriptCore/runtime/Options.cpp:198
> +    // We don't scale so well beyond 6.
> +    if (cpusToUse > 6)
> +        cpusToUse = 6;

Can't we use a constant here rather than a magic number that gets repeated?
Comment 5 Geoffrey Garen 2012-06-12 19:06:55 PDT
Committed r120149: <http://trac.webkit.org/changeset/120149>