Bug 149433 - VMs should share GC threads
Summary: VMs should share GC threads
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: Other
Hardware: All All
: P2 Normal
Assignee: Filip Pizlo
URL:
Keywords:
Depends on: 149435 149509 149512
Blocks: 149432
  Show dependency treegraph
 
Reported: 2015-09-21 16:34 PDT by Filip Pizlo
Modified: 2015-09-26 13:14 PDT (History)
14 users (show)

See Also:


Attachments
work in progress (66.25 KB, patch)
2015-09-23 12:30 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
sort of passing some tests, I guess (72.38 KB, patch)
2015-09-23 15:12 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
maybe the patch (78.24 KB, patch)
2015-09-24 15:04 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
getter better (85.10 KB, patch)
2015-09-24 18:00 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
the patch (87.90 KB, patch)
2015-09-24 18:50 PDT, Filip Pizlo
ggaren: review+
Details | Formatted Diff | Diff
patch for landing (91.82 KB, patch)
2015-09-25 13:08 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Filip Pizlo 2015-09-21 16:34:23 PDT
...
Comment 1 Mark Hahnenberg 2015-09-22 12:31:56 PDT
+1
Comment 2 Filip Pizlo 2015-09-23 12:30:07 PDT
Created attachment 261835 [details]
work in progress
Comment 3 Filip Pizlo 2015-09-23 15:12:17 PDT
Created attachment 261843 [details]
sort of passing some tests, I guess
Comment 4 Filip Pizlo 2015-09-24 15:04:05 PDT
Created attachment 261897 [details]
maybe the patch

I'm still running some tests.  But it looks pretty good so far.
Comment 5 WebKit Commit Bot 2015-09-24 15:05:37 PDT
Attachment 261897 [details] did not pass style-queue:


ERROR: Source/WTF/wtf/ParallelHelperPool.cpp:176:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/runtime/CodeCache.h:38:  Alphabetical sorting problem.  [build/include_order] [4]
ERROR: Source/JavaScriptCore/heap/HeapHelperPool.cpp:30:  Bad include order. Mixing system and custom headers.  [build/include_order] [4]
ERROR: Source/JavaScriptCore/heap/Heap.cpp:548:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/heap/Heap.cpp:633:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/runtime/VM.h:68:  Alphabetical sorting problem.  [build/include_order] [4]
Total errors found: 6 in 17 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 6 Filip Pizlo 2015-09-24 17:55:46 PDT
Performance is OK:


Benchmark report for SunSpider, LongSpider, V8Spider, Octane, Kraken, JSRegress, AsmBench, and CompressionBench on shakezilla (MacBookPro11,3).

VMs tested:
"TipOfTree" at /Volumes/Data/secondary/OpenSource/WebKitBuild/Release/jsc (r190217)
"ParallelHelperPool" at /Volumes/Data/quartary/OpenSource/WebKitBuild/Release/jsc (r190217)

Collected 7 samples per benchmark/VM, with 7 VM invocations per benchmark. Emitted a call to gc() between sample measurements.
Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level
timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds.

                                                        TipOfTree             ParallelHelperPool                                
SunSpider:
   3d-cube                                            4.8238+-0.4948            4.4259+-0.1112          might be 1.0899x faster
   3d-morph                                           5.3545+-0.2622            5.2012+-0.1000          might be 1.0295x faster
   3d-raytrace                                        5.3653+-0.1587            5.2177+-0.2311          might be 1.0283x faster
   access-binary-trees                                2.1610+-0.1281     ?      2.2318+-0.2849        ? might be 1.0327x slower
   access-fannkuch                                    5.4899+-0.1947     ?      5.5426+-0.1371        ?
   access-nbody                                       2.5440+-0.1140     ?      2.5831+-0.1806        ? might be 1.0154x slower
   access-nsieve                                      3.0718+-0.0730     ?      3.0813+-0.1555        ?
   bitops-3bit-bits-in-byte                           1.1482+-0.0227     ?      1.1590+-0.0418        ?
   bitops-bits-in-byte                                3.1865+-0.0434     ?      3.2571+-0.1015        ? might be 1.0222x slower
   bitops-bitwise-and                                 1.9811+-0.0172     ?      2.0200+-0.1145        ? might be 1.0196x slower
   bitops-nsieve-bits                                 2.9163+-0.0590     ?      2.9771+-0.1171        ? might be 1.0209x slower
   controlflow-recursive                              2.3349+-0.0745            2.3206+-0.0430        
   crypto-aes                                         3.8618+-0.0679     ?      3.9394+-0.2317        ? might be 1.0201x slower
   crypto-md5                                         2.4306+-0.0634     ?      2.5313+-0.1136        ? might be 1.0414x slower
   crypto-sha1                                        2.4447+-0.1530            2.4429+-0.1040        
   date-format-tofte                                  6.6284+-0.1152     ?      6.7748+-0.2771        ? might be 1.0221x slower
   date-format-xparb                                  4.7655+-0.1693            4.5829+-0.1946          might be 1.0398x faster
   math-cordic                                        2.8865+-0.1388            2.8842+-0.1256        
   math-partial-sums                                  4.7640+-0.0731     ?      4.7845+-0.2007        ?
   math-spectral-norm                                 1.9986+-0.1796            1.9368+-0.0857          might be 1.0319x faster
   regexp-dna                                         6.4210+-0.1994     ^      6.0502+-0.1595        ^ definitely 1.0613x faster
   string-base64                                      4.4975+-0.2186            4.4594+-0.1558        
   string-fasta                                       5.8044+-0.1032     ?      5.8735+-0.1774        ? might be 1.0119x slower
   string-tagcloud                                    7.8370+-0.0885     ?      7.8931+-0.0769        ?
   string-unpack-code                                18.5619+-0.5746     ?     18.7104+-0.4670        ?
   string-validate-input                              4.5297+-0.1527     ?      4.5637+-0.4200        ?

   <arithmetic>                                       4.5311+-0.0236            4.5171+-0.0506          might be 1.0031x faster

                                                        TipOfTree             ParallelHelperPool                                
LongSpider:
   3d-cube                                          800.6923+-9.6531          799.8176+-2.4010        
   3d-morph                                        1485.1298+-2.1368         1484.4645+-3.0170        
   3d-raytrace                                      589.2321+-4.3652     ?    590.7834+-5.1502        ?
   access-binary-trees                              785.2797+-3.3423     ?    787.6647+-6.4111        ?
   access-fannkuch                                  273.3021+-3.7824     ?    274.9242+-4.6492        ?
   access-nbody                                     504.3896+-0.7826          503.6922+-0.6525        
   access-nsieve                                    365.2946+-11.0123    ?    366.2232+-9.7443        ?
   bitops-3bit-bits-in-byte                          33.7894+-0.7732           33.6817+-0.4180        
   bitops-bits-in-byte                               73.9784+-1.3815           73.8790+-0.6959        
   bitops-nsieve-bits                               398.2521+-2.1836     ?    399.3892+-3.1114        ?
   controlflow-recursive                            425.8974+-4.6171          423.2567+-0.6182        
   crypto-aes                                       558.9363+-12.6580         553.3800+-10.7937         might be 1.0100x faster
   crypto-md5                                       431.8727+-3.2255     ?    441.8658+-26.4278       ? might be 1.0231x slower
   crypto-sha1                                      632.6772+-5.6073          628.0345+-10.0830       
   date-format-tofte                                500.2225+-6.0276     ?    504.2688+-6.1823        ?
   date-format-xparb                                658.1238+-1.8384     ?    700.6277+-119.7695      ? might be 1.0646x slower
   hash-map                                         148.6700+-1.6807     ?    149.1348+-1.4700        ?
   math-cordic                                      474.6093+-1.2511          473.9519+-0.5247        
   math-partial-sums                                457.5010+-0.7586     ^    455.0258+-1.5119        ^ definitely 1.0054x faster
   math-spectral-norm                               545.2324+-0.4119     ?    547.1794+-2.7131        ?
   string-base64                                    349.8429+-2.4775     !    357.3467+-2.8458        ! definitely 1.0214x slower
   string-fasta                                     362.0520+-2.0066     ?    364.2637+-1.2252        ?
   string-tagcloud                                  174.2948+-1.6912     ?    175.5440+-2.0972        ?

   <geometric>                                      380.9535+-0.9168     ?    382.6343+-3.2999        ? might be 1.0044x slower

                                                        TipOfTree             ParallelHelperPool                                
V8Spider:
   crypto                                            47.7898+-1.8614           47.6309+-1.1581        
   deltablue                                         79.6906+-3.3930     ?     79.8958+-2.8654        ?
   earley-boyer                                      41.1604+-1.1718           40.7277+-0.7002          might be 1.0106x faster
   raytrace                                          30.8145+-1.3700     ?     31.0355+-1.3871        ?
   regexp                                            61.8230+-1.0898           61.5825+-1.4560        
   richards                                          54.3931+-1.1869           54.3314+-1.7836        
   splay                                             34.6575+-1.2398     ?     36.0469+-0.9957        ? might be 1.0401x slower

   <geometric>                                       47.6916+-0.7557     ?     47.8969+-0.2706        ? might be 1.0043x slower

                                                        TipOfTree             ParallelHelperPool                                
Octane:
   encrypt                                           0.16343+-0.00267          0.16310+-0.00195       
   decrypt                                           2.94817+-0.05449          2.89314+-0.00637         might be 1.0190x faster
   deltablue                                x2       0.13779+-0.00205    ?     0.13879+-0.00254       ?
   earley                                            0.30340+-0.00789          0.30223+-0.00443       
   boyer                                             4.30783+-0.02409          4.29217+-0.03977       
   navier-stokes                            x2       4.82042+-0.00795          4.80779+-0.01098       
   raytrace                                 x2       0.87676+-0.05835          0.85806+-0.00420         might be 1.0218x faster
   richards                                 x2       0.08839+-0.00068    ?     0.08871+-0.00080       ?
   splay                                    x2       0.33154+-0.00416          0.33099+-0.00137       
   regexp                                   x2      23.95248+-0.37488    ?    24.35841+-0.29745       ? might be 1.0169x slower
   pdfjs                                    x2      36.95942+-0.52543         36.75775+-0.45513       
   mandreel                                 x2      41.85882+-0.49120    ?    42.35157+-0.36371       ? might be 1.0118x slower
   gbemu                                    x2      33.13506+-5.06346         32.03116+-2.35316         might be 1.0345x faster
   closure                                           0.57249+-0.00310          0.57035+-0.00235       
   jquery                                            7.23814+-0.02116          7.21110+-0.02052       
   box2d                                    x2       9.10411+-0.06195          9.06213+-0.06407       
   zlib                                     x2     386.51659+-4.52678        385.91389+-5.79452       
   typescript                               x2     652.22147+-8.99379    ?   654.91033+-13.74540      ?

   <geometric>                                       5.31726+-0.07076          5.30404+-0.02898         might be 1.0025x faster

                                                        TipOfTree             ParallelHelperPool                                
Kraken:
   ai-astar                                          126.137+-0.702            126.066+-0.502         
   audio-beat-detection                               49.756+-0.358             49.647+-0.209         
   audio-dft                                          95.089+-0.585      ?      95.293+-0.163         ?
   audio-fft                                          35.076+-0.672             35.034+-0.461         
   audio-oscillator                                   55.413+-0.717      ?      55.541+-0.980         ?
   imaging-darkroom                                   59.730+-0.164      ?      59.848+-0.183         ?
   imaging-desaturate                                 47.560+-0.227             47.531+-0.376         
   imaging-gaussian-blur                              85.038+-0.487      ?      88.892+-6.168         ? might be 1.0453x slower
   json-parse-financial                               37.283+-0.456      ?      38.749+-1.274         ? might be 1.0393x slower
   json-stringify-tinderbox                           23.404+-1.308             21.919+-0.694           might be 1.0678x faster
   stanford-crypto-aes                                40.918+-0.919      ?      40.918+-1.438         ?
   stanford-crypto-ccm                                35.441+-1.153      ?      36.464+-1.711         ? might be 1.0289x slower
   stanford-crypto-pbkdf2                             93.509+-0.208      !      93.882+-0.145         ! definitely 1.0040x slower
   stanford-crypto-sha256-iterative                   36.001+-0.640      ?      36.955+-1.952         ? might be 1.0265x slower

   <arithmetic>                                       58.597+-0.130      ?      59.053+-0.559         ? might be 1.0078x slower

                                                        TipOfTree             ParallelHelperPool                                
JSRegress:
   abc-forward-loop-equal                            29.3565+-1.0327     ?     31.0222+-1.8528        ? might be 1.0567x slower
   abc-postfix-backward-loop                         29.7912+-0.5513           29.3668+-0.5314          might be 1.0145x faster
   abc-simple-backward-loop                          29.4310+-1.0981           29.3534+-0.7962        
   abc-simple-forward-loop                           29.7808+-1.5927           29.0624+-0.4607          might be 1.0247x faster
   abc-skippy-loop                                   21.6326+-0.3919           21.0678+-0.5134          might be 1.0268x faster
   abs-boolean                                        2.4848+-0.1656            2.4536+-0.0799          might be 1.0127x faster
   adapt-to-double-divide                            16.2429+-0.1942     ?     16.4609+-0.5834        ? might be 1.0134x slower
   aliased-arguments-getbyval                         1.1821+-0.0281     ?      1.3010+-0.3082        ? might be 1.1006x slower
   allocate-big-object                                2.5917+-0.2934            2.4178+-0.1343          might be 1.0719x faster
   arguments-named-and-reflective                    10.6881+-0.1507           10.4298+-0.2074          might be 1.0248x faster
   arguments-out-of-bounds                            9.2026+-0.2504     ?      9.2072+-0.3817        ?
   arguments-strict-mode                              9.5392+-0.6549     ?      9.5483+-0.3914        ?
   arguments                                          8.2852+-0.2653     ?      8.4510+-0.3825        ? might be 1.0200x slower
   arity-mismatch-inlining                            0.8222+-0.0106     ?      0.8616+-0.0376        ? might be 1.0480x slower
   array-access-polymorphic-structure                 7.0726+-0.3831            7.0293+-0.2144        
   array-nonarray-polymorhpic-access                 23.6395+-0.5155     ?     23.7437+-0.8642        ?
   array-prototype-every                             74.5624+-0.6370     ?     75.6059+-1.0363        ? might be 1.0140x slower
   array-prototype-forEach                           73.7615+-1.0234     ?     73.8851+-0.9513        ?
   array-prototype-map                               79.7467+-0.9320     ?     80.2307+-0.6464        ?
   array-prototype-reduce                            70.2190+-0.8770     ?     71.3728+-1.1883        ? might be 1.0164x slower
   array-prototype-reduceRight                       70.8346+-1.5496     ?     71.4663+-1.3753        ?
   array-prototype-some                              74.4315+-0.8592     ?     74.7282+-0.8082        ?
   array-splice-contiguous                           21.0511+-0.5749           20.3212+-0.6597          might be 1.0359x faster
   array-with-double-add                              3.3735+-0.0578     ?      3.4865+-0.1107        ? might be 1.0335x slower
   array-with-double-increment                        3.1378+-0.1577            3.0481+-0.0639          might be 1.0294x faster
   array-with-double-mul-add                          4.1968+-0.0944     ?      4.2897+-0.2413        ? might be 1.0221x slower
   array-with-double-sum                              3.2120+-0.1081     ?      3.2314+-0.0786        ?
   array-with-int32-add-sub                           5.8442+-0.1968     ?      5.8567+-0.2952        ?
   array-with-int32-or-double-sum                     3.2271+-0.0461     ?      3.3396+-0.1125        ? might be 1.0349x slower
   ArrayBuffer-DataView-alloc-large-long-lived   
                                                     25.7962+-0.8155     ?     26.2505+-1.3011        ? might be 1.0176x slower
   ArrayBuffer-DataView-alloc-long-lived             11.8668+-0.5699     ?     12.0131+-0.5715        ? might be 1.0123x slower
   ArrayBuffer-Int32Array-byteOffset                  3.5920+-0.1720            3.4819+-0.0327          might be 1.0316x faster
   ArrayBuffer-Int8Array-alloc-large-long-lived   
                                                     26.4563+-0.9379           25.8607+-0.7490          might be 1.0230x faster
   ArrayBuffer-Int8Array-alloc-long-lived-buffer   
                                                     19.2873+-0.8727     ?     19.7725+-0.5821        ? might be 1.0252x slower
   ArrayBuffer-Int8Array-alloc-long-lived            11.2677+-0.5516     ?     11.9112+-0.9822        ? might be 1.0571x slower
   ArrayBuffer-Int8Array-alloc                        9.3154+-0.2028     ?      9.5407+-0.3360        ? might be 1.0242x slower
   arrowfunction-call                                10.4272+-0.1500     ?     10.7147+-0.1907        ? might be 1.0276x slower
   asmjs_bool_bug                                     7.4775+-0.1389     ?      7.6511+-0.2795        ? might be 1.0232x slower
   assign-custom-setter-polymorphic                   2.4796+-0.0398            2.4779+-0.1156        
   assign-custom-setter                               3.4798+-0.0856     ^      3.2400+-0.0417        ^ definitely 1.0740x faster
   basic-set                                          7.1322+-0.2514     ?      7.4105+-0.5650        ? might be 1.0390x slower
   big-int-mul                                        3.4929+-0.0375     ?      3.5125+-0.0412        ?
   boolean-test                                       3.0570+-0.0552            3.0556+-0.0806        
   branch-fold                                        3.5759+-0.0483     ?      3.6265+-0.0553        ? might be 1.0141x slower
   branch-on-string-as-boolean                       17.1311+-0.6498           16.7138+-0.4266          might be 1.0250x faster
   by-val-generic                                     2.4874+-0.0899     ?      2.5834+-0.2736        ? might be 1.0386x slower
   call-spread-apply                                 27.0971+-0.8009     ?     27.5159+-1.2324        ? might be 1.0155x slower
   call-spread-call                                  21.4626+-0.4063     ?     21.5661+-0.6216        ?
   captured-assignments                               0.5231+-0.2430            0.4132+-0.0053          might be 1.2662x faster
   cast-int-to-double                                 5.0504+-0.0742     ?      5.2079+-0.2798        ? might be 1.0312x slower
   cell-argument                                      5.9909+-0.3708            5.8680+-0.1939          might be 1.0210x faster
   cfg-simplify                                       2.8473+-0.0468     ?      2.8748+-0.0681        ?
   chain-getter-access                                8.3989+-0.1335            8.2409+-0.1293          might be 1.0192x faster
   cmpeq-obj-to-obj-other                            12.6030+-1.0444           11.9971+-1.2182          might be 1.0505x faster
   constant-test                                      4.8196+-0.0917     ?      4.8638+-0.1182        ?
   create-lots-of-functions                           9.4813+-0.4761            9.2423+-0.4429          might be 1.0259x faster
   cse-new-array-buffer                               2.3040+-0.1265     ?      2.3085+-0.0937        ?
   cse-new-array                                      2.4287+-0.0913     ?      2.6374+-0.3116        ? might be 1.0859x slower
   DataView-custom-properties                        30.8669+-1.1588           30.6830+-1.0097        
   delay-tear-off-arguments-strictmode               12.3640+-0.4357     ?     12.3947+-0.4274        ?
   deltablue-varargs                                167.8660+-3.5312     ?    172.7589+-3.2891        ? might be 1.0291x slower
   destructuring-arguments                          159.1486+-0.6103     ?    160.2453+-1.2824        ?
   destructuring-parameters-overridden-by-function   
                                                      0.4861+-0.0836     ?      0.5148+-0.1033        ? might be 1.0590x slower
   destructuring-swap                                 4.8428+-0.2323            4.7849+-0.0678          might be 1.0121x faster
   direct-arguments-getbyval                          1.1347+-0.0166     !      1.2075+-0.0500        ! definitely 1.0642x slower
   div-boolean-double                                 5.2413+-0.1098     ?      5.3645+-0.1080        ? might be 1.0235x slower
   div-boolean                                        8.2072+-0.2929     ?      8.2744+-0.2825        ?
   double-get-by-val-out-of-bounds                    4.4515+-0.0672     ?      4.4810+-0.1781        ?
   double-pollution-getbyval                          8.7360+-0.0920            8.6736+-0.1610        
   double-pollution-putbyoffset                       4.0387+-0.5099            3.5668+-0.1647          might be 1.1323x faster
   double-real-use                                   24.5575+-1.4997           23.6573+-0.5163          might be 1.0381x faster
   double-to-int32-typed-array-no-inline              2.2629+-0.1695            2.2012+-0.1551          might be 1.0280x faster
   double-to-int32-typed-array                        2.0365+-0.0276     ?      2.0759+-0.0821        ? might be 1.0194x slower
   double-to-uint32-typed-array-no-inline             2.2050+-0.0554     ?      2.3474+-0.2306        ? might be 1.0646x slower
   double-to-uint32-typed-array                       2.0994+-0.0866            2.0953+-0.0165        
   elidable-new-object-dag                           33.9170+-0.7471           33.5063+-0.5747          might be 1.0123x faster
   elidable-new-object-roflcopter                    32.4588+-0.2753     ?     32.4715+-1.1477        ?
   elidable-new-object-then-call                     31.9255+-1.1010           31.6553+-0.9619        
   elidable-new-object-tree                          37.6223+-1.0741           36.9362+-0.4600          might be 1.0186x faster
   empty-string-plus-int                              4.7040+-0.0544     ?      4.7133+-0.0937        ?
   emscripten-cube2hash                              27.6944+-1.7620           27.2388+-0.6391          might be 1.0167x faster
   exit-length-on-plain-object                       16.3444+-1.2514           15.6522+-0.5977          might be 1.0442x faster
   external-arguments-getbyval                        1.2690+-0.0883     ?      1.3101+-0.0735        ? might be 1.0324x slower
   external-arguments-putbyval                        2.2081+-0.1511     ?      2.2174+-0.1200        ?
   fixed-typed-array-storage-var-index                1.3475+-0.3490            1.2414+-0.0788          might be 1.0855x faster
   fixed-typed-array-storage                          0.9848+-0.0544            0.9353+-0.0766          might be 1.0529x faster
   Float32Array-matrix-mult                           3.9841+-0.0731     ?      4.0958+-0.4146        ? might be 1.0281x slower
   Float32Array-to-Float64Array-set                  46.8357+-0.8639           46.7759+-0.7813        
   Float64Array-alloc-long-lived                     59.5727+-1.9765     ?     60.0234+-1.3327        ?
   Float64Array-to-Int16Array-set                    56.7699+-1.0771           56.2013+-0.6626          might be 1.0101x faster
   fold-double-to-int                                12.1760+-0.1663           12.1534+-0.1360        
   fold-get-by-id-to-multi-get-by-offset-rare-int   
                                                     10.4807+-0.8839     ?     11.1325+-1.0295        ? might be 1.0622x slower
   fold-get-by-id-to-multi-get-by-offset             10.6729+-1.5534           10.0416+-0.5594          might be 1.0629x faster
   fold-multi-get-by-offset-to-get-by-offset   
                                                      9.2352+-0.5870            9.1383+-0.7223          might be 1.0106x faster
   fold-multi-get-by-offset-to-poly-get-by-offset   
                                                      9.2409+-1.0132            9.0216+-0.5751          might be 1.0243x faster
   fold-multi-put-by-offset-to-poly-put-by-offset   
                                                      9.1681+-1.2521     ?      9.9054+-1.4483        ? might be 1.0804x slower
   fold-multi-put-by-offset-to-put-by-offset   
                                                     10.8474+-1.7473           10.8092+-0.9561        
   fold-multi-put-by-offset-to-replace-or-transition-put-by-offset   
                                                      9.1004+-0.2344     ?      9.3192+-0.5691        ? might be 1.0240x slower
   fold-put-by-id-to-multi-put-by-offset             11.0763+-0.9677     ?     12.6783+-2.5683        ? might be 1.1446x slower
   fold-put-by-val-with-string-to-multi-put-by-offset   
                                                     10.8389+-0.6064     ?     11.0068+-0.7783        ? might be 1.0155x slower
   fold-put-by-val-with-symbol-to-multi-put-by-offset   
                                                     10.4877+-0.7344     ?     10.9761+-0.7447        ? might be 1.0466x slower
   fold-put-structure                                 7.4094+-1.1789     ?      8.1991+-0.3357        ? might be 1.1066x slower
   for-of-iterate-array-entries                      10.7965+-0.1067           10.7600+-0.2025        
   for-of-iterate-array-keys                          3.4501+-0.0748     ?      3.5382+-0.4152        ? might be 1.0255x slower
   for-of-iterate-array-values                        3.6132+-0.4172            3.3318+-0.1059          might be 1.0845x faster
   fround                                            18.2111+-0.4730           18.1421+-1.1409        
   ftl-library-inlining-dataview                     56.4142+-0.2190           55.8224+-0.6345          might be 1.0106x faster
   ftl-library-inlining                              95.5581+-0.5499           95.2924+-0.4687        
   function-call                                     10.8105+-0.2711     ?     10.8851+-0.1592        ?
   function-dot-apply                                 2.1627+-0.1397            2.0641+-0.1477          might be 1.0478x faster
   function-test                                      2.7878+-0.2491            2.7455+-0.0944          might be 1.0154x faster
   function-with-eval                                91.5505+-0.3059     ?     93.4033+-1.9011        ? might be 1.0202x slower
   gcse-poly-get-less-obvious                        20.3363+-0.3851     ?     20.3853+-0.3501        ?
   gcse-poly-get                                     22.4937+-1.0273     ?     22.6642+-0.8397        ?
   gcse                                               3.5188+-0.1518            3.4017+-0.0348          might be 1.0344x faster
   get-by-id-bimorphic-check-structure-elimination-simple   
                                                      2.6298+-0.0891            2.6172+-0.0825        
   get-by-id-bimorphic-check-structure-elimination   
                                                      4.7826+-0.1820            4.7096+-0.0998          might be 1.0155x faster
   get-by-id-chain-from-try-block                     2.5400+-0.2495     ?      2.5651+-0.3229        ?
   get-by-id-check-structure-elimination              3.9961+-0.1522            3.9094+-0.0709          might be 1.0222x faster
   get-by-id-proto-or-self                           15.1374+-0.3066     ?     15.5473+-0.5754        ? might be 1.0271x slower
   get-by-id-quadmorphic-check-structure-elimination-simple   
                                                      2.9500+-0.1272            2.8846+-0.0726          might be 1.0227x faster
   get-by-id-self-or-proto                           15.7680+-1.0238     ?     15.8248+-0.5483        ?
   get-by-val-out-of-bounds                           4.3104+-0.1146     ?      4.3688+-0.2105        ? might be 1.0136x slower
   get-by-val-with-string-bimorphic-check-structure-elimination-simple   
                                                      2.7129+-0.0379            2.7096+-0.1101        
   get-by-val-with-string-bimorphic-check-structure-elimination   
                                                      5.8228+-0.0281     ?      5.8603+-0.1184        ?
   get-by-val-with-string-chain-from-try-block   
                                                      2.4559+-0.1264            2.3769+-0.0045          might be 1.0332x faster
   get-by-val-with-string-check-structure-elimination   
                                                      5.2392+-0.0832            5.2046+-0.1634        
   get-by-val-with-string-proto-or-self              15.4528+-0.6247     ?     15.7096+-0.5122        ? might be 1.0166x slower
   get-by-val-with-string-quadmorphic-check-structure-elimination-simple   
                                                      3.1660+-0.2759            3.1572+-0.1307        
   get-by-val-with-string-self-or-proto              15.9467+-0.6466           15.7133+-0.8377          might be 1.0149x faster
   get-by-val-with-symbol-bimorphic-check-structure-elimination-simple   
                                                      3.0986+-0.0303            3.0863+-0.0356        
   get-by-val-with-symbol-bimorphic-check-structure-elimination   
                                                     12.3212+-0.0864           12.2513+-0.0412        
   get-by-val-with-symbol-chain-from-try-block   
                                                      2.4749+-0.1629     ?      2.5348+-0.2313        ? might be 1.0242x slower
   get-by-val-with-symbol-check-structure-elimination   
                                                     11.4203+-0.4333           11.2365+-0.1053          might be 1.0164x faster
   get-by-val-with-symbol-proto-or-self              15.5873+-0.7831           15.5483+-0.7293        
   get-by-val-with-symbol-quadmorphic-check-structure-elimination-simple   
                                                      4.0209+-0.4042            3.7020+-0.1585          might be 1.0861x faster
   get-by-val-with-symbol-self-or-proto              15.8728+-0.7238           15.8668+-1.0346        
   get_callee_monomorphic                             2.3732+-0.1104     ?      2.5224+-0.5199        ? might be 1.0629x slower
   get_callee_polymorphic                             3.4611+-0.3077            3.3736+-0.3556          might be 1.0259x faster
   getter-no-activation                               4.7616+-0.1553            4.7422+-0.2508        
   getter-prototype                                   7.8935+-0.1788     ?      7.9819+-0.2259        ? might be 1.0112x slower
   getter-richards                                  119.0602+-9.9869          114.7559+-8.3484          might be 1.0375x faster
   getter                                             5.3080+-0.4459     ?      5.3659+-0.4727        ? might be 1.0109x slower
   global-object-access-with-mutating-structure   
                                                      5.6175+-0.1955            5.5761+-0.1240        
   global-var-const-infer-fire-from-opt               0.8607+-0.0801     ?      0.8703+-0.0821        ? might be 1.0112x slower
   global-var-const-infer                             0.6394+-0.0318     ?      0.6530+-0.0203        ? might be 1.0214x slower
   hard-overflow-check-equal                         26.2897+-0.5881     ?     26.7191+-0.8121        ? might be 1.0163x slower
   hard-overflow-check                               26.6345+-1.0546     ?     26.6432+-0.2225        ?
   HashMap-put-get-iterate-keys                      25.7031+-1.9463     ?     26.5482+-1.8008        ? might be 1.0329x slower
   HashMap-put-get-iterate                           28.2694+-1.1328           27.5800+-0.6482          might be 1.0250x faster
   HashMap-string-put-get-iterate                    24.4550+-1.0591           23.7402+-0.6055          might be 1.0301x faster
   hoist-make-rope                                    9.1133+-0.7831     ?      9.2104+-0.9708        ? might be 1.0107x slower
   hoist-poly-check-structure-effectful-loop   
                                                      3.7140+-0.1865            3.5447+-0.0450          might be 1.0478x faster
   hoist-poly-check-structure                         3.1646+-0.0978     ?      3.1914+-0.1601        ?
   imul-double-only                                   7.8158+-0.1566            7.7408+-0.1084        
   imul-int-only                                      9.0356+-0.3665            8.7873+-0.8767          might be 1.0283x faster
   imul-mixed                                         6.8087+-0.7229     ?      6.9012+-0.2755        ? might be 1.0136x slower
   in-four-cases                                     17.0128+-0.4362     ?     17.1189+-0.5557        ?
   in-one-case-false                                  9.4655+-0.2469     ?     10.0965+-0.8116        ? might be 1.0667x slower
   in-one-case-true                                   9.9453+-0.8318            9.5506+-0.4274          might be 1.0413x faster
   in-two-cases                                      10.1339+-0.6603            9.5269+-0.1093          might be 1.0637x faster
   indexed-properties-in-objects                      2.7344+-0.0289     ?      2.7851+-0.1134        ? might be 1.0185x slower
   infer-closure-const-then-mov-no-inline             3.6414+-0.1464            3.5841+-0.0362          might be 1.0160x faster
   infer-closure-const-then-mov                      18.8162+-0.6625     ?     18.8378+-0.4839        ?
   infer-closure-const-then-put-to-scope-no-inline   
                                                     10.8014+-0.2542           10.7742+-0.1370        
   infer-closure-const-then-put-to-scope             22.6386+-0.9609     ?     23.0738+-1.1797        ? might be 1.0192x slower
   infer-closure-const-then-reenter-no-inline   
                                                     44.9486+-0.7900     ?     45.2117+-0.5123        ?
   infer-closure-const-then-reenter                  22.6184+-0.5018     ?     23.8609+-0.9626        ? might be 1.0549x slower
   infer-constant-global-property                     3.3894+-0.0511            3.3758+-0.0732        
   infer-constant-property                            2.7150+-0.2658            2.6262+-0.0830          might be 1.0338x faster
   infer-one-time-closure-ten-vars                    7.9693+-0.5260            7.7623+-0.1160          might be 1.0267x faster
   infer-one-time-closure-two-vars                    7.6501+-0.5744            7.6404+-0.6334        
   infer-one-time-closure                             7.2405+-0.1119            7.1614+-0.3940          might be 1.0111x faster
   infer-one-time-deep-closure                       10.6788+-0.2469     ?     10.7645+-0.3737        ?
   inline-arguments-access                            3.6673+-0.2454            3.6465+-0.1146        
   inline-arguments-aliased-access                    3.6794+-0.0924     ?      3.8422+-0.3734        ? might be 1.0442x slower
   inline-arguments-local-escape                      3.7056+-0.3390            3.5457+-0.0605          might be 1.0451x faster
   inline-get-scoped-var                              4.5418+-0.1585            4.5095+-0.1070        
   inlined-put-by-id-transition                       9.7186+-1.6817            8.9215+-0.4415          might be 1.0893x faster
   inlined-put-by-val-with-string-transition   
                                                     42.6241+-1.1743           42.0691+-1.5529          might be 1.0132x faster
   inlined-put-by-val-with-symbol-transition   
                                                     43.3025+-2.2422           42.4315+-1.4180          might be 1.0205x faster
   int-or-other-abs-then-get-by-val                   4.4167+-0.0576     ?      4.5500+-0.1379        ? might be 1.0302x slower
   int-or-other-abs-zero-then-get-by-val             15.3950+-0.3113           15.3335+-0.2753        
   int-or-other-add-then-get-by-val                   4.0481+-0.1070     ?      4.0682+-0.1162        ?
   int-or-other-add                                   4.9826+-0.2172     ?      5.0635+-0.1760        ? might be 1.0162x slower
   int-or-other-div-then-get-by-val                   3.8110+-0.1744            3.7661+-0.0845          might be 1.0119x faster
   int-or-other-max-then-get-by-val                   3.8939+-0.0909     ?      4.0365+-0.2729        ? might be 1.0366x slower
   int-or-other-min-then-get-by-val                   3.7689+-0.0564            3.7610+-0.0532        
   int-or-other-mod-then-get-by-val                   3.5532+-0.1756            3.5467+-0.1878        
   int-or-other-mul-then-get-by-val                   3.6596+-0.0891            3.5919+-0.0789          might be 1.0189x faster
   int-or-other-neg-then-get-by-val                   3.9577+-0.0902     ?      3.9670+-0.0326        ?
   int-or-other-neg-zero-then-get-by-val             15.6159+-0.3045           15.4045+-0.3922          might be 1.0137x faster
   int-or-other-sub-then-get-by-val                   3.9610+-0.0415     ?      4.0350+-0.0530        ? might be 1.0187x slower
   int-or-other-sub                                   3.5075+-0.1072            3.4165+-0.1048          might be 1.0267x faster
   int-overflow-local                                 4.0496+-0.0315     ?      4.1213+-0.0915        ? might be 1.0177x slower
   Int16Array-alloc-long-lived                       43.2995+-0.8622           42.8477+-1.1305          might be 1.0105x faster
   Int16Array-bubble-sort-with-byteLength            17.1314+-0.4944           16.6690+-0.1648          might be 1.0277x faster
   Int16Array-bubble-sort                            16.5985+-0.3058     ?     16.9073+-0.9682        ? might be 1.0186x slower
   Int16Array-load-int-mul                            1.4413+-0.0634     ?      1.5373+-0.1972        ? might be 1.0666x slower
   Int16Array-to-Int32Array-set                      44.4871+-0.5405           43.1043+-0.9482          might be 1.0321x faster
   Int32Array-alloc-large                            11.9762+-0.7968           11.9623+-0.6352        
   Int32Array-alloc-long-lived                       48.5172+-1.3361           48.1144+-1.2252        
   Int32Array-alloc                                   2.8298+-0.2230     ?      2.8300+-0.1564        ?
   Int32Array-Int8Array-view-alloc                    5.8675+-0.1196     ?      5.9424+-0.2765        ? might be 1.0128x slower
   int52-spill                                        4.6311+-0.3230     ?      4.6459+-0.2634        ?
   Int8Array-alloc-long-lived                        38.1646+-1.0949     ?     38.3647+-1.1864        ?
   Int8Array-load-with-byteLength                     3.3700+-0.0562     ?      3.7047+-0.5697        ? might be 1.0993x slower
   Int8Array-load                                     3.4621+-0.1709            3.4060+-0.1011          might be 1.0165x faster
   integer-divide                                    10.3601+-0.1310           10.3494+-0.2497        
   integer-modulo                                     1.5836+-0.0377     ?      1.5914+-0.0411        ?
   is-boolean-fold-tricky                             3.8313+-0.2410     ?      3.8393+-0.2044        ?
   is-boolean-fold                                    2.7354+-0.2065            2.6140+-0.0324          might be 1.0464x faster
   is-function-fold-tricky-internal-function   
                                                      9.5305+-0.1155     ?      9.6943+-0.3340        ? might be 1.0172x slower
   is-function-fold-tricky                            4.2410+-0.4476            4.0590+-0.0645          might be 1.0448x faster
   is-function-fold                                   2.6386+-0.0755     ?      2.6755+-0.0855        ? might be 1.0140x slower
   is-number-fold-tricky                              3.9211+-0.0345     ?      3.9917+-0.1681        ? might be 1.0180x slower
   is-number-fold                                     2.5924+-0.0381     ?      2.6210+-0.0675        ? might be 1.0110x slower
   is-object-or-null-fold-functions                   2.7622+-0.1320            2.7206+-0.1188          might be 1.0153x faster
   is-object-or-null-fold-less-tricky                 3.9877+-0.0408     ?      4.0920+-0.1602        ? might be 1.0262x slower
   is-object-or-null-fold-tricky                      5.5365+-1.1037            4.7620+-0.1858          might be 1.1626x faster
   is-object-or-null-fold                             2.7199+-0.0986            2.6864+-0.0216          might be 1.0125x faster
   is-object-or-null-trickier-function                3.9971+-0.0499     ?      4.0245+-0.0481        ?
   is-object-or-null-trickier-internal-function   
                                                      9.8190+-0.1828     !     10.1550+-0.1470        ! definitely 1.0342x slower
   is-object-or-null-tricky-function                  4.0066+-0.0463     ?      4.2652+-0.4422        ? might be 1.0645x slower
   is-object-or-null-tricky-internal-function   
                                                      7.3351+-0.1328     !      7.6739+-0.0982        ! definitely 1.0462x slower
   is-string-fold-tricky                              3.9874+-0.0531     ?      3.9973+-0.1352        ?
   is-string-fold                                     2.6693+-0.1171            2.6491+-0.0238        
   is-undefined-fold-tricky                           3.3232+-0.0672     ?      3.3423+-0.0531        ?
   is-undefined-fold                                  2.6231+-0.0520     ?      2.6419+-0.0519        ?
   JSONP-negative-0                                   0.2604+-0.0330            0.2484+-0.0081          might be 1.0484x faster
   large-int-captured                                 4.2973+-0.3169            4.2202+-0.2129          might be 1.0183x faster
   large-int-neg                                     14.7954+-0.8013           13.9581+-0.1734          might be 1.0600x faster
   large-int                                         13.4844+-0.7982           13.0754+-0.3290          might be 1.0313x faster
   load-varargs-elimination                          20.6490+-0.6290           20.3594+-0.8424          might be 1.0142x faster
   logical-not-weird-types                            2.9992+-0.0327     ?      3.0846+-0.1091        ? might be 1.0285x slower
   logical-not                                        4.4164+-0.1109            4.3474+-0.1050          might be 1.0159x faster
   lots-of-fields                                     9.5087+-0.5004            9.4936+-0.5264        
   make-indexed-storage                               2.7808+-0.2482            2.7718+-0.2194        
   make-rope-cse                                      3.9319+-0.4784            3.5344+-0.0372          might be 1.1125x faster
   marsaglia-larger-ints                             32.9067+-1.5389           32.3362+-1.6979          might be 1.0176x faster
   marsaglia-osr-entry                               21.2575+-0.5225     ?     21.3472+-0.4127        ?
   math-with-out-of-bounds-array-values              21.5580+-0.3866     ?     21.6646+-1.2230        ?
   max-boolean                                        2.6102+-0.1065     ?      2.6421+-0.0489        ? might be 1.0122x slower
   method-on-number                                  15.4083+-0.4428     ?     15.5264+-0.4749        ?
   min-boolean                                        2.5866+-0.0313     ?      2.6546+-0.0955        ? might be 1.0263x slower
   minus-boolean-double                               3.1091+-0.0569     ?      3.1620+-0.1375        ? might be 1.0170x slower
   minus-boolean                                      2.3064+-0.0218     ?      2.3627+-0.0922        ? might be 1.0244x slower
   misc-strict-eq                                    28.2740+-0.8269     ?     29.5530+-1.9318        ? might be 1.0452x slower
   mod-boolean-double                                11.1554+-0.3663           11.1373+-0.1828        
   mod-boolean                                        8.3034+-0.1700     ?      8.4493+-0.3886        ? might be 1.0176x slower
   mul-boolean-double                                 3.6550+-0.1095            3.5960+-0.0607          might be 1.0164x faster
   mul-boolean                                        2.8509+-0.1725            2.8058+-0.0727          might be 1.0161x faster
   neg-boolean                                        3.0974+-0.0596            3.0641+-0.0341          might be 1.0108x faster
   negative-zero-divide                               0.3608+-0.0335            0.3603+-0.0762        
   negative-zero-modulo                               0.3303+-0.0161     ?      0.3787+-0.0794        ? might be 1.1465x slower
   negative-zero-negate                               0.3476+-0.0566            0.3187+-0.0024          might be 1.0904x faster
   nested-function-parsing                           44.3195+-0.5547     ?     44.3249+-0.6894        ?
   new-array-buffer-dead                             86.6065+-0.3025     ?     87.4377+-0.9717        ?
   new-array-buffer-push                              6.6640+-1.0166            6.1062+-0.1240          might be 1.0914x faster
   new-array-dead                                    15.1968+-1.0878           14.7859+-0.6634          might be 1.0278x faster
   new-array-push                                     3.4681+-0.1090            3.3714+-0.0456          might be 1.0287x faster
   no-inline-constructor                             30.9118+-0.7755     ?     30.9155+-0.7164        ?
   number-test                                        3.0039+-0.0382     ?      3.0677+-0.0561        ? might be 1.0213x slower
   object-closure-call                                4.8431+-0.0256     ?      4.8966+-0.1231        ? might be 1.0111x slower
   object-get-own-property-symbols-on-large-array   
                                                      4.2254+-0.2372            4.0466+-0.1704          might be 1.0442x faster
   object-test                                        2.6862+-0.0459     ?      2.7699+-0.0484        ? might be 1.0311x slower
   obvious-sink-pathology-taken                      97.5746+-0.6149     ?     97.6515+-0.6388        ?
   obvious-sink-pathology                            92.8748+-1.1597     ?     92.9670+-1.2502        ?
   obviously-elidable-new-object                     28.5708+-0.6228     ?     28.8054+-0.7981        ?
   plus-boolean-arith                                 2.3924+-0.1264     ?      2.4629+-0.1670        ? might be 1.0294x slower
   plus-boolean-double                                3.1053+-0.0514            3.0969+-0.0473        
   plus-boolean                                       2.5821+-0.1540            2.5259+-0.0501          might be 1.0222x faster
   poly-chain-access-different-prototypes-simple   
                                                      3.1890+-0.0427            3.1623+-0.0310        
   poly-chain-access-different-prototypes             3.2613+-0.0501     ?      3.2665+-0.0944        ?
   poly-chain-access-simpler                          3.2534+-0.1208     ?      3.6232+-0.8147        ? might be 1.1137x slower
   poly-chain-access                                  3.2681+-0.0795     ?      3.3290+-0.1545        ? might be 1.0186x slower
   poly-stricteq                                     48.8830+-0.1800     ?     48.9623+-0.4356        ?
   polymorphic-array-call                             1.3306+-0.0548            1.2556+-0.0601          might be 1.0597x faster
   polymorphic-get-by-id                              2.8398+-0.0416     ?      2.8664+-0.0854        ?
   polymorphic-put-by-id                             28.2633+-1.4489           28.1894+-1.7182        
   polymorphic-put-by-val-with-string                28.5860+-0.4248     ?     29.0419+-1.6790        ? might be 1.0160x slower
   polymorphic-put-by-val-with-symbol                29.0743+-0.5832           29.0348+-1.0786        
   polymorphic-structure                             14.0579+-3.0330           12.4952+-0.4418          might be 1.1251x faster
   polyvariant-monomorphic-get-by-id                  7.5096+-1.2529            6.3908+-0.8487          might be 1.1751x faster
   proto-getter-access                                8.3421+-0.2234     ?      8.3998+-0.3137        ?
   prototype-access-with-mutating-prototype           5.4835+-0.2098            5.4172+-0.2232          might be 1.0122x faster
   put-by-id-replace-and-transition                   7.6007+-0.4711     ?      7.7330+-0.7646        ? might be 1.0174x slower
   put-by-id-slightly-polymorphic                     2.6693+-0.0202     ?      2.7060+-0.0292        ? might be 1.0138x slower
   put-by-id                                         10.1565+-0.5094            9.8857+-0.4965          might be 1.0274x faster
   put-by-val-direct                                  0.3335+-0.0106     ?      0.3800+-0.0618        ? might be 1.1396x slower
   put-by-val-large-index-blank-indexing-type   
                                                      5.2738+-0.2928     ?      5.7930+-0.8042        ? might be 1.0985x slower
   put-by-val-machine-int                             2.5371+-0.1509            2.4472+-0.0720          might be 1.0367x faster
   put-by-val-with-string-replace-and-transition   
                                                     10.4202+-0.2969     ?     10.7699+-0.4777        ? might be 1.0336x slower
   put-by-val-with-string-slightly-polymorphic   
                                                      2.9110+-0.0316     ?      3.0379+-0.1894        ? might be 1.0436x slower
   put-by-val-with-string                            10.1071+-0.3918     ?     10.5052+-0.6443        ? might be 1.0394x slower
   put-by-val-with-symbol-replace-and-transition   
                                                     11.8419+-0.4317           11.7856+-0.4352        
   put-by-val-with-symbol-slightly-polymorphic   
                                                      3.2135+-0.1890     ?      3.3250+-0.2135        ? might be 1.0347x slower
   put-by-val-with-symbol                            10.2437+-0.4533     ?     10.9224+-0.8477        ? might be 1.0663x slower
   rare-osr-exit-on-local                            13.8951+-0.4876           13.3469+-0.0870          might be 1.0411x faster
   raytrace-with-empty-try-catch                      5.1696+-0.1871            5.0948+-0.0404          might be 1.0147x faster
   raytrace-with-try-catch                            9.9105+-0.1863     ?      9.9575+-0.2816        ?
   register-pressure-from-osr                        16.4480+-0.2775     ?     16.5224+-0.4097        ?
   repeat-multi-get-by-offset                        20.4210+-0.7246     ?     20.4836+-0.5127        ?
   richards-empty-try-catch                          74.6801+-0.7977           74.1591+-1.0262        
   richards-try-catch                               241.4964+-2.1454          240.9025+-1.9898        
   setter-prototype                                   7.8885+-0.3935     ?      7.9046+-0.3373        ?
   setter                                             5.6474+-0.4507            5.5835+-0.6598          might be 1.0114x faster
   simple-activation-demo                            24.1150+-0.4533           24.0153+-0.4877        
   simple-getter-access                              10.7463+-0.4197     ?     10.8721+-0.3903        ? might be 1.0117x slower
   simple-poly-call-nested                            9.1348+-0.3713            8.5956+-0.6041          might be 1.0627x faster
   simple-poly-call                                   1.2584+-0.0267     ?      1.3909+-0.1834        ? might be 1.1053x slower
   sin-boolean                                       20.2623+-2.0284           18.7060+-1.6909          might be 1.0832x faster
   singleton-scope                                   55.4791+-0.5753     ?     56.1047+-0.7566        ? might be 1.0113x slower
   sink-function                                     10.0931+-0.5567     ?     10.4807+-1.2054        ? might be 1.0384x slower
   sink-huge-activation                              16.9272+-0.9447           16.8241+-0.9365        
   sinkable-new-object-dag                           53.3750+-0.5963     ?     55.5417+-2.7107        ? might be 1.0406x slower
   sinkable-new-object-taken                         44.1523+-0.8778           43.8390+-1.0670        
   sinkable-new-object                               29.5261+-0.3168     ?     29.6653+-0.6363        ?
   slow-array-profile-convergence                     2.5245+-0.1113            2.5113+-0.0671        
   slow-convergence                                   2.3451+-0.0537            2.3012+-0.0371          might be 1.0191x faster
   slow-ternaries                                    17.9653+-1.0141           17.4788+-0.4184          might be 1.0278x faster
   sorting-benchmark                                 16.9148+-0.7482     ?     17.0519+-0.8636        ?
   sparse-conditional                                 1.1746+-0.0539     ?      1.1852+-0.1140        ?
   splice-to-remove                                  12.2147+-0.4477     ?     12.3866+-0.4591        ? might be 1.0141x slower
   string-char-code-at                               12.9472+-0.1404     ?     13.1440+-0.2408        ? might be 1.0152x slower
   string-concat-object                               2.2061+-0.1055            2.1649+-0.0612          might be 1.0190x faster
   string-concat-pair-object                          2.1234+-0.1601            2.1231+-0.0818        
   string-concat-pair-simple                          9.3406+-0.4451     ?      9.7102+-0.9358        ? might be 1.0396x slower
   string-concat-simple                               9.7193+-0.4496     ?     10.5088+-2.1247        ? might be 1.0812x slower
   string-cons-repeat                                 6.3597+-0.1519     ?      6.4084+-0.1793        ?
   string-cons-tower                                  6.6364+-0.1780            6.5843+-0.3248        
   string-equality                                   15.0114+-0.3327           14.9855+-0.3474        
   string-get-by-val-big-char                         6.5044+-0.0943            6.4815+-0.0621        
   string-get-by-val-out-of-bounds-insane             3.2540+-0.3156            3.0799+-0.1214          might be 1.0565x faster
   string-get-by-val-out-of-bounds                    4.1152+-0.3561            3.9595+-0.2400          might be 1.0393x faster
   string-get-by-val                                  2.8127+-0.0427            2.7889+-0.0166        
   string-hash                                        1.9093+-0.1614            1.8412+-0.0526          might be 1.0370x faster
   string-long-ident-equality                        13.5992+-0.9678           13.2061+-0.2427          might be 1.0298x faster
   string-out-of-bounds                              10.2600+-0.6198           10.2426+-0.5000        
   string-repeat-arith                               27.1294+-1.9850           26.4701+-0.5211          might be 1.0249x faster
   string-sub                                        53.9452+-0.5260     ?     54.1053+-0.5833        ?
   string-test                                        2.8825+-0.0913     ?      2.8994+-0.1128        ?
   string-var-equality                               26.5474+-0.6164     ?     27.1690+-1.5696        ? might be 1.0234x slower
   structure-hoist-over-transitions                   2.3041+-0.0899     ?      2.3405+-0.1026        ? might be 1.0158x slower
   substring-concat-weird                            35.9076+-0.7507     ?     36.0751+-1.8344        ?
   substring-concat                                  39.2244+-0.9722     ?     39.3648+-0.9156        ?
   substring                                         44.4754+-0.6751     ?     44.7293+-1.0340        ?
   switch-char-constant                               2.6519+-0.0535     ?      2.7227+-0.0715        ? might be 1.0267x slower
   switch-char                                        5.9419+-0.9988     ?      6.2082+-0.9391        ? might be 1.0448x slower
   switch-constant                                    8.3178+-1.0719     ?      8.3855+-0.8487        ?
   switch-string-basic-big-var                       14.5553+-0.4162           14.2636+-0.1317          might be 1.0204x faster
   switch-string-basic-big                           15.0913+-0.3046     ?     15.1725+-0.1661        ?
   switch-string-basic-var                           13.4137+-0.2847     ?     13.5043+-0.4941        ?
   switch-string-basic                               12.5241+-0.5112           12.5144+-0.3866        
   switch-string-big-length-tower-var                18.2237+-0.8138           18.1006+-0.4993        
   switch-string-length-tower-var                    13.2668+-0.3905           13.2432+-0.3911        
   switch-string-length-tower                        11.2128+-0.1285     ?     11.3698+-0.0545        ? might be 1.0140x slower
   switch-string-short                               11.3896+-0.2214           11.2258+-0.1083          might be 1.0146x faster
   switch                                            11.0766+-0.8017     ?     11.1951+-0.8196        ? might be 1.0107x slower
   tear-off-arguments-simple                          3.1184+-0.1647     ?      3.4198+-0.7459        ? might be 1.0967x slower
   tear-off-arguments                                 4.2604+-0.4418            3.9475+-0.0788          might be 1.0793x faster
   temporal-structure                                12.2130+-0.7872           11.6562+-0.1047          might be 1.0478x faster
   to-int32-boolean                                  12.8454+-0.6960           12.5456+-0.2090          might be 1.0239x faster
   try-catch-get-by-val-cloned-arguments              9.6792+-0.4189            9.5939+-0.2517        
   try-catch-get-by-val-direct-arguments              2.2733+-0.2505            2.2324+-0.2693          might be 1.0183x faster
   try-catch-get-by-val-scoped-arguments              4.9942+-0.2145            4.6955+-0.1419          might be 1.0636x faster
   typed-array-get-set-by-val-profiling              27.4358+-0.5585           26.8394+-0.4618          might be 1.0222x faster
   undefined-property-access                        216.0598+-0.6995     ?    216.4635+-1.4428        ?
   undefined-test                                     2.9856+-0.0898     ?      3.0613+-0.0809        ? might be 1.0254x slower
   unprofiled-licm                                    9.6752+-0.3929            9.3211+-0.4896          might be 1.0380x faster
   v8-raytrace-with-empty-try-catch                  49.2170+-0.4580     ?     49.5795+-1.9464        ?
   v8-raytrace-with-try-catch                        61.7539+-0.5760           61.5331+-0.7321        
   varargs-call                                      12.7864+-0.1292     ?     13.0188+-0.2522        ? might be 1.0182x slower
   varargs-construct-inline                          21.8156+-0.8146           21.5449+-0.6859          might be 1.0126x faster
   varargs-construct                                 20.5729+-0.6290     ?     21.1206+-0.7438        ? might be 1.0266x slower
   varargs-inline                                     8.6338+-0.1354     ?      8.6540+-0.2113        ?
   varargs-strict-mode                                9.6808+-0.3116            9.6558+-0.3358        
   varargs                                            9.4506+-0.1032            9.4299+-0.0304        
   weird-inlining-const-prop                          2.1017+-0.0881     ?      2.1657+-0.1970        ? might be 1.0305x slower

   <geometric>                                        7.9726+-0.0181            7.9709+-0.0132          might be 1.0002x faster

                                                        TipOfTree             ParallelHelperPool                                
AsmBench:
   bigfib.cpp                                       442.1815+-6.3538     ?    453.0791+-13.7661       ? might be 1.0246x slower
   cray.c                                           392.3013+-3.3147          392.1994+-4.0855        
   dry.c                                            417.9313+-7.8421          415.9226+-7.4424        
   FloatMM.c                                        681.3876+-3.5137     ?    681.4820+-2.3939        ?
   gcc-loops.cpp                                   3408.7941+-16.8878    ?   3415.2036+-33.6737       ?
   n-body.c                                         819.6172+-0.6425     ?    862.8707+-105.3475      ? might be 1.0528x slower
   Quicksort.c                                      404.9588+-4.2404          403.1143+-4.7886        
   stepanov_container.cpp                          3476.1975+-33.0324        3468.5560+-26.9007       
   Towers.c                                         233.8833+-2.2038          232.3964+-1.1485        

   <geometric>                                      709.2614+-2.3591     ?    713.4667+-7.4889        ? might be 1.0059x slower

                                                        TipOfTree             ParallelHelperPool                                
CompressionBench:
   huffman                                           59.1054+-1.4146     ?     61.6995+-6.2506        ? might be 1.0439x slower
   arithmetic-simple                                272.2116+-1.4485     ^    269.0524+-0.4276        ^ definitely 1.0117x faster
   arithmetic-precise                               241.2559+-1.2868     ?    241.3825+-0.7072        ?
   arithmetic-complex-precise                       243.7722+-2.3799          242.4104+-1.9398        
   arithmetic-precise-order-0                       278.2844+-1.9047          278.0792+-1.0005        
   arithmetic-precise-order-1                       297.6564+-2.1971     ?    299.7858+-5.4466        ?
   arithmetic-precise-order-2                       341.1390+-3.0461     ?    342.3434+-2.3227        ?
   arithmetic-simple-order-1                        317.5185+-2.1918          317.3589+-1.2325        
   arithmetic-simple-order-2                        369.9989+-5.6354     ?    370.6637+-5.8619        ?
   lz-string                                        310.2207+-5.0437          306.0996+-1.4824          might be 1.0135x faster

   <geometric>                                      250.5231+-0.8076     ?    251.0277+-2.0638        ? might be 1.0020x slower

                                                        TipOfTree             ParallelHelperPool                                
Geomean of preferred means:
   <scaled-result>                                   49.5484+-0.1158     ?     49.6636+-0.1532        ? might be 1.0023x slower
Comment 7 Filip Pizlo 2015-09-24 18:00:20 PDT
Created attachment 261907 [details]
getter better

It passes tests. It has build things for VS/CMake. It performs well.

Now I just need to check that it works with two VMs.
Comment 8 WebKit Commit Bot 2015-09-24 18:03:04 PDT
Attachment 261907 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/heap/HeapHelperPool.cpp:30:  Bad include order. Mixing system and custom headers.  [build/include_order] [4]
ERROR: Source/WTF/wtf/ParallelHelperPool.cpp:181:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/runtime/CodeCache.h:38:  Alphabetical sorting problem.  [build/include_order] [4]
ERROR: Source/JavaScriptCore/heap/Heap.cpp:550:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/heap/Heap.cpp:635:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/runtime/VM.h:68:  Alphabetical sorting problem.  [build/include_order] [4]
Total errors found: 6 in 23 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 9 Filip Pizlo 2015-09-24 18:50:06 PDT
Created attachment 261910 [details]
the patch
Comment 10 WebKit Commit Bot 2015-09-24 18:53:02 PDT
Attachment 261910 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/API/tests/testapi.mm:495:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/JavaScriptCore/API/tests/testapi.mm:496:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/JavaScriptCore/API/tests/testapi.mm:497:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/JavaScriptCore/API/tests/testapi.mm:498:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/JavaScriptCore/API/tests/testapi.mm:499:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/JavaScriptCore/API/tests/testapi.mm:500:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/JavaScriptCore/heap/HeapHelperPool.cpp:30:  Bad include order. Mixing system and custom headers.  [build/include_order] [4]
ERROR: Source/WTF/wtf/ParallelHelperPool.cpp:181:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/runtime/CodeCache.h:38:  Alphabetical sorting problem.  [build/include_order] [4]
ERROR: Source/JavaScriptCore/heap/Heap.cpp:550:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/heap/Heap.cpp:635:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/runtime/VM.h:68:  Alphabetical sorting problem.  [build/include_order] [4]
Total errors found: 12 in 24 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 11 Alexey Proskuryakov 2015-09-24 18:59:26 PDT
Are there any tests to unskip with this? I think that some worker tests may have been skipped since GC started to use many threads.
Comment 12 Geoffrey Garen 2015-09-25 11:11:32 PDT
Comment on attachment 261910 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=261910&action=review

> Source/JavaScriptCore/heap/Heap.cpp:604
> +    m_helperClient.finish();

Shouldn't we doSomeHelping at some point? It seems like the main thread will not participate in marking unless we do some helping.

> Source/WTF/ChangeLog:39
> +        join after the other threads have already started. it's not advisable to make algorithmic

It's

> Source/WTF/wtf/ParallelHelperPool.cpp:56
> +void ParallelHelperClient::setTask(RefPtr<ParallelHelperTask> task)

I don't think the setter naming idiom works here. The post condition for a setter is that the value you set should now be visible in the member variable. But this function will clear the value you set before returning.

Let's call this runTask.

> Source/WTF/wtf/ParallelHelperPool.cpp:83
> +void ParallelHelperClient::runTaskInParallel(RefPtr<ParallelHelperTask> task)

Let's call this runTaskSynchronously. We don't need parallel in the function name since it's in the class name.

> Source/WTF/wtf/ParallelHelperPool.cpp:180
> +            "Parallel Helper Thread",

Let's namespace this since it will show up in process-global output. WTF? WebKit?

> Source/WTF/wtf/ParallelHelperPool.h:61
> +// run this task on potentially many threads. The pool may not run the task on any threads, of all

of => if

> Source/WTF/wtf/ParallelHelperPool.h:169
> +class ParallelHelperClient {

One class per file please.

> Source/WTF/wtf/ParallelHelperPool.h:170
> +    WTF_MAKE_NONCOPYABLE(ParallelHelperClient); WTF_MAKE_FAST_ALLOCATED;

Two lines please.

> Source/WTF/wtf/ParallelHelperPool.h:181
> +    void setTask(RefPtr<ParallelHelperTask>);
> +
> +    template<typename Functor>
> +    void setFunction(const Functor& functor)
> +    {
> +        setTask(createParallelHelperTask(functor));
> +    }

I think you can eliminate this task vs function distinction and the ParallelHelperTask and ParallelHelperTaskFunctor abstraction by using std::function.

> Source/WTF/wtf/ParallelHelperPool.h:218
> +class ParallelHelperPool : public ThreadSafeRefCounted<ParallelHelperPool> {

One class per file please.

"Helper" is a bit vague, as is "Helper client".

Maybe we can call these "ParallelTaskRunner (the pool) and ParallelTask (the 'client')". (Those names are currently sort of taken by the functor classes, but see my suggestion to remove those classes, or we could just rename them to ParallelTaskRunnable and ParallelTaskFunctor.)

> Source/WTF/wtf/ParallelHelperPool.h:224
> +    void addThreads(unsigned numThreads);
> +    void ensureThreads(unsigned numThreads);

Nobody seems to call these :(.

> Source/WTF/wtf/WeakRandom.h:100
> +        uint64_t cutoff = (static_cast<uint64_t>(UINT_MAX) + 1) / limit * limit;
> +        for (;;) {
> +            uint64_t value = getUint32();
> +            if (value >= cutoff)
> +                continue;
> +            return value % limit;
> +        }

I don't get this. What values are we trying to defend against? Should % always be safe as long as we're dealing with uint32?
Comment 13 Filip Pizlo 2015-09-25 11:49:54 PDT
(In reply to comment #12)
> Comment on attachment 261910 [details]
> the patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=261910&action=review
> 
> > Source/JavaScriptCore/heap/Heap.cpp:604
> > +    m_helperClient.finish();
> 
> Shouldn't we doSomeHelping at some point? It seems like the main thread will
> not participate in marking unless we do some helping.

No, that would be wrong.  Parallel marking uses a different marking algorithm on the helper threads than on the main thread.  doSomeHelping() would run marking-in-helper instead of marking-on-main-thread.

Note that this is not a change in behavior from what we do on trunk.  Parallel GC has always used different code for the marking helpers.

> 
> > Source/WTF/ChangeLog:39
> > +        join after the other threads have already started. it's not advisable to make algorithmic
> 
> It's
> 
> > Source/WTF/wtf/ParallelHelperPool.cpp:56
> > +void ParallelHelperClient::setTask(RefPtr<ParallelHelperTask> task)
> 
> I don't think the setter naming idiom works here. The post condition for a
> setter is that the value you set should now be visible in the member
> variable. But this function will clear the value you set before returning.

No, it won't.  setTask() doesn't clear the task.

> 
> Let's call this runTask.

That would be wrong, because setTask() does not run the task.

I believe that setTask() is the correct name, since it just sets m_task and notifies the helper threads.

> 
> > Source/WTF/wtf/ParallelHelperPool.cpp:83
> > +void ParallelHelperClient::runTaskInParallel(RefPtr<ParallelHelperTask> task)
> 
> Let's call this runTaskSynchronously. We don't need parallel in the function
> name since it's in the class name.

I really don't like calling it runTaskSynchronously.  "Synchronously" implies that there is some synchronization being performed, and could easily confuse people into thinking that the task will only run in one thread at a time.  In fact, the task will experience parallelism - the same task instance will most likely be run from more than one thread.  I don't mind that "Parallel" is both in the method name and the class name, since it's a useful reminder.

Perhaps a better name is "runTask()", if you really think that the redundant use of "Parallel" should be avoided.

> 
> > Source/WTF/wtf/ParallelHelperPool.cpp:180
> > +            "Parallel Helper Thread",
> 
> Let's namespace this since it will show up in process-global output. WTF?
> WebKit?

Good point, I'll use "WTF Parallel Helper Thread".

> 
> > Source/WTF/wtf/ParallelHelperPool.h:61
> > +// run this task on potentially many threads. The pool may not run the task on any threads, of all
> 
> of => if
> 
> > Source/WTF/wtf/ParallelHelperPool.h:169
> > +class ParallelHelperClient {
> 
> One class per file please.

I'm not sure if that's a good idea.  This header is included from everything that includes Heap.h, which is basically everything in the whole project.  I've heard that build times are somewhat related to the number of headers that need to be included, and I've had requests in the past to coalesce headers that are included everywhere for the sake of WebCore build performance.

That's one of the reasons why I used one header file.  I also think that because of the tight coupling between these classes, one file simplifies things a bit.

> 
> > Source/WTF/wtf/ParallelHelperPool.h:170
> > +    WTF_MAKE_NONCOPYABLE(ParallelHelperClient); WTF_MAKE_FAST_ALLOCATED;
> 
> Two lines please.
> 
> > Source/WTF/wtf/ParallelHelperPool.h:181
> > +    void setTask(RefPtr<ParallelHelperTask>);
> > +
> > +    template<typename Functor>
> > +    void setFunction(const Functor& functor)
> > +    {
> > +        setTask(createParallelHelperTask(functor));
> > +    }
> 
> I think you can eliminate this task vs function distinction and the
> ParallelHelperTask and ParallelHelperTaskFunctor abstraction by using
> std::function.

I don't think that's a good idea, unless you are proposing wrapping std::function inside a ref-counted object, which is exactly what ParallelHelperTaskFunctor does.

Notice that a key part of the helper thread algorithm is: get a reference to the current task while holding the lock, then release the lock, then run the task.  If we were using std::function, the "getting a reference" would require doing a copy of the std::function's payload.  I don't like that, since it means that the helper thread worklist algorithm's performance would be related to the number of variables that you closed over.

Also, copying std::function's that have more than ~3 captured variables requires system malloc.  That would mean that we'd be calling system malloc while holding the single lock that protects the helper pool.  That seems like it will cause scalability problems in the future.

> 
> > Source/WTF/wtf/ParallelHelperPool.h:218
> > +class ParallelHelperPool : public ThreadSafeRefCounted<ParallelHelperPool> {
> 
> One class per file please.
> 
> "Helper" is a bit vague, as is "Helper client".
> 
> Maybe we can call these "ParallelTaskRunner (the pool) and ParallelTask (the
> 'client')". (Those names are currently sort of taken by the functor classes,
> but see my suggestion to remove those classes, or we could just rename them
> to ParallelTaskRunnable and ParallelTaskFunctor.)

I'm fine with maybe changing the names of things, but I'd like to preserve the performance guarantees by using a reference-counted functor rather than std::function.

> 
> > Source/WTF/wtf/ParallelHelperPool.h:224
> > +    void addThreads(unsigned numThreads);
> > +    void ensureThreads(unsigned numThreads);
> 
> Nobody seems to call these :(.

See heap/HeapHelperPool.cpp.  You're right that it only calls one of them.  I can remove the one that it doesn't call.

> 
> > Source/WTF/wtf/WeakRandom.h:100
> > +        uint64_t cutoff = (static_cast<uint64_t>(UINT_MAX) + 1) / limit * limit;
> > +        for (;;) {
> > +            uint64_t value = getUint32();
> > +            if (value >= cutoff)
> > +                continue;
> > +            return value % limit;
> > +        }
> 
> I don't get this. What values are we trying to defend against? Should %
> always be safe as long as we're dealing with uint32?

The % is "safe" but it would lead to bias.  For example, if you just did random.getUint32() % 3, then you'd be more likely to get 0 than 2.  One easy way to defend against this is to ignore return values from getUint32() that are above the cutoff above which the modulo experiences a bias.  That's what I'm doing here.  It doesn't matter *too* much but I figured that if I'm putting WeakRandom into WTF and adding a getUint32(limit) API, I might as well do it the right way.
Comment 14 Filip Pizlo 2015-09-25 12:08:20 PDT
Geoff and I talked in person.  I'll move ParallelHelperTask into its own header, called Task.  I'll make some of the other fixes.

(In reply to comment #13)
> (In reply to comment #12)
> > Comment on attachment 261910 [details]
> > the patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=261910&action=review
> > 
> > > Source/JavaScriptCore/heap/Heap.cpp:604
> > > +    m_helperClient.finish();
> > 
> > Shouldn't we doSomeHelping at some point? It seems like the main thread will
> > not participate in marking unless we do some helping.
> 
> No, that would be wrong.  Parallel marking uses a different marking
> algorithm on the helper threads than on the main thread.  doSomeHelping()
> would run marking-in-helper instead of marking-on-main-thread.
> 
> Note that this is not a change in behavior from what we do on trunk. 
> Parallel GC has always used different code for the marking helpers.

We'll keep this the same.

> 
> > 
> > > Source/WTF/ChangeLog:39
> > > +        join after the other threads have already started. it's not advisable to make algorithmic
> > 
> > It's

I'll fix this.

> > 
> > > Source/WTF/wtf/ParallelHelperPool.cpp:56
> > > +void ParallelHelperClient::setTask(RefPtr<ParallelHelperTask> task)
> > 
> > I don't think the setter naming idiom works here. The post condition for a
> > setter is that the value you set should now be visible in the member
> > variable. But this function will clear the value you set before returning.
> 
> No, it won't.  setTask() doesn't clear the task.
> 
> > 
> > Let's call this runTask.
> 
> That would be wrong, because setTask() does not run the task.
> 
> I believe that setTask() is the correct name, since it just sets m_task and
> notifies the helper threads.
> 

We'll keep this the same.

> > 
> > > Source/WTF/wtf/ParallelHelperPool.cpp:83
> > > +void ParallelHelperClient::runTaskInParallel(RefPtr<ParallelHelperTask> task)
> > 
> > Let's call this runTaskSynchronously. We don't need parallel in the function
> > name since it's in the class name.
> 
> I really don't like calling it runTaskSynchronously.  "Synchronously"
> implies that there is some synchronization being performed, and could easily
> confuse people into thinking that the task will only run in one thread at a
> time.  In fact, the task will experience parallelism - the same task
> instance will most likely be run from more than one thread.  I don't mind
> that "Parallel" is both in the method name and the class name, since it's a
> useful reminder.
> 
> Perhaps a better name is "runTask()", if you really think that the redundant
> use of "Parallel" should be avoided.
> 

We'll keep this the same.

> > 
> > > Source/WTF/wtf/ParallelHelperPool.cpp:180
> > > +            "Parallel Helper Thread",
> > 
> > Let's namespace this since it will show up in process-global output. WTF?
> > WebKit?
> 
> Good point, I'll use "WTF Parallel Helper Thread".
> 

I'll fix this.

> > 
> > > Source/WTF/wtf/ParallelHelperPool.h:61
> > > +// run this task on potentially many threads. The pool may not run the task on any threads, of all
> > 
> > of => if

I'll fix this.

> > 
> > > Source/WTF/wtf/ParallelHelperPool.h:169
> > > +class ParallelHelperClient {
> > 
> > One class per file please.
> 
> I'm not sure if that's a good idea.  This header is included from everything
> that includes Heap.h, which is basically everything in the whole project. 
> I've heard that build times are somewhat related to the number of headers
> that need to be included, and I've had requests in the past to coalesce
> headers that are included everywhere for the sake of WebCore build
> performance.
> 
> That's one of the reasons why I used one header file.  I also think that
> because of the tight coupling between these classes, one file simplifies
> things a bit.

I'll move task into its own header.

> 
> > 
> > > Source/WTF/wtf/ParallelHelperPool.h:170
> > > +    WTF_MAKE_NONCOPYABLE(ParallelHelperClient); WTF_MAKE_FAST_ALLOCATED;
> > 
> > Two lines please.

I'll fix this.

> > 
> > > Source/WTF/wtf/ParallelHelperPool.h:181
> > > +    void setTask(RefPtr<ParallelHelperTask>);
> > > +
> > > +    template<typename Functor>
> > > +    void setFunction(const Functor& functor)
> > > +    {
> > > +        setTask(createParallelHelperTask(functor));
> > > +    }
> > 
> > I think you can eliminate this task vs function distinction and the
> > ParallelHelperTask and ParallelHelperTaskFunctor abstraction by using
> > std::function.
> 
> I don't think that's a good idea, unless you are proposing wrapping
> std::function inside a ref-counted object, which is exactly what
> ParallelHelperTaskFunctor does.
> 
> Notice that a key part of the helper thread algorithm is: get a reference to
> the current task while holding the lock, then release the lock, then run the
> task.  If we were using std::function, the "getting a reference" would
> require doing a copy of the std::function's payload.  I don't like that,
> since it means that the helper thread worklist algorithm's performance would
> be related to the number of variables that you closed over.
> 
> Also, copying std::function's that have more than ~3 captured variables
> requires system malloc.  That would mean that we'd be calling system malloc
> while holding the single lock that protects the helper pool.  That seems
> like it will cause scalability problems in the future.

We'll keep this the same.

> 
> > 
> > > Source/WTF/wtf/ParallelHelperPool.h:218
> > > +class ParallelHelperPool : public ThreadSafeRefCounted<ParallelHelperPool> {
> > 
> > One class per file please.
> > 
> > "Helper" is a bit vague, as is "Helper client".
> > 
> > Maybe we can call these "ParallelTaskRunner (the pool) and ParallelTask (the
> > 'client')". (Those names are currently sort of taken by the functor classes,
> > but see my suggestion to remove those classes, or we could just rename them
> > to ParallelTaskRunnable and ParallelTaskFunctor.)
> 
> I'm fine with maybe changing the names of things, but I'd like to preserve
> the performance guarantees by using a reference-counted functor rather than
> std::function.
> 

We'll keep this the same.

> > 
> > > Source/WTF/wtf/ParallelHelperPool.h:224
> > > +    void addThreads(unsigned numThreads);
> > > +    void ensureThreads(unsigned numThreads);
> > 
> > Nobody seems to call these :(.
> 
> See heap/HeapHelperPool.cpp.  You're right that it only calls one of them. 
> I can remove the one that it doesn't call.
> 

I'll remove addThreads().

> > 
> > > Source/WTF/wtf/WeakRandom.h:100
> > > +        uint64_t cutoff = (static_cast<uint64_t>(UINT_MAX) + 1) / limit * limit;
> > > +        for (;;) {
> > > +            uint64_t value = getUint32();
> > > +            if (value >= cutoff)
> > > +                continue;
> > > +            return value % limit;
> > > +        }
> > 
> > I don't get this. What values are we trying to defend against? Should %
> > always be safe as long as we're dealing with uint32?
> 
> The % is "safe" but it would lead to bias.  For example, if you just did
> random.getUint32() % 3, then you'd be more likely to get 0 than 2.  One easy
> way to defend against this is to ignore return values from getUint32() that
> are above the cutoff above which the modulo experiences a bias.  That's what
> I'm doing here.  It doesn't matter *too* much but I figured that if I'm
> putting WeakRandom into WTF and adding a getUint32(limit) API, I might as
> well do it the right way.

We'll keep this the same.
Comment 15 Filip Pizlo 2015-09-25 13:08:56 PDT
Created attachment 261934 [details]
patch for landing
Comment 16 WebKit Commit Bot 2015-09-25 13:32:34 PDT
Attachment 261934 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/API/tests/testapi.mm:495:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/JavaScriptCore/API/tests/testapi.mm:496:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/JavaScriptCore/API/tests/testapi.mm:497:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/JavaScriptCore/API/tests/testapi.mm:498:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/JavaScriptCore/API/tests/testapi.mm:499:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/JavaScriptCore/API/tests/testapi.mm:500:  Weird number of spaces at line-start.  Are you using a 4-space indent?  [whitespace/indent] [3]
ERROR: Source/JavaScriptCore/heap/HeapHelperPool.cpp:30:  Bad include order. Mixing system and custom headers.  [build/include_order] [4]
ERROR: Source/WTF/wtf/ParallelHelperPool.cpp:173:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/runtime/CodeCache.h:38:  Alphabetical sorting problem.  [build/include_order] [4]
ERROR: Source/JavaScriptCore/heap/Heap.cpp:550:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/heap/Heap.cpp:635:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/runtime/VM.h:68:  Alphabetical sorting problem.  [build/include_order] [4]
Total errors found: 12 in 25 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 17 Filip Pizlo 2015-09-26 11:08:11 PDT
Landed in http://trac.webkit.org/changeset/190267
Comment 19 Filip Pizlo 2015-09-26 11:38:34 PDT
(In reply to comment #18)
> This broke Windows builds:
> https://build.webkit.org/builders/Apple%20Win%20Debug%20%28Build%29/builds/
> 91251/steps/compile-webkit/logs/stdio

Fixing it.
Comment 20 Filip Pizlo 2015-09-26 11:41:59 PDT
(In reply to comment #19)
> (In reply to comment #18)
> > This broke Windows builds:
> > https://build.webkit.org/builders/Apple%20Win%20Debug%20%28Build%29/builds/
> > 91251/steps/compile-webkit/logs/stdio
> 
> Fixing it.

Fixed in http://trac.webkit.org/changeset/190268
Comment 21 Ryosuke Niwa 2015-09-26 12:04:52 PDT
Now lots of layout tests are crashing in Debug builds:
https://build.webkit.org/results/Apple%20Mavericks%20Debug%20WK1%20(Tests)/r190267%20(15885)/results.html

e.g.
stderr:
ASSERTION FAILED: isMainThreadOrGCThread()
/Volumes/Data/slave/mavericks-debug/build/Source/WebCore/dom/ShadowRoot.h(134) : WebCore::ContainerNode *WebCore::Node::parentOrShadowHostNode() const
1   0x109b75830 WTFCrash
2   0x10e461995 WebCore::Node::parentOrShadowHostNode() const
3   0x10eff6e42 WebCore::root(WebCore::Node*)
4   0x10f1e8375 WebCore::root(WebCore::Node&)
5   0x10f35c9f9 WebCore::JSNode::visitAdditionalChildren(JSC::SlotVisitor&)
6   0x10f35af76 WebCore::JSNode::visitChildren(JSC::JSCell*, JSC::SlotVisitor&)
7   0x109a94688 JSC::visitChildren(JSC::SlotVisitor&, JSC::JSCell const*)
8   0x109a94558 JSC::SlotVisitor::drain()
9   0x109a94acc JSC::SlotVisitor::drainFromShared(JSC::SlotVisitor::SharedDrainMode)
10  0x109638e58 JSC::Heap::markRoots(double, void*, void*, int (&) [37])::$_0::operator()() const
11  0x109638c5c WTF::SharedTaskFunctor<JSC::Heap::markRoots(double, void*, void*, int (&) [37])::$_0>::run()
12  0x109bc3497 WTF::ParallelHelperClient::runTask(WTF::RefPtr<WTF::SharedTask>)
13  0x109bc3c52 WTF::ParallelHelperPool::helperThreadBody()
14  0x109bc4ca8 WTF::ParallelHelperPool::didMakeWorkAvailable(WTF::Locker<WTF::LockBase> const&)::$_0::operator()() const
15  0x109bc4c7c std::__1::__function::__func<WTF::ParallelHelperPool::didMakeWorkAvailable(WTF::Locker<WTF::LockBase> const&)::$_0, std::__1::allocator<WTF::ParallelHelperPool::didMakeWorkAvailable(WTF::Locker<WTF::LockBase> const&)::$_0>, void ()>::operator()()
16  0x10963b7da std::__1::function<void ()>::operator()() const
17  0x109be1c2e WTF::threadEntryPoint(void*)
18  0x109be3548 WTF::wtfThreadEntryPoint(void*)
19  0x7fff8b58e899 _pthread_body
20  0x7fff8b58e72a _pthread_struct_init
21  0x7fff8b592fc9 thread_start
Comment 22 Filip Pizlo 2015-09-26 12:58:28 PDT
(In reply to comment #21)
> Now lots of layout tests are crashing in Debug builds:
> https://build.webkit.org/results/Apple%20Mavericks%20Debug%20WK1%20(Tests)/
> r190267%20(15885)/results.html
> 
> e.g.
> stderr:
> ASSERTION FAILED: isMainThreadOrGCThread()
> /Volumes/Data/slave/mavericks-debug/build/Source/WebCore/dom/ShadowRoot.
> h(134) : WebCore::ContainerNode *WebCore::Node::parentOrShadowHostNode()
> const
> 1   0x109b75830 WTFCrash
> 2   0x10e461995 WebCore::Node::parentOrShadowHostNode() const
> 3   0x10eff6e42 WebCore::root(WebCore::Node*)
> 4   0x10f1e8375 WebCore::root(WebCore::Node&)
> 5   0x10f35c9f9 WebCore::JSNode::visitAdditionalChildren(JSC::SlotVisitor&)
> 6   0x10f35af76 WebCore::JSNode::visitChildren(JSC::JSCell*,
> JSC::SlotVisitor&)
> 7   0x109a94688 JSC::visitChildren(JSC::SlotVisitor&, JSC::JSCell const*)
> 8   0x109a94558 JSC::SlotVisitor::drain()
> 9   0x109a94acc
> JSC::SlotVisitor::drainFromShared(JSC::SlotVisitor::SharedDrainMode)
> 10  0x109638e58 JSC::Heap::markRoots(double, void*, void*, int (&)
> [37])::$_0::operator()() const
> 11  0x109638c5c WTF::SharedTaskFunctor<JSC::Heap::markRoots(double, void*,
> void*, int (&) [37])::$_0>::run()
> 12  0x109bc3497
> WTF::ParallelHelperClient::runTask(WTF::RefPtr<WTF::SharedTask>)
> 13  0x109bc3c52 WTF::ParallelHelperPool::helperThreadBody()
> 14  0x109bc4ca8
> WTF::ParallelHelperPool::didMakeWorkAvailable(WTF::Locker<WTF::LockBase>
> const&)::$_0::operator()() const
> 15  0x109bc4c7c
> std::__1::__function::__func<WTF::ParallelHelperPool::
> didMakeWorkAvailable(WTF::Locker<WTF::LockBase> const&)::$_0,
> std::__1::allocator<WTF::ParallelHelperPool::didMakeWorkAvailable(WTF::
> Locker<WTF::LockBase> const&)::$_0>, void ()>::operator()()
> 16  0x10963b7da std::__1::function<void ()>::operator()() const
> 17  0x109be1c2e WTF::threadEntryPoint(void*)
> 18  0x109be3548 WTF::wtfThreadEntryPoint(void*)
> 19  0x7fff8b58e899 _pthread_body
> 20  0x7fff8b58e72a _pthread_struct_init
> 21  0x7fff8b592fc9 thread_start

Looking...
Comment 23 Filip Pizlo 2015-09-26 13:01:41 PDT
(In reply to comment #22)
> (In reply to comment #21)
> > Now lots of layout tests are crashing in Debug builds:
> > https://build.webkit.org/results/Apple%20Mavericks%20Debug%20WK1%20(Tests)/
> > r190267%20(15885)/results.html
> > 
> > e.g.
> > stderr:
> > ASSERTION FAILED: isMainThreadOrGCThread()
> > /Volumes/Data/slave/mavericks-debug/build/Source/WebCore/dom/ShadowRoot.
> > h(134) : WebCore::ContainerNode *WebCore::Node::parentOrShadowHostNode()
> > const
> > 1   0x109b75830 WTFCrash
> > 2   0x10e461995 WebCore::Node::parentOrShadowHostNode() const
> > 3   0x10eff6e42 WebCore::root(WebCore::Node*)
> > 4   0x10f1e8375 WebCore::root(WebCore::Node&)
> > 5   0x10f35c9f9 WebCore::JSNode::visitAdditionalChildren(JSC::SlotVisitor&)
> > 6   0x10f35af76 WebCore::JSNode::visitChildren(JSC::JSCell*,
> > JSC::SlotVisitor&)
> > 7   0x109a94688 JSC::visitChildren(JSC::SlotVisitor&, JSC::JSCell const*)
> > 8   0x109a94558 JSC::SlotVisitor::drain()
> > 9   0x109a94acc
> > JSC::SlotVisitor::drainFromShared(JSC::SlotVisitor::SharedDrainMode)
> > 10  0x109638e58 JSC::Heap::markRoots(double, void*, void*, int (&)
> > [37])::$_0::operator()() const
> > 11  0x109638c5c WTF::SharedTaskFunctor<JSC::Heap::markRoots(double, void*,
> > void*, int (&) [37])::$_0>::run()
> > 12  0x109bc3497
> > WTF::ParallelHelperClient::runTask(WTF::RefPtr<WTF::SharedTask>)
> > 13  0x109bc3c52 WTF::ParallelHelperPool::helperThreadBody()
> > 14  0x109bc4ca8
> > WTF::ParallelHelperPool::didMakeWorkAvailable(WTF::Locker<WTF::LockBase>
> > const&)::$_0::operator()() const
> > 15  0x109bc4c7c
> > std::__1::__function::__func<WTF::ParallelHelperPool::
> > didMakeWorkAvailable(WTF::Locker<WTF::LockBase> const&)::$_0,
> > std::__1::allocator<WTF::ParallelHelperPool::didMakeWorkAvailable(WTF::
> > Locker<WTF::LockBase> const&)::$_0>, void ()>::operator()()
> > 16  0x10963b7da std::__1::function<void ()>::operator()() const
> > 17  0x109be1c2e WTF::threadEntryPoint(void*)
> > 18  0x109be3548 WTF::wtfThreadEntryPoint(void*)
> > 19  0x7fff8b58e899 _pthread_body
> > 20  0x7fff8b58e72a _pthread_struct_init
> > 21  0x7fff8b592fc9 thread_start
> 
> Looking...

Fix on the way.
Comment 24 Filip Pizlo 2015-09-26 13:07:15 PDT
(In reply to comment #23)
> (In reply to comment #22)
> > (In reply to comment #21)
> > > Now lots of layout tests are crashing in Debug builds:
> > > https://build.webkit.org/results/Apple%20Mavericks%20Debug%20WK1%20(Tests)/
> > > r190267%20(15885)/results.html
> > > 
> > > e.g.
> > > stderr:
> > > ASSERTION FAILED: isMainThreadOrGCThread()
> > > /Volumes/Data/slave/mavericks-debug/build/Source/WebCore/dom/ShadowRoot.
> > > h(134) : WebCore::ContainerNode *WebCore::Node::parentOrShadowHostNode()
> > > const
> > > 1   0x109b75830 WTFCrash
> > > 2   0x10e461995 WebCore::Node::parentOrShadowHostNode() const
> > > 3   0x10eff6e42 WebCore::root(WebCore::Node*)
> > > 4   0x10f1e8375 WebCore::root(WebCore::Node&)
> > > 5   0x10f35c9f9 WebCore::JSNode::visitAdditionalChildren(JSC::SlotVisitor&)
> > > 6   0x10f35af76 WebCore::JSNode::visitChildren(JSC::JSCell*,
> > > JSC::SlotVisitor&)
> > > 7   0x109a94688 JSC::visitChildren(JSC::SlotVisitor&, JSC::JSCell const*)
> > > 8   0x109a94558 JSC::SlotVisitor::drain()
> > > 9   0x109a94acc
> > > JSC::SlotVisitor::drainFromShared(JSC::SlotVisitor::SharedDrainMode)
> > > 10  0x109638e58 JSC::Heap::markRoots(double, void*, void*, int (&)
> > > [37])::$_0::operator()() const
> > > 11  0x109638c5c WTF::SharedTaskFunctor<JSC::Heap::markRoots(double, void*,
> > > void*, int (&) [37])::$_0>::run()
> > > 12  0x109bc3497
> > > WTF::ParallelHelperClient::runTask(WTF::RefPtr<WTF::SharedTask>)
> > > 13  0x109bc3c52 WTF::ParallelHelperPool::helperThreadBody()
> > > 14  0x109bc4ca8
> > > WTF::ParallelHelperPool::didMakeWorkAvailable(WTF::Locker<WTF::LockBase>
> > > const&)::$_0::operator()() const
> > > 15  0x109bc4c7c
> > > std::__1::__function::__func<WTF::ParallelHelperPool::
> > > didMakeWorkAvailable(WTF::Locker<WTF::LockBase> const&)::$_0,
> > > std::__1::allocator<WTF::ParallelHelperPool::didMakeWorkAvailable(WTF::
> > > Locker<WTF::LockBase> const&)::$_0>, void ()>::operator()()
> > > 16  0x10963b7da std::__1::function<void ()>::operator()() const
> > > 17  0x109be1c2e WTF::threadEntryPoint(void*)
> > > 18  0x109be3548 WTF::wtfThreadEntryPoint(void*)
> > > 19  0x7fff8b58e899 _pthread_body
> > > 20  0x7fff8b58e72a _pthread_struct_init
> > > 21  0x7fff8b592fc9 thread_start
> > 
> > Looking...
> 
> Fix on the way.

Here is a fairly principled fix: https://bugs.webkit.org/show_bug.cgi?id=149582

It's got enough code that it should probably be reviewed.  If I don't see a review soon, I will land a simpler fix that just calls WTF::registerGCThread().
Comment 25 Filip Pizlo 2015-09-26 13:14:06 PDT
(In reply to comment #24)
> (In reply to comment #23)
> > (In reply to comment #22)
> > > (In reply to comment #21)
> > > > Now lots of layout tests are crashing in Debug builds:
> > > > https://build.webkit.org/results/Apple%20Mavericks%20Debug%20WK1%20(Tests)/
> > > > r190267%20(15885)/results.html
> > > > 
> > > > e.g.
> > > > stderr:
> > > > ASSERTION FAILED: isMainThreadOrGCThread()
> > > > /Volumes/Data/slave/mavericks-debug/build/Source/WebCore/dom/ShadowRoot.
> > > > h(134) : WebCore::ContainerNode *WebCore::Node::parentOrShadowHostNode()
> > > > const
> > > > 1   0x109b75830 WTFCrash
> > > > 2   0x10e461995 WebCore::Node::parentOrShadowHostNode() const
> > > > 3   0x10eff6e42 WebCore::root(WebCore::Node*)
> > > > 4   0x10f1e8375 WebCore::root(WebCore::Node&)
> > > > 5   0x10f35c9f9 WebCore::JSNode::visitAdditionalChildren(JSC::SlotVisitor&)
> > > > 6   0x10f35af76 WebCore::JSNode::visitChildren(JSC::JSCell*,
> > > > JSC::SlotVisitor&)
> > > > 7   0x109a94688 JSC::visitChildren(JSC::SlotVisitor&, JSC::JSCell const*)
> > > > 8   0x109a94558 JSC::SlotVisitor::drain()
> > > > 9   0x109a94acc
> > > > JSC::SlotVisitor::drainFromShared(JSC::SlotVisitor::SharedDrainMode)
> > > > 10  0x109638e58 JSC::Heap::markRoots(double, void*, void*, int (&)
> > > > [37])::$_0::operator()() const
> > > > 11  0x109638c5c WTF::SharedTaskFunctor<JSC::Heap::markRoots(double, void*,
> > > > void*, int (&) [37])::$_0>::run()
> > > > 12  0x109bc3497
> > > > WTF::ParallelHelperClient::runTask(WTF::RefPtr<WTF::SharedTask>)
> > > > 13  0x109bc3c52 WTF::ParallelHelperPool::helperThreadBody()
> > > > 14  0x109bc4ca8
> > > > WTF::ParallelHelperPool::didMakeWorkAvailable(WTF::Locker<WTF::LockBase>
> > > > const&)::$_0::operator()() const
> > > > 15  0x109bc4c7c
> > > > std::__1::__function::__func<WTF::ParallelHelperPool::
> > > > didMakeWorkAvailable(WTF::Locker<WTF::LockBase> const&)::$_0,
> > > > std::__1::allocator<WTF::ParallelHelperPool::didMakeWorkAvailable(WTF::
> > > > Locker<WTF::LockBase> const&)::$_0>, void ()>::operator()()
> > > > 16  0x10963b7da std::__1::function<void ()>::operator()() const
> > > > 17  0x109be1c2e WTF::threadEntryPoint(void*)
> > > > 18  0x109be3548 WTF::wtfThreadEntryPoint(void*)
> > > > 19  0x7fff8b58e899 _pthread_body
> > > > 20  0x7fff8b58e72a _pthread_struct_init
> > > > 21  0x7fff8b592fc9 thread_start
> > > 
> > > Looking...
> > 
> > Fix on the way.
> 
> Here is a fairly principled fix:
> https://bugs.webkit.org/show_bug.cgi?id=149582
> 
> It's got enough code that it should probably be reviewed.  If I don't see a
> review soon, I will land a simpler fix that just calls
> WTF::registerGCThread().

Landed the simpler fix in http://trac.webkit.org/changeset/190269