Bug 86024 - Enh: Hash Const JSString in Backing Stores to Save Memory
Summary: Enh: Hash Const JSString in Backing Stores to Save Memory
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Michael Saboff
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2012-05-09 15:28 PDT by Michael Saboff
Modified: 2012-07-12 16:12 PDT (History)
4 users (show)

See Also:


Attachments
Patch for Review (11.62 KB, patch)
2012-05-09 15:45 PDT, Michael Saboff
gustavo: commit-queue-
Details | Formatted Diff | Diff
Speculative build fix (11.65 KB, patch)
2012-05-09 17:33 PDT, Michael Saboff
fpizlo: review+
Details | Formatted Diff | Diff
Patch with fix for string hash ASSERT failures (9.85 KB, patch)
2012-06-29 09:40 PDT, Michael Saboff
ggaren: review-
Details | Formatted Diff | Diff
Updated patch with changes suggested by reviewer (13.36 KB, patch)
2012-07-02 16:00 PDT, Michael Saboff
oliver: review-
Details | Formatted Diff | Diff
Patch with suggested fixes (13.63 KB, patch)
2012-07-02 17:15 PDT, Michael Saboff
gyuyoung.kim: commit-queue-
Details | Formatted Diff | Diff
FFixed build issues. (13.64 KB, patch)
2012-07-02 17:37 PDT, Michael Saboff
oliver: review+
Details | Formatted Diff | Diff
Fixed a minor bug and improved new strings since last hash const logic (15.14 KB, patch)
2012-07-03 11:54 PDT, Michael Saboff
oliver: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Saboff 2012-05-09 15:28:49 PDT
JavaScript code often creates multiple strings that contain the same actual text. Poorly written JavaScript is especially prone to this. This enhancement is to coalesce multiple JSStrings that contain the same text into one common string. Given that JSStrings are immutable, this is possible.

The easiest path to start doing this is to hash const strings that are in the backing store for arrays and objects.
Comment 1 Michael Saboff 2012-05-09 15:45:14 PDT
Created attachment 141035 [details]
Patch for Review

Willing to consider other names for JSString::m_isHashConstSingleton.  This is some what descriptive of a string with the flag set.  Another name could be m_dontHashConstString which is descriptive of its purpose of the flag.
Comment 2 Gustavo Noronha (kov) 2012-05-09 15:55:03 PDT
Comment on attachment 141035 [details]
Patch for Review

Attachment 141035 [details] did not pass gtk-ews (gtk):
Output: http://queues.webkit.org/results/12652843
Comment 3 Build Bot 2012-05-09 16:07:18 PDT
Comment on attachment 141035 [details]
Patch for Review

Attachment 141035 [details] did not pass win-ews (win):
Output: http://queues.webkit.org/results/12652841
Comment 4 Gyuyoung Kim 2012-05-09 17:13:18 PDT
Comment on attachment 141035 [details]
Patch for Review

Attachment 141035 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/12652862
Comment 5 Michael Saboff 2012-05-09 17:33:40 PDT
Created attachment 141063 [details]
Speculative build fix

Enclosed call to MarkStackThreadSharedData::resetChildren() inside parallel GC #if.
Comment 6 Michael Saboff 2012-05-10 10:32:03 PDT
Committed r116659: <http://trac.webkit.org/changeset/116659>
Comment 7 Antti Koivisto 2012-05-11 05:53:30 PDT
This seem to be hitting asserts, bug 86203
Comment 8 Michael Saboff 2012-05-11 16:33:09 PDT
Rolling out change.
Comment 10 Michael Saboff 2012-06-26 12:27:16 PDT
<rdar://problem/11752030>
Comment 11 Michael Saboff 2012-06-26 17:22:23 PDT
(In reply to comment #10)
> <rdar://problem/11752030>

Actually the original is <rdar://problem/11410050>
Comment 12 Michael Saboff 2012-06-29 09:40:38 PDT
Created attachment 150197 [details]
Patch with fix for string hash ASSERT failures

Updated the code for the case where two marking threads are in the process of hash const'ing the same string.  Added a tryHashConstLock() to JSString.  When a thread fails to get the lock, it gives up hash con sting the string.  The winning thread will be able to hash const.  This eliminated several ASSERT failure modes when two threads try to hash a string at nearly the same time.
Comment 13 Geoffrey Garen 2012-06-29 11:37:20 PDT
What's the performance cost of doing this on every garbage collection? What's the cost when there are no strings in the heap, and what's the cost when the heap is 100% unique strings?

Is is possible to limit how often we do this string unique-ing?
Comment 14 Michael Saboff 2012-06-29 11:57:43 PDT
(In reply to comment #13)
> What's the performance cost of doing this on every garbage collection? What's the cost when there are no strings in the heap, and what's the cost when the heap is 100% unique strings?
> 
> Is is possible to limit how often we do this string unique-ing?

The performance appears to be neutral overall.  v8-splay with DRT looks slower, but using JSC shows it's faster.  See below.  The no string case is an additional value.isString() check that will fail.  

The current patch has a hash const kill bit so that strings aren't continuously const'ed.  This does mean that we can never get to 100% unique, but it reduces the ongoing const'ing cost.  We may want to change this in the future to a const count instead of a single bit to further improve uniqueness.

We could limit how often by using a new string since last const threshold or by const'ing in a separate thread that has its own timer.  

Benchmark report for SunSpider, V8, V8Real, Kraken, JSBench, JSRegress, and DSP on msaboff-pro (MacPro5,1).

VMs tested:
"Base" at /Volumes/Data/src/webkit.baseline/WebKitBuild/Release/DumpRenderTree (r121160)
"HashConst" at /Volumes/Data/src/webkit/WebKitBuild/Release/DumpRenderTree (r121160)

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to
get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds.

                                                     Base                   HashConst                                     
SunSpider:
   3d-cube                                      8.6762+-0.6578     ?      8.6762+-0.6311        ?
   3d-morph                                     7.5697+-0.0574     ?      7.5991+-0.0497        ?
   3d-raytrace                                 14.1252+-0.5765           14.0391+-0.6123        
   access-binary-trees                          2.2856+-0.4548     ?      2.2994+-0.4653        ?
   access-fannkuch                              6.7956+-0.1100            6.7222+-0.0411          might be 1.0109x faster
   access-nbody                                 4.1792+-0.0387     ?      4.1858+-0.0307        ?
   access-nsieve                                3.2875+-0.0473            3.2794+-0.0554        
   bitops-3bit-bits-in-byte                     1.3716+-0.0110     ?      1.3736+-0.0088        ?
   bitops-bits-in-byte                          5.7174+-0.0531            5.7145+-0.0367        
   bitops-bitwise-and                           2.2786+-0.0278            2.2595+-0.0074        
   bitops-nsieve-bits                           3.2478+-0.0489            3.2254+-0.0389        
   controlflow-recursive                        2.5888+-0.0212     ?      2.6184+-0.0345        ? might be 1.0114x slower
   crypto-aes                                  10.0999+-0.7407           10.0631+-0.7233        
   crypto-md5                                   3.6877+-0.1098     ?      3.6952+-0.1158        ?
   crypto-sha1                                  3.1664+-0.0248     ?      3.1930+-0.0264        ?
   date-format-tofte                           13.9546+-1.3730           13.6739+-1.3103          might be 1.0205x faster
   date-format-xparb                           12.2760+-0.9493     ?     12.5290+-0.9808        ? might be 1.0206x slower
   math-cordic                                  4.1717+-0.0217     ?      4.1975+-0.0397        ?
   math-partial-sums                            9.0324+-0.0220     ?      9.0649+-0.0334        ?
   math-spectral-norm                           3.1152+-0.0256            3.0927+-0.0179        
   regexp-dna                                  10.5605+-0.3807           10.4890+-0.3909        
   string-base64                                5.7906+-0.6927     ?      5.8786+-0.6728        ? might be 1.0152x slower
   string-fasta                                 8.1582+-0.4101     ?      8.1639+-0.3869        ?
   string-tagcloud                             13.9253+-0.3059     ?     14.0120+-0.3686        ?
   string-unpack-code                          23.1382+-0.8883     ?     23.4292+-0.9274        ? might be 1.0126x slower
   string-validate-input                        8.2838+-0.6484     ?      8.4563+-0.7052        ? might be 1.0208x slower

   <arithmetic> *                               7.3648+-0.2383     ?      7.3820+-0.1900        ? might be 1.0023x slower
   <geometric>                                  5.8449+-0.1537     ?      5.8539+-0.1233        ? might be 1.0015x slower
   <harmonic>                                   4.5904+-0.0998     ?      4.5962+-0.0929        ? might be 1.0013x slower

                                                     Base                   HashConst                                     
V8:
   crypto                                      77.2960+-0.6339           77.2859+-0.7442        
   deltablue                                  138.4858+-2.4387          137.4746+-0.8139        
   earley-boyer                                88.4461+-1.3296     ?     88.9693+-1.4255        ?
   raytrace                                    61.7561+-2.8397     ?     62.7029+-2.9255        ? might be 1.0153x slower
   regexp                                      88.4419+-0.4629     ?     88.7320+-0.3942        ?
   richards                                   127.2255+-1.0954          126.9527+-0.6654        
   splay                                      105.2859+-10.6871    ?    109.6264+-10.3128       ? might be 1.0412x slower

   <arithmetic>                                98.1339+-1.3156     ?     98.8205+-1.2472        ? might be 1.0070x slower
   <geometric> *                               94.7000+-1.3117     ?     95.4741+-1.1706        ? might be 1.0082x slower
   <harmonic>                                  91.3628+-1.3391     ?     92.1949+-1.1579        ? might be 1.0091x slower

                                                     Base                   HashConst                                     
V8Real:
   encrypt                                     0.40804+-0.00034    ?     0.40864+-0.00038       ?
   decrypt                                     7.20393+-0.07552          7.15369+-0.01838       
   deltablue                          x2       0.68745+-0.00404    ?     0.68968+-0.00394       ?
   earley                                      2.15036+-0.01002          2.13805+-0.01112       
   boyer                                      13.95076+-0.09289    ?    14.11211+-0.14308       ? might be 1.0116x slower
   raytrace                           x2       5.13411+-0.04329          5.04956+-0.04231         might be 1.0167x faster
   regexp                             x2      26.85510+-0.21172         26.59170+-0.05988       
   richards                           x2       0.34534+-0.00371    ?     0.34651+-0.00489       ?
   splay                              x2       0.79769+-0.00748    !     0.85507+-0.00308       ! definitely 1.0719x slower

   <arithmetic>                                6.52518+-0.03322          6.49125+-0.01718         might be 1.0052x faster
   <geometric> *                               2.19447+-0.00635    !     2.21014+-0.00636       ! definitely 1.0071x slower
   <harmonic>                                  0.94595+-0.00452    !     0.95828+-0.00546       ! definitely 1.0130x slower

                                                     Base                   HashConst                                     
Kraken:
   ai-astar                                    803.357+-1.288            796.699+-10.284        
   audio-beat-detection                        207.808+-2.674      ?     208.747+-2.841         ?
   audio-dft                                   276.963+-1.532      ?     278.441+-1.212         ?
   audio-fft                                   127.160+-0.363            126.917+-0.222         
   audio-oscillator                            239.677+-0.294            239.299+-0.400         
   imaging-darkroom                            299.388+-1.299            296.911+-1.565         
   imaging-desaturate                          225.037+-0.617            224.603+-0.426         
   imaging-gaussian-blur                       438.760+-0.404      ?     439.820+-1.571         ?
   json-parse-financial                         67.232+-0.215      !      69.052+-0.945         ! definitely 1.0271x slower
   json-stringify-tinderbox                     84.603+-0.408      ?      84.641+-0.274         ?
   stanford-crypto-aes                          89.359+-0.954             88.687+-0.559         
   stanford-crypto-ccm                          99.134+-0.334      !      99.934+-0.428         ! definitely 1.0081x slower
   stanford-crypto-pbkdf2                      198.419+-1.137      ?     199.866+-1.133         ?
   stanford-crypto-sha256-iterative             96.168+-0.290             95.958+-0.320         

   <arithmetic> *                              232.362+-0.281            232.113+-0.852           might be 1.0011x faster
   <geometric>                                 180.849+-0.276      ?     181.141+-0.281         ? might be 1.0016x slower
   <harmonic>                                  147.084+-0.246      !     147.674+-0.215         ! definitely 1.0040x slower

                                                     Base                   HashConst                                     
JSBench:
   amazon                                      18.2500+-0.2874           18.2500+-0.2874        
   facebook                                    71.3333+-2.0692     ?     71.8333+-2.1304        ?
   google                                      95.4167+-1.3670     ?     96.4167+-1.6359        ? might be 1.0105x slower
   twitter                                     52.3333+-0.3128           52.2500+-0.2874        
   yahoo                                       22.0833+-0.1834     ?     22.3333+-0.3128        ? might be 1.0113x slower

   <arithmetic> *                              51.8833+-0.5266     ?     52.2167+-0.5153        ? might be 1.0064x slower
   <geometric>                                 42.7856+-0.3461     ?     43.0159+-0.3605        ? might be 1.0054x slower
   <harmonic>                                  34.7854+-0.2544     ?     34.9502+-0.3244        ? might be 1.0047x slower

                                                     Base                   HashConst                                     
JSRegress:
   adapt-to-double-divide                      72.6140+-0.0975     ?     72.6762+-0.1291        ?
   aliased-arguments-getbyval                   0.8366+-0.0105     ?      0.8426+-0.0183        ?
   arity-mismatch-inlining                      0.6412+-0.0075     ?      0.6433+-0.0152        ?
   big-int-mul                                  8.4775+-0.0666     ?      8.4849+-0.0677        ?
   boolean-test                                 3.5251+-0.0725            3.4780+-0.0278          might be 1.0135x faster
   cast-int-to-double                          12.1047+-0.0604           12.0942+-0.0462        
   cfg-simplify                                 3.0606+-0.0178     ?      3.1044+-0.0317        ? might be 1.0143x slower
   cmpeq-obj-to-obj-other                      13.6976+-1.2376     ?     13.8000+-1.1877        ?
   constant-test                                6.8605+-0.0168            6.8537+-0.0235        
   direct-arguments-getbyval                    0.7708+-0.0161     ?      0.7748+-0.0116        ?
   double-pollution-getbyval                    8.8611+-0.0588     ?      8.8782+-0.0548        ?
   double-pollution-putbyoffset                 5.1556+-0.8489            4.4103+-0.0805          might be 1.1690x faster
   external-arguments-getbyval                  2.0881+-0.2468     ?      2.1221+-0.2789        ? might be 1.0163x slower
   external-arguments-putbyval                  3.5392+-0.5579     ?      3.5740+-0.5779        ?
   Float32Array-matrix-mult                    12.0812+-0.9324           11.8524+-1.0004          might be 1.0193x faster
   fold-double-to-int                          38.0619+-1.2160           37.5368+-1.3019          might be 1.0140x faster
   function-dot-apply                           2.7338+-0.0318            2.7226+-0.0249        
   function-test                                4.3704+-0.0179     !      4.4758+-0.0476        ! definitely 1.0241x slower
   inline-arguments-access                      1.0929+-0.0210     ?      1.1041+-0.0194        ? might be 1.0103x slower
   inline-arguments-local-escape               28.8715+-3.9518     ?     29.1550+-4.0264        ?
   int-overflow-local                          85.6392+-0.1476     ?     85.8113+-0.2798        ?
   Int16Array-bubble-sort                      68.2007+-0.0937     !     69.0125+-0.1352        ! definitely 1.0119x slower
   Int16Array-load-int-mul                      1.7721+-0.0142     ?      1.7760+-0.0200        ?
   Int8Array-load                               4.4804+-0.0950     ?      4.6011+-0.1646        ? might be 1.0269x slower
   integer-divide                              13.2724+-0.0415     ?     13.3018+-0.0949        ?
   method-on-number                           184.2202+-1.1002     !    193.3059+-3.5646        ! definitely 1.0493x slower
   new-array-dead                              23.4163+-0.0934           23.3786+-0.0476        
   new-array-push                              18.0233+-2.4322           17.9705+-2.4527        
   number-test                                  3.4411+-0.0510            3.4260+-0.0237        
   object-test                                  3.7845+-0.0390     !      3.8687+-0.0326        ! definitely 1.0223x slower
   poly-stricteq                               79.6322+-0.1355     ?     80.1471+-0.8916        ?
   rare-osr-exit-on-local                     178.8125+-0.1769     ?    178.9686+-0.2493        ?
   simple-activation-demo                      40.5452+-0.6801           40.2360+-0.1038        
   slow-convergence                            79.2538+-0.1947     ?     79.2710+-0.2052        ?
   sparse-conditional                           1.0725+-0.0171     ?      1.0910+-0.0162        ? might be 1.0173x slower
   string-hash                                  4.2034+-0.0123     ?      4.2177+-0.0368        ?
   string-test                                  3.4052+-0.0247            3.4027+-0.0225        
   tear-off-arguments                           2.7744+-0.0228     ?      2.7791+-0.0202        ?
   to-int32-boolean                            23.1648+-0.0178     ?     23.3756+-0.2807        ?
   undefined-test                               3.7333+-0.0302     ?      3.7437+-0.0342        ?

   <arithmetic>                                26.3073+-0.1262     ?     26.5567+-0.1423        ? might be 1.0095x slower
   <geometric> *                                8.5237+-0.0578     ?      8.5365+-0.0699        ? might be 1.0015x slower
   <harmonic>                                   3.4470+-0.0209     ?      3.4603+-0.0207        ? might be 1.0039x slower

                                                     Base                   HashConst                                     
DSP:
   filtrr-posterize-tint                       44.7237+-0.5850           44.6182+-0.5461        
   filtrr-tint-contrast-sat-bright             70.5503+-1.0706           70.4246+-0.9774        
   filtrr-tint-sat-adj-contr-mult              91.5707+-0.5243     ?     92.2560+-0.5432        ?
   filtrr-blur-overlay-sat-contr              231.3993+-9.9191          225.1883+-2.5697          might be 1.0276x faster
   filtrr-sat-blur-mult-sharpen-contr         282.3200+-6.5660     ?    283.7610+-6.1809        ?
   filtrr-sepia-bias                           31.4936+-0.5809           31.4092+-0.6969        
   route9-vp8                         x5     1092.6734+-8.0539     ?   1103.7610+-11.1229       ? might be 1.0101x slower

   <arithmetic>                               565.0386+-4.4458     ?    569.6784+-5.0556        ? might be 1.0082x slower
   <geometric> *                              282.7366+-1.8769     ?    283.5124+-1.2426        ? might be 1.0027x slower
   <harmonic>                                 119.9613+-0.9288          119.7971+-1.1168          might be 1.0014x faster

                                                     Base                   HashConst                                     
All benchmarks:
   <arithmetic>                               100.4271+-0.4212     ?    100.9739+-0.4847        ? might be 1.0054x slower
   <geometric>                                 16.5144+-0.1072     ?     16.5628+-0.0990        ? might be 1.0029x slower
   <harmonic>                                   3.6021+-0.0206     ?      3.6293+-0.0144        ? might be 1.0075x slower

                                                     Base                   HashConst                                     
Geomean of preferred means:
   <scaled-result>                             33.1934+-0.0806     ?     33.3246+-0.0780        ? might be 1.0040x slower

Benchmark report for V8 and V8Real on msaboff-pro (MacPro5,1).

VMs tested:
"Base" at /Volumes/Data/src/webkit.baseline/WebKitBuild/Release/jsc (r121160)
"HashConst" at /Volumes/Data/src/webkit/WebKitBuild/Release/jsc (r121160)

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc()
between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the
jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution
times with 95% confidence intervals in milliseconds.

                                 Base                   HashConst                                     
V8:
   splay                   83.1748+-1.5902     ^     75.0682+-0.7650        ^ definitely 1.1080x faster

   <arithmetic>            83.1748+-1.5902     ^     75.0682+-0.7650        ^ definitely 1.1080x faster
   <geometric> *           83.1748+-1.5902     ^     75.0682+-0.7650        ^ definitely 1.1080x faster
   <harmonic>              83.1748+-1.5902     ^     75.0682+-0.7650        ^ definitely 1.1080x faster

                                 Base                   HashConst                                     
V8Real:
   splay          x2       0.93576+-0.01412    ^     0.81143+-0.00869       ^ definitely 1.1532x faster

   <arithmetic>            0.93576+-0.01412    ^     0.81143+-0.00869       ^ definitely 1.1532x faster
   <geometric> *           0.93576+-0.01412    ^     0.81143+-0.00869       ^ definitely 1.1532x faster
   <harmonic>              0.93576+-0.01412    ^     0.81143+-0.00869       ^ definitely 1.1532x faster

                                 Base                   HashConst                                     
All benchmarks:
   <arithmetic>            28.3488+-0.5350     ^     25.5637+-0.2509        ^ definitely 1.1089x faster
   <geometric>              4.1758+-0.0602     ^      3.6696+-0.0192        ^ definitely 1.1380x faster
   <harmonic>               1.3958+-0.0210     ^      1.2106+-0.0128        ^ definitely 1.1530x faster

                                 Base                   HashConst                                     
Geomean of preferred means:
   <scaled-result>         27.8960+-0.4164     ^     24.6779+-0.0960        ^ definitely 1.1304x faster
Comment 15 Geoffrey Garen 2012-06-29 17:10:00 PDT
>    splay                              x2       0.79769+-0.00748    !     0.85507+-0.00308       ! definitely 1.0719x slower

This result seems to say that we've made GC slower. 

>    splay          x2       0.93576+-0.01412    ^     0.81143+-0.00869       ^ definitely 1.1532x faster

This result seems to say that we've made GC faster.

I'd like more data on this. Please post results from the GC benchmark I sent you.
Comment 16 Geoffrey Garen 2012-06-29 17:10:56 PDT
Comment on attachment 150197 [details]
Patch with fix for string hash ASSERT failures

View in context: https://bugs.webkit.org/attachment.cgi?id=150197&action=review

r- because of the changes suggested below, and because we need data to figure out if this patch slows down GC.

> Source/JavaScriptCore/heap/MarkStack.cpp:532
> +    do {

Should be:

unsigned currentFlags = m_flags;
unsigned newFlags = currentFlags | s_hashConstLock;
if (!WTF::weakCompareAndSwap(&m_flags, currentFlags, newFlags))
    return false;
WTF::memoryBarrierAfterLock();
return true;

(tryLock should only try once; weak memory ordering platforms like ARM require fences for locks.)

> Source/JavaScriptCore/heap/MarkStack.cpp:552
> +ALWAYS_INLINE void MarkStack::internalAppend(JSValue* slot)

Let's put a comment here explaining that we're specifically excluding all visits except for visits to object and array backing stores. That way, if someone refactors this code, they'll maintain that requirement.

> Source/JavaScriptCore/heap/MarkStack.cpp:560
> +    if (value.isString()) {

Let's cast to JSCell* before doing this test, to avoid testing isCell() twice. Maybe the compiler will optimize that out for us, but let's not rely on that.

> Source/JavaScriptCore/heap/MarkStack.cpp:562
> +        if ((string->length() > 1) && !string->isRope() && string->tryHashConstLock()) {

Let's use a helper function named shouldTryHashConst, which does all our tests for us:
- length > 1
- !isRope
- !isHashConstSingleton

> Source/JavaScriptCore/runtime/JSString.h:179
> +        void releaseHashConstLock() { m_flags &= ~s_hashConstLock; }

This needs WTF::memoryBarrierBeforeUnlock().
Comment 17 Geoffrey Garen 2012-06-29 17:15:07 PDT
> We could limit how often by using a new string since last const threshold

This sounds good to me.
Comment 18 Michael Saboff 2012-07-02 16:00:36 PDT
Created attachment 150492 [details]
Updated patch with changes suggested by reviewer

These changes eliminate any question of performance regression, especially on v8-splay.  The GC benchmark was of marginal use as the numbers seem to worsen over time.  I compared before and after numbers after 30 seconds from when the benchmark page loaded to show that the patch is very similar to the baseline.

Benchmark report for SunSpider, V8, V8Real, Kraken, JSBench, JSRegress, and DSP on msaboff-pro (MacPro5,1).

VMs tested:
"Base" at /Volumes/Data/src/webkit.baseline/WebKitBuild/Release/DumpRenderTree (r121689)
"HashConst" at /Volumes/Data/src/webkit.work/WebKitBuild/Release/DumpRenderTree (r121689)

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to
get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds.

                                                     Base                   HashConst                                     
SunSpider:
   3d-cube                                      8.0544+-0.4843     ?      8.3723+-0.5080        ? might be 1.0395x slower
   3d-morph                                     7.7863+-0.1308     ?      7.7893+-0.1079        ?
   3d-raytrace                                 11.9975+-0.4714           11.9100+-0.5007        
   access-binary-trees                          2.3276+-0.4307            2.3102+-0.4476        
   access-fannkuch                              6.8067+-0.0173     ?      6.8502+-0.0684        ?
   access-nbody                                 4.2047+-0.0456            4.2043+-0.0487        
   access-nsieve                                3.3556+-0.0625            3.2679+-0.0484          might be 1.0268x faster
   bitops-3bit-bits-in-byte                     1.3841+-0.0150     ?      1.3927+-0.0187        ?
   bitops-bits-in-byte                          5.8172+-0.0789            5.7890+-0.0753        
   bitops-bitwise-and                           2.2743+-0.0150            2.2694+-0.0208        
   bitops-nsieve-bits                           3.4734+-0.0399            3.4345+-0.0146          might be 1.0113x faster
   controlflow-recursive                        2.5510+-0.0552            2.4986+-0.0183          might be 1.0210x faster
   crypto-aes                                   9.3500+-0.5134            9.3166+-0.5694        
   crypto-md5                                   3.5942+-0.1247            3.5358+-0.1064          might be 1.0165x faster
   crypto-sha1                                  3.0049+-0.0210     ?      3.0059+-0.0359        ?
   date-format-tofte                           13.4962+-0.9437     ?     13.7100+-1.0121        ? might be 1.0158x slower
   date-format-xparb                           11.8243+-0.6616     ?     12.1386+-0.7611        ? might be 1.0266x slower
   math-cordic                                  4.5366+-0.1117            4.5309+-0.0218        
   math-partial-sums                           10.0159+-0.0891     ?     10.0468+-0.0861        ?
   math-spectral-norm                           3.0866+-0.0181     ?      3.1119+-0.0697        ?
   regexp-dna                                  10.5978+-0.3863           10.5248+-0.3782        
   string-base64                                5.5663+-0.5029            5.4636+-0.4370          might be 1.0188x faster
   string-fasta                                 7.9419+-0.2639            7.9352+-0.2706        
   string-tagcloud                             13.7991+-0.2104     ?     13.8195+-0.2604        ?
   string-unpack-code                          23.5503+-0.5986           22.9889+-0.4340          might be 1.0244x faster
   string-validate-input                        8.7396+-0.5395            8.5304+-0.6068          might be 1.0245x faster

   <arithmetic> *                               7.2745+-0.1588            7.2595+-0.1732          might be 1.0021x faster
   <geometric>                                  5.8200+-0.1240            5.8038+-0.1309          might be 1.0028x faster
   <harmonic>                                   4.6040+-0.0944            4.5845+-0.0965          might be 1.0042x faster

                                                     Base                   HashConst                                     
V8:
   crypto                                      75.3963+-0.4597           75.1714+-0.6040        
   deltablue                                  138.4143+-0.5702     ?    139.5984+-1.4878        ?
   earley-boyer                                89.3543+-1.6485     ?     89.5104+-1.3849        ?
   raytrace                                    60.1360+-1.7843     ?     62.0452+-2.8892        ? might be 1.0317x slower
   regexp                                      88.1610+-0.4656           87.9296+-0.3581        
   richards                                   127.5962+-0.6307     ?    128.5957+-2.0536        ?
   splay                                      107.2903+-14.2292         104.4436+-13.9688         might be 1.0273x faster

   <arithmetic>                                98.0498+-1.9328     ?     98.1849+-1.5802        ? might be 1.0014x slower
   <geometric> *                               94.3256+-1.6731     ?     94.5058+-1.3205        ? might be 1.0019x slower
   <harmonic>                                  90.7006+-1.3669     ?     90.9741+-0.9908        ? might be 1.0030x slower

                                                     Base                   HashConst                                     
V8Real:
   encrypt                                     0.40839+-0.00114          0.40832+-0.00145       
   decrypt                                     7.04424+-0.01547          7.04140+-0.00548       
   deltablue                          x2       0.66529+-0.00875          0.66333+-0.00769       
   earley                                      2.14376+-0.01243    ?     2.15854+-0.01314       ?
   boyer                                      13.95471+-0.03918         13.95424+-0.05520       
   raytrace                           x2       5.17177+-0.04428          5.13838+-0.04901       
   regexp                             x2      26.59154+-0.08747    ?    26.72360+-0.14260       ?
   richards                           x2       0.34671+-0.00355          0.34661+-0.00257       
   splay                              x2       0.74072+-0.00383    ^     0.71081+-0.00533       ^ definitely 1.0421x faster

   <arithmetic>                                6.47023+-0.01220    ?     6.48057+-0.02214       ? might be 1.0016x slower
   <geometric> *                               2.15791+-0.00625    ^     2.14476+-0.00649       ^ definitely 1.0061x faster
   <harmonic>                                  0.92915+-0.00427          0.92163+-0.00407         might be 1.0082x faster

                                                     Base                   HashConst                                     
Kraken:
   ai-astar                                    794.943+-11.039           794.457+-11.576        
   audio-beat-detection                        197.114+-1.162      ?     206.202+-11.136        ? might be 1.0461x slower
   audio-dft                                   276.644+-0.650      ?     277.280+-2.925         ?
   audio-fft                                   119.476+-0.449            119.453+-0.265         
   audio-oscillator                            243.566+-0.434      ?     244.615+-0.621         ?
   imaging-darkroom                            286.999+-1.415      ?     287.954+-2.334         ?
   imaging-desaturate                          226.815+-0.484            226.659+-0.238         
   imaging-gaussian-blur                       437.352+-1.542      ?     437.453+-2.217         ?
   json-parse-financial                         66.794+-0.230             66.649+-0.262         
   json-stringify-tinderbox                     86.361+-0.918      ?      87.221+-2.289         ?
   stanford-crypto-aes                          87.739+-1.164             86.656+-0.328           might be 1.0125x faster
   stanford-crypto-ccm                          90.307+-0.296      ?      90.763+-0.321         ?
   stanford-crypto-pbkdf2                      196.523+-0.834      ?     196.833+-1.849         ?
   stanford-crypto-sha256-iterative             96.218+-0.810      ?      96.330+-0.289         ?

   <arithmetic> *                              229.061+-0.818      ?     229.895+-1.476         ? might be 1.0036x slower
   <geometric>                                 177.581+-0.290      ?     178.262+-0.859         ? might be 1.0038x slower
   <harmonic>                                  144.159+-0.259      ?     144.510+-0.554         ? might be 1.0024x slower

                                                     Base                   HashConst                                     
JSBench:
   amazon                                      18.0833+-0.3272     ?     18.4167+-0.3272        ? might be 1.0184x slower
   facebook                                    70.2500+-1.2154     ?     70.4167+-1.5435        ?
   google                                      96.1667+-0.7082           96.0000+-1.1494        
   twitter                                     53.0833+-0.4248     ^     52.2500+-0.2874        ^ definitely 1.0159x faster
   yahoo                                       22.5000+-0.3318           22.1667+-0.2473          might be 1.0150x faster

   <arithmetic> *                              52.0167+-0.3272           51.8500+-0.3759          might be 1.0032x faster
   <geometric>                                 42.9280+-0.2570           42.8264+-0.3306          might be 1.0024x faster
   <harmonic>                                  34.8989+-0.2832     ?     34.9147+-0.3509        ? might be 1.0005x slower

                                                     Base                   HashConst                                     
JSRegress:
   adapt-to-double-divide                      73.0448+-0.2528     ?     73.3176+-0.5147        ?
   aliased-arguments-getbyval                   0.8252+-0.0153            0.8226+-0.0089        
   arity-mismatch-inlining                      0.6705+-0.0089            0.6600+-0.0139          might be 1.0159x faster
   big-int-mul                                  8.4999+-0.0774            8.4502+-0.0511        
   boolean-test                                 3.5269+-0.0315            3.5131+-0.0294        
   cast-int-to-double                          12.0900+-0.0306     ?     12.1129+-0.0621        ?
   cfg-simplify                                 3.0696+-0.0116     ?      3.0845+-0.0310        ?
   cmpeq-obj-to-obj-other                       8.6143+-0.3413            8.4200+-0.0558          might be 1.0231x faster
   constant-test                                6.8671+-0.0285            6.8647+-0.0593        
   direct-arguments-getbyval                    0.7700+-0.0082            0.7698+-0.0149        
   double-pollution-getbyval                    9.2234+-0.0308            9.2041+-0.0153        
   double-pollution-putbyoffset                 4.7886+-0.6114     ?      5.1147+-0.8163        ? might be 1.0681x slower
   external-arguments-getbyval                  2.1439+-0.2492            2.1198+-0.2117          might be 1.0114x faster
   external-arguments-putbyval                  3.7456+-0.4770     ?      3.8689+-0.4926        ? might be 1.0329x slower
   Float32Array-matrix-mult                    11.3750+-0.5663     ?     11.7441+-0.6744        ? might be 1.0324x slower
   fold-double-to-int                          38.1196+-0.6664           37.6831+-0.3176          might be 1.0116x faster
   function-dot-apply                           2.6112+-0.0324     ^      2.5639+-0.0080        ^ definitely 1.0185x faster
   function-test                                4.0352+-0.0778     ?      4.0651+-0.0418        ?
   inline-arguments-access                      1.1399+-0.0121     ?      1.1551+-0.0184        ? might be 1.0134x slower
   inline-arguments-local-escape               22.8221+-0.3672     ?     23.1902+-0.3512        ? might be 1.0161x slower
   int-overflow-local                          85.7052+-0.2149           85.5160+-0.1651        
   Int16Array-bubble-sort                      66.9235+-1.3807           66.3555+-0.2462        
   Int16Array-load-int-mul                      1.7750+-0.0209            1.7710+-0.0257        
   Int8Array-load                               4.5036+-0.1126            4.4873+-0.0903        
   integer-divide                              12.8903+-0.0449     ?     12.9670+-0.1169        ?
   method-on-number                           189.3689+-1.0929     ?    191.4051+-2.0277        ? might be 1.0108x slower
   new-array-dead                              23.3886+-0.0952     ?     23.4266+-0.1534        ?
   new-array-push                              10.8416+-1.6158           10.8325+-1.6281        
   number-test                                  3.4584+-0.0255            3.4408+-0.0363        
   object-test                                  3.8524+-0.0439     ?      3.9079+-0.0321        ? might be 1.0144x slower
   poly-stricteq                               79.1842+-0.8449     ?     80.9045+-1.7510        ? might be 1.0217x slower
   rare-osr-exit-on-local                     175.2612+-1.1155          173.8743+-0.6069        
   simple-activation-demo                      40.1248+-0.4975           39.9025+-0.5046        
   slow-convergence                            78.5798+-0.2815     ?     78.8012+-0.2196        ?
   sparse-conditional                           1.0674+-0.0153            1.0600+-0.0132        
   string-hash                                  4.2261+-0.0215     ?      4.2363+-0.0447        ?
   string-test                                  3.4048+-0.0327            3.3857+-0.0127        
   tear-off-arguments                           2.9834+-0.0114     ?      2.9946+-0.0141        ?
   to-int32-boolean                            23.1747+-0.0693           23.1611+-0.0303        
   undefined-test                               3.7980+-0.0602            3.7848+-0.0540        

   <arithmetic>                                25.8124+-0.0986     ?     25.8735+-0.1072        ? might be 1.0024x slower
   <geometric> *                                8.2821+-0.0552     ?      8.2997+-0.0463        ? might be 1.0021x slower
   <harmonic>                                   3.4528+-0.0129            3.4489+-0.0130          might be 1.0011x faster

                                                     Base                   HashConst                                     
DSP:
   filtrr-posterize-tint                       44.1847+-0.7029           43.9408+-0.6471        
   filtrr-tint-contrast-sat-bright             70.3021+-0.5723     ?     71.8130+-2.5226        ? might be 1.0215x slower
   filtrr-tint-sat-adj-contr-mult              90.3652+-0.6997     ?     92.1827+-1.9895        ? might be 1.0201x slower
   filtrr-blur-overlay-sat-contr              223.4736+-4.6838     ?    224.0483+-4.6761        ?
   filtrr-sat-blur-mult-sharpen-contr         274.9528+-1.7327     ?    277.0566+-2.0776        ?
   filtrr-sepia-bias                           33.1090+-0.4371     ?     33.1363+-0.5593        ?
   route9-vp8                         x5     1081.0710+-10.5430        1077.9987+-6.9789        
   starfield                          x5     1172.4700+-9.5640     ?   1176.9836+-6.9101        ?

   <arithmetic>                               750.2558+-5.9386     ?    751.0681+-3.3128        ? might be 1.0011x slower
   <geometric> *                              438.4555+-2.5311     ?    439.8283+-1.7897        ? might be 1.0031x slower
   <harmonic>                                 168.1541+-1.2405     ?    168.9105+-1.9142        ? might be 1.0045x slower

                                                     Base                   HashConst                                     
All benchmarks:
   <arithmetic>                               143.1933+-0.8390     ?    143.4145+-0.4087        ? might be 1.0015x slower
   <geometric>                                 19.3711+-0.1199     ?     19.3760+-0.1130        ? might be 1.0003x slower
   <harmonic>                                   3.7289+-0.0144            3.7107+-0.0102          might be 1.0049x faster

                                                     Base                   HashConst                                     
Geomean of preferred means:
   <scaled-result>                             34.9745+-0.1795           34.9713+-0.1493          might be 1.0001x faster
Comment 19 Oliver Hunt 2012-07-02 16:09:54 PDT
Comment on attachment 150492 [details]
Updated patch with changes suggested by reviewer

View in context: https://bugs.webkit.org/attachment.cgi?id=150492&action=review

> Source/JavaScriptCore/ChangeLog:68
> - 
> -        * runtime/JSArray.cpp:
> + * runtime/JSArray.cpp:

?

> Source/JavaScriptCore/runtime/JSString.h:169
> +        static JS_EXPORTDATA unsigned s_newStringsSinceLastGC;

This isn't thread safe -- we have multiple heaps independtly being GC'd so this needs to be a per-heap field.

> Source/JavaScriptCore/runtime/JSString.h:195
> +        static const unsigned s_hashConstLock = 1u << 2;
> +        static const unsigned s_isHashConstSingleton = 1u << 1;
> +        static const unsigned s_is8Bit = 1u;

Would be nicer (and less likely for the compiler to do anything stupid) as:
enum {
    hashConstLock = 1u << 2,
    isHashConstSingleton = 1u << 1,
    s_is8Bit = 1u
};
Comment 20 Michael Saboff 2012-07-02 17:15:38 PDT
Created attachment 150503 [details]
Patch with suggested fixes
Comment 21 Gyuyoung Kim 2012-07-02 17:20:50 PDT
Comment on attachment 150503 [details]
Patch with suggested fixes

Attachment 150503 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/13124314
Comment 22 Michael Saboff 2012-07-02 17:37:11 PDT
Created attachment 150505 [details]
FFixed build issues.
Comment 23 Michael Saboff 2012-07-03 11:54:14 PDT
Created attachment 150656 [details]
Fixed a minor bug and improved new strings since last hash const logic

Fixed that the strings since last Hash Const threshold boolean is in MarkStackThreadSharedData and not MarkStack.

Change the clearing of the number of new strings since last GC is only done at the end of a GC where we actually hash const'ed strings.
Comment 24 Michael Saboff 2012-07-03 15:57:14 PDT
Committed r121806: <http://trac.webkit.org/changeset/121806>
Comment 25 Geoffrey Garen 2012-07-12 16:12:02 PDT
> The GC benchmark was of marginal use as the numbers seem to worsen over time.  I compared before and after numbers after 30 seconds from when the benchmark page loaded to show that the patch is very similar to the baseline.

Can you give more detail here? What's the rate of worsening over time? What were the two numbers you measured, which were "very similar"?