Bug 84907 - Memory wasted in JSString for non-rope strings
: Memory wasted in JSString for non-rope strings
Status: RESOLVED FIXED
: WebKit
JavaScriptCore
: 528+ (Nightly build)
: All All
: P2 Normal
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2012-04-25 16:52 PST by
Modified: 2012-04-27 16:50 PST (History)


Attachments
Proposed Patch (23.38 KB, patch)
2012-04-25 17:08 PST, Michael Saboff
gns: commit‑queue-
Review Patch | Details | Formatted Diff | Diff
Patch with Speculative Build Fix (23.62 KB, patch)
2012-04-26 09:59 PST, Michael Saboff
buildbot: commit‑queue-
Review Patch | Details | Formatted Diff | Diff
Updated patch with speculative Windows fix (24.52 KB, patch)
2012-04-26 12:21 PST, Michael Saboff
buildbot: commit‑queue-
Review Patch | Details | Formatted Diff | Diff
Another Windows build fix - added new symbol to exports (24.62 KB, patch)
2012-04-26 13:35 PST, Michael Saboff
ggaren: review-
Review Patch | Details | Formatted Diff | Diff
Patch updated from review comments (26.46 KB, patch)
2012-04-27 10:53 PST, Michael Saboff
no flags Review Patch | Details | Formatted Diff | Diff
Updated patch after talking to Geoff about isRope / isJSRopeObject() (23.59 KB, patch)
2012-04-27 11:08 PST, Michael Saboff
ggaren: review+
Review Patch | Details | Formatted Diff | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2012-04-25 16:52:14 PST
JSString effectively has four pointers to data, a UString, which contains a pointer to a StringImpl, and 3 pointers to other JSStrings in the case the string is a rope made up of other strings.

For strings created from a UString or as a substring of another JSString, the three "fiber" pointers are not used.

We can save the space of these three pointers by splitting JSString into a class that has a UString and another class that is a rope.
------- Comment #1 From 2012-04-25 17:08:57 PST -------
Created an attachment (id=138902) [details]
Proposed Patch

This appears overall neutral on performance tests.  The only test that looks like it might slow down in SunSpider string-base64.  Other SunSpider tests seem to offset that loss so that SunSpider nets out with no change.

Benchmark report for SunSpider, V8, and Kraken on msaboff-pro (MacPro5,1).

VMs tested:
"Baseline" at /Volumes/Data/src/webkit.baseline/WebKitBuild/Release/jsc (r115220)
"StringSplit" at /Volumes/Data/src/webkit.work/WebKitBuild/Release/jsc (r115220)

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime()
function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in
milliseconds.

                                             Baseline              StringSplit                                   
SunSpider:
   3d-cube                                7.1923+-0.0878          7.1591+-0.1012       
   3d-morph                               7.3342+-0.1166    ?     7.3962+-0.0787       ?
   3d-raytrace                            9.6550+-0.1114          9.5425+-0.0897         might be 1.0118x faster
   access-binary-trees                    1.8062+-0.0146    ?     1.8078+-0.0155       ?
   access-fannkuch                        7.4821+-0.0565    ?     7.5589+-0.0891       ? might be 1.0103x slower
   access-nbody                           3.8797+-0.0511    ?     3.8995+-0.0594       ?
   access-nsieve                          3.6100+-0.0518    ?     3.6761+-0.0453       ? might be 1.0183x slower
   bitops-3bit-bits-in-byte               1.3877+-0.0126          1.3839+-0.0155       
   bitops-bits-in-byte                    5.4218+-0.0903          5.3617+-0.0706         might be 1.0112x faster
   bitops-bitwise-and                     3.4084+-0.0385    ?     3.4516+-0.0456       ? might be 1.0127x slower
   bitops-nsieve-bits                     3.3156+-0.0421    !     3.3741+-0.0106       ! definitely 1.0176x slower
   controlflow-recursive                  2.4107+-0.0339    ?     2.4555+-0.0269       ? might be 1.0185x slower
   crypto-aes                             8.1091+-0.0905          8.0742+-0.1588       
   crypto-md5                             3.3620+-0.0474          3.3256+-0.0533         might be 1.0110x faster
   crypto-sha1                            2.6600+-0.0604    ?     2.6639+-0.0488       ?
   date-format-tofte                     11.1965+-0.2238    ?    11.2499+-0.1706       ?
   date-format-xparb                     11.0137+-0.1682         10.7141+-0.1453         might be 1.0280x faster
   math-cordic                            4.1943+-0.0542          4.1473+-0.0333         might be 1.0113x faster
   math-partial-sums                      9.7927+-0.1006          9.7874+-0.0972       
   math-spectral-norm                     2.8539+-0.0344          2.8480+-0.0384       
   regexp-dna                             9.7621+-0.0957    ?     9.7827+-0.1204       ?
   string-base64                          4.7095+-0.0715    !     4.8647+-0.0832       ! definitely 1.0330x slower
   string-fasta                           7.3884+-0.0862          7.3108+-0.1465         might be 1.0106x faster
   string-tagcloud                       12.9977+-0.1691         12.9032+-0.1906       
   string-unpack-code                    21.6388+-0.3067         21.5812+-0.2760       
   string-validate-input                  6.6081+-0.1529          6.5152+-0.1037         might be 1.0143x faster

   <arithmetic> *                         6.6612+-0.0349          6.6475+-0.0364         might be 1.0021x faster
   <geometric>                            5.4346+-0.0310    ?     5.4368+-0.0296       ? might be 1.0004x slower
   <harmonic>                             4.3992+-0.0283    ?     4.4095+-0.0260       ? might be 1.0023x slower

                                             Baseline              StringSplit                                   
V8:
   crypto                                75.6874+-0.6223         75.3308+-0.5333       
   deltablue                            154.9205+-1.1508    ?   155.4980+-1.4269       ?
   earley-boyer                          95.1838+-1.3352         94.8811+-1.4211       
   raytrace                              53.1343+-0.2090    ?    53.3965+-0.4181       ?
   regexp                                95.1276+-0.4268         94.7119+-0.4922       
   richards                             141.0346+-1.0650        139.7111+-0.5126       
   splay                                 90.4939+-11.5016   ?    90.6910+-10.9277      ?

   <arithmetic>                         100.7975+-1.8471        100.6029+-1.8123         might be 1.0019x faster
   <geometric> *                         95.2155+-1.8432         95.0883+-1.8053         might be 1.0013x faster
   <harmonic>                            89.7821+-1.6562         89.7419+-1.6340         might be 1.0004x faster

                                             Baseline              StringSplit                                   
Kraken:
   ai-astar                             821.3973+-10.8685   ?   822.8782+-10.8859      ?
   audio-beat-detection                 196.3768+-1.2337        195.9786+-1.0787       
   audio-dft                            289.4762+-2.0124    ?   291.9755+-3.5931       ?
   audio-fft                            118.2653+-0.4089    ?   118.5623+-0.3563       ?
   audio-oscillator                     305.7648+-1.4518    ?   307.1338+-1.4232       ?
   imaging-darkroom                     297.0695+-8.0269        296.6311+-8.5114       
   imaging-desaturate                   221.4177+-0.3711    ?   221.4266+-0.4326       ?
   imaging-gaussian-blur                459.9391+-2.2574        458.4150+-0.6549       
   json-parse-financial                  63.0999+-0.2543    ^    62.0548+-0.2780       ^ definitely 1.0168x faster
   json-stringify-tinderbox              79.7542+-1.2215    ^    77.9880+-0.3781       ^ definitely 1.0226x faster
   stanford-crypto-aes                   84.5476+-0.4595         83.8717+-0.4741       
   stanford-crypto-ccm                   77.8462+-0.8944         76.5901+-0.7101         might be 1.0164x faster
   stanford-crypto-pbkdf2               186.2330+-0.8928    ?   187.4214+-0.9307       ?
   stanford-crypto-sha256-iterative      89.8800+-0.2906         89.8685+-0.5059       

   <arithmetic> *                       235.0763+-1.0246        235.0568+-1.2835         might be 1.0001x faster
   <geometric>                          176.4131+-0.5488        175.8282+-0.5446         might be 1.0033x faster
   <harmonic>                           138.4484+-0.3933    ^   137.3959+-0.2900       ^ definitely 1.0077x faster

                                             Baseline              StringSplit                                   
All benchmarks:
   <arithmetic>                          88.7200+-0.4915         88.6777+-0.5338         might be 1.0005x faster
   <geometric>                           23.4720+-0.1403         23.4495+-0.1426         might be 1.0010x faster
   <harmonic>                             7.7185+-0.0497    ?     7.7350+-0.0458       ? might be 1.0021x slower

                                             Baseline              StringSplit                                   
Geomean of preferred means:
   <scaled-result>                       53.0219+-0.4465         52.9607+-0.4533         might be 1.0012x faster
------- Comment #2 From 2012-04-25 17:16:32 PST -------
(From update of attachment 138902 [details])
Attachment 138902 [details] did not pass gtk-ews (gtk):
Output: http://queues.webkit.org/results/12524875
------- Comment #3 From 2012-04-25 17:22:54 PST -------
(From update of attachment 138902 [details])
Attachment 138902 [details] did not pass qt-wk2-ews (qt):
Output: http://queues.webkit.org/results/12524874
------- Comment #4 From 2012-04-25 17:32:15 PST -------
(From update of attachment 138902 [details])
Attachment 138902 [details] did not pass win-ews (win):
Output: http://queues.webkit.org/results/12525690
------- Comment #5 From 2012-04-25 18:03:26 PST -------
(From update of attachment 138902 [details])
Attachment 138902 [details] did not pass mac-ews (mac):
Output: http://queues.webkit.org/results/12535412
------- Comment #6 From 2012-04-25 18:19:31 PST -------
(From update of attachment 138902 [details])
Attachment 138902 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/12520724
------- Comment #7 From 2012-04-25 19:33:32 PST -------
(From update of attachment 138902 [details])
Attachment 138902 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/12518741
------- Comment #8 From 2012-04-26 09:59:38 PST -------
Created an attachment (id=139017) [details]
Patch with Speculative Build Fix

clang doesn't produce the errors.

Added jsString() and jsStringFromArguments() as friends to JSRopeString
------- Comment #9 From 2012-04-26 10:21:14 PST -------
(From update of attachment 139017 [details])
Attachment 139017 [details] did not pass win-ews (win):
Output: http://queues.webkit.org/results/12529895
------- Comment #10 From 2012-04-26 12:21:14 PST -------
Created an attachment (id=139041) [details]
Updated patch with speculative Windows fix
------- Comment #11 From 2012-04-26 12:58:37 PST -------
(From update of attachment 139041 [details])
Attachment 139041 [details] did not pass win-ews (win):
Output: http://queues.webkit.org/results/12490900
------- Comment #12 From 2012-04-26 13:35:27 PST -------
Created an attachment (id=139058) [details]
Another Windows build fix - added new symbol to exports
------- Comment #13 From 2012-04-26 18:24:24 PST -------
(From update of attachment 139058 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=139058&action=review

Looks good, but some small tweaks required before landing.

> Source/JavaScriptCore/runtime/JSString.cpp:63
> +    if (thisObject->isJSRopeStringObject())
> +        JSRopeString::visitChildren(thisObject, visitor);
> +}
> +
> +void JSRopeString::visitChildren(JSString* thisObject, SlotVisitor& visitor)

It's slightly dirty pool to have two classes with distinct GC visit functions but identical ClassInfos. ClassInfo is supposed to contain a pointer to an object's visitChildren function. It's also against our convention to have a subclass "visitChildren" function that never calls its base class's "visitChildren" function.

I think a slightly better way to represent this is to fold the rope-style marking code into JSString::visitChildren. That way, JSString's ClassInfo is still "true", and JSRopeString is just an implementation detail for the fact that a JSString may be variable-sized.

> Source/JavaScriptCore/runtime/JSString.cpp:65
> +    JSRopeString* thisObjectAsRope = reinterpret_cast<JSRopeString*>(thisObject);

static_cast is the C++ cast for casting a base class pointer to a subclass pointer.

> Source/JavaScriptCore/runtime/JSString.cpp:157
> +            JSRopeString* currentFiberAsRope = reinterpret_cast<JSRopeString*>(currentFiber);

static_cast is the C++ cast for casting a base class pointer to a subclass pointer.

> Source/JavaScriptCore/runtime/JSString.cpp:186
> +            JSRopeString* currentFiberAsRope = reinterpret_cast<JSRopeString*>(currentFiber);

static_cast is the C++ cast for casting a base class pointer to a subclass pointer.

> Source/JavaScriptCore/runtime/JSString.h:167
> +        bool isRope() const { return m_value.isNull(); }
> +        bool is8Bit() const { return m_is8Bit; }
> +        bool isJSRopeStringObject() const { return m_isJSRopeStringObject; }

I'd rather not have two different ways to ask "Am I a rope?" -- It's hard to tell which one to use when. Instead, please always use the existing isRope() function, and remove the new one.

> Source/JavaScriptCore/runtime/JSString.h:349
> +            reinterpret_cast<const JSRopeString*>(this)->resolveRope(exec);

static_cast is the C++ cast for casting a base class pointer to a subclass pointer.

> Source/JavaScriptCore/runtime/JSString.h:356
> +            reinterpret_cast<const JSRopeString*>(this)->resolveRope(0);

static_cast is the C++ cast for casting a base class pointer to a subclass pointer.

> Source/JavaScriptCore/runtime/JSString.h:364
> +            return reinterpret_cast<JSRopeString*>(this)->getIndexSlowCase(exec, i);

static_cast is the C++ cast for casting a base class pointer to a subclass pointer.
------- Comment #14 From 2012-04-27 10:53:09 PST -------
Created an attachment (id=139228) [details]
Patch updated from review comments

(In reply to comment #13)
> (From update of attachment 139058 [details] [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=139058&action=review
> 
> Looks good, but some small tweaks required before landing.
> 
> > Source/JavaScriptCore/runtime/JSString.cpp:63
> > +    if (thisObject->isJSRopeStringObject())
> > +        JSRopeString::visitChildren(thisObject, visitor);
> > +}
> > +
> > +void JSRopeString::visitChildren(JSString* thisObject, SlotVisitor& visitor)
> 
> It's slightly dirty pool to have two classes with distinct GC visit functions but identical ClassInfos. ClassInfo is supposed to contain a pointer to an object's visitChildren function. It's also against our convention to have a subclass "visitChildren" function that never calls its base class's "visitChildren" function.
> 
> I think a slightly better way to represent this is to fold the rope-style marking code into JSString::visitChildren. That way, JSString's ClassInfo is still "true", and JSRopeString is just an implementation detail for the fact that a JSString may be variable-sized.

Modified static JSRopeString::visitChildren() to JSRopeString::visitFibers().

> > Source/JavaScriptCore/runtime/JSString.cpp:65
> > +    JSRopeString* thisObjectAsRope = reinterpret_cast<JSRopeString*>(thisObject);
> 
> static_cast is the C++ cast for casting a base class pointer to a subclass pointer.

Fixed all cited reinterpret_casts to static_casts.

> > Source/JavaScriptCore/runtime/JSString.h:167
> > +        bool isRope() const { return m_value.isNull(); }
> > +        bool is8Bit() const { return m_is8Bit; }
> > +        bool isJSRopeStringObject() const { return m_isJSRopeStringObject; }
> 
> I'd rather not have two different ways to ask "Am I a rope?" -- It's hard to tell which one to use when. Instead, please always use the existing isRope() function, and remove the new one.

They actually serve two different purposes.  isRope() is really isUnresolvedRope() so I changed it's name.  When a JSRopeString is created, isRope() (now isUnresolvedRope()) is true, but when the rope is resolved, it becomes false.  isJSRopeStringObject() will be true throughout the life of the JSRopeObject and is used only in JSString::visitChildren() to make sure we visit the fibers.
------- Comment #15 From 2012-04-27 11:08:53 PST -------
Created an attachment (id=139233) [details]
Updated patch after talking to Geoff about isRope / isJSRopeObject()

Eliminated isJSRopeObject() and reverted isUnresolvedRope() to isRope().
------- Comment #16 From 2012-04-27 15:45:54 PST -------
(From update of attachment 139233 [details])
r=me
------- Comment #17 From 2012-04-27 16:50:47 PST -------
Committed r115516: <http://trac.webkit.org/changeset/115516>