Bug 95810 - 16 bit JSRopeString up converts an 8 bit fibers to 16 bits during resolution
Summary: 16 bit JSRopeString up converts an 8 bit fibers to 16 bits during resolution
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Michael Saboff
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-09-04 18:29 PDT by Michael Saboff
Modified: 2012-09-06 18:28 PDT (History)
2 users (show)

See Also:


Attachments
Patch (3.17 KB, patch)
2012-09-05 14:17 PDT, Michael Saboff
benjamin: review+
benjamin: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Saboff 2012-09-04 18:29:29 PDT
A JSRopeString can contain both 8 and 16 bit fibers.  If there is one 16 bit fiber, the resulting resolved JSRopeString will also be 16 bits.  The problem is that during resolution an 8 bit fiber is first up converted itself to 16 bits and then treated like a 16 bit fiber.  The up conversion will use extra memory and hurt performance for the common case that the string is only the fiber of one JSRopeString.  Instead the up conversion should be done directly as part of the resolution.
Comment 1 Michael Saboff 2012-09-05 14:17:39 PDT
Created attachment 162332 [details]
Patch
Comment 2 Benjamin Poulain 2012-09-05 15:24:26 PDT
This looks awesome but it could have an impact on performance (if the same strings are used repeatedly).

Resolving ropes is va performance sensitive area. Have you verified the benchmarks?
Comment 3 Michael Saboff 2012-09-05 15:43:12 PDT
(In reply to comment #2)
> This looks awesome but it could have an impact on performance (if the same strings are used repeatedly).
> 
> Resolving ropes is va performance sensitive area. Have you verified the benchmarks?

I haven't verified performance.  I'll verify with SunSpider, V8 and Kraken and Dromaeo string tests.  Any others you want?
Comment 4 Benjamin Poulain 2012-09-05 16:02:29 PDT
> I haven't verified performance.  I'll verify with SunSpider, V8 and Kraken and Dromaeo string tests.  Any others you want?

I think SunSpider should do the trick. It has good coverage for strings.

I have heard Filip has a good tool to test the benchmarks, better ping him.
Comment 5 Sam Weinig 2012-09-05 22:26:50 PDT
Comment on attachment 162332 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=162332&action=review

> Source/JavaScriptCore/runtime/JSString.cpp:208
> +            StringImpl::copyChars(position, string->characters(), length);

Should this call characters16() to avoid the extra branch?
Comment 6 Benjamin Poulain 2012-09-05 22:36:54 PDT
> > Source/JavaScriptCore/runtime/JSString.cpp:208
> > +            StringImpl::copyChars(position, string->characters(), length);
> 
> Should this call characters16() to avoid the extra branch?

Yep, you are right.
Comment 7 Michael Saboff 2012-09-06 10:11:10 PDT
(In reply to comment #4)
> > I haven't verified performance.  I'll verify with SunSpider, V8 and Kraken and Dromaeo string tests.  Any others you want?
> 
> I think SunSpider should do the trick. It has good coverage for strings.
> 
> I have heard Filip has a good tool to test the benchmarks, better ping him.

Here are the results for SunSpider and V8:

VMs tested:
"BaseLine" at /Volumes/Data/src/webkit.baseline/WebKitBuild/Release/DumpRenderTree (r127525)
"95810" at /Volumes/Data/src/webkit/WebKitBuild/Release/DumpRenderTree (r127525)

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between
sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific
preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence
intervals in milliseconds.

                                         BaseLine                   95810                                       
SunSpider:
   3d-cube                            8.5670+-0.5119     ?      8.9346+-0.5255        ? might be 1.0429x slower
   3d-morph                           8.2499+-0.1111            8.2225+-0.1247        
   3d-raytrace                       12.2917+-0.3964     ?     12.6080+-0.3119        ? might be 1.0257x slower
   access-binary-trees                2.2971+-0.4272     ?      2.3416+-0.4374        ? might be 1.0194x slower
   access-fannkuch                    6.8547+-0.0200     ?      6.8976+-0.0840        ?
   access-nbody                       4.2014+-0.0695     ?      4.2518+-0.0845        ? might be 1.0120x slower
   access-nsieve                      3.4389+-0.0473            3.3642+-0.0569          might be 1.0222x faster
   bitops-3bit-bits-in-byte           1.3670+-0.0139            1.3652+-0.0141        
   bitops-bits-in-byte                5.7495+-0.0125            5.7362+-0.0115        
   bitops-bitwise-and                 2.1783+-0.0872     ?      2.2376+-0.0641        ? might be 1.0272x slower
   bitops-nsieve-bits                 3.5173+-0.0171     ?      3.5415+-0.0363        ?
   controlflow-recursive              2.5044+-0.0069            2.5032+-0.0191        
   crypto-aes                         9.7332+-0.5201     ?      9.7493+-0.5317        ?
   crypto-md5                         3.9798+-0.1409            3.9523+-0.1279        
   crypto-sha1                        3.1893+-0.0437     ?      3.1950+-0.0661        ?
   date-format-tofte                 14.3015+-0.6807     ?     14.5108+-0.9050        ? might be 1.0146x slower
   date-format-xparb                 11.9714+-0.6316           11.4960+-0.1827          might be 1.0414x faster
   math-cordic                        4.2393+-0.0759     ?      4.2980+-0.0703        ? might be 1.0138x slower
   math-partial-sums                 10.1456+-0.0613           10.1065+-0.0189        
   math-spectral-norm                 3.0143+-0.0189            3.0132+-0.0168        
   regexp-dna                        10.4226+-0.3497           10.4187+-0.3661        
   string-base64                      5.9329+-0.4557            5.7017+-0.4059          might be 1.0405x faster
   string-fasta                       8.1435+-0.2758     ?      8.1897+-0.2468        ?
   string-tagcloud                   14.0988+-0.3792           13.7988+-0.2379          might be 1.0217x faster
   string-unpack-code                23.4974+-0.5536           23.4540+-0.5765        
   string-validate-input              8.7743+-0.5667            8.5747+-0.5756          might be 1.0233x faster

   <arithmetic> *                     7.4100+-0.1313            7.4024+-0.1264          might be 1.0010x faster
   <geometric>                        5.9050+-0.1044     ?      5.9076+-0.1061        ? might be 1.0004x slower
   <harmonic>                         4.6297+-0.0826     ?      4.6426+-0.0884        ? might be 1.0028x slower

                                         BaseLine                   95810                                       
V8Real:
   encrypt                           0.40161+-0.00098    ?     0.40167+-0.00059       ?
   decrypt                           6.94212+-0.01194          6.93901+-0.01066       
   deltablue                x2       0.57076+-0.00408    ?     0.57362+-0.00593       ?
   earley                            1.23547+-0.01809          1.22780+-0.00944       
   boyer                            13.76285+-0.06688         13.71783+-0.06507       
   raytrace                 x2       4.53568+-0.03303    ?     4.55272+-0.04241       ?
   regexp                   x2      26.63586+-0.08221    ?    26.68346+-0.11123       ?
   richards                 x2       0.27006+-0.00167          0.26990+-0.00202       
   splay                    x2       0.74024+-0.00986          0.73810+-0.00648       
   navier-stokes            x2      12.73918+-0.01668    ?    12.74327+-0.02574       ?

   <arithmetic>                      7.08285+-0.01590    ?     7.08803+-0.01532       ? might be 1.0007x slower
   <geometric> *                     2.42818+-0.00593    ?     2.42897+-0.00709       ? might be 1.0003x slower
   <harmonic>                        0.89872+-0.00298    ?     0.89887+-0.00443       ? might be 1.0002x slower

                                         BaseLine                   95810                                       
All benchmarks:
   <arithmetic>                       7.2854+-0.0847            7.2826+-0.0803          might be 1.0004x faster
   <geometric>                        4.2089+-0.0487     ?      4.2105+-0.0484        ? might be 1.0004x slower
   <harmonic>                         1.7932+-0.0108     ?      1.7946+-0.0122        ? might be 1.0008x slower

                                         BaseLine                   95810                                       
Geomean of preferred means:
   <scaled-result>                   13.4128+-0.1306           13.4080+-0.1223          might be 1.0004x faster
Comment 8 Michael Saboff 2012-09-06 10:11:51 PDT
(In reply to comment #6)
> > > Source/JavaScriptCore/runtime/JSString.cpp:208
> > > +            StringImpl::copyChars(position, string->characters(), length);
> > 
> > Should this call characters16() to avoid the extra branch?
> 
> Yep, you are right.

Made this change locally.  Want a new patch?
Comment 9 Benjamin Poulain 2012-09-06 13:16:28 PDT
Comment on attachment 162332 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=162332&action=review

The patch is great. And the bench are good so let's land this :)

Please fix the characters() call and check 2 tiny things:

> Source/WTF/wtf/text/StringImpl.h:541
> +    static void copyChars(UChar* destination, const LChar* source, unsigned numCharacters)
> +    {
> +        if (numCharacters == 1) {
> +            *destination = *source;
> +            return;
> +        }
> +        
> +        for (unsigned i = 0; i < numCharacters; ++i)
> +            destination[i] = source[i];
> +    }
> +

I have the feeling the other copyChars is always inlined. Could you check and add ALWAYS_INLINE if necessary?

I am not sure the case numCharacters == 1 helps much here because you don't have as much overhead with short-ish strings.
Comment 10 Michael Saboff 2012-09-06 18:28:40 PDT
Committed r127809: <http://trac.webkit.org/changeset/127809>