Bug 96028 - equalIgnoringCase of two StringImpls doesn't handle 8 bit strings
Summary: equalIgnoringCase of two StringImpls doesn't handle 8 bit strings
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Michael Saboff
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-09-06 14:47 PDT by Michael Saboff
Modified: 2012-09-07 10:46 PDT (History)
2 users (show)

See Also:


Attachments
Patch (2.40 KB, patch)
2012-09-06 15:35 PDT, Michael Saboff
benjamin: review+
benjamin: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Saboff 2012-09-06 14:47:45 PDT
equalIgnoringCase(StringImpl*, StringImpl*) is implemented by calling CaseFoldingHash::equal(a, b) which always up converts 8 bit strings.
Comment 1 Michael Saboff 2012-09-06 15:35:46 PDT
Created attachment 162606 [details]
Patch
Comment 2 Michael Saboff 2012-09-06 15:39:50 PDT
SunSpider and V8 Performance results of posted patch. Seems neutral.


Benchmark report for SunSpider and V8Real on msaboff-pro (MacPro5,1).

VMs tested:
"BaseLine" at /Volumes/Data/src/webkit.baseline/WebKitBuild/Release/DumpRenderTree (r127525)
"96028" at /Volumes/Data/src/webkit/WebKitBuild/Release/DumpRenderTree (r127762)

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between
sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific
preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence
intervals in milliseconds.

                                         BaseLine                   96028                                       
SunSpider:
   3d-cube                            8.6759+-0.4965     ?      8.6795+-0.4975        ?
   3d-morph                           8.1207+-0.0540     ?      8.2320+-0.1022        ? might be 1.0137x slower
   3d-raytrace                       12.4447+-0.3669     ?     12.7408+-0.2827        ? might be 1.0238x slower
   access-binary-trees                2.3152+-0.4143     ?      2.3615+-0.4420        ? might be 1.0200x slower
   access-fannkuch                    6.9096+-0.0546            6.8998+-0.0728        
   access-nbody                       4.1547+-0.0387     ?      4.3159+-0.1864        ? might be 1.0388x slower
   access-nsieve                      3.4240+-0.0603            3.4100+-0.0361        
   bitops-3bit-bits-in-byte           1.3655+-0.0132     ?      1.3787+-0.0229        ?
   bitops-bits-in-byte                5.7560+-0.0294     ?      5.7835+-0.0546        ?
   bitops-bitwise-and                 2.3053+-0.0559            2.1998+-0.1298          might be 1.0480x faster
   bitops-nsieve-bits                 3.5439+-0.0450            3.5358+-0.0341        
   controlflow-recursive              2.5098+-0.0199            2.4791+-0.0399          might be 1.0124x faster
   crypto-aes                         9.7697+-0.5576            9.7663+-0.5242        
   crypto-md5                         3.9863+-0.1273            3.9119+-0.1399          might be 1.0190x faster
   crypto-sha1                        3.2103+-0.0622            3.1315+-0.0479          might be 1.0252x faster
   date-format-tofte                 14.9243+-1.1552           14.1839+-0.6570          might be 1.0522x faster
   date-format-xparb                 12.1777+-0.6900           11.5475+-0.2127          might be 1.0546x faster
   math-cordic                        4.2617+-0.0855     ?      4.2981+-0.0387        ?
   math-partial-sums                 10.3130+-0.2316           10.1840+-0.0735          might be 1.0127x faster
   math-spectral-norm                 3.0180+-0.0339            3.0162+-0.0184        
   regexp-dna                        10.4587+-0.3506     ?     10.4712+-0.3430        ?
   string-base64                      5.5117+-0.0758     ?      5.8878+-0.5140        ? might be 1.0682x slower
   string-fasta                       8.1848+-0.2835     ?      8.3168+-0.2465        ? might be 1.0161x slower
   string-tagcloud                   13.7753+-0.2121           13.6639+-0.2892        
   string-unpack-code                23.5257+-0.5769           23.2775+-0.4931          might be 1.0107x faster
   string-validate-input              8.8146+-0.5770            8.7567+-0.5374        

   <arithmetic> *                     7.4407+-0.1668            7.4012+-0.1276          might be 1.0053x faster
   <geometric>                        5.9266+-0.1177            5.9129+-0.1067          might be 1.0023x faster
   <harmonic>                         4.6557+-0.0864            4.6442+-0.0923          might be 1.0025x faster

                                         BaseLine                   96028                                       
V8Real:
   encrypt                           0.40173+-0.00063    ?     0.40197+-0.00114       ?
   decrypt                           6.93700+-0.01300    ?     6.94133+-0.01487       ?
   deltablue                x2       0.56658+-0.00450    ?     0.56968+-0.00429       ?
   earley                            1.23068+-0.00589    ^     1.21994+-0.00352       ^ definitely 1.0088x faster
   boyer                            13.66712+-0.05575    ?    13.67038+-0.05082       ?
   raytrace                 x2       4.55525+-0.03219    ?     4.55528+-0.02875       ?
   regexp                   x2      26.79253+-0.07617         26.62287+-0.09437       
   richards                 x2       0.26984+-0.00132          0.26955+-0.00128       
   splay                    x2       0.74498+-0.00724          0.74428+-0.00661       
   navier-stokes            x2      12.73651+-0.01486    ?    12.73830+-0.00966       ?

   <arithmetic>                      7.09799+-0.01237          7.07710+-0.01179         might be 1.0030x faster
   <geometric> *                     2.42901+-0.00434          2.42707+-0.00476         might be 1.0008x faster
   <harmonic>                        0.89800+-0.00301    ?     0.89814+-0.00259       ? might be 1.0002x slower

                                         BaseLine                   96028                                       
All benchmarks:
   <arithmetic>                       7.3101+-0.1044            7.2777+-0.0809          might be 1.0045x faster
   <geometric>                        4.2188+-0.0531            4.2116+-0.0481          might be 1.0017x faster
   <harmonic>                         1.7945+-0.0108            1.7936+-0.0104          might be 1.0005x faster

                                         BaseLine                   96028                                       
Geomean of preferred means:
   <scaled-result>                   13.4420+-0.1556           13.4016+-0.1194          might be 1.0030x faster
Comment 3 Benjamin Poulain 2012-09-06 19:15:20 PDT
Comment on attachment 162606 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=162606&action=review

> Source/WTF/wtf/text/StringHash.h:66
> -            if (b->is8Bit()) {
> -                // We know that a is 8 bit and b is 16 bit.
> +            if (b->is8Bit())
>                  return WTF::equal(a->characters16(), b->characters8(), aLength);

Nice comment indeed :)

> Source/WTF/wtf/text/StringHash.h:135
>              return WTF::Unicode::umemcasecmp(a->characters(), b->characters(), length) == 0;

For clarity, I would also use equalIgnoringCase here.

And characters() could be characters16().
Comment 4 Michael Saboff 2012-09-07 09:01:58 PDT
(In reply to comment #3)
> (From update of attachment 162606 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=162606&action=review
> 
> > Source/WTF/wtf/text/StringHash.h:66
> > -            if (b->is8Bit()) {
> > -                // We know that a is 8 bit and b is 16 bit.
> > +            if (b->is8Bit())
> >                  return WTF::equal(a->characters16(), b->characters8(), aLength);
> 
> Nice comment indeed :)
> 
> > Source/WTF/wtf/text/StringHash.h:135
> >              return WTF::Unicode::umemcasecmp(a->characters(), b->characters(), length) == 0;
> 
> For clarity, I would also use equalIgnoringCase here.
> 
> And characters() could be characters16().

WTF::equalIgnoringCase(UChar*, UChar*) is currently a static inline StringImpl.cpp that calls umemcasecmp().  I moved it to StringImpl.h and changed StringHash.h to call it (with characters16()).  Want a new patch or you fine with this change?
Comment 5 Benjamin Poulain 2012-09-07 10:29:09 PDT
> WTF::equalIgnoringCase(UChar*, UChar*) is currently a static inline StringImpl.cpp that calls umemcasecmp().  I moved it to StringImpl.h and changed StringHash.h to call it (with characters16()).  Want a new patch or you fine with this change?

I did not know equalIgnoringCase() was only in the cpp file.
Your change sounds good, no need for another review.
Comment 6 Michael Saboff 2012-09-07 10:46:41 PDT
Committed r127887: <http://trac.webkit.org/changeset/127887>