Bug 76647 - StringProtoFuncToUpperCase should call StringImpl::upper similar to StringProtoToLowerCase
Summary: StringProtoFuncToUpperCase should call StringImpl::upper similar to StringPro...
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Michael Saboff
Depends on: 77264
  Show dependency treegraph
Reported: 2012-01-19 11:22 PST by Michael Saboff
Modified: 2012-01-31 17:49 PST (History)
1 user (show)

See Also:

Patch (5.28 KB, patch)
2012-01-19 12:21 PST, Michael Saboff
ggaren: review+
Details | Formatted Diff | Diff
Updated patch with fix for s-sharp (6.48 KB, patch)
2012-01-31 15:26 PST, Michael Saboff
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Saboff 2012-01-19 11:22:42 PST
StringProtoFuncToUpperCase has a local implementation similar to what exists in StringImpl::upper().  All of the work should be done in StringImpl, including optimizations for 8 bit strings.
Comment 1 Michael Saboff 2012-01-19 12:21:55 PST
Created attachment 123170 [details]

Neutral on sunspider and v8. Provides minor benefit to Dromaeo JavaScript tests.
Comment 2 Geoffrey Garen 2012-01-19 15:57:34 PST
Comment on attachment 123170 [details]

Comment 3 Michael Saboff 2012-01-27 16:06:48 PST
Committed r106167: <http://trac.webkit.org/changeset/106167>
Comment 4 Michael Saboff 2012-01-27 17:09:14 PST
Rolled out in http://trac.webkit.org/changeset/106170 due to a test failure in LayoutTests/fast/js/string-capitalization.html.

Need to resolve that issue and submit fixed patch.
Comment 5 Michael Saboff 2012-01-31 15:26:11 PST
Created attachment 124834 [details]
Updated patch with fix for s-sharp

This patch is performance neutral on sunspider, v8 and kraken.  It speeds up the dromaeo-object-string toUpperCase subtest from 238runs/sec to 347runs/sec (46%).  Unfortunately, this provides little benefit to the dromaeo javascript overall.
Comment 6 Darin Adler 2012-01-31 16:47:33 PST
Comment on attachment 124834 [details]
Updated patch with fix for s-sharp

View in context: https://bugs.webkit.org/attachment.cgi?id=124834&action=review

> Source/JavaScriptCore/ChangeLog:3
> +2012-01-31  Michael Saboff  <msaboff@apple.com>
> +
> +2012-01-31  Michael Saboff  <msaboff@apple.com>

Double name here.

> Source/JavaScriptCore/ChangeLog:18
> +        (JSC::stringProtoFuncToLowerCase):

Since all you did was tweak whitespace in this function, I suggest removing this line.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1206
> +    StringImpl* ourImpl = s.impl();

I would have called this sImpl for consistency with sSize above.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1208
> +    if (ourImpl == upper.get())

The call to .get() here should not be needed.

> Source/JavaScriptCore/wtf/text/StringImpl.cpp:366
> +    const UChar* source16;

This source16 variable declared so long before it’s used is only necessary because we don’t want to call characters() in the fast 16-bit case. But I think one more check of is8Bit() is probably cheap enough so we can just call characters() and avoid it. Either way is probably OK.

> Source/JavaScriptCore/wtf/text/StringImpl.cpp:388
> +        // There are three special case characters.
> +        //  1. sharp-S (0xdf) converts to "SS" (two characters)
> +        //  2. micro (0xb5) converts to upper case micro (0x39c, a 16 bit character)
> +        //  3. y umlaut (0xff) converts to upper case Y umlaut (0x178, a 16 bit character)

The code doesn’t assume that there are exactly three of these, so I don’t think we need to list them. The code simply assumes that there is exactly one character that converts to two, and that is U+00DF.

I think a better comment would focus on the fact that latinSmallLetterSharpS is the only character in the range U+0000-U+00FF that converts into more than one character, because that’s all the code assumes.

> Source/JavaScriptCore/wtf/text/StringImpl.cpp:391
> +            if (UNLIKELY(c == 0xdf))

Normally we would use a constant named latinSmallLetterSharpS in the header CharacterNames.h rather than coding 0xdf here in line.

> Source/JavaScriptCore/wtf/text/StringImpl.cpp:395
> +                // Have a character that is 16bit when converted to uppercase.

I think you could word this comment better. I would write this:

// Since this upper-cased character does not fit in an 8-bit string, we should convert to 16-bit.

> Source/JavaScriptCore/wtf/text/StringImpl.cpp:412
> +            if (UNLIKELY(c == 0xdf)) {

I’m not sure this UNLIKELY is all that helpful.
Comment 7 Michael Saboff 2012-01-31 17:49:33 PST
Committed r106417: <http://trac.webkit.org/changeset/106417>