Bug 36010

Summary: RenderText::m_text should be a String, not RefPtr<StringImpl>
Product: WebKit Reporter: Shinichiro Hamaji <hamaji>
Component: WebCore Misc.Assignee: Shinichiro Hamaji <hamaji>
Status: RESOLVED FIXED    
Severity: Normal CC: eric, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Bug Depends on:    
Bug Blocks: 24906    
Attachments:
Description Flags
Patch v1
none
Patch v2 darin: review+

Description Shinichiro Hamaji 2010-03-11 04:17:25 PST
This was suggested by Darin in Bug 24906.
Comment 1 Shinichiro Hamaji 2010-03-11 04:18:55 PST
Created attachment 50486 [details]
Patch v1
Comment 2 Darin Adler 2010-03-12 09:58:18 PST
Comment on attachment 50486 [details]
Patch v1

The main thing we need to do here is do performance testing to be sure we didn't slow anything down.

> +bool charactersAreAllASCII(const UChar*, size_t);

I think we should move all the UChar* operations to the top of the file here instead of moving up just this one. If we can come up with a good file name we also might want to move them into a separate header.

> -            char c = (*m_text)[i];
> +            char c = m_text[i];

It's kind of inefficient to go through the String -> StringImpl -> characters thing each time through the loop. It would be better to get a pointer to the characters outside the loop. But only if this has a measurable performance effect.

>              case CAPITALIZE: {
> -                m_text = m_text->capitalize(previousCharacter());
> +                m_text = m_text.impl()->capitalize(previousCharacter());
>                  break;
>              }

No need for the braces here. Do we have a guarantee that m_text.impl() is not 0? We should add this function to PlatformString.h so we don't have to go down to the StringImpl level.

>              case UPPERCASE:
> -                m_text = m_text->upper();
> +                m_text = m_text.impl()->upper();
>                  break;
>              case LOWERCASE:
> -                m_text = m_text->lower();
> +                m_text = m_text.impl()->lower();
>                  break;

>              case TSCIRCLE:
> -                m_text = m_text->secure(whiteBullet);
> +                m_text = m_text.impl()->secure(whiteBullet);
>                  break;
>              case TSDISC:
> -                m_text = m_text->secure(bullet);
> +                m_text = m_text.impl()->secure(bullet);
>                  break;
>              case TSSQUARE:
> -                m_text = m_text->secure(blackSquare);
> +                m_text = m_text.impl()->secure(blackSquare);

Do we have a guarantee that m_text.impl() is not 0? We should add these functions to PlatformString.h so we don't have to go down to the StringImpl level.

> -    StringImpl* text() const { return m_text.get(); }
> +    StringImpl* text() const { return m_text.impl(); }

I think we should change this method to return a const String&, but in a later patch.

> -    const UChar* characters() const { return m_text->characters(); }
> -    unsigned textLength() const { return m_text->length(); } // non virtual implementation of length()
> +    const UChar* characters() const { return m_text.characters(); }
> +    unsigned textLength() const { return m_text.length(); } // non virtual implementation of length()

I think we should consider removing one or both of these since we already have text(), but in a separate patch.

I'm going to say review+ but the "impl non-0" concern and the performance concern are both things that need to be checked somehow before landing
Comment 3 Shinichiro Hamaji 2010-03-15 01:53:15 PDT
Created attachment 50693 [details]
Patch v2
Comment 4 WebKit Review Bot 2010-03-15 01:58:30 PDT
Attachment 50693 [details] did not pass style-queue:

Failed to run "WebKitTools/Scripts/check-webkit-style" exit_code: 1
WebCore/platform/text/PlatformString.h:157:  More than one command on the same line in if  [whitespace/parens] [4]
WebCore/platform/text/PlatformString.h:158:  More than one command on the same line in if  [whitespace/parens] [4]
WebCore/platform/text/PlatformString.h:159:  More than one command on the same line in if  [whitespace/parens] [4]
WebCore/platform/text/PlatformString.h:160:  More than one command on the same line in if  [whitespace/parens] [4]
WebCore/platform/text/StringImpl.h:191:  More than one command on the same line  [whitespace/newline] [4]
Total errors found: 5 in 6 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 5 Shinichiro Hamaji 2010-03-15 02:05:02 PDT
Thanks for the review! Could you take a look again?

I think style checker will complain about this patch, but I think they would be OK. This is another instance of Bug 33814.

> It's kind of inefficient to go through the String -> StringImpl -> characters
> thing each time through the loop. It would be better to get a pointer to the
> characters outside the loop. But only if this has a measurable performance
> effect.

I see. I found there are a small difference between StringImpl::operator[] and String::operator[]. The former ASSERTs the specified index is smaller than the length of string, and the latter just returns zero. I think the former would be more informative when someone actually meets the overflow. I rewrote all String::operator[] calls in loops with StringImpl::operator[] calls. So, there should be no performance issue anymore (but it's very nice if you told me the good or standard way to measure the performances for this kind of changes in this chance).

> Do we have a guarantee that m_text.impl() is not 0? We should add these
> functions to PlatformString.h so we don't have to go down to the StringImpl
> level.

Yes, there is ASSERT(m_text) in the beginning of this function, but I agree we want these functions in PlatformString.h. I've added make(Lower|Upper|Secure|Capitalized) as lower and upper have already been defined and have different semantics. I'm not sure if they are standard WebKit naming. Please let me know if you know better names.

> I think we should change this method to return a const String&, but in a later
> patch.
>
> I think we should consider removing one or both of these since we already have
> text(), but in a separate patch.

I agree.

I found many functions in StringImpl should be const functions. I'd like to do this refactoring in a later patch.
Comment 6 Darin Adler 2010-03-15 08:16:15 PDT
Comment on attachment 50693 [details]
Patch v2

> +    void makeLower() { if (m_impl) m_impl = m_impl->lower(); }
> +    void makeUpper() { if (m_impl) m_impl = m_impl->upper(); }
> +    void makeSecure(UChar aChar) { if (m_impl) m_impl = m_impl->secure(aChar); }
> +    void makeCapitalized(UChar previousCharacter) { if (m_impl) m_impl = m_impl->capitalize(previousCharacter); }

When it's unclear whether the function name is a verb or some other part of speech, I think it's good to choose something that is clearer. In the case of "capitalize" I don't think "makeCapitalized" is any better.

But lets go with these as-is. Should be easy to rename them later.

> +    bool charactersAreAllASCII() const { return WebCore::charactersAreAllASCII(characters(), length()); }

For boolean members we try to name them so they complete a phrase "string <xxx>". Given that, the function name should be something more like containsOnlyASCIICharacters or containsOnlyASCII.

> -UChar32 StringImpl::characterStartingAt(unsigned i)
> +UChar32 StringImpl::characterStartingAt(unsigned i) const

> -    UChar operator[](unsigned i) { ASSERT(i < m_length); return m_data[i]; }
> -    UChar32 characterStartingAt(unsigned);
> +    UChar operator[](unsigned i) const { ASSERT(i < m_length); return m_data[i]; }
> +    UChar32 characterStartingAt(unsigned) const;

StringImpl member functions should not be marked const. Because all StringImpl are immutable, all functions could be marked const. Any that change the string return a new StringImpl*. Our design choice for StringImpl was to never use const StringImpl* at all and to leave them all non-const. So it's strange that you decided to make these const.

> +        const StringImpl& text = *m_text.impl();

> +    const StringImpl& text = *m_text.impl();

> +    const StringImpl& text = *m_text.impl();

There's no reason for the const here. StringImpl objects are all immutable. If we do want to use const with StringImpl then we need to make all the member functions const, since none of them change the state.

Patch is OK as-is, but it would be better not to tamper with the const on StringImpl.

r=me
Comment 7 Eric Seidel (no email) 2010-03-15 15:58:08 PDT
Attachment 50693 [details] was posted by a committer and has review+, assigning to Shinichiro Hamaji for commit.
Comment 8 Shinichiro Hamaji 2010-03-16 00:39:44 PDT
Committed r56047: <http://trac.webkit.org/changeset/56047>
Comment 9 Shinichiro Hamaji 2010-03-16 00:48:08 PDT
> StringImpl member functions should not be marked const. Because all StringImpl
> are immutable, all functions could be marked const. Any that change the string
> return a new StringImpl*. Our design choice for StringImpl was to never use
> const StringImpl* at all and to leave them all non-const. So it's strange that
> you decided to make these const.

Ah, I didn't notice that we are trying not to mark members of StringImpl as const. Thanks for the info. It seems we may want to remove all consts from StringImpl's member functions. I'd like to fix this in a later patch. It will remove the unnecessary consts and add a few comment about this policy.

I landed the patch with this change and another change for s/charactersAreAllASCII/containsOnlyASCII/ . I'm sorry I did the same mistake (boolean methods should make sense as an English sentence) again.

Thanks again for your review!