Bug 184182

Summary: Make InlineTextBox::createTextRun() take a const lvalue reference String
Product: WebKit Reporter: Said Abou-Hallawa <sabouhallawa>
Component: Layout and RenderingAssignee: Said Abou-Hallawa <sabouhallawa>
Status: RESOLVED FIXED    
Severity: Normal CC: bfulgham, commit-queue, dbates, koivisto, mmaxfield, simon.fraser, webkit-bug-importer, zalan
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 178750    
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch none

Description Said Abou-Hallawa 2018-03-30 10:21:20 PDT
InlineTextBox::createTextRun() takes a non-const lvalue reference. It is tempting to change the signature of this method to take a const lvalue reference. But this was done intentionally. TextRun is effectively a StringView: it does not own the passed string. We rely on the compiler to prevent calls like createTextRun("abc").

A possible Solution is to make InlineTextBox own three TextRuns:

class InlineTextBox {

private:
    mutable std::uniqe_ptr<TextRun> m_textRun;
    mutable std::uniqe_ptr<TextRun> m_combinedTextRun;
    mutable std::uniqe_ptr<TextRun> m_hyphenatedTextRun;
};

Then we can replace InlineTextBox::createTextRun(String&) by InlineTextBox::textRun(bool ignoreCombinedText = false, bool ignoreHyphen = false) const.
Comment 1 Said Abou-Hallawa 2018-04-04 17:57:58 PDT
Created attachment 337246 [details]
Patch
Comment 2 Daniel Bates 2018-04-04 22:29:14 PDT
Comment on attachment 337246 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=337246&action=review

> Source/WebCore/platform/graphics/FontCascade.cpp:377
> +float FontCascade::widthForSimpleText(const StringView& text) const

This is not necessary. StringView is a small object and should be passed by value.

> Source/WebCore/platform/graphics/TextRun.h:103
> +        m_string = emptyString();

I do not understand the need for TextRun to hold both a non-owning and owning String. It should either hold a non-owned string, StringView, or hold a String, not both.
Comment 3 Said Abou-Hallawa 2018-04-05 09:47:31 PDT
Created attachment 337271 [details]
Patch
Comment 4 Said Abou-Hallawa 2018-04-05 09:59:10 PDT
Comment on attachment 337246 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=337246&action=review

>> Source/WebCore/platform/graphics/FontCascade.cpp:377
>> +float FontCascade::widthForSimpleText(const StringView& text) const
> 
> This is not necessary. StringView is a small object and should be passed by value.

Ok. Done.

>> Source/WebCore/platform/graphics/TextRun.h:103
>> +        m_string = emptyString();
> 
> I do not understand the need for TextRun to hold both a non-owning and owning String. It should either hold a non-owned string, StringView, or hold a String, not both.

Yes this is correct. TextRun will either (1) own String or (2) hold a non-owned string StringView.

-- In case (1), TextRun owns a String, m_stringView will encapsulate the owned String. This will make the code cleaner and easier to read since all the methods will operate on m_stringView.
-- In case (2), TextRun holds a non-owned string StringView, m_stringView will be a copy of this StringView. TextRun::m_string should be empty and should never be used in functions like TextRun::string().

This allows code like this:

auto text = this->text();
TextRun textRun = createTextRun(text);

To be replaced by this:

TextRun textRun = createTextRun(text());

We moved the ownership of the underlying of the String from the caller side to the TextRun itself. This will only happen when the caller passes a String instead of a StringView.
Comment 5 Daniel Bates 2018-04-05 10:26:13 PDT
(In reply to Said Abou-Hallawa from comment #4)
> Comment on attachment 337246 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=337246&action=review
> 
> >> Source/WebCore/platform/graphics/FontCascade.cpp:377
> >> +float FontCascade::widthForSimpleText(const StringView& text) const
> > 
> > This is not necessary. StringView is a small object and should be passed by value.
> 
> Ok. Done.
> 
> >> Source/WebCore/platform/graphics/TextRun.h:103
> >> +        m_string = emptyString();
> > 
> > I do not understand the need for TextRun to hold both a non-owning and owning String. It should either hold a non-owned string, StringView, or hold a String, not both.
> 
> Yes this is correct. TextRun will either (1) own String or (2) hold a
> non-owned string StringView.
> 
> -- In case (1), TextRun owns a String, m_stringView will encapsulate the
> owned String. This will make the code cleaner and easier to read since all
> the methods will operate on m_stringView.
> -- In case (2), TextRun holds a non-owned string StringView, m_stringView
> will be a copy of this StringView. TextRun::m_string should be empty and
> should never be used in functions like TextRun::string().
> 

It seems sufficient for TextRun to hold a String. In case 1, holding both means we hold two pointers to the same StringImpl. This seems unnecessary and I would be surprised if there is any efficiency gained in case 2 by avoiding a ref/deref.
Comment 6 zalan 2018-04-05 10:27:59 PDT
(In reply to Daniel Bates from comment #5)
> (In reply to Said Abou-Hallawa from comment #4)
> > Comment on attachment 337246 [details]
> > Patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=337246&action=review
> > 
> > >> Source/WebCore/platform/graphics/FontCascade.cpp:377
> > >> +float FontCascade::widthForSimpleText(const StringView& text) const
> > > 
> > > This is not necessary. StringView is a small object and should be passed by value.
> > 
> > Ok. Done.
> > 
> > >> Source/WebCore/platform/graphics/TextRun.h:103
> > >> +        m_string = emptyString();
> > > 
> > > I do not understand the need for TextRun to hold both a non-owning and owning String. It should either hold a non-owned string, StringView, or hold a String, not both.
> > 
> > Yes this is correct. TextRun will either (1) own String or (2) hold a
> > non-owned string StringView.
> > 
> > -- In case (1), TextRun owns a String, m_stringView will encapsulate the
> > owned String. This will make the code cleaner and easier to read since all
> > the methods will operate on m_stringView.
> > -- In case (2), TextRun holds a non-owned string StringView, m_stringView
> > will be a copy of this StringView. TextRun::m_string should be empty and
> > should never be used in functions like TextRun::string().
> > 
> 
> It seems sufficient for TextRun to hold a String. In case 1, holding both
> means we hold two pointers to the same StringImpl. This seems unnecessary
> and I would be surprised if there is any efficiency gained in case 2 by
> avoiding a ref/deref.
I very much agree with this.
Comment 7 Said Abou-Hallawa 2018-04-05 12:49:06 PDT
Created attachment 337291 [details]
Patch
Comment 8 Said Abou-Hallawa 2018-04-05 13:54:39 PDT
(In reply to Daniel Bates from comment #5)
> (In reply to Said Abou-Hallawa from comment #4)
> > Comment on attachment 337246 [details]
> > Patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=337246&action=review
> > 
> > >> Source/WebCore/platform/graphics/FontCascade.cpp:377
> > >> +float FontCascade::widthForSimpleText(const StringView& text) const
> > > 
> > > This is not necessary. StringView is a small object and should be passed by value.
> > 
> > Ok. Done.
> > 
> > >> Source/WebCore/platform/graphics/TextRun.h:103
> > >> +        m_string = emptyString();
> > > 
> > > I do not understand the need for TextRun to hold both a non-owning and owning String. It should either hold a non-owned string, StringView, or hold a String, not both.
> > 
> > Yes this is correct. TextRun will either (1) own String or (2) hold a
> > non-owned string StringView.
> > 
> > -- In case (1), TextRun owns a String, m_stringView will encapsulate the
> > owned String. This will make the code cleaner and easier to read since all
> > the methods will operate on m_stringView.
> > -- In case (2), TextRun holds a non-owned string StringView, m_stringView
> > will be a copy of this StringView. TextRun::m_string should be empty and
> > should never be used in functions like TextRun::string().
> > 
> 
> It seems sufficient for TextRun to hold a String. In case 1, holding both
> means we hold two pointers to the same StringImpl. This seems unnecessary
> and I would be surprised if there is any efficiency gained in case 2 by
> avoiding a ref/deref.

My understanding was TextRun was changed such that it holds a StringView to avoid copying the underlying buffer of the String given that the caller guarantees the String outlives the TextRun. For instance GraphicsContext::drawBidiText() slices the passed TextRun into smaller sub-TextRuns without having to allocate and copy the text of each sub-TextRun.

This is why I kept the StringView member after adding a String member to the TextRun. But I've just learnt that StringView::toStringWithoutCopying() can solve this problem by creating a string which shares the same underlying buffer of the StringView and does not make a new copy of the buffer. So the returned String is effectively acts like a StringView in this case.

So I changed TextRun to hold only a String member which will be either a copy of another String or created by StringView::toStringWithoutCopying().
Comment 9 zalan 2018-04-05 14:02:32 PDT
Comment on attachment 337291 [details]
Patch

I'd prefer doing one thing per patch (not just because it's easier to review but also becasue the unrelated changes don't get lost if this gets rolled out).
Comment 10 Said Abou-Hallawa 2018-04-05 14:59:17 PDT
(In reply to zalan from comment #9)
> Comment on attachment 337291 [details]
> Patch
> 
> I'd prefer doing one thing per patch (not just because it's easier to review
> but also becasue the unrelated changes don't get lost if this gets rolled
> out).

What do you mean by "one thing per patch"? What are these things?

This patch is basically rolling back http://trac.webkit.org/changeset/174228 but with a better fix. TextRun was holding a character pointer and length. Using StringView was a nice approach but it introduced the underlying String lifetime issue and the caller has to do tricks like:

    auto text = this->text();
    TextRun textRun = createTextRun(text);

instead of just
    TextRun textRun = createTextRun(text());

The new patch uses String as a replacement instead of StringView giving that no underlying charter buffer will be ever copied which was the purpose of using the StringView in the first time.

I do not think this patch can be split more than that. It replaces a single member of TextRun and fixes the code to compile, link and work as expected without perf regression. The rest of the patch are clean-ups which are very easy to follow.
Comment 11 zalan 2018-04-05 15:25:58 PDT
Comment on attachment 337291 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=337291&action=review

> Source/WebCore/ChangeLog:3
> +        Fix the lifetime of the TextRun

This title is misleading. You don't change the lifetime of the TextRun.

> Source/WebCore/platform/graphics/TextRun.h:69
> +        RELEASE_ASSERT_WITH_SECURITY_IMPLICATION(startOffset < m_text.length());

This line has nothing to do with this patch. Please remove it.

> Source/WebCore/platform/graphics/TextRun.h:83
> -    UChar operator[](unsigned i) const { ASSERT_WITH_SECURITY_IMPLICATION(i < m_text.length()); return m_text[i]; }
> -    const LChar* data8(unsigned i) const { ASSERT_WITH_SECURITY_IMPLICATION(i < m_text.length()); ASSERT(is8Bit()); return &m_text.characters8()[i]; }
> -    const UChar* data16(unsigned i) const { ASSERT_WITH_SECURITY_IMPLICATION(i < m_text.length()); ASSERT(!is8Bit()); return &m_text.characters16()[i]; }
> +    UChar operator[](unsigned i) const { RELEASE_ASSERT_WITH_SECURITY_IMPLICATION(i < m_text.length()); return m_text[i]; }
> +    const LChar* data8(unsigned i) const { RELEASE_ASSERT_WITH_SECURITY_IMPLICATION(i < m_text.length()); ASSERT(is8Bit()); return &m_text.characters8()[i]; }
> +    const UChar* data16(unsigned i) const { RELEASE_ASSERT_WITH_SECURITY_IMPLICATION(i < m_text.length()); ASSERT(!is8Bit()); return &m_text.characters16()[i]; }

These lines have nothing to do with this patch. Please remove them.
Comment 12 zalan 2018-04-05 15:34:53 PDT
> without perf regression. The rest of the patch are clean-ups which are very
> easy to follow.
I've wasted too much time looking at unrelated "easy to follow, clean-up" changes at regression points and I'd rather use that time for something else.
Comment 13 Daniel Bates 2018-04-05 19:22:41 PDT
Comment on attachment 337291 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=337291&action=review

> Source/WebCore/rendering/InlineTextBox.cpp:1217
> +TextRun InlineTextBox::createTextRun(bool ignoreCombinedText, bool ignoreHypheng) const

ignoreHypheng => ignoreHyphen

> Source/WebCore/rendering/InlineTextBox.cpp:1220
> +    TextRun textRun { text(ignoreCombinedText, ignoreHypheng), textPos(), expansion(), expansionBehavior(), direction(), dirOverride() || style.rtlOrdering() == VisualOrder, !renderer().canUseSimpleFontCodePath() };

Ditto.

> Source/WebCore/rendering/SimpleLineLayout.cpp:184
> +            TextRun run(String(textRenderer.text()));

Is it necessary to explicitly call the String constructor?

> Source/WebCore/rendering/SimpleLineLayoutTextFragmentIterator.cpp:52
> +    , hyphenStringWidth(shouldHyphenate ? (useSimplifiedTextMeasuring ? font.widthForSimpleText(style.hyphenString()) : font.width(TextRun(String(style.hyphenString())))) : 0)

Is it necessary to explicitly call the String constructor?
Comment 14 Daniel Bates 2018-04-05 19:24:39 PDT
Comment on attachment 337291 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=337291&action=review

>> Source/WebCore/platform/graphics/TextRun.h:69
>> +        RELEASE_ASSERT_WITH_SECURITY_IMPLICATION(startOffset < m_text.length());
> 
> This line has nothing to do with this patch. Please remove it.

I agree with zalan, please make these changes in a separate bug.

>> Source/WebCore/platform/graphics/TextRun.h:83
>> +    const UChar* data16(unsigned i) const { RELEASE_ASSERT_WITH_SECURITY_IMPLICATION(i < m_text.length()); ASSERT(!is8Bit()); return &m_text.characters16()[i]; }
> 
> These lines have nothing to do with this patch. Please remove them.

Ditto.
Comment 15 Daniel Bates 2018-04-05 19:28:41 PDT
Comment on attachment 337291 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=337291&action=review

For your consideration, I suggest using uniform initializer syntax throughout this patch.

> Source/WebCore/platform/graphics/TextRun.h:91
> +    void setText(const LChar* c, unsigned len) { setText(StringView(c, len)); }

c  => text
len  => length

(Since we are touching this line anyway).

> Source/WebCore/platform/graphics/TextRun.h:92
> +    void setText(const UChar* c, unsigned len) { setText(StringView(c, len)); }

Ditto.

> Source/WebCore/platform/graphics/TextRun.h:93
> +    void setText(StringView stringView) { m_text = stringView.toStringWithoutCopying(); }

stringView => text

(Typically we name the argument of the setter to match the name of the setter and underlying instance variable)
Comment 16 Daniel Bates 2018-04-05 19:34:32 PDT
Comment on attachment 337291 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=337291&action=review

> Source/WebCore/ChangeLog:19
> +           will make a copy of this string and assign it to m_text.

This is not accurate. The word “string” is usually interpreted as a character buffer. This is not the same as a String object. Copying of a String refs the underlying StringImpl. That is, we do not make a copy of the underlying character data.

> Source/WebCore/ChangeLog:23
> +           will not use  StringView::toString(). Instead We will use

“Instead We”  => “Instead we”

The last sentence does not read well.
Comment 17 Said Abou-Hallawa 2018-04-09 10:53:05 PDT
Comment on attachment 337291 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=337291&action=review

>> Source/WebCore/ChangeLog:3
>> +        Fix the lifetime of the TextRun
> 
> This title is misleading. You don't change the lifetime of the TextRun.

Title was changed to "Make InlineTextBox::createTextRun() take a const lvalue reference String"

>> Source/WebCore/ChangeLog:19
>> +           will make a copy of this string and assign it to m_text.
> 
> This is not accurate. The word “string” is usually interpreted as a character buffer. This is not the same as a String object. Copying of a String refs the underlying StringImpl. That is, we do not make a copy of the underlying character data.

This was changed to "This constructor will addRef the underlying StringImpl when assigning it to m_text".

>> Source/WebCore/ChangeLog:23
>> +           will not use  StringView::toString(). Instead We will use
> 
> “Instead We”  => “Instead we”
> 
> The last sentence does not read well.

Fixed.

>>> Source/WebCore/platform/graphics/TextRun.h:69
>>> +        RELEASE_ASSERT_WITH_SECURITY_IMPLICATION(startOffset < m_text.length());
>> 
>> This line has nothing to do with this patch. Please remove it.
> 
> I agree with zalan, please make these changes in a separate bug.

Removed.

>>> Source/WebCore/platform/graphics/TextRun.h:83
>>> +    const UChar* data16(unsigned i) const { RELEASE_ASSERT_WITH_SECURITY_IMPLICATION(i < m_text.length()); ASSERT(!is8Bit()); return &m_text.characters16()[i]; }
>> 
>> These lines have nothing to do with this patch. Please remove them.
> 
> Ditto.

Removed.

>> Source/WebCore/platform/graphics/TextRun.h:91
>> +    void setText(const LChar* c, unsigned len) { setText(StringView(c, len)); }
> 
> c  => text
> len  => length
> 
> (Since we are touching this line anyway).

Fixed.

>> Source/WebCore/platform/graphics/TextRun.h:92
>> +    void setText(const UChar* c, unsigned len) { setText(StringView(c, len)); }
> 
> Ditto.

Fixed.

>> Source/WebCore/platform/graphics/TextRun.h:93
>> +    void setText(StringView stringView) { m_text = stringView.toStringWithoutCopying(); }
> 
> stringView => text
> 
> (Typically we name the argument of the setter to match the name of the setter and underlying instance variable)

Fixed.

>> Source/WebCore/rendering/InlineTextBox.cpp:1217
>> +TextRun InlineTextBox::createTextRun(bool ignoreCombinedText, bool ignoreHypheng) const
> 
> ignoreHypheng => ignoreHyphen

Fixed.

>> Source/WebCore/rendering/InlineTextBox.cpp:1220
>> +    TextRun textRun { text(ignoreCombinedText, ignoreHypheng), textPos(), expansion(), expansionBehavior(), direction(), dirOverride() || style.rtlOrdering() == VisualOrder, !renderer().canUseSimpleFontCodePath() };
> 
> Ditto.

Fixed.

>> Source/WebCore/rendering/SimpleLineLayout.cpp:184
>> +            TextRun run(String(textRenderer.text()));
> 
> Is it necessary to explicitly call the String constructor?

Yes. Because RenderText::text() returns StringImpl. Since both String and StringView can be implicitly constructed given a StringImpl, the compiler wants us to be more explicit about which constructor of TextRun to be call.

>> Source/WebCore/rendering/SimpleLineLayoutTextFragmentIterator.cpp:52
>> +    , hyphenStringWidth(shouldHyphenate ? (useSimplifiedTextMeasuring ? font.widthForSimpleText(style.hyphenString()) : font.width(TextRun(String(style.hyphenString())))) : 0)
> 
> Is it necessary to explicitly call the String constructor?

Yes. Because RenderStyle::hyphenString() returns an AtomicString.
Comment 18 Said Abou-Hallawa 2018-04-09 10:54:30 PDT
Created attachment 337508 [details]
Patch
Comment 19 WebKit Commit Bot 2018-04-09 16:02:34 PDT
Comment on attachment 337508 [details]
Patch

Clearing flags on attachment: 337508

Committed r230452: <https://trac.webkit.org/changeset/230452>
Comment 20 WebKit Commit Bot 2018-04-09 16:02:36 PDT
All reviewed patches have been landed.  Closing bug.
Comment 21 Radar WebKit Bug Importer 2018-04-09 16:03:48 PDT
<rdar://problem/39296994>