Bug 67079 - Replace usages of Vector<UChar> with existing StringBuilder
Summary: Replace usages of Vector<UChar> with existing StringBuilder
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Web Template Framework (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks: 66661
  Show dependency treegraph
 
Reported: 2011-08-26 20:42 PDT by Xianzhu Wang
Modified: 2013-08-09 11:19 PDT (History)
17 users (show)

See Also:


Attachments
patch (113.11 KB, patch)
2011-08-29 01:24 PDT, Xianzhu Wang
gustavo.noronha: commit-queue-
Details | Formatted Diff | Diff
patch v2 (104.87 KB, patch)
2011-08-29 02:03 PDT, Xianzhu Wang
barraclough: review-
Details | Formatted Diff | Diff
patch v3 (104.74 KB, patch)
2011-08-29 21:38 PDT, Xianzhu Wang
no flags Details | Formatted Diff | Diff
patch v4 (small change to StringBuilder::characters()) (104.81 KB, patch)
2011-08-31 03:57 PDT, Xianzhu Wang
barraclough: review+
webkit.review.bot: commit-queue-
Details | Formatted Diff | Diff
patch resolved conflicts (106.73 KB, patch)
2011-09-06 21:09 PDT, Xianzhu Wang
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Xianzhu Wang 2011-08-26 20:42:50 PDT
This is the first step for bug 66661.
Comment 1 Xianzhu Wang 2011-08-29 01:24:30 PDT
Created attachment 105473 [details]
patch

The patch is big, but most of the modifications are straight-forward.
StringBuilder is slightly modified:
1) Added StringBuilder::characters() because originally Vector<UChar>::data() was widely used;
2) Added a minimum size for buffer to match Vector (though further performance investigations/optimizations should be done in https://bugs.webkit.org/show_bug.cgi?id=67084).
Comment 2 Collabora GTK+ EWS bot 2011-08-29 01:40:25 PDT
Comment on attachment 105473 [details]
patch

Attachment 105473 [details] did not pass gtk-ews (gtk):
Output: http://queues.webkit.org/results/9558226
Comment 3 WebKit Review Bot 2011-08-29 01:41:58 PDT
Comment on attachment 105473 [details]
patch

Attachment 105473 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/9566049
Comment 4 Xianzhu Wang 2011-08-29 01:47:14 PDT
Comment on attachment 105473 [details]
patch

Please ignore this patch. Will upload a new patch soon.
Comment 5 Gyuyoung Kim 2011-08-29 01:48:12 PDT
Comment on attachment 105473 [details]
patch

Attachment 105473 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/9565086
Comment 6 Early Warning System Bot 2011-08-29 01:50:58 PDT
Comment on attachment 105473 [details]
patch

Attachment 105473 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/9561112
Comment 7 Xianzhu Wang 2011-08-29 02:03:09 PDT
Created attachment 105476 [details]
patch v2
Comment 8 Gavin Barraclough 2011-08-29 20:30:34 PDT
Comment on attachment 105476 [details]
patch v2

View in context: https://bugs.webkit.org/attachment.cgi?id=105476&action=review

This patch looks good, would be great to be able to remove the String::adopt method, do you know how far this takes us in that direction?
r- for a couple of small issues, but generally looks fine to me.

> Source/WebCore/platform/text/TextStream.cpp:87
>      size_t stringLength = strlen(string);

I think this function should just be
{ m_text.append(string); return *this; }

(If the StringBuilder needs to grow the buffer it should be checking for overflow itself).

> Source/WebCore/storage/IDBLevelDBCoding.cpp:275
> +        result.append(static_cast<UChar>((hi << 8) | lo));

Errrrk! – this looks like a(n existing) bug!
'hi' is an unsigned char, but is left shifted by 8.
You can probably fix this like this:

result.append((static_cast<UChar>(hi) << 8) | lo);

> Source/WebCore/xml/parser/CharacterReferenceParserInlineMethods.h:47
> +        source.prepend(SegmentedString(String(consumedCharacters.characters(), consumedCharacters.length())));

To get the current contents of a StringBuilder as a String you could call
consumedCharacters.toString()
(or maybe consumedCharacters.toStringPreserveCapacity() )
instead of "String(consumedCharacters.characters(), consumedCharacters.length())".
Comment 9 Darin Adler 2011-08-29 20:35:24 PDT
Comment on attachment 105476 [details]
patch v2

View in context: https://bugs.webkit.org/attachment.cgi?id=105476&action=review

>> Source/WebCore/storage/IDBLevelDBCoding.cpp:275
>> +        result.append(static_cast<UChar>((hi << 8) | lo));
> 
> Errrrk! – this looks like a(n existing) bug!
> 'hi' is an unsigned char, but is left shifted by 8.
> You can probably fix this like this:
> 
> result.append((static_cast<UChar>(hi) << 8) | lo);

This is not a bug. Operators on integer types smaller than int/unsigned are promoted to int/unsigned. The standard calls this integral promotions. So the code is fine.
Comment 10 Xianzhu Wang 2011-08-29 21:38:28 PDT
Created attachment 105574 [details]
patch v3

Thanks Gavin and Darin for review!

> This patch looks good, would be great to be able to remove the String::adopt method, do you know how far this takes us in that direction?
I think this patch has replaced more than 80% Vector<UChar> usages and almost all String::adopt usages. The remaining places depend on other bugs blocking bug 66661 and will be fixed soon.

>> Source/WebCore/platform/text/TextStream.cpp:87
>>      size_t stringLength = strlen(string);
> 
> I think this function should just be
> { m_text.append(string); return *this; }
> 
> (If the StringBuilder needs to grow the buffer it should be checking for overflow itself).

Done.

>> Source/WebCore/xml/parser/CharacterReferenceParserInlineMethods.h:47
>> +        source.prepend(SegmentedString(String(consumedCharacters.characters(), consumedCharacters.length())));
> 
> To get the current contents of a StringBuilder as a String you could call
> consumedCharacters.toString()
> (or maybe consumedCharacters.toStringPreserveCapacity() )
> instead of "String(consumedCharacters.characters(), consumedCharacters.length())".

For now both toString() and toStringPreserveCapacity() alters the internal state and are non-const. They can't be used here because consumedCharacters is const.
My plan is to make toStringPreserveCapacity() const (also make StringBuilder::m_string mutable). This will be done in bug 67081.
Comment 11 Xianzhu Wang 2011-08-31 03:57:17 PDT
Created attachment 105766 [details]
patch v4 (small change to StringBuilder::characters())
Comment 12 Ryosuke Niwa 2011-08-31 21:41:48 PDT
Comment on attachment 105766 [details]
patch v4 (small change to StringBuilder::characters())

r=me on MarkupAccumulator.h/cpp and markup.cpp changes.
Comment 13 Xianzhu Wang 2011-09-06 18:54:47 PDT
Hi Darin, Gavin, does the latest patch look good to you?
Comment 14 Gavin Barraclough 2011-09-06 19:16:25 PDT
Comment on attachment 105766 [details]
patch v4 (small change to StringBuilder::characters())

Looks good to me.
Comment 15 WebKit Review Bot 2011-09-06 19:58:35 PDT
Comment on attachment 105766 [details]
patch v4 (small change to StringBuilder::characters())

Rejecting attachment 105766 [details] from commit-queue.

Failed to run "['/mnt/git/webkit-commit-queue/Tools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '-..." exit_code: 2

Last 500 characters of output:
ource/WebCore/xml/XPathUtil.cpp
patching file Source/WebCore/xml/XSLTProcessorLibxslt.cpp
patching file Source/WebCore/xml/parser/CharacterReferenceParserInlineMethods.h
patching file Source/WebCore/xml/parser/XMLCharacterReferenceParser.cpp
patching file Source/WebCore/xml/parser/XMLCharacterReferenceParser.h
patching file Source/WebCore/xml/parser/XMLTokenizer.cpp

Failed to run "[u'/mnt/git/webkit-commit-queue/Tools/Scripts/svn-apply', u'--reviewer', u'Gavin Barraclough', u'..." exit_code: 1

Full output: http://queues.webkit.org/results/9608022
Comment 16 Xianzhu Wang 2011-09-06 21:09:49 PDT
Created attachment 106540 [details]
patch resolved conflicts

Note of changes compared with the last patch:

1. StringBuilder.h: Still use StringImpl::characters() because of StringBuilder::characters() (was changed to use [] in bug 67340). As discussed in bug 66286 and bug 66706, eventually we'll use StringConstCharacterIterator as the return type of String::characters() (the function name may be changed).

2. To avoid conflict between WTF::double_conversion::StringBuilder and WTF::StringBuilder, added a typedef in NumberPrototype.cpp.
Comment 17 WebKit Review Bot 2011-09-06 22:42:16 PDT
Comment on attachment 106540 [details]
patch resolved conflicts

Clearing flags on attachment: 106540

Committed r94640: <http://trac.webkit.org/changeset/94640>
Comment 18 WebKit Review Bot 2011-09-06 22:42:24 PDT
All reviewed patches have been landed.  Closing bug.
Comment 19 Adam Barth 2013-08-09 10:21:03 PDT
Comment on attachment 106540 [details]
patch resolved conflicts

View in context: https://bugs.webkit.org/attachment.cgi?id=106540&action=review

> Source/WebCore/rendering/InlineTextBox.h:42
> -typedef Vector<UChar, 256> BufferForAppendingHyphen;
> +
> +class BufferForAppendingHyphen : public StringBuilder {
> +public:
> +    BufferForAppendingHyphen() { reserveCapacity(256); }
> +};

Pre-allocating 256 characters in a StringBuilder is not nearly as efficient as reserving 256 characters of inline capacity in a Vector.  This change caused https://code.google.com/p/chromium/issues/detail?id=270678.
Comment 20 Darin Adler 2013-08-09 11:19:21 PDT
(In reply to comment #19)
> (From update of attachment 106540 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=106540&action=review
> 
> > Source/WebCore/rendering/InlineTextBox.h:42
> > -typedef Vector<UChar, 256> BufferForAppendingHyphen;
> > +
> > +class BufferForAppendingHyphen : public StringBuilder {
> > +public:
> > +    BufferForAppendingHyphen() { reserveCapacity(256); }
> > +};
> 
> Pre-allocating 256 characters in a StringBuilder is not nearly as efficient as reserving 256 characters of inline capacity in a Vector.  This change caused https://code.google.com/p/chromium/issues/detail?id=270678.

Thanks for pointing that out. We should figure out what to do about this in WebKit.