Bug 43987

Summary: Downloading using XHR is much slower than before
Product: WebKit Reporter: Behrooz <behrooz.noorizadeh>
Component: JavaScriptCoreAssignee: Gavin Barraclough <barraclough>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, ademar, ap, barraclough, dave+webkit, dglazkov, emil.zimmermann, eric, ggaren, hausmann, kamaji, kling, nancy.piedra, s.mathur, suresh.voruganti, webkit.review.bot, wellu.makinen
Priority: P1 Keywords: InRadar, Regression
Version: 528+ (Nightly build)   
Hardware: Other   
OS: Linux   
Attachments:
Description Flags
Proposed patch, first stab
barraclough: review-
First cut, switch XHR to use StringBuilder
none
Ooops, ignore the WTF changes from the last patch, probably not how we'd do this!
none
[PART I] Allow fast inspection of intermediate states of a StringBuilder.
oliver: review+
[PART II] Switch XMLHttpRequest, FileReader, and FileReaderSync to use a Stringbuilder oliver: review+

Description Behrooz 2010-08-13 12:06:15 PDT
On a simple test case, I tried to download a 5Mb file using XHR request. The download time on the latest WebKit is much more than the time I tried it a few months ago. Looking into traces, it seems most of the time is spent in string concatenation:

WebCore::ScriptString::operator+=(WebCore::String const&)

I think, this issue is introduced since the introduction of the new JS parser. Specifically, JavaScriptCore/wtf/Vector.h has templates for AlignedBuffer with the size of 1 byte to 64 bytes. 64 bytes is too small for downloading big chunks of data. Basically, what is happening is that we keep expanding the vector by only 64 bytes many times and thats why the ScriptString::operator+ is so slow.
Comment 1 Geoffrey Garen 2010-08-16 10:57:18 PDT
(In reply to comment #0)

Thanks for spotting this.

> I think, this issue is introduced since the introduction of the new JS parser. Specifically, JavaScriptCore/wtf/Vector.h has templates for AlignedBuffer with the size of 1 byte to 64 bytes. 64 bytes is too small for downloading big chunks of data. Basically, what is happening is that we keep expanding the vector by only 64 bytes many times and thats why the ScriptString::operator+ is so slow.

I don't think Vector is the problem here. 64 bytes is the initial, inline capacity of the vector, not its growth policy. Vectors use a 1.25X growth policy. See Vector<T, inlineCapacity>::expandCapacity().

The problem here is that ScriptString creates a new vector for each append operation, instead of using a persistent vector a rope.
Comment 2 Geoffrey Garen 2010-08-16 10:57:52 PDT
<rdar://problem/8314324>
Comment 3 Geoffrey Garen 2010-08-16 11:00:17 PDT
http://trac.webkit.org/changeset/52329

This was caused by string refactoring, which removed UString's built-in rope-ish behavior.
Comment 4 emil zimmermann 2010-09-09 14:16:14 PDT
Can you give us an estimate, when the fix could be coming?
Comment 5 Gavin Barraclough 2010-09-10 14:30:54 PDT
I hope to find time to look at this within the next few weeks.
Comment 6 emil zimmermann 2010-09-16 14:50:53 PDT
Any chance you can do that sooner? this is becoming quite important. thanks
Comment 7 Gavin Barraclough 2010-09-16 14:53:30 PDT
I'm afraid I have a busy schedule right now, but feel free to look into this bug yourself. :-)
Comment 8 emil zimmermann 2010-09-16 14:55:15 PDT
(In reply to comment #7)
> I'm afraid I have a busy schedule right now, but feel free to look into this bug yourself. :-)

It is NOT my job I am afraid. Take your time...
Comment 9 Andreas Kling 2010-10-04 10:25:00 PDT
Created attachment 69648 [details]
Proposed patch, first stab

Note that this patch doesn't address the problem on V8 - I'm mostly interested in whether this approach makes sense, or if we should be putting the rope logic in XMLHttpRequest.{cpp,h} instead.
Comment 10 Gavin Barraclough 2010-10-04 16:02:04 PDT
Let me comment on this patch tomorrow. May be able to do something to support v8 too using overcapacity on StringImpls. Would say more but plane about to take off.
G.
Comment 11 Gavin Barraclough 2010-10-08 13:46:00 PDT
Comment on attachment 69648 [details]
Proposed patch, first stab

Okay, I don't think this is quite how we want to fix this.  Fundamentally this patch is doing the right thing, in that we should be using a smarter string building layer rather that resolving a fresh string every time new data is available, but it's doing it in slightly the wrong way:

(1) I think this patch is fixing the problem in the wrong place.  We don't want to be adding more code to ScriptString - in fact, we don't really want ScriptString to exist at all! (this class only existed because XMLHttpRequest used to hold a UString, which wasn't great from design perspective in the first place, and clearly isn't ).  Strings in WebCore are by design immutable, so XMLHttpRequest holding onto a string and appending to it is not the correct approach.  The place this problem needs to be fixed is in XMLHttpRequest.  (Moreover, all uses of ScriptString are now wrong.  XMLHttpRequest only held a StringString under the assumption this can be used for fast concatenation, which is no longer (necessarily) the case – as demonstrated by this bug.  New code in fileapi is currently using ScriptString to mimic XMLHttpRequest, and as such will also be mimicking this O(N^2) performance too.  Whatever fix is made for this bug should also be applied to FileReader/FileReaderSync - and having done so StringString will be redundant.)

(2) We shouldn't be adding a new type to build Strings – we already have StringBuilder.  XMLHttpRequest should just use a StringBuilder.  If StringBuilder does not have the performance characteristics we require, we should fix it ...

(3) ... and StringBuilder probably doesn't have the performance characteristics we require. :-)  Building the string up as a rope saves resolving the string on every append, but accesses to the responseText string while data is being loaded will still result in a copy to resolve, and as such repeated calls to access the responseText could still result in repeated copying of the character data in the String.  Instead we should probably look at using a allocating a string buffer with overcapacity, such that appends can write into the existing buffer.  We should be able support with using the existing createUninitialized & substring mechanisms supported by WTF StringImpls.  Using a 2x overcapacity should make string construction O(N), with zero cost to resolve.

I have a patch for (1) and (2), will attach, and most of the code written for (3) too.

G.
Comment 12 Gavin Barraclough 2010-10-08 13:49:03 PDT
Created attachment 70283 [details]
First cut, switch XHR to use StringBuilder

Couple of rough edges, bit this patch should be basically good.  Going to finish cleaning up & take a stab at an improved StringBuilder.
Comment 13 Gavin Barraclough 2010-10-08 13:52:27 PDT
Created attachment 70284 [details]
Ooops, ignore the WTF changes from the last patch, probably not how we'd do this!
Comment 14 Gavin Barraclough 2010-10-13 11:10:12 PDT
Created attachment 70632 [details]
[PART I] Allow fast inspection of intermediate states of a StringBuilder.
Comment 15 Gavin Barraclough 2010-10-13 11:27:44 PDT
Created attachment 70635 [details]
[PART II] Switch XMLHttpRequest, FileReader, and FileReaderSync to use a Stringbuilder

No performance impact on SunSpider, Dromaeo DOM Core / CSS Selectors.
Comment 16 Oliver Hunt 2010-10-13 11:37:01 PDT
Comment on attachment 70632 [details]
[PART I] Allow fast inspection of intermediate states of a StringBuilder.

I stab at you and your size vs. length renames :P
Comment 17 Oliver Hunt 2010-10-13 11:39:31 PDT
Comment on attachment 70635 [details]
[PART II] Switch XMLHttpRequest, FileReader, and FileReaderSync to use a Stringbuilder

r=me
Comment 18 WebKit Review Bot 2010-10-13 11:50:43 PDT
Attachment 70632 [details] did not build on chromium:
Build output: http://queues.webkit.org/results/4391009
Comment 19 Gavin Barraclough 2010-10-13 13:11:04 PDT
Fixed in r69683 + r69688.
Comment 20 WebKit Review Bot 2010-10-13 14:06:40 PDT
http://trac.webkit.org/changeset/69688 might have broken Chromium Mac Release
Comment 21 Suresh Voruganti 2010-10-26 10:57:24 PDT
Nancy, Were you able to port the patch to Qtwebkit 2.1 branch?
Comment 22 Nancy Piedra 2010-10-26 15:32:33 PDT
We've been working for over a week to try to port this to webkit 2.1.  The changes are far reaching and I'm not sure it will be possible to port.  Will spend a few more days.
Comment 23 Keith Kyzivat 2010-10-27 10:39:46 PDT
I have been unable to port the changes, and I think I'm going to give up at this point.

I have also been unable to really determine that there is a significant performance impact without this fix.  In theory it sounds like there'd be a significant performance gain, but I'm not able to see that from tests on simple, large xhrs I've tried, when testing trunk before and after these fixes.  Tested using ~5MB XHR content on x86 Linux, relatively new Intel hardware.
Comment 24 David Tapuska 2010-10-27 13:22:24 PDT
(In reply to comment #23)
> I have been unable to port the changes, and I think I'm going to give up at this point.
> 
> I have also been unable to really determine that there is a significant performance impact without this fix.  In theory it sounds like there'd be a significant performance gain, but I'm not able to see that from tests on simple, large xhrs I've tried, when testing trunk before and after these fixes.  Tested using ~5MB XHR content on x86 Linux, relatively new Intel hardware.

This was a page I initially noticed:
http://i.dslr.net/tinyspeedtest.html
Comment 25 Wellu Mäkinen 2010-10-27 22:26:24 PDT
(In reply to comment #23)
> I have been unable to port the changes, and I think I'm going to give up at this point.
> 
> I have also been unable to really determine that there is a significant performance impact without this fix.  In theory it sounds like there'd be a significant performance gain, but I'm not able to see that from tests on simple, large xhrs I've tried, when testing trunk before and after these fixes.  Tested using ~5MB XHR content on x86 Linux, relatively new Intel hardware.

If you use lower end hardware you'll see the impact. I guess this won't happen on desktop environment where you have plenty of processing power and memory.

Anyway, there really is a bug in the implementation and it manifests under certain conditions as very slow downloads.
Comment 26 Siddharth Mathur 2010-10-28 06:15:09 PDT
*** Bug 46746 has been marked as a duplicate of this bug. ***
Comment 27 Keith Kyzivat 2010-11-04 07:08:29 PDT
Ademar is going to apply the original proposed patch Andreas Kling provided for this bug to QtWebkit 2.1 -- it applies much more easily to QtWebkit 2.1.  Stan has verified this fixes XHR crashes on some Symbian devices with XHR requests of 5MB and 10MB -- which had previously failed (See bug 46746).  This also should improve performance.
Comment 28 Ademar Reis 2010-11-08 06:24:35 PST
A backported version of the original patch by Andreas Kling was pushed to QtWebkit-2.1 (http://gitorious.org/+qtwebkit-developers/webkit/qtwebkit/commit/20290f01a54b739d0591979f361b99bf13c2f40a).