Bug 12833 - REGRESSION: Selecting text in 6.6MB txt file is sluggish as of the Feb 19th nightly
Summary: REGRESSION: Selecting text in 6.6MB txt file is sluggish as of the Feb 19th n...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Text (show other bugs)
Version: 420+
Hardware: Mac OS X 10.4
: P1 Normal
Assignee: Antti Koivisto
URL: http://dscoder.com/test.txt
Keywords: InRadar, Regression
Depends on:
Blocks:
 
Reported: 2007-02-20 15:49 PST by David Smith
Modified: 2011-03-07 13:09 PST (History)
4 users (show)

See Also:


Attachments
improve selection performance (20.55 KB, patch)
2007-03-08 04:37 PST, Antti Koivisto
koivisto: review-
Details | Formatted Diff | Diff
updated patch (22.58 KB, patch)
2007-03-09 09:15 PST, Antti Koivisto
no flags Details | Formatted Diff | Diff
Divide large text blocks (>64kB) over multiple text nodes (3.99 KB, patch)
2007-03-12 11:33 PDT, Antti Koivisto
ap: review-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description David Smith 2007-02-20 15:49:32 PST
The impression I'm getting is that the speed is directly related to how far down the document the selection is. Try scrolling to the bottom to see it more clearly.
Comment 1 mitz 2007-02-20 16:00:53 PST
Wondering if r19734 changed anything in this regard.
Comment 2 mitz 2007-02-20 16:26:43 PST
(In reply to comment #1)
> Wondering if r19734 changed anything in this regard.
> 

Apparently it didn't.

One thing that my be slowing things down is that if the document is loaded in multiple chunks, then it turns into multiple text nodes. The parser could be taught how to append to a text node.
Comment 3 mitz 2007-02-20 23:43:30 PST
A lot of time seems to be spent iterating over inline boxes trying to find the one containing a given offset, e.g. in RenderText::inlineBox().
Comment 4 Maciej Stachowiak 2007-02-27 19:35:20 PST
<rdar://problem/5028159>
Comment 5 Antti Koivisto 2007-03-08 04:37:45 PST
Created attachment 13540 [details]
improve selection performance

This patch improved time it takes to do SelectionController::setSelection() at the end of http://dscoder.com/test.txt from 200ms range to around 20ms. Unfortunately it is still more sluggish than Tiger WebKit but at least it is sort-of usable (selection is quite a bit faster in release build than debug build too).

Basic idea is to do way less line box traversing than before. It is also possible that some of the stuff editing code is doing during selections is actually unnecessary, I can't really tell. This patch doesn't change anything in that respect.
Comment 6 mitz 2007-03-08 05:16:59 PST
Comment on attachment 13540 [details]
improve selection performance

textBoxesCoverAllPositions() doesn't necessarily give the right answer for renderers containing reversed text (it may return false when the text boxes do in fact cover all positions). I didn't check how the answer is used so I don't know whether it's okay to return true in this case or whether the question is really "do the text boxes cover all positions in logical order?"
Comment 7 mitz 2007-03-08 11:25:03 PST
Comment on attachment 13540 [details]
improve selection performance

+                textOffset > 0 && textOffset <= textRenderer->lastTextBox()->start() + textRenderer->lastTextBox()->end())

Are you sure start() + end() is correct?
Comment 8 Antti Koivisto 2007-03-08 12:11:33 PST
(In reply to comment #6)
> (From update of attachment 13540 [details] [edit])
> textBoxesCoverAllPositions() doesn't necessarily give the right answer for
> renderers containing reversed text (it may return false when the text boxes do
> in fact cover all positions). I didn't check how the answer is used so I don't
> know whether it's okay to return true in this case or whether the question is
> really "do the text boxes cover all positions in logical order?"
> 

Correct, it doesn't optimize reversed text case. Doing that would also require also changes in the caller. Maybe adding a FIXME is enough...
Comment 9 Antti Koivisto 2007-03-08 12:12:18 PST
(In reply to comment #7)
> (From update of attachment 13540 [details] [edit])
> +                textOffset > 0 && textOffset <=
> textRenderer->lastTextBox()->start() + textRenderer->lastTextBox()->end())
> 
> Are you sure start() + end() is correct?
> 

Ooops. Good catch.
Comment 10 Antti Koivisto 2007-03-08 13:12:58 PST
Comment on attachment 13540 [details]
improve selection performance

clearing review flag, i'm updating the patch
Comment 11 Antti Koivisto 2007-03-09 09:15:30 PST
Created attachment 13560 [details]
updated patch

- rename textBoxesCoverAllPositions() to textBoxesCoverAllPositionsInOrder()
- use textBoxesCoverAllPositionsInOrder() optimization in more places
- faster VisiblePosition construction
- fix issue pointed out by Mitz
- cleanups
Comment 12 Darin Adler 2007-03-09 09:24:46 PST
Comment on attachment 13560 [details]
updated patch

+                    return currentPos;
+                else
+                    continue;

We normally don't do "else after return".

-    DeprecatedString string;
+    String string;

String has pathologically slow behavior when you resize it that DeprecatedString does not; it does a realloc every time you make it bigger. Instead you should probably use a Vector<UChar>.
Comment 13 Justin Garcia 2007-03-09 16:18:38 PST
Comment on attachment 13560 [details]
updated patch

From VisiblePosition::init():

     // When not at a line wrap, make sure to end up with DOWNSTREAM affinity.
-    if (m_affinity == UPSTREAM && (isNull() || inSameLine(VisiblePosition(position, DOWNSTREAM), *this)))
-        m_affinity = DOWNSTREAM;
+    if (m_affinity == UPSTREAM) {
+        VisiblePosition downstream(*this);
+        downstream.m_affinity = DOWNSTREAM;
+        if (isNull() || inSameLine(downstream, *this))
+            m_affinity = DOWNSTREAM;
+    }

There's a FIXME in canonicalPosition about how it will eventually need to take in the affinity and use it to find the canonical position to fix a bug (9535).  At that time we'll have to remove the above optimization. I'd add a note and mention the bug number.
Comment 14 Antti Koivisto 2007-03-12 11:33:55 PDT
Created attachment 13599 [details]
Divide large text blocks (>64kB) over multiple text nodes

Instead of trying to make editing code efficient (which would be nice, but is bit risky at this point) the problem can be more or less completely hidden by dividing large text blocks over multiple text nodes. Shorter linebox lists are faster to search.

Dividing is done for both text and html documents. Firefox seems to be dividing large text nodes too (in html).
Comment 15 Alexey Proskuryakov 2007-03-12 13:01:01 PDT
Comment on attachment 13599 [details]
Divide large text blocks (>64kB) over multiple text nodes

Discussed with Antti over IRC - according to his testing, this doesn't properly handle decomposed characters that happen to be on the boundary. I believe this needs to be corrected first.

See a test case at <http://nypop.com/~ap/webkit/large-text.html>.
Comment 16 Alexey Proskuryakov 2007-03-12 14:09:24 PDT
r=me with changes discussed in detail over IRC (using CharacterBreakIterator::textBreakPreceding, adding a layout test), assuming all tests pass.
Comment 17 Antti Koivisto 2007-03-12 17:14:15 PDT
Per Alexey's comments commited as r20127, with layout test for composed characters. 
Comment 18 Alexey Proskuryakov 2011-03-07 13:09:23 PST
This has regressed again - I've filed bug 55898 for that.