Bug 29092

Summary: Performance slow when loading a large text html file on Symbian platform
Product: WebKit Reporter: Chang Shu <cshu>
Component: DOMAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, commit-queue, laszlo.gombos, mitz
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: S60 Hardware   
OS: S60 3rd edition   
Attachments:
Description Flags
fix patch
none
fix patch for the bug
ariya.hidayat: review-
patch with modified description
ap: review-
new patch after review
none
patch with comments added none

Description Chang Shu 2009-09-09 10:28:11 PDT
Loading a large text file on Symbian platform is slow at parsing stage. One improvement can be done is to limit the text scanning range in function Text::createWithLengthLimit(...) in Text.cpp.
Comment 1 Chang Shu 2009-09-09 10:34:17 PDT
Created attachment 39278 [details]
fix patch
Comment 2 Ariya Hidayat 2009-09-09 16:55:03 PDT
Comment on attachment 39278 [details]
fix patch


> +        Optimize the code so only the text from start to end is scaned.
> +        https://bugs.webkit.org/show_bug.cgi?id=29092

This text is too brief to provide a context of what kind of optimization is done by the commit.

Can you also confirm that no test is broken by the change?

Also, can you provide a comparison of performance before and after the change?


r- until these issues are addressed.
Comment 3 Chang Shu 2009-09-14 10:02:11 PDT
Created attachment 39550 [details]
fix patch for the bug

Found issues in previous patch during regression test. Regression test went through clean with new patch.
Comment 4 Ariya Hidayat 2009-09-16 06:08:32 PDT
Comment on attachment 39550 [details]
fix patch for the bug

> +        Optimize the code so only the text from start to end is scaned.

Typo: scanned.

> +        On a platform with webkit+Qt+Symbian, the parsing time for a 600K text
> +        file improved 90% and overall loading time improved 20%.

Could you use less ambiguous text? Improved by 90% means what?
Suggested: improved from 100 ms to 75 ms (1.33x faster).
Comment 5 Chang Shu 2009-09-16 06:22:37 PDT
Created attachment 39646 [details]
patch with modified description
Comment 6 mitz 2009-09-16 07:42:00 PDT
Comment on attachment 39646 [details]
patch with modified description

> Index: WebCore/ChangeLog
> ===================================================================
> --- WebCore/ChangeLog	(revision 48354)
> +++ WebCore/ChangeLog	(working copy)
> @@ -1,3 +1,19 @@
> +2009-09-14  Shu Chang  <Chang.Shu@nokia.com>
> +
> +        Reviewed by NOBODY (OOPS!).
> +
> +        Need a short description and bug URL (OOPS!)
> +
> +        Optimize the code so only the text from start to end is scaned.
> +        https://bugs.webkit.org/show_bug.cgi?id=29092
> +
> +        On a platform with webkit+Qt+Symbian, the parsing time for a 600K text
> +        file improved from 400ms to 40ms (10x faster).
> +        Ran WebKitTools/Scripts/run-webkit-tests, and no new problem was found.
> +
> +        * dom/Text.cpp:
> +        (WebCore::Text::createWithLengthLimit):
> +
>  2009-09-14  Yael Aharon  <yael.aharon@nokia.com>
>  
>          Reviewed by Tor Arne Vestbø.
> Index: WebCore/dom/Text.cpp
> ===================================================================
> --- WebCore/dom/Text.cpp	(revision 48354)
> +++ WebCore/dom/Text.cpp	(working copy)
> @@ -315,10 +315,12 @@ PassRefPtr<Text> Text::createWithLengthL
>      unsigned end = start + min(charsLeft, maxChars);
>      
>      // Check we are not on an unbreakable boundary.
> -    TextBreakIterator* it = characterBreakIterator(data.characters(), dataLength);
> -    if (end < dataLength && !isTextBreak(it, end))
> -        end = textBreakPreceding(it, end);
> -        
> +    if(end < dataLength) {

There’s a space missing after “if”.

> +        TextBreakIterator* it = characterBreakIterator(data.characters() + start, end - start + 1);
> +        if (!isTextBreak(it, end - start))

Are you sure one additional character is enough context?
Comment 7 Chang Shu 2009-09-16 08:17:21 PDT
Thanks for comment #6. I would think one extra character is enough for character breaking. However, I am not 100% sure if it is true about all different languages. To make it safe, more extra characters would be good and it won't affect performance. Any number could you suggest?
Comment 8 mitz 2009-09-16 09:19:15 PDT
I conjecture that supplying the entire remainder of the data to the iterator will have no effect on performance. I wonder if trimming the beginning has any significant effect. I would think that the dominant part of this patch is not instantiating the iterator and not calling it when end >= dataLength.
Comment 9 Chang Shu 2009-09-16 10:03:46 PDT
(In reply to comment #8)
> I conjecture that supplying the entire remainder of the data to the iterator
> will have no effect on performance. I wonder if trimming the beginning has
> any significant effect. 

It depends on lower level implementation. On Qt, calling characterBreakIterator() is very heavy with large texts. This is the major improvement with the patch.

> I would think that the dominant part of this patch is not
> instantiating the iterator and not calling it when end >= dataLength.

This saves time of processing the last piece of the text. However, it is not significant based on the test on a platform with Qt+Symbian.
Comment 10 Eric Seidel (no email) 2009-09-18 13:20:01 PDT
Comment on attachment 39278 [details]
fix patch

Clearing r+ on this obsolete patch.
Comment 11 Alexey Proskuryakov 2009-09-18 14:19:25 PDT
Comment on attachment 39646 [details]
patch with modified description

+        Optimize the code so only the text from start to end is scaned.

Typo: should be "scanned".

+    if(end < dataLength) {

Please add a space after "if".

+        TextBreakIterator* it = characterBreakIterator(data.characters() + start, end - start + 1);

How will this behave if a non-BMP character is coming next (one that takes two UTF-16 values)? Depending on the platform implementation of the iterator, it may not work well.

Seems that this needs at least "+ 2". I think that it would work then, but I'm also not 100% sure.

>     // If we have maxChars of unbreakable characters the above could lead to
>     // an infinite loop.

This comment needs to be updated, now that we rely on the code below in more cases.

What exactly gave the performance boost for Qt, trimming from the beginning, or from the end? If the latter is not necessary, I suggest passing the entire remainder of the text to iterator.
Comment 12 Chang Shu 2009-09-18 15:02:28 PDT
(In reply to comment #11)

Thanks for the review. Just a few comments below before next patch.

> Typo: should be "scanned".
> Please add a space after "if".

Will fix the typo and coding style in next patch.

> +        TextBreakIterator* it = characterBreakIterator(data.characters() +
> start, end - start + 1);
> How will this behave if a non-BMP character is coming next (one that takes two
> UTF-16 values)? Depending on the platform implementation of the iterator, it
> may not work well.
> Seems that this needs at least "+ 2". I think that it would work then, but I'm also not 100% sure.

To make it safe, I can use a much larger number. All I want to avoid is a huge length like 100000 or even larger.

> >     // If we have maxChars of unbreakable characters the above could lead to
> >     // an infinite loop.
> This comment needs to be updated, now that we rely on the code below in more
> cases.

Any suggestion on how to update this comment? I don't think my changes above changed the current over-all behavoir. However, if we want to change the behavior, such as search forward for the next break instead of return the whole buffer, my change will cause problem. And in this case, the suggestion below(using the entire remainder) is a better solution.

> What exactly gave the performance boost for Qt, trimming from the beginning, or
> from the end? If the latter is not necessary, I suggest passing the entire
> remainder of the text to iterator.

In Qt implementation, when a text is passed to function characterBreakIterator(), a corresponding flag buffer is allocated (each text char is associated with several flags). Then the text is scanned for several times and the flags are filled. For details, see WebCore/platform/text/qt/TextBreakIteratorQt.cpp in webkit and corelib/tools/qtextboundaryfinder.cpp in qt. Without the improvement, the whole text is passed to this function everytime a piece of 64k of text is chopped off. Say the text length is n*64K and the time in createWithLengthLimit() using entire text is T, then the overall time are roughly:
Original code: nT
My code: T
Using remainder of the text: nT/2.
But apparently, the last option is safe and has advantages in the future.
Comment 13 mitz 2009-09-18 15:05:06 PDT
(In reply to comment #11)
> Seems that this needs at least "+ 2". I think that it would work then, but I'm
> also not 100% sure.

Of course if you use any constant greater than 1 you need to make sure you don’t overshoot.
Comment 14 Alexey Proskuryakov 2009-09-21 10:33:01 PDT
> To make it safe, I can use a much larger number.

I don't think that would be a good thing to do. We know whay we may need to go to 2 (and it would be great if you could make a test case for that), but arbitrarily increasing the number will just make any potential issues more difficult to reproduce and correct.

> > >     // If we have maxChars of unbreakable characters the above could lead to
> > >     // an infinite loop.
> > This comment needs to be updated, now that we rely on the code below in more
> > cases.
> 
> Any suggestion on how to update this comment? I don't think my changes above
> changed the current over-all behavoir. 

You are right, I got confused. There is no need to change this comment.

> Original code: nT
> My code: T
> Using remainder of the text: nT/2.
> But apparently, the last option is safe and has advantages in the future.

It seems that for Qt, halving the time isn't really sufficient, is it?
Comment 15 Chang Shu 2009-09-21 10:45:54 PDT
(In reply to comment #14)
> > To make it safe, I can use a much larger number.
> I don't think that would be a good thing to do. We know whay we may need to go
> to 2 (and it would be great if you could make a test case for that), but
> arbitrarily increasing the number will just make any potential issues more
> difficult to reproduce and correct.

I will try to find out if a number larger than 2 is necesary.

> > 
> > Any suggestion on how to update this comment? I don't think my changes above
> > changed the current over-all behavoir. 
> You are right, I got confused. There is no need to change this comment.
> > Original code: nT
> > My code: T
> > Using remainder of the text: nT/2.
> > But apparently, the last option is safe and has advantages in the future.
> It seems that for Qt, halving the time isn't really sufficient, is it?

:) Would the patch be ok if I can find the right number above?
Comment 16 Chang Shu 2009-09-22 07:50:46 PDT
I did some study on text breaking code in Qt implementation, which uses harfbuzz, and had the following findings.
1. In most languages except the ones mentioned below, one character is required for making character breaking decision except for surrogate handling, where two characters are required.
2. In certain languages, namely Indic, Tibetan, Myanmar and Khmer, which are supported in harfbuzz, syllables are non-breakable. I looked at each of their implementation and found in Myanmar and Khmer, the maximum syllable length is 15 and 12, correspondingly, in theory. However, Indic and Tibetan languages are different. In reality, any syllable has a limit, but I can make up a syllable that can run forever (e.g., in Tibetan, a head consonant followed by unlimited vowels).

So as a result, using the remainder of the text buffer is the best compromise of code robustness and performance. Hopefully, we can improve the Qt performance later. If all the reviewers agree, I will submit the new patch.
Thanks.
Comment 17 Alexey Proskuryakov 2009-09-22 09:29:45 PDT
> 2. In certain languages, namely Indic, Tibetan, Myanmar and Khmer, which are
> supported in harfbuzz, syllables are non-breakable. I looked at each of their
> implementation and found in Myanmar and Khmer, the maximum syllable length is
> 15 and 12, correspondingly, in theory.

It's certainly true that there are many long unbreakable character sequences in Unicode (another example is decomposed Roman text, e.g. U+0065 U+0301 U+0302 for é̂). But are there cases where unlimited look-ahead is necessary? It certainly isn't in my example, because each accent has appropriate Unicode properties that tell us about it being composed with the previous character.

> So as a result, using the remainder of the text buffer is the best compromise
> of code robustness and performance. Hopefully, we can improve the Qt
> performance later. If all the reviewers agree, I will submit the new patch.

Anything that's correct, not too complicated, and doesn't regress performance of other ports is fine with me.
Comment 18 Chang Shu 2009-09-22 10:08:53 PDT
(In reply to comment #17)
> > 2. In certain languages, namely Indic, Tibetan, Myanmar and Khmer, which are
> > supported in harfbuzz, syllables are non-breakable. I looked at each of their
> > implementation and found in Myanmar and Khmer, the maximum syllable length is
> > 15 and 12, correspondingly, in theory.
> It's certainly true that there are many long unbreakable character sequences in
> Unicode (another example is decomposed Roman text, e.g. U+0065 U+0301 U+0302
> for é̂). But are there cases where unlimited look-ahead is necessary? It
> certainly isn't in my example, because each accent has appropriate Unicode
> properties that tell us about it being composed with the previous character.

By looking at the code itself (harfbuzz_tibetan.c), it does not guarantee a syllable has a limited boundary. However, I don't think there is any unlimited syllable in the real life either.

> > So as a result, using the remainder of the text buffer is the best compromise
> > of code robustness and performance. Hopefully, we can improve the Qt
> > performance later. If all the reviewers agree, I will submit the new patch.
> Anything that's correct, not too complicated, and doesn't regress performance
> of other ports is fine with me.

If we don't want to sacrifice the performance too much, is it ok to use +20? As I found out before, the longest syllable length is possibly 15.
Comment 19 Alexey Proskuryakov 2009-09-22 10:17:14 PDT
I still don't understand why +2 is not sufficient.
Comment 20 Chang Shu 2009-09-22 10:48:09 PDT
(In reply to comment #19)
> I still don't understand why +2 is not sufficient.

Let me explain. +2 is sufficient in most languages, but not in the four languages I mentioned. In these four languages, syllables are not breakable. One syllable may contain more than 2 characters. The reason they are not breakable is a syllable may present on display as one glyph. Using English letters as a faked example, "webkit" has two syllables, web and kit. In these languages, w-e-b may be drawn vertically as a whole glyph and k-i-t drawn vertically after w-e-b, horizontally (see below). In this case, +3 is required.
w k
e i
b t

This is what I learned yesterday. Hopefully, my understanding is correct.
Comment 21 Alexey Proskuryakov 2009-09-22 11:00:12 PDT
In the same example, will "we" form a non-breakable glyph, too? If it will, then there is no need to look at "web" to determine that there is no break between "w" and "e", even though we don't have a complete syllable to look for.
Comment 22 Chang Shu 2009-09-22 11:24:39 PDT
(In reply to comment #21)
> In the same example, will "we" form a non-breakable glyph, too? If it will,
> then there is no need to look at "web" to determine that there is no break
> between "w" and "e", even though we don't have a complete syllable to look for.

That makes a good point. We only break the text when it's breakable. I looked at the code again and didn't see any case that "we" is breakable but "web" is not.
Comment 23 Chang Shu 2009-09-23 06:37:42 PDT
Created attachment 39993 [details]
new patch after review
Comment 24 Alexey Proskuryakov 2009-09-23 08:49:02 PDT
Comment on attachment 39993 [details]
new patch after review

> +        Ran WebKitTools/Scripts/run-webkit-tests, and no new problem was found.

This shouldn't be in ChangeLog, as it is not the kind of information that's interesting to anyone after the patch is landed.

> +        TextBreakIterator* it = characterBreakIterator(data.characters() + start, (end + 2 > dataLength) ? dataLength - start : end - start + 2);

It would be nice to comment where "+ 2" comes from. This was a result of a long discussion, we can't assume that anyone looking at this code would guess what it is about.
Comment 25 Chang Shu 2009-09-23 10:04:48 PDT
Created attachment 40002 [details]
patch with comments added
Comment 26 Alexey Proskuryakov 2009-09-23 10:49:28 PDT
Comment on attachment 40002 [details]
patch with comments added

r=me
Comment 27 Eric Seidel (no email) 2009-09-26 01:30:27 PDT
Comment on attachment 40002 [details]
patch with comments added

As far as I can tell Shu Chang is not a committer, setting cq+ so that this gets auto-committed for him/her.
Comment 28 WebKit Commit Bot 2009-09-26 08:00:35 PDT
Comment on attachment 40002 [details]
patch with comments added

Clearing flags on attachment: 40002

Committed r48790: <http://trac.webkit.org/changeset/48790>
Comment 29 WebKit Commit Bot 2009-09-26 08:00:39 PDT
All reviewed patches have been landed.  Closing bug.