RESOLVED FIXED 96935
Add Latin-1 Line Break Iterator to TextBreakIteratorICU.cpp
https://bugs.webkit.org/show_bug.cgi?id=96935
Summary Add Latin-1 Line Break Iterator to TextBreakIteratorICU.cpp
Michael Saboff
Reported 2012-09-17 11:24:12 PDT
Currently there is a 16 bit line break iterator provided by the function acquireLineBreakIterator() provided in TextBreakIteratorICU.cpp. To support 8-bit rendering, a 8 bit line break iterator should be added.
Attachments
Draft patch without build changes for non-Mac platforms (16.55 KB, patch)
2012-09-17 11:54 PDT, Michael Saboff
no flags
Updated Patch (16.48 KB, patch)
2012-09-21 16:01 PDT, Michael Saboff
webkit-ews: commit-queue-
Patch again (16.48 KB, patch)
2012-09-21 16:15 PDT, Michael Saboff
ggaren: review+
Michael Saboff
Comment 1 2012-09-17 11:54:11 PDT
Created attachment 164430 [details] Draft patch without build changes for non-Mac platforms Will post an updated patch with changes to other platform that use TextBreakIteratorICU.cpp. The change is neutral in performance on the Layout tests.
Eric Seidel (no email)
Comment 2 2012-09-17 12:25:05 PDT
I assume by LayoutTests you mean PLT? or PerformanceTests? Do we have a complex-text line-breaking perf test? PerformanceTests/Layout/line-breaking.html only does the ascii case.
Eric Seidel (no email)
Comment 3 2012-09-17 12:25:51 PDT
We probably should add a complex line-breaking microbenchmark to PerformanceTests. I'm sure there is a lot there we could work through yet. :)
Benjamin Poulain
Comment 4 2012-09-17 12:29:51 PDT
Comment on attachment 164430 [details] Draft patch without build changes for non-Mac platforms View in context: https://bugs.webkit.org/attachment.cgi?id=164430&action=review > Source/WebCore/platform/text/TextBreakIterator.h:62 > + : m_string(String()) > + , m_locale(AtomicString()) You can remove those. > Source/WebCore/rendering/break_lines.cpp:151 > +template<typename CharacterType, bool treatNoBreakSpaceAsBreak> > +static inline int nextBreakablePosition(LazyLineBreakIterator& lazyBreakIterator, const CharacterType* str, unsigned length, int pos) Chances are we should fully specialize that function instead. I can do that after your patch if you want.
Michael Saboff
Comment 5 2012-09-17 12:45:27 PDT
(In reply to comment #2) > I assume by LayoutTests you mean PLT? or PerformanceTests? > > Do we have a complex-text line-breaking perf test? PerformanceTests/Layout/line-breaking.html only does the ascii case. I mean PerformanceTests/Layout. I don't know if we have a complex text test for layout. This patch doesn't enable using the 8 bit code path as Text nodes and thus RenderText objects are 16 bit until we enable the 8 bit path in the HTML parser.
Michael Saboff
Comment 6 2012-09-17 12:49:12 PDT
(In reply to comment #4) > (From update of attachment 164430 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=164430&action=review > > > Source/WebCore/platform/text/TextBreakIterator.h:62 > > + : m_string(String()) > > + , m_locale(AtomicString()) > > You can remove those. Done. > > Source/WebCore/rendering/break_lines.cpp:151 > > +template<typename CharacterType, bool treatNoBreakSpaceAsBreak> > > +static inline int nextBreakablePosition(LazyLineBreakIterator& lazyBreakIterator, const CharacterType* str, unsigned length, int pos) > > Chances are we should fully specialize that function instead. > > I can do that after your patch if you want. Please elaborate.
Eric Seidel (no email)
Comment 7 2012-09-17 13:14:45 PDT
(In reply to comment #5) > (In reply to comment #2) > > I assume by LayoutTests you mean PLT? or PerformanceTests? > > > > Do we have a complex-text line-breaking perf test? PerformanceTests/Layout/line-breaking.html only does the ascii case. > > I mean PerformanceTests/Layout. I don't know if we have a complex text test for layout. > > This patch doesn't enable using the 8 bit code path as Text nodes and thus RenderText objects are 16 bit until we enable the 8 bit path in the HTML parser. I see. So this path is as of yet unused. Thanks. We should probably catch mitz or jungshik's ear about adding some complex-text Layout PerformanceTests as thats an area we could micro-benchmark more and probably eek more performance out of.
Benjamin Poulain
Comment 8 2012-09-17 13:53:12 PDT
> > > Source/WebCore/rendering/break_lines.cpp:151 > > > +template<typename CharacterType, bool treatNoBreakSpaceAsBreak> > > > +static inline int nextBreakablePosition(LazyLineBreakIterator& lazyBreakIterator, const CharacterType* str, unsigned length, int pos) > > > > Chances are we should fully specialize that function instead. > > > > I can do that after your patch if you want. > > Please elaborate. If you look at nextBreakablePosition(), there is fast path (first if()), and a slower patch for non-ASCII characters (second if()). Chances are we could generalize the fast path and get rid of the second if() for Latin1. What takes time in that particular loop is the number of branches needed for each character. If we could get rid of the second if(), it would likely be a big win.
Michael Saboff
Comment 9 2012-09-17 14:55:37 PDT
(In reply to comment #8) > > > > Source/WebCore/rendering/break_lines.cpp:151 > > > > +template<typename CharacterType, bool treatNoBreakSpaceAsBreak> > > > > +static inline int nextBreakablePosition(LazyLineBreakIterator& lazyBreakIterator, const CharacterType* str, unsigned length, int pos) > > > > > > Chances are we should fully specialize that function instead. > > > > > > I can do that after your patch if you want. > > > > Please elaborate. > > If you look at nextBreakablePosition(), there is fast path (first if()), and a slower patch for non-ASCII characters (second if()). > > Chances are we could generalize the fast path and get rid of the second if() for Latin1. > > What takes time in that particular loop is the number of branches needed for each character. If we could get rid of the second if(), it would likely be a big win. We could do that in this patch or as a follow on. We would have to generate the new entries for the line break table.
Michael Saboff
Comment 10 2012-09-21 16:01:27 PDT
Created attachment 165218 [details] Updated Patch Removed the unneeded explicit initializations.
Early Warning System Bot
Comment 11 2012-09-21 16:06:46 PDT
Comment on attachment 165218 [details] Updated Patch Attachment 165218 [details] did not pass qt-ews (qt): Output: http://queues.webkit.org/results/13957821
Gyuyoung Kim
Comment 12 2012-09-21 16:07:10 PDT
Comment on attachment 165218 [details] Updated Patch Attachment 165218 [details] did not pass efl-ews (efl): Output: http://queues.webkit.org/results/13954780
Early Warning System Bot
Comment 13 2012-09-21 16:08:51 PDT
Comment on attachment 165218 [details] Updated Patch Attachment 165218 [details] did not pass qt-wk2-ews (qt): Output: http://queues.webkit.org/results/13949796
Michael Saboff
Comment 14 2012-09-21 16:15:11 PDT
Created attachment 165221 [details] Patch again
Geoffrey Garen
Comment 15 2012-09-25 23:35:38 PDT
Comment on attachment 165221 [details] Patch again r=me
Michael Saboff
Comment 16 2012-09-26 09:54:17 PDT
Note You need to log in before you can comment on or make changes to this bug.