WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
96935
Add Latin-1 Line Break Iterator to TextBreakIteratorICU.cpp
https://bugs.webkit.org/show_bug.cgi?id=96935
Summary
Add Latin-1 Line Break Iterator to TextBreakIteratorICU.cpp
Michael Saboff
Reported
2012-09-17 11:24:12 PDT
Currently there is a 16 bit line break iterator provided by the function acquireLineBreakIterator() provided in TextBreakIteratorICU.cpp. To support 8-bit rendering, a 8 bit line break iterator should be added.
Attachments
Draft patch without build changes for non-Mac platforms
(16.55 KB, patch)
2012-09-17 11:54 PDT
,
Michael Saboff
no flags
Details
Formatted Diff
Diff
Updated Patch
(16.48 KB, patch)
2012-09-21 16:01 PDT
,
Michael Saboff
webkit-ews
: commit-queue-
Details
Formatted Diff
Diff
Patch again
(16.48 KB, patch)
2012-09-21 16:15 PDT
,
Michael Saboff
ggaren
: review+
Details
Formatted Diff
Diff
Show Obsolete
(2)
View All
Add attachment
proposed patch, testcase, etc.
Michael Saboff
Comment 1
2012-09-17 11:54:11 PDT
Created
attachment 164430
[details]
Draft patch without build changes for non-Mac platforms Will post an updated patch with changes to other platform that use TextBreakIteratorICU.cpp. The change is neutral in performance on the Layout tests.
Eric Seidel (no email)
Comment 2
2012-09-17 12:25:05 PDT
I assume by LayoutTests you mean PLT? or PerformanceTests? Do we have a complex-text line-breaking perf test? PerformanceTests/Layout/line-breaking.html only does the ascii case.
Eric Seidel (no email)
Comment 3
2012-09-17 12:25:51 PDT
We probably should add a complex line-breaking microbenchmark to PerformanceTests. I'm sure there is a lot there we could work through yet. :)
Benjamin Poulain
Comment 4
2012-09-17 12:29:51 PDT
Comment on
attachment 164430
[details]
Draft patch without build changes for non-Mac platforms View in context:
https://bugs.webkit.org/attachment.cgi?id=164430&action=review
> Source/WebCore/platform/text/TextBreakIterator.h:62 > + : m_string(String()) > + , m_locale(AtomicString())
You can remove those.
> Source/WebCore/rendering/break_lines.cpp:151 > +template<typename CharacterType, bool treatNoBreakSpaceAsBreak> > +static inline int nextBreakablePosition(LazyLineBreakIterator& lazyBreakIterator, const CharacterType* str, unsigned length, int pos)
Chances are we should fully specialize that function instead. I can do that after your patch if you want.
Michael Saboff
Comment 5
2012-09-17 12:45:27 PDT
(In reply to
comment #2
)
> I assume by LayoutTests you mean PLT? or PerformanceTests? > > Do we have a complex-text line-breaking perf test? PerformanceTests/Layout/line-breaking.html only does the ascii case.
I mean PerformanceTests/Layout. I don't know if we have a complex text test for layout. This patch doesn't enable using the 8 bit code path as Text nodes and thus RenderText objects are 16 bit until we enable the 8 bit path in the HTML parser.
Michael Saboff
Comment 6
2012-09-17 12:49:12 PDT
(In reply to
comment #4
)
> (From update of
attachment 164430
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=164430&action=review
> > > Source/WebCore/platform/text/TextBreakIterator.h:62 > > + : m_string(String()) > > + , m_locale(AtomicString()) > > You can remove those.
Done.
> > Source/WebCore/rendering/break_lines.cpp:151 > > +template<typename CharacterType, bool treatNoBreakSpaceAsBreak> > > +static inline int nextBreakablePosition(LazyLineBreakIterator& lazyBreakIterator, const CharacterType* str, unsigned length, int pos) > > Chances are we should fully specialize that function instead. > > I can do that after your patch if you want.
Please elaborate.
Eric Seidel (no email)
Comment 7
2012-09-17 13:14:45 PDT
(In reply to
comment #5
)
> (In reply to
comment #2
) > > I assume by LayoutTests you mean PLT? or PerformanceTests? > > > > Do we have a complex-text line-breaking perf test? PerformanceTests/Layout/line-breaking.html only does the ascii case. > > I mean PerformanceTests/Layout. I don't know if we have a complex text test for layout. > > This patch doesn't enable using the 8 bit code path as Text nodes and thus RenderText objects are 16 bit until we enable the 8 bit path in the HTML parser.
I see. So this path is as of yet unused. Thanks. We should probably catch mitz or jungshik's ear about adding some complex-text Layout PerformanceTests as thats an area we could micro-benchmark more and probably eek more performance out of.
Benjamin Poulain
Comment 8
2012-09-17 13:53:12 PDT
> > > Source/WebCore/rendering/break_lines.cpp:151 > > > +template<typename CharacterType, bool treatNoBreakSpaceAsBreak> > > > +static inline int nextBreakablePosition(LazyLineBreakIterator& lazyBreakIterator, const CharacterType* str, unsigned length, int pos) > > > > Chances are we should fully specialize that function instead. > > > > I can do that after your patch if you want. > > Please elaborate.
If you look at nextBreakablePosition(), there is fast path (first if()), and a slower patch for non-ASCII characters (second if()). Chances are we could generalize the fast path and get rid of the second if() for Latin1. What takes time in that particular loop is the number of branches needed for each character. If we could get rid of the second if(), it would likely be a big win.
Michael Saboff
Comment 9
2012-09-17 14:55:37 PDT
(In reply to
comment #8
)
> > > > Source/WebCore/rendering/break_lines.cpp:151 > > > > +template<typename CharacterType, bool treatNoBreakSpaceAsBreak> > > > > +static inline int nextBreakablePosition(LazyLineBreakIterator& lazyBreakIterator, const CharacterType* str, unsigned length, int pos) > > > > > > Chances are we should fully specialize that function instead. > > > > > > I can do that after your patch if you want. > > > > Please elaborate. > > If you look at nextBreakablePosition(), there is fast path (first if()), and a slower patch for non-ASCII characters (second if()). > > Chances are we could generalize the fast path and get rid of the second if() for Latin1. > > What takes time in that particular loop is the number of branches needed for each character. If we could get rid of the second if(), it would likely be a big win.
We could do that in this patch or as a follow on. We would have to generate the new entries for the line break table.
Michael Saboff
Comment 10
2012-09-21 16:01:27 PDT
Created
attachment 165218
[details]
Updated Patch Removed the unneeded explicit initializations.
Early Warning System Bot
Comment 11
2012-09-21 16:06:46 PDT
Comment on
attachment 165218
[details]
Updated Patch
Attachment 165218
[details]
did not pass qt-ews (qt): Output:
http://queues.webkit.org/results/13957821
Gyuyoung Kim
Comment 12
2012-09-21 16:07:10 PDT
Comment on
attachment 165218
[details]
Updated Patch
Attachment 165218
[details]
did not pass efl-ews (efl): Output:
http://queues.webkit.org/results/13954780
Early Warning System Bot
Comment 13
2012-09-21 16:08:51 PDT
Comment on
attachment 165218
[details]
Updated Patch
Attachment 165218
[details]
did not pass qt-wk2-ews (qt): Output:
http://queues.webkit.org/results/13949796
Michael Saboff
Comment 14
2012-09-21 16:15:11 PDT
Created
attachment 165221
[details]
Patch again
Geoffrey Garen
Comment 15
2012-09-25 23:35:38 PDT
Comment on
attachment 165221
[details]
Patch again r=me
Michael Saboff
Comment 16
2012-09-26 09:54:17 PDT
Committed
r129662
: <
http://trac.webkit.org/changeset/129662
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug