WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
63209
Make line breaking obey the -webkit-locale property
https://bugs.webkit.org/show_bug.cgi?id=63209
Summary
Make line breaking obey the -webkit-locale property
mitz
Reported
2011-06-22 17:51:19 PDT
Make line breaking obey the -webkit-locale property
Attachments
Patch
(37.92 KB, patch)
2011-06-22 17:59 PDT
,
mitz
no flags
Details
Formatted Diff
Diff
Patch
(42.35 KB, patch)
2011-06-22 20:13 PDT
,
mitz
no flags
Details
Formatted Diff
Diff
Patch
(42.38 KB, patch)
2011-06-22 20:23 PDT
,
mitz
no flags
Details
Formatted Diff
Diff
Patch
(41.12 KB, patch)
2011-06-23 10:36 PDT
,
mitz
ap
: review+
Details
Formatted Diff
Diff
Show Obsolete
(3)
View All
Add attachment
proposed patch, testcase, etc.
mitz
Comment 1
2011-06-22 17:59:28 PDT
Created
attachment 98275
[details]
Patch
mitz
Comment 2
2011-06-22 18:00:17 PDT
<
rdar://problem/8948904
>
WebKit Review Bot
Comment 3
2011-06-22 18:05:59 PDT
Comment on
attachment 98275
[details]
Patch
Attachment 98275
[details]
did not pass chromium-ews (chromium-xvfb): Output:
http://queues.webkit.org/results/8925806
Early Warning System Bot
Comment 4
2011-06-22 18:13:16 PDT
Comment on
attachment 98275
[details]
Patch
Attachment 98275
[details]
did not pass qt-ews (qt): Output:
http://queues.webkit.org/results/8925808
Martin Robinson
Comment 5
2011-06-22 18:23:47 PDT
Comment on
attachment 98275
[details]
Patch
Attachment 98275
[details]
did not pass gtk-ews (gtk): Output:
http://queues.webkit.org/results/8919875
WebKit Review Bot
Comment 6
2011-06-22 19:37:23 PDT
Comment on
attachment 98275
[details]
Patch
Attachment 98275
[details]
did not pass cr-mac-ews (chromium): Output:
http://queues.webkit.org/results/8924831
mitz
Comment 7
2011-06-22 20:13:42 PDT
Created
attachment 98305
[details]
Patch
WebKit Review Bot
Comment 8
2011-06-22 20:18:57 PDT
Comment on
attachment 98305
[details]
Patch
Attachment 98305
[details]
did not pass chromium-ews (chromium-xvfb): Output:
http://queues.webkit.org/results/8918914
Martin Robinson
Comment 9
2011-06-22 20:22:28 PDT
Comment on
attachment 98305
[details]
Patch
Attachment 98305
[details]
did not pass gtk-ews (gtk): Output:
http://queues.webkit.org/results/8923865
mitz
Comment 10
2011-06-22 20:23:12 PDT
Created
attachment 98309
[details]
Patch
Alexey Proskuryakov
Comment 11
2011-06-23 09:30:08 PDT
Comment on
attachment 98309
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=98309&action=review
> Source/WebCore/platform/text/TextBreakIterator.h:48 > + void releaseLineBreakIterator(TextBreakIterator*, const AtomicString& locale = AtomicString());
I can see why this was easier to implement, but passing locale to the release function is quite surprising and misleading. Perhaps it would be better to maintain a reverse map of acquired TextBreakIterators to their locales?
> Source/WebCore/platform/text/TextBreakIteratorICU.cpp:78 > + static LineBreakIteratorPool& sharedPool() > + { > + DEFINE_STATIC_LOCAL(LineBreakIteratorPool, pool, ());
Perhaps ASSERT(isMainThread())?
> Source/WebCore/platform/text/TextBreakIteratorICU.cpp:93 > + return ubrk_open(UBRK_LINE, locale.isEmpty() ? currentTextBreakLocaleID() : locale.string().ascii().data(), 0, 0, &openStatus);
I would at least ASSERT the status.
> Source/WebCore/platform/text/TextBreakIteratorICU.cpp:123 > + if (U_FAILURE(setTextStatus)) > + return 0;
I'm not sure what this error means. Is this a completely unexpected condition that we should ASSERT against?
Darin Adler
Comment 12
2011-06-23 09:35:58 PDT
Comment on
attachment 98309
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=98309&action=review
> Source/WebCore/platform/text/TextBreakIteratorICU.cpp:93 > + return ubrk_open(UBRK_LINE, locale.isEmpty() ? currentTextBreakLocaleID() : locale.string().ascii().data(), 0, 0, &openStatus);
Since empty string and null string are distinct, they end up with two separate entries in the pool, but both map to the current locale. If we mapped empty locale strings to the current text break locale before looking for matches in the pool, then it could use an existing iterator with the same locale, explicitly specified. Also, doing it that way would eliminate a dependency on the fact that the current locale ID won't change in this function. I know that's currently how currentTextBreakLocaleID() behaves, but it might be nice to not have the pool machinery depend on it. Is ascii() better than utf8() or latin1() here? I worry slightly that folding non-ASCII characters to "?" could make unwanted collisions or aliasing. Maybe we should assert that the locale string is all ASCII.
> Source/WebCore/platform/text/TextBreakIteratorICU.cpp:101 > + if (m_pool.size() == capacity) { > + ubrk_close(m_pool[0].second); > + m_pool.remove(0); > + }
This is acting like a deque rather than a vector. But I suppose given the small fixed size a vector works. Another way to do it would be to use a fixed size array as a circular buffer.
mitz
Comment 13
2011-06-23 10:36:40 PDT
Created
attachment 98368
[details]
Patch
mitz
Comment 14
2011-06-23 10:48:09 PDT
(In reply to
comment #12
) The version I just posted addresses some of Alexey’s comments, but I didn’t see Darin’s comments before posting it.
> (From update of
attachment 98309
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=98309&action=review
> > > Source/WebCore/platform/text/TextBreakIteratorICU.cpp:93 > > + return ubrk_open(UBRK_LINE, locale.isEmpty() ? currentTextBreakLocaleID() : locale.string().ascii().data(), 0, 0, &openStatus); > > Since empty string and null string are distinct, they end up with two separate entries in the pool, but both map to the current locale. > > If we mapped empty locale strings to the current text break locale before looking for matches in the pool, then it could use an existing iterator with the same locale, explicitly specified. Also, doing it that way would eliminate a dependency on the fact that the current locale ID won't change in this function. I know that's currently how currentTextBreakLocaleID() behaves, but it might be nice to not have the pool machinery depend on it.
This occurred to me. Doing the mapping first would have required a round trip through AtomicString, and while that’s not a big deal, I decided not to bother with this improvement.
> Is ascii() better than utf8() or latin1() here? I worry slightly that folding non-ASCII characters to "?" could make unwanted collisions or aliasing. Maybe we should assert that the locale string is all ASCII.
I think utf8() is better. I am going to change to utf8() (if I remember).
> > > Source/WebCore/platform/text/TextBreakIteratorICU.cpp:101 > > + if (m_pool.size() == capacity) { > > + ubrk_close(m_pool[0].second); > > + m_pool.remove(0); > > + } > > This is acting like a deque rather than a vector. But I suppose given the small fixed size a vector works. Another way to do it would be to use a fixed size array as a circular buffer.
I need to be able to remove entries from anywhere in the vector, so it is not quite a deque. I preferred to keep the code simple (IMO), given the small fixed size.
Alexey Proskuryakov
Comment 15
2011-06-23 10:54:44 PDT
Comment on
attachment 98368
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=98368&action=review
I'm going to say r+, although not sharing cached iterators for empty -webkit-locale and for one that happens to match user locale seems unfortunate (as well as empty vs null).
> Source/WebCore/platform/text/TextBreakIteratorICU.cpp:122 > +private: > + LineBreakIteratorPool() { }
Should the class also be noncopyable?
mitz
Comment 16
2011-06-23 11:03:16 PDT
Fixed in <
http://trac.webkit.org/changeset/89592
>.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug