WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED DUPLICATE of
bug 17411
15630
After U+3001, U+3002 (ideographic comma/full stop), lines cannot be broken
https://bugs.webkit.org/show_bug.cgi?id=15630
Summary
After U+3001, U+3002 (ideographic comma/full stop), lines cannot be broken
Jungshik Shin
Reported
2007-10-22 15:16:11 PDT
Due to the problem described in the summary line, the layout at
http://usstock.jrj.com.cn/xhmt
is broken. WebKit uses ICU line breaking iteratoer, which in turn is based on UAX #14 (Unicode Line Breaking Algorithm). It has the following rule: CL x (AL|NU) where CL includes U+3001 and U+3002 (Ideographic Comma and Full Stop). With the above rule, lines cannot be broken when U+3001 and U+3002 are followed by a Latin letter or a number. As a result, the box at the url given above with the title
Attachments
layout test case
(710 bytes, text/html)
2007-10-22 15:19 PDT
,
Jungshik Shin
no flags
Details
View All
Add attachment
proposed patch, testcase, etc.
Jungshik Shin
Comment 1
2007-10-22 15:19:09 PDT
Created
attachment 16810
[details]
layout test case two columns should be rendered identically.
Jungshik Shin
Comment 2
2007-10-22 15:24:09 PDT
Hmm my
comment #0
got trimmed.... As a result, the box at the url given above with the title
Jungshik Shin
Comment 3
2007-10-22 17:25:23 PDT
Try once more (this time with FF ;-)). The textbox whose title is '美通社简介' is a lot wider than its specified width breaking the layout of the page. A fix is very simple. We have to tailor UAX #14's line breaking property so that U+3001 and U+3002 followed by a Latin letter/number (or more broadly, any character belonging to AL/NU classes) are regarded as a line breaking opportunity. A way to do that is to move those characters from CL class to NS (non-starter) class in ICU's source/data/brkiter/line.txt. For WinSafari, it'd be a simple change, but for Safari on Mac, this may be more involved because it may mean changing the build of ICU shipped with OS X.
David Kilzer (:ddkilzer)
Comment 4
2007-10-22 22:56:34 PDT
(In reply to
comment #3
)
> Try once more (this time with FF ;-)).
It sounds like you're hitting
Bug 14562
(or something similar) when entering text in a text area (which is truncated when sent to the server). Could you please file a new bug on this, stating the version of Safari/WebKit you're using, and steps to reproduce. Thanks!
Alexey Proskuryakov
Comment 5
2007-10-25 09:47:05 PDT
I can only reproduce this problem on Windows - Mac (Tiger) works as expected for me.
Alexey Proskuryakov
Comment 6
2007-10-26 09:18:33 PDT
Do you know if this has been reported to the Unicode consortium? This rule is new to Unicode 5.0, and doesn't look quite right, as you point out.
Jungshik Shin
Comment 7
2007-10-26 14:44:48 PDT
Yes, I've been in contact with the author of UAX #14 (indirectly). I talked to the author of ICU break iterator and he agreed with me (actually, we sat together and he suggested changing the class of those two to NS).
Alexey Proskuryakov
Comment 8
2008-02-18 02:51:21 PST
Bug 17411
has a patch for this. I'm still unsure whether the Unicode consortium is aware of this issue. ICU is one thing, but the proposed update to UAX #14 at <
http://www.unicode.org/reports/tr14/tr14-21.html
> seems to be unchanged.
Satoshi Nakagawa
Comment 9
2008-02-23 14:33:52 PST
(In reply to
comment #8
) I agree. They could be not aware of this issue. I wrote a report about this problem, and sent it to the Unicode ML.
http://limechat.net/report/unicode-line-break-problem.html
Alexey Proskuryakov
Comment 10
2008-02-23 22:08:00 PST
Marking as a duplicate, as
bug 17411
has an approved fix for this. *** This bug has been marked as a duplicate of
17411
***
Jungshik Shin
Comment 11
2008-02-25 14:28:08 PST
(In reply to
comment #8
)
>
Bug 17411
has a patch for this. > > I'm still unsure whether the Unicode consortium is aware of this issue. ICU is > one thing, but the proposed update to UAX #14 at > <
http://www.unicode.org/reports/tr14/tr14-21.html
> seems to be unchanged.
In the meantime, they nuked LB30 instead of changing the class for U+3001/3002. A long-term solution is being worked on according to my source. Anyway, on Mac OS X, ICU will always lag behind, I agree that we should fix webkit code (as in
bug 17411
) .
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug