WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
17411
Ideographic comma or full stop, and subsequent ASCII token should be breakable into two lines
https://bugs.webkit.org/show_bug.cgi?id=17411
Summary
Ideographic comma or full stop, and subsequent ASCII token should be breakabl...
Satoshi Nakagawa
Reported
2008-02-17 08:46:15 PST
With the current implementation of WebKit, '、' (U+3001) and '。' (U+3002) are not treated as line-breakable characters. But in Japanese writing, we use them like comma and period in English. I think you can break a line just after comma or period. It's the same situation for '、' and '。' in Japanese. I confirmed IE6/7 and Firefox 2/3 implemented it correctly.
Attachments
Fix and a test case
(7.16 KB, patch)
2008-02-17 08:54 PST
,
Satoshi Nakagawa
no flags
Details
Formatted Diff
Diff
Revised patch
(7.16 KB, patch)
2008-02-17 09:43 PST
,
Satoshi Nakagawa
darin
: review+
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Satoshi Nakagawa
Comment 1
2008-02-17 08:54:53 PST
Created
attachment 19178
[details]
Fix and a test case
Satoshi Nakagawa
Comment 2
2008-02-17 09:39:08 PST
Comment on
attachment 19178
[details]
Fix and a test case Index: WebCore/ChangeLog =================================================================== --- WebCore/ChangeLog (revision 30355) +++ WebCore/ChangeLog (working copy) @@ -1,3 +1,17 @@ +2008-02-17 Satoshi Nakagawa <
artension@gmail.com
> + + Reviewed by NOBODY (OOPS!). + +
Bug 17411
: Ideographic comma and full stop should be treated as line-breakable characters +
http://bugs.webkit.org/show_bug.cgi?id=17411
+ + Test: fast/text/line-breaks-after-ideographic-comma-or-full-stop.html + + * ChangeLog: + * platform/text/CharacterNames.h: + * rendering/break_lines.cpp: + (WebCore::shouldBreakAfter): + 2008-02-17 Bin Chen <
binary.chen@gmail.com
> Reviewed by Alp Toker. Index: WebCore/platform/text/CharacterNames.h =================================================================== --- WebCore/platform/text/CharacterNames.h (revision 30353) +++ WebCore/platform/text/CharacterNames.h (working copy) @@ -39,6 +39,8 @@ namespace WebCore { const UChar bullet = 0x2022; const UChar horizontalEllipsis = 0x2026; const UChar ideographicSpace = 0x3000; + const UChar ideographicComma = 0x3001; + const UChar ideographicFullStop = 0x3002; const UChar leftToRightMark = 0x200E; const UChar leftToRightEmbed = 0x202A; const UChar leftToRightOverride = 0x202D; Index: WebCore/rendering/break_lines.cpp =================================================================== --- WebCore/rendering/break_lines.cpp (revision 30353) +++ WebCore/rendering/break_lines.cpp (working copy) @@ -51,6 +51,8 @@ static inline bool shouldBreakAfter(UCha case '-': case '?': case softHyphen: + case ideographicComma: + case ideographicFullStop: return true; default: return false; Index: LayoutTests/ChangeLog =================================================================== --- LayoutTests/ChangeLog (revision 30355) +++ LayoutTests/ChangeLog (working copy) @@ -1,3 +1,11 @@ +2008-02-17 Satoshi Nakagawa <
artension@gmail.com
> + + Reviewed by NOBODY (OOPS!). + + - test for
http://bugs.webkit.org/show_bug.cgi?id=17411
+ + * fast/text/line-breaks-after-ideographic-comma-or-full-stop.html: Added. + 2008-02-17 Dan Bernstein <
mitz@apple.com
> Reviewed by Darin Adler. Index: LayoutTests/fast/text/line-breaks-after-ideographic-comma-or-full-stop.html =================================================================== --- LayoutTests/fast/text/line-breaks-after-ideographic-comma-or-full-stop.html (revision 0) +++ LayoutTests/fast/text/line-breaks-after-ideographic-comma-or-full-stop.html (revision 0) @@ -0,0 +1,36 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" +"
http://www.w3.org/TR/html4/loose.dtd
"> +<html> + <head> + <title>Line breaks after ideographic comma or full stop</title> + </head> + <body> + These are good: + <div style="font-family:'Lucida Grande'; font-size:16pt; text-decoration:underline; width:5em;"> + <p style="border:solid green 1px;"> + あいう、<br/>abc + </p> + <p style="border:solid green 1px;"> + あいう。<br/>abc + </p> + </div> + The following two should look like “good”: + <div style="font-family:'Lucida Grande'; font-size:16pt; text-decoration:underline; width:5em;"> + <p style="border:solid blue 1px;"> + あいう、abc + </p> + <p style="border:solid blue 1px;"> + あいう。abc + </p> + </div> + These are bad: + <div style="font-family:'Lucida Grande'; font-size:16pt; text-decoration:underline; width:5em;"> + <p style="border:solid red 1px;"> + あい<br/>う、abc + </p> + <p style="border:solid red 1px;"> + あい<br/>う。abc + </p> + </div> + </body> +</html> Index: LayoutTests/platform/mac/fast/text/line-breaks-after-ideographic-comma-or-full-stop-expected.txt =================================================================== --- LayoutTests/platform/mac/fast/text/line-breaks-after-ideographic-comma-or-full-stop-expected.txt (revision 0) +++ LayoutTests/platform/mac/fast/text/line-breaks-after-ideographic-comma-or-full-stop-expected.txt (revision 0) @@ -0,0 +1,49 @@ +layer at (0,0) size 800x600 + RenderView at (0,0) size 800x600 +layer at (0,0) size 800x551 + RenderBlock {HTML} at (0,0) size 800x551 + RenderBody {BODY} at (8,8) size 784x522 + RenderBlock (anonymous) at (0,0) size 784x18 + RenderText {#text} at (0,0) size 101x18 + text run at (0,0) width 101: "These are good:" + RenderBlock {DIV} at (0,39) size 100x121 + RenderBlock {P} at (0,0) size 100x50 [border: (1px solid #008000)] + RenderText {#text} at (1,1) size 84x24 + text run at (1,1) width 84: "\x{3042}\x{3044}\x{3046}\x{3001}" + RenderBR {BR} at (85,21) size 0x0 + RenderText {#text} at (1,25) size 36x24 + text run at (1,25) width 36: "abc" + RenderBlock {P} at (0,71) size 100x50 [border: (1px solid #008000)] + RenderText {#text} at (1,1) size 84x24 + text run at (1,1) width 84: "\x{3042}\x{3044}\x{3046}\x{3002}" + RenderBR {BR} at (85,21) size 0x0 + RenderText {#text} at (1,25) size 36x24 + text run at (1,25) width 36: "abc" + RenderBlock (anonymous) at (0,181) size 784x18 + RenderText {#text} at (0,0) size 277x18 + text run at (0,0) width 277: "The following two should look like \x{201C}good\x{201D}:" + RenderBlock {DIV} at (0,220) size 100x121 + RenderBlock {P} at (0,0) size 100x50 [border: (1px solid #0000FF)] + RenderText {#text} at (1,1) size 84x48 + text run at (1,1) width 84: "\x{3042}\x{3044}\x{3046}\x{3001}" + text run at (1,25) width 36: "abc" + RenderBlock {P} at (0,71) size 100x50 [border: (1px solid #0000FF)] + RenderText {#text} at (1,1) size 84x48 + text run at (1,1) width 84: "\x{3042}\x{3044}\x{3046}\x{3002}" + text run at (1,25) width 36: "abc" + RenderBlock (anonymous) at (0,362) size 784x18 + RenderText {#text} at (0,0) size 92x18 + text run at (0,0) width 92: "These are bad:" + RenderBlock {DIV} at (0,401) size 100x121 + RenderBlock {P} at (0,0) size 100x50 [border: (1px solid #FF0000)] + RenderText {#text} at (1,1) size 42x24 + text run at (1,1) width 42: "\x{3042}\x{3044}" + RenderBR {BR} at (43,21) size 0x0 + RenderText {#text} at (1,25) size 78x24 + text run at (1,25) width 78: "\x{3046}\x{3001}abc" + RenderBlock {P} at (0,71) size 100x50 [border: (1px solid #FF0000)] + RenderText {#text} at (1,1) size 42x24 + text run at (1,1) width 42: "\x{3042}\x{3044}" + RenderBR {BR} at (43,21) size 0x0 + RenderText {#text} at (1,25) size 78x24 + text run at (1,25) width 78: "\x{3046}\x{3002}abc"
Satoshi Nakagawa
Comment 3
2008-02-17 09:43:46 PST
Created
attachment 19179
[details]
Revised patch Updated the test case to make it regardless of the font size.
Alexey Proskuryakov
Comment 4
2008-02-18 02:38:21 PST
Duplicate of
bug 15630
(which includes some discussion of standard-related issues).
Darin Adler
Comment 5
2008-02-20 09:27:52 PST
Comment on
attachment 19179
[details]
Revised patch Should this be fixed in ICU rather than in WebKit? We're using the ICU ubrk.h header for this, using UBRK_LINE. Why doesn't it say these characters should be breaks? The switch statement in break_lines.cpp is *only* to work around bugs in ICU's ubrk.h or to implement rules that don't make sense at the ICU level but are needed for a web browser. I'll tentatively agree that we should land this patch, but I'm not sure why ICU doesn't already handle this properly. r=me
Satoshi Nakagawa
Comment 6
2008-02-23 14:35:59 PST
I updated the report to make it more easier to understand.
http://limechat.net/report/webkit-line-break-problem.html
I realized the original cause of this problem is the Unicode line breaking algorithm standard.
http://limechat.net/report/unicode-line-break-problem.html
I have sent this report to the Unicode ML.
Satoshi Nakagawa
Comment 7
2008-02-23 14:58:57 PST
(In reply to
comment #5
) As I wrote in the report above, it seems a problem of the Unicode line breaking standards (
http://www.unicode.org/reports/tr14/
). ICU just implemented it strictly. So the switch statement in break_lines.cpp is to work around bugs in the Unicode standards. If the Unicode line breaking algorithm would be fixed, ICU's line break iterator would work correctly. Then WebKit will work well without any patches. But if so, we need to wait the fix propagation from the Unicode standards, ICU, WebKit to Safari. It will take too much time. I suggest that the patch should be applied to fix this problem quickly in WebKit.
Satoshi Nakagawa
Comment 8
2008-02-23 15:21:59 PST
Updated summary.
Alexey Proskuryakov
Comment 9
2008-02-23 21:49:26 PST
(In reply to
comment #6
)
>
http://limechat.net/report/unicode-line-break-problem.html
> > I have sent this report to the Unicode ML.
Thank you! Is there an archive of the list available somewhere? I do not see the posting in <
http://www.unicode.org/mail-arch/unicode-ml/y2008-m02/index.html
>, so I assume that you sent it to some other list. (In reply to
comment #7
)
> I suggest that the patch should be applied to fix this problem quickly in > WebKit.
I agree, we shouldn't wait for ICU here. I think we can wait a few days for feedback from the Unicode mailing list though.
Alexey Proskuryakov
Comment 10
2008-02-23 22:08:00 PST
***
Bug 15630
has been marked as a duplicate of this bug. ***
Satoshi Nakagawa
Comment 11
2008-02-23 23:06:13 PST
(In reply to
comment #9
) I sent a mail to the list, but it needs to be checked by a moderator. Because the report is my first post to the list. So please wait it.
> I agree, we shouldn't wait for ICU here. I think we can wait a few days for > feedback from the Unicode mailing list though.
I'm glad to hear that. Thanks a lot!
Satoshi Nakagawa
Comment 12
2008-02-24 18:52:25 PST
(In reply to
comment #9
) My post to the list seemed approved by a moderator, and discussion has been started.
http://www.unicode.org/mail-arch/unicode-ml/y2008-m02/0138.html
I'll start intensive checking.
Alexey Proskuryakov
Comment 13
2008-02-26 02:16:46 PST
Landed in <
http://trac.webkit.org/projects/webkit/changeset/30592
>. I changed the test to be text-only to not depend on available fonts and font fallback mechanism (the included results didn't match mine). Also added a few comments.
Satoshi Nakagawa
Comment 14
2008-02-26 13:25:03 PST
(In reply to
comment #13
) Thanks a lot! I have checked it. It seems good.
Eric Seidel (no email)
Comment 15
2008-04-24 13:24:19 PDT
This does not work for me on Safari 3.1.1. on windows. It seems this bug isn't really fixed (for windows)? The latest nightly also fails this test on windows.
Alexey Proskuryakov
Comment 16
2008-04-24 13:59:24 PDT
The Windows results are different visually, which suggests that it may be an entirely different issue. As this report was against the Mac version, and it is fixed with a test case landed, I'd say that it is fully resolved. Could you please file a new bug?
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug