Bug 20677 - WebKit wraps between hyphen-minus and numeric characters
Summary: WebKit wraps between hyphen-minus and numeric characters
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Layout and Rendering (show other bugs)
Version: 528+ (Nightly build)
Hardware: Mac OS X 10.5
: P2 Enhancement
Assignee: Nobody
URL: http://irowan.com/hyphenminuswrapping...
Keywords:
Depends on:
Blocks: 41103
  Show dependency treegraph
 
Reported: 2008-09-05 15:27 PDT by Rowan Beentje
Modified: 2011-09-13 09:43 PDT (History)
13 users (show)

See Also:


Attachments
patch (54.23 KB, patch)
2010-07-31 16:52 PDT, Xianzhu Wang
no flags Details | Formatted Diff | Diff
Updated patch (54.43 KB, patch)
2010-08-30 20:47 PDT, Xianzhu Wang
no flags Details | Formatted Diff | Diff
new patch (8.17 KB, patch)
2011-01-19 05:50 PST, Xianzhu Wang
no flags Details | Formatted Diff | Diff
new patch (6.99 KB, patch)
2011-09-04 21:15 PDT, Xianzhu Wang
darin: review-
Details | Formatted Diff | Diff
patch v5 (8.86 KB, patch)
2011-09-12 21:46 PDT, Xianzhu Wang
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Rowan Beentje 2008-09-05 15:27:18 PDT
Webkit currently treats all hyphen–minus characters as a location which can be wrapped after.  In the case of hyphen-minus followed by numeric class characters, the hyphen-minus almost certainly indicates a negative and this rule should not be applied.

Very basicbackground: Hyphen-minus ("-") is the ambiguous hyphen/minus/dash character used on typewriters and computers for general use; most systems output negative numbers by prepending a hyphen–minus in front of the value.  Separating the hyphen-minus from the number by means of a wrap changes the appearance of a number drastically.

The unicode site/standard/recommendations (I've no idea where on the site this is located!) includes line breaking rules at http://www.unicode.org/unicode/reports/tr14/tr14-14.html#Algorithm .  Scroll down to LB18 to see the recommendation for hyphen characters followed by numeric characters - not recommended.

This would improve the tabular display of numbers greatly.  There are however a few cases where it probably wouldn't be desirable – US–formatted phone numbers (0123-456-789), for example.

See URL for a basic example.
Comment 1 Rowan Beentje 2008-09-05 15:30:55 PDT
I should add – while this is the case in tables where the hyphen-minus is the only obvious wrapping point within the cell, it also occurs in free text.  On the following page, resize the window while noting the lower bound of some of the data types – a hyphen–minus is often left floating on its own despite the preceding space.

http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/index.jsp?topic=/com.ibm.db29.doc.intro/db2z_numericdatatypes.htm
Comment 2 Dave Hyatt 2008-09-05 15:37:28 PDT
Hyphen is also a good break for long URLs as well (and may be followed by a number in those cases).
Comment 3 Rowan Beentje 2008-09-05 16:09:39 PDT
That's an excellent point.

How complex is the current linebreak code?  Would it be possible to easily add a rule that checks both sides?

If (space, punctuation, or element boundary)(hyphen-minus)(numeric character) could be separated out, that wouldn't affect wrapping of longer strings where alphanumerics and hyphen-minuses are mixed.
Comment 4 Alexey Proskuryakov 2008-09-08 04:49:38 PDT
See also: bug 17411, bug 4808. Note that Unicode line breaking rules change from one version of the standard to another (see e.g. a mailing list discussion related to the first of these bugs, <http://www.unicode.org/mail-arch/unicode-ml/y2008-m02/0138.html>).

Most of line breaking logic for WebKit is handled by ICU.
Comment 5 Rowan Beentje 2008-09-18 11:27:18 PDT
A comment in shouldBreakAfter() in WebKit/WebCore/rendering/break_lines.cpp says:

    // Match WinIE's breaking strategy, which is to always allow breaks after hyphens and question marks.

Is matching WinIE still preferred behaviour?  If so, this bug can be closed.

(IE 6, IE 7 *do* show this behaviour; Firefox matches the unicode recommendation and doesn't wrap.)
Comment 6 Edward Sabol 2009-09-30 12:24:32 PDT
I work in the astronomy/astrophysics field, and this undesirable behavior causes a lot of problems for us. We have large tables of astronomical catalogs which we render in HTML for easy reference. In some cases, WebKit will wrap inside of a table cell even when the negative number is the only text inside the table cell!

Please modify WebKit to adhere to the Unicode recommendation and match Firefox. Matching Internet Explorer's buggy behavior in this regard is not at all desirable.

As for hyphens appearing in phone numbers and URLs, one should be able to differentiate between numbers and those other cases. With numbers, the hyphen is preceded by either whitespace or it's at the beginning of the text node. That's not the case for hyphens in phone numbers or in URLs, typically.
Comment 7 Xianzhu Wang 2010-05-11 22:44:15 PDT
IE8 has changed the way of handling hyphens in standard mode to catch up the standard. See bug 37698 for details. The patch submitted in bug 37698 can also fix this bug.
Comment 8 Xianzhu Wang 2010-07-29 19:54:07 PDT
(In reply to comment #7)
> IE8 has changed the way of handling hyphens in standard mode to catch up the standard. See bug 37698 for details. The patch submitted in bug 37698 can also fix this bug.

The patch of bug 37698 has been landed, but this bug is unaffected.

There might be several methods to fix this bug:

1. Based on the patch of bug 37698, change the line breaking matrix about '-', disallowing line breaking between '-' and digits;
2. Same as above, but for the case of '-' before digits, let the underlying ICU handle it;
3. Implement a stateful line breaking algorithm like the one in Firefox.

I just did tests of method 1 and 2, the results seem the same, that is, ICU just simply disallows line breaking between '-' and digits. UAX#14 LB25 also contains a 'HY x NU' rule. So method 1 seems the simplest solution.

I'd like to hear your opinions.
Comment 9 mitz 2010-07-29 19:59:07 PDT
(In reply to comment #8)
> (In reply to comment #7)
> > IE8 has changed the way of handling hyphens in standard mode to catch up the standard. See bug 37698 for details. The patch submitted in bug 37698 can also fix this bug.
> 
> The patch of bug 37698 has been landed, but this bug is unaffected.
> 
> There might be several methods to fix this bug:
> 
> 1. Based on the patch of bug 37698, change the line breaking matrix about '-', disallowing line breaking between '-' and digits;
> 2. Same as above, but for the case of '-' before digits, let the underlying ICU handle it;
> 3. Implement a stateful line breaking algorithm like the one in Firefox.
> 
> I just did tests of method 1 and 2, the results seem the same, that is, ICU just simply disallows line breaking between '-' and digits. UAX#14 LB25 also contains a 'HY x NU' rule. So method 1 seems the simplest solution.
> 
> I'd like to hear your opinions.

I agree that if this change is desirable, then (1) is the way to go about it.
Comment 10 Xianzhu Wang 2010-07-31 16:52:10 PDT
Created attachment 63161 [details]
patch
Comment 11 Xianzhu Wang 2010-08-23 23:25:02 PDT
Hi, all,

Do you think this issue is still worth resolving? I'd like to hear your opinions about this issue.

Thanks,
Xianzhu
Comment 12 Darin Adler 2010-08-29 12:39:42 PDT
Comment on attachment 63161 [details]
patch

> +        Webkit wraps between hyphenâminus and numeric characters

Need to stick to ASCII in the change log.

> +        Modified the ascii line breaking table to disallow line breaking
> +        after hyphen-minus before numeric characters.

It's ASCII, not ascii.

> +        * rendering/break_lines.cpp:
> +        (WebCore::):

This function name isn't right. Please don't leave garbled things like this in the change log, even if a script generates them.
Comment 13 mitz 2010-08-29 12:50:46 PDT
(In reply to comment #12)
> (From update of attachment 63161 [details])
> > +        Webkit wraps between hyphenâminus and numeric characters
> 
> Need to stick to ASCII in the change log.

Is this really a requirement? The change log files are encoded as UTF-8, and include some contributor names which contain non-ASCII characters.

I don’t think that the fact that the “pretty diff” tool at bugs.webkit.org messes up these characters is a reason not to use them.
Comment 14 Darin Adler 2010-08-29 12:57:34 PDT
(In reply to comment #13)
> (In reply to comment #12)
> > (From update of attachment 63161 [details] [details])
> > > +        Webkit wraps between hyphenâminus and numeric characters
> > 
> > Need to stick to ASCII in the change log.
> 
> Is this really a requirement? The change log files are encoded as UTF-8, and include some contributor names which contain non-ASCII characters.
> 
> I don’t think that the fact that the “pretty diff” tool at bugs.webkit.org messes up these characters is a reason not to use them.

OK, Dan, you’re right, but I still think that we should use an ASCII hyphen here instead of an en dash. Lets stick to ASCII unless there is a good reason. I’m not even sure that the character name “hyphen minus” deserves an en dash.
Comment 15 mitz 2010-08-29 13:01:26 PDT
(In reply to comment #14)
> I’m not even sure that the character name “hyphen minus” deserves an en dash.

The Unicode name of the character is HYPHEN-MINUS, and it’s spelled with a HYPHEN-MINUS :-)
Comment 16 Rowan Beentje 2010-08-29 13:31:48 PDT
Xianzhu - thanks for the patch!  I still think it's a nice behavioural tweak; in large tables of facts and figures I still see this pop up quite a lot.  Smaller screens presumably suffer more, so a nice mobile tweak as well.

Looking forward to seeing this committed once the commit notes are sorted ;)
Comment 17 Xianzhu Wang 2010-08-30 20:47:28 PDT
Created attachment 66008 [details]
Updated patch

Thanks Darin for your review. Here is the updated patch.
Comment 18 Darin Adler 2010-08-30 22:06:11 PDT
Hyatt, do you now agree this is  good change, or do you still think it will prevent breaking URLs?
Comment 19 Xianzhu Wang 2010-09-05 21:51:29 PDT
ping...
Comment 20 Eric Seidel (no email) 2010-09-07 03:19:40 PDT
Comment on attachment 63161 [details]
patch

Cleared Darin Adler's review+ from obsolete attachment 63161 [details] so that this bug does not appear in http://webkit.org/pending-commit.
Comment 21 Xianzhu Wang 2010-09-14 18:43:11 PDT
(In reply to comment #18)
> Hyatt, do you now agree this is  good change, or do you still think it will prevent breaking URLs?

Hi, Hyatt, I think even without this change, long URLs still have few chances to break. I just did a stat on 70418 complex URLs (selected from our crawler library), 62649 (89%) of them don't contain number-after-hyphen or contain other line-breaking opportunities near (distance <= 10 chars) number-after-hyphen.

I'd also like to implement a more complex algorithm like the one mentioned in #3 if most of you agree it. This might require a refactoring of the current line breaking code. However, the refactoring can also make it possible to fix bug 37543.
Comment 22 Xianzhu Wang 2010-10-20 20:20:53 PDT
Hi, I'd like to hear your opinions about this issue. Should I do it step-by-step or in one step?

Step-by-step:
1. Land the current patch that disallows line breaking between a hyphen-minus and a digit;
2. Refactor line-breaking algorithm, introduce stateful line-breaking features.

One step:
Do the above Step 2 directly.
Comment 23 Alexey Proskuryakov 2010-10-20 20:28:14 PDT
Deviations from platform behavior should be a rare exception, not a rule. For example, if line breaking worked differently in Apple Mail and Safari than in other Mac OS X applications, that would be a problem for users.
Comment 24 Xianzhu Wang 2010-10-20 21:15:32 PDT
(In reply to comment #23)
> Deviations from platform behavior should be a rare exception, not a rule. For example, if line breaking worked differently in Apple Mail and Safari than in other Mac OS X applications, that would be a problem for users.

It seems that almost all platforms' WebKit are by default using ICU for line breaking of non-ASCII characters, except qt (using QtCore/QTextBoundaryFinder), wince (not using any library) and brew (not using any library). Gtk has two choices and ICU is the default. Line breaking between ASCII characters is handled in WebCore/platform/break_lines.cpp for all platforms.

If we want to keep platform behavior, do you think we should let all platforms use their own platform specific line-breaking libraries? I think this would degrade performance. I believe this is the main reason that we are not using any library for ASCII character line-breaking.
Comment 25 Rowan Beentje 2010-10-21 02:55:06 PDT
(In reply to comment #23)
> Deviations from platform behavior should be a rare exception, not a rule. For example, if line breaking worked differently in Apple Mail and Safari than in other Mac OS X applications, that would be a problem for users.

It appears the current ICU-style basic wrapping is quite different from Mac OS X.  If I try wrapping text in TextEdit (which I assume is using standard NSTextView behaviour…) hyphen-minus is already behaving in the way this bug - and the unicode spec - suggests.  So that sounds like another reason this change would be a good thing :)

I'm not sure about deviations in platforms across a broader stateful algorithm - I would favour WebKit wrapping things nicely, despite the underlying platform, rather than duplicating behaviour that breaks layout.
Comment 26 Adam Barth 2011-01-12 14:44:11 PST
Comment on attachment 66008 [details]
Updated patch

This patch is a tough call.  Mitz and Darin agree that this is the right approach for implementing this behavior if this is the behavior we want.  Several folks have differed to Hyatt w.r.t. whether this is the behavior we want, especially in connection with URLs.  The new behavior moves us to be more Firefox-like than IE-like and moves us closer to the spec.  In the common case, it seems like the new behavior will be ok for URLs.  The bad case is probably something like the old Amazon URLs with long strings of hyphenated numbers...  I'm marking this R+, but it would be nice to hear from Hyatt before landing.
Comment 27 Rowan Beentje 2011-01-12 14:59:49 PST
Is it easy to achieve different wrapping behaviour based on the preceding character in addition to the following character?

For example, I think most people will agree that "-1,234.56789" shouldn't wrap, under ideal circumstances.  At the moment, it wraps to the following:

-
1,234.56789

…not ideal.

The case is more mixed when there's characters preceding the hyphen-minus as well as following the hyphen-minus.  For example, currently 12345-67890 wraps to:

12345-
67890

This looks fine to me - in fact, here the hyphen-minus makes a natural break point anyway, and the meaning isn't lost.

The spec does suggest that neither case should wrap, but I suspect the mention of HY × NU is intended for the first case, if you'd rather follow the spirit than the letter :)

This would fix numbers in tables but leave URLs to wrap (which is another common table/wrapping bugbear due to long url-encoded strings!)
Comment 28 Rowan Beentje 2011-01-12 15:12:53 PST
Actually, it sounds like the spec does actually cover the behaviour I outlined in comment #27:

"Numbers are of the form PR ? ( OP | HY ) ? NU (NU | IS) * CL ?  PO ?"

The "HY × NU" explicit mention is in the following section which begins: "This is approximated with the following rules".

This does also mean that to be fully compliant, numeric prefixes ($, £, ¥, etc) shouldn't be broken after as part of a number, and enclosing brackets should also be respected… these currently seem to work as specced.
Comment 29 Xianzhu Wang 2011-01-12 17:14:33 PST
Thanks Adam for review.

As it need time to make this patch applicable to the newest tip of tree, during the time, anyone please continue to provide your opinions.
Comment 30 Edward Sabol 2011-01-12 20:39:25 PST
I think Rowan has summarized perfectly what *everyone* wants. Thanks, Rowan. May I suggest that the following tests be added:

"-1234.4567" should NOT wrap.
"$-1234.4567" should NOT wrap. Ditto for £, ¥, etc.
"1234-4567" should wrap as

1234-
4567

With this patch does Webkit satisfy all 3 cases? If not, how difficult would it be to achieve this ideal? It would be great if this long-standing bug could finally be closed!

By the way, "opening" is misspelled in the comment on line 69 of WebCore/rendering/break_lines.cpp. Can that correction be included in this patch? It's one line before Xianzhu's second modification.
Comment 31 Xianzhu Wang 2011-01-19 05:50:54 PST
Created attachment 79411 [details]
new patch

Thanks Rowan and Edward for your clear descriptions.

Adam/Eric, the updated patch added logics for breaking abcd-1234 and 1234-5678. Please take another look. Thanks.
Comment 32 Xianzhu Wang 2011-01-19 06:08:59 PST
FYI, just created another bug to track future improvements to break_lines.cpp:
https://bugs.webkit.org/show_bug.cgi?id=52715
Comment 33 Eric Seidel (no email) 2011-01-31 16:06:30 PST
Comment on attachment 66008 [details]
Updated patch

Cleared Adam Barth's review+ from obsolete attachment 66008 [details] so that this bug does not appear in http://webkit.org/pending-commit.
Comment 34 Xianzhu Wang 2011-02-14 00:09:50 PST
Hi, reviewers,

Would you please take a look at the new patch? (Then I could work on bug 52715.)

Thanks,
Xianzhu
Comment 35 Xianzhu Wang 2011-02-24 01:48:22 PST
Hi, reviewers,

Would you please take a look at the new patch? (Then I could work on bug 52715.)

Thanks,
Xianzhu
Comment 36 Xianzhu Wang 2011-03-28 20:14:28 PDT
ping...
Comment 37 Eric Seidel (no email) 2011-05-23 11:11:44 PDT
Levi, rniwa?  Any thoughts on this one?
Comment 38 Alex Shinn 2011-09-01 23:14:47 PDT
+1.  This issue is crucial for using Google docs spreadsheets, since with the current behavior negative numbers look positive.  There's also no workaround since webkit doesn't currently support "word-break: keep-all" (see bug 43917).
Comment 39 Darin Adler 2011-09-02 10:41:32 PDT
Comment on attachment 79411 [details]
new patch

View in context: https://bugs.webkit.org/attachment.cgi?id=79411&action=review

> Source/WebCore/rendering/break_lines.cpp:3
> + * Copyright (C) 2010, 2011 Google Inc. All rights reserved.

I’m a little confused about the 2010 here. Is there substantial code here that Google published in 2010?

> Source/WebCore/rendering/break_lines.cpp:89
> -    { B(1, 1, 1, 1, 1, 1, 1, 1), B(1, 1, 1, 1, 1, 1, 1, 1), F, B(1, 1, 1, 1, 1, 1, 1, 1), F, F, F, B(1, 1, 1, 1, 1, 1, 1, 1), F, F, F, B(1, 1, 1, 1, 1, 1, 1, 1) }, // -
> +    { B(1, 1, 1, 1, 1, 1, 1, 1), B(1, 1, 1, 1, 1, 0, 1, 0), 0, B(0, 1, 1, 1, 1, 1, 1, 1), F, F, F, B(1, 1, 1, 1, 1, 1, 1, 1), F, F, F, B(1, 1, 1, 1, 1, 1, 1, 1) }, // -

It’s hard for me to review the change in this line because the change log does not state what these 4 changes are or what they are supposed to accomplish.

> Source/WebCore/rendering/break_lines.cpp:135
> +            return isASCIIAlphanumeric(lastCh);

This doesn’t seem right. Why does this handle ASCII alphabetical characters differently from non-ASCII alphabetical characters? Does this code only run if all characters are ASCII? Maybe the ICU code already does this right for non-ASCII, but I can’t tell that from the patch.

I’d like to see test cases covering the non-ASCII characters to show we have the same kinds of behavior in those cases.

> Source/WebCore/rendering/break_lines.cpp:173
> +    UChar lastLastCh = pos > 1 ? str[pos - 2] : 0;

This now requires more context to correctly break lines. Do the callers always provide enough context in cases where text within the line is split across multiple elements?
Comment 40 Ryosuke Niwa 2011-09-02 10:49:37 PDT
Comment on attachment 79411 [details]
new patch

View in context: https://bugs.webkit.org/attachment.cgi?id=79411&action=review

> LayoutTests/fast/text/line-breaks-after-hyphen-before-number-expected.txt:24
> +1111111-2222222

This output is wrong, no?
Comment 41 Darin Adler 2011-09-02 12:27:21 PDT
Comment on attachment 79411 [details]
new patch

View in context: https://bugs.webkit.org/attachment.cgi?id=79411&action=review

>> LayoutTests/fast/text/line-breaks-after-hyphen-before-number-expected.txt:24
>> +1111111-2222222
> 
> This output is wrong, no?

The line breaks based on width don’t show up in the DumpRenderTree output, even though the break does happen on screen. The test checks that the result is correct by comparing the heights of the elements.

That’s consistent with the expected behavior. When you convert to text, line breaks based on the width of the layout aren’t considered.
Comment 42 Ryosuke Niwa 2011-09-02 12:31:17 PDT
Comment on attachment 79411 [details]
new patch

View in context: https://bugs.webkit.org/attachment.cgi?id=79411&action=review

>>> LayoutTests/fast/text/line-breaks-after-hyphen-before-number-expected.txt:24
>>> +1111111-2222222
>> 
>> This output is wrong, no?
> 
> The line breaks based on width don’t show up in the DumpRenderTree output, even though the break does happen on screen. The test checks that the result is correct by comparing the heights of the elements.
> 
> That’s consistent with the expected behavior. When you convert to text, line breaks based on the width of the layout aren’t considered.

I think it's quite misleading to dump the text result like this then. If I saw this result, I'd assume that the test is failing.  We should probably hide these text when ran in DRT/TestRunner.

> LayoutTests/fast/text/line-breaks-after-hyphen-before-number-expected.txt:30
> +PASS

We should probably print PASS/FAIL for each test case so that we can easily diagnose the results when one of these cases fail.
Comment 43 Xianzhu Wang 2011-09-04 21:15:44 PDT
Created attachment 106301 [details]
new patch

View in context: https://bugs.webkit.org/attachment.cgi?id=79411&action=review

>> Source/WebCore/rendering/break_lines.cpp:3
>> + * Copyright (C) 2010, 2011 Google Inc. All rights reserved.
> 
> I’m a little confused about the 2010 here. Is there substantial code here that Google published in 2010?

Yes. I forgot to update the copyright in https://bugs.webkit.org/show_bug.cgi?id=37698. I ever used phnixwxz@gmail.com when I had already been a Google employee.

>> Source/WebCore/rendering/break_lines.cpp:89
>> +    { B(1, 1, 1, 1, 1, 1, 1, 1), B(1, 1, 1, 1, 1, 0, 1, 0), 0, B(0, 1, 1, 1, 1, 1, 1, 1), F, F, F, B(1, 1, 1, 1, 1, 1, 1, 1), F, F, F, B(1, 1, 1, 1, 1, 1, 1, 1) }, // -
> 
> It’s hard for me to review the change in this line because the change log does not state what these 4 changes are or what they are supposed to accomplish.

It's part of a line-breaking matrix. The above change disables line breaking after '-' and before '.' and '0'..'9'. In fact, the above change doesn't affect the result after '-' and before digits because this is hard-coded in shouldBreakAfter().
Added some comment here and ChangeLog.

>> Source/WebCore/rendering/break_lines.cpp:135
>> +            return isASCIIAlphanumeric(lastCh);
> 
> This doesn’t seem right. Why does this handle ASCII alphabetical characters differently from non-ASCII alphabetical characters? Does this code only run if all characters are ASCII? Maybe the ICU code already does this right for non-ASCII, but I can’t tell that from the patch.
> 
> I’d like to see test cases covering the non-ASCII characters to show we have the same kinds of behavior in those cases.

Yes. ICU can handle the case correctly for both ASCII and non-ASCII characters but with lower performance.
This code will be executed if both nextCh and ch are ASCII characters, and is covering the cases of this bug.
I think non-ASCII characters are out of the scope of this bug.

>> Source/WebCore/rendering/break_lines.cpp:173
>> +    UChar lastLastCh = pos > 1 ? str[pos - 2] : 0;
> 
> This now requires more context to correctly break lines. Do the callers always provide enough context in cases where text within the line is split across multiple elements?

I did some test with ICU by reducing the context only from the lastChar. The ICU line-break iterator works differently with the reduced contexts. Many ICU line-breaking rules require even larger contexts.

However, for now WebKit won't pass contexts across element boundaries to line-break iterators. This causes different line-breaking on in-line element boundaries. For example, it may break after the '-' in '1111111-abcdefg', but not in '1111111-<i>abcdefg</i>'. This patch won't change this behavior. Should this be a bug? I searched and found a related bug: https://bugs.webkit.org/show_bug.cgi?id=56269.

>>>> LayoutTests/fast/text/line-breaks-after-hyphen-before-number-expected.txt:24
>>>> +1111111-2222222
>>> 
>>> This output is wrong, no?
>> 
>> The line breaks based on width don’t show up in the DumpRenderTree output, even though the break does happen on screen. The test checks that the result is correct by comparing the heights of the elements.
>> 
>> That’s consistent with the expected behavior. When you convert to text, line breaks based on the width of the layout aren’t considered.
> 
> I think it's quite misleading to dump the text result like this then. If I saw this result, I'd assume that the test is failing.  We should probably hide these text when ran in DRT/TestRunner.

Done.

>> LayoutTests/fast/text/line-breaks-after-hyphen-before-number-expected.txt:30
>> +PASS
> 
> We should probably print PASS/FAIL for each test case so that we can easily diagnose the results when one of these cases fail.

Done.
Comment 44 Darin Adler 2011-09-06 08:52:58 PDT
(In reply to comment #43)
> However, for now WebKit won't pass contexts across element boundaries to line-break iterators. This causes different line-breaking on in-line element boundaries. For example, it may break after the '-' in '1111111-abcdefg', but not in '1111111-<i>abcdefg</i>'. This patch won't change this behavior. Should this be a bug?

Yes.
Comment 45 Xianzhu Wang 2011-09-06 18:43:19 PDT
(In reply to comment #44)
> (In reply to comment #43)
> > However, for now WebKit won't pass contexts across element boundaries to line-break iterators. This causes different line-breaking on in-line element boundaries. For example, it may break after the '-' in '1111111-abcdefg', but not in '1111111-<i>abcdefg</i>'. This patch won't change this behavior. Should this be a bug?
> 
> Yes.

FYI, just found it: https://bugs.webkit.org/show_bug.cgi?id=17427
Comment 46 Xianzhu Wang 2011-09-06 18:52:42 PDT
Hi Darin, Roysuke, does the latest patch look good to you?
Comment 47 Darin Adler 2011-09-12 17:00:05 PDT
Comment on attachment 106301 [details]
new patch

View in context: https://bugs.webkit.org/attachment.cgi?id=106301&action=review

> Source/WebCore/rendering/break_lines.cpp:134
> +            // while allow breaking in 'ABCD-1234' and '1234-5678'. <https://bugs.webkit.org/show_bug.cgi?id=20677>

I don’t think having the bug number here is a good idea. I see other bug URLs in this file, but there’s no need long-term to track the bug number of each bug we fixed.

> LayoutTests/ChangeLog:12
> +        * fast/text/line-breaks-after-hyphen-before-number.html: Added.

I can’t find this test case in the patch.

> LayoutTests/fast/text/line-breaks-after-hyphen-before-number-expected.txt:8
> +0: PASS
> +1: PASS
> +2: PASS
> +3: PASS
> +4: PASS
> +5: PASS
> +Summary: PASS

This output is not clear at all. The test would be better if the output made it clearer what was being tested.
Comment 48 Xianzhu Wang 2011-09-12 21:42:05 PDT
Comment on attachment 106301 [details]
new patch

View in context: https://bugs.webkit.org/attachment.cgi?id=106301&action=review

>> Source/WebCore/rendering/break_lines.cpp:134
>> +            // while allow breaking in 'ABCD-1234' and '1234-5678'. <https://bugs.webkit.org/show_bug.cgi?id=20677>
> 
> I don’t think having the bug number here is a good idea. I see other bug URLs in this file, but there’s no need long-term to track the bug number of each bug we fixed.

Done.

>> LayoutTests/ChangeLog:12
>> +        * fast/text/line-breaks-after-hyphen-before-number.html: Added.
> 
> I can’t find this test case in the patch.

Sorry, my fault. Fixed.

>> LayoutTests/fast/text/line-breaks-after-hyphen-before-number-expected.txt:8
>> +Summary: PASS
> 
> This output is not clear at all. The test would be better if the output made it clearer what was being tested.

Re-designed the input/output.
Comment 49 Xianzhu Wang 2011-09-12 21:46:11 PDT
Created attachment 107141 [details]
patch v5
Comment 50 WebKit Review Bot 2011-09-13 09:43:21 PDT
Comment on attachment 107141 [details]
patch v5

Clearing flags on attachment: 107141

Committed r95030: <http://trac.webkit.org/changeset/95030>
Comment 51 WebKit Review Bot 2011-09-13 09:43:29 PDT
All reviewed patches have been landed.  Closing bug.