Bug 114721 - REGRESSION (r147588): Line breaks occur in the middle of Hebrew words at haaertz.co.il and other websites
Summary: REGRESSION (r147588): Line breaks occur in the middle of Hebrew words at haae...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Text (show other bugs)
Version: 528+ (Nightly build)
Hardware: Macintosh OS X 10.8
: P1 Normal
Assignee: Glenn Adams
URL: http://www.haaretz.co.il/news/world/a...
Keywords: InRadar, Regression
Depends on:
Blocks:
 
Reported: 2013-04-16 21:25 PDT by mitz
Modified: 2013-04-20 21:32 PDT (History)
8 users (show)

See Also:


Attachments
Test case - Hebrew - with empty inline (153 bytes, text/html)
2013-04-16 21:25 PDT, mitz
no flags Details
Patch (5.40 KB, patch)
2013-04-17 16:00 PDT, Glenn Adams
no flags Details | Formatted Diff | Diff
Patch (5.38 KB, patch)
2013-04-17 16:10 PDT, Glenn Adams
no flags Details | Formatted Diff | Diff
Test case - French (695 bytes, text/html)
2013-04-17 16:34 PDT, Glenn Adams
no flags Details
Test case - Latin1 Reduction (553 bytes, text/html)
2013-04-17 19:27 PDT, Glenn Adams
no flags Details
Test case - Latin1 Reduction - Improved (848 bytes, text/html)
2013-04-18 08:34 PDT, Glenn Adams
no flags Details
Patch (12.28 KB, patch)
2013-04-18 20:30 PDT, Glenn Adams
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from webkit-ews-05 for mac-mountainlion (564.93 KB, application/zip)
2013-04-18 21:42 PDT, Build Bot
no flags Details
Archive of layout-test-results from webkit-ews-01 for mac-mountainlion (630.00 KB, application/zip)
2013-04-18 22:15 PDT, Build Bot
no flags Details
Patch (30.40 KB, patch)
2013-04-18 22:23 PDT, Glenn Adams
no flags Details | Formatted Diff | Diff
Performance Results (39.15 KB, text/html)
2013-04-19 18:09 PDT, Glenn Adams
no flags Details
Patch for landing (30.25 KB, patch)
2013-04-20 09:13 PDT, Glenn Adams
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description mitz 2013-04-16 21:25:25 PDT
Created attachment 198464 [details]
Test case - Hebrew - with empty inline

<rdar://problem/13670430>

To reproduce: open the attached test case. The second line breaks in the middle of the word צרור. It should break before that word. This was caused by <http://trac.webkit.org/r147588>, the fix for bug 105692.
Comment 1 Glenn Adams 2013-04-17 12:15:50 PDT
Will work on this later today. Please hold off on a revert.
Comment 2 Glenn Adams 2013-04-17 16:00:17 PDT
Created attachment 198625 [details]
Patch
Comment 3 Philippe Wittenbergh 2013-04-17 16:09:12 PDT
I suspect this might be the same as what I see on some French pages.
Example:
http://www.lemonde.fr/societe/article/2013/04/16/la-cedh-refuse-l-extradition-vers-les-etats-unis-d-un-schizophrene-accuse-de-terrorisme_3160563_3224.html

In the last paragraph, in the sentence 'Il fait partie d'un groupe de six islamistes radicaux dont le célèbre imam londonien Abou Hamza Masri …', the word 'célèbre' is split after 'cé'. The line break should happen before or after the whole word (before based on Safari 6.04).
Comment 4 Glenn Adams 2013-04-17 16:10:28 PDT
Created attachment 198626 [details]
Patch
Comment 5 Glenn Adams 2013-04-17 16:17:26 PDT
(In reply to comment #3)
> I suspect this might be the same as what I see on some French pages.
> Example:
> http://www.lemonde.fr/societe/article/2013/04/16/la-cedh-refuse-l-extradition-vers-les-etats-unis-d-un-schizophrene-accuse-de-terrorisme_3160563_3224.html
> 
> In the last paragraph, in the sentence 'Il fait partie d'un groupe de six islamistes radicaux dont le célèbre imam londonien Abou Hamza Masri …', the word 'célèbre' is split after 'cé'. The line break should happen before or after the whole word (before based on Safari 6.04).

I'll investigate, but at first glance, it appears to be a different case, since there is no empty inline involved (as was the case with the hebrew text). The patch I just posted fixes the hebrew test case, but not the example you cite.
Comment 6 mitz 2013-04-17 16:20:56 PDT
(In reply to comment #5)
> (In reply to comment #3)
> > I suspect this might be the same as what I see on some French pages.
> > Example:
> > http://www.lemonde.fr/societe/article/2013/04/16/la-cedh-refuse-l-extradition-vers-les-etats-unis-d-un-schizophrene-accuse-de-terrorisme_3160563_3224.html
> > 
> > In the last paragraph, in the sentence 'Il fait partie d'un groupe de six islamistes radicaux dont le célèbre imam londonien Abou Hamza Masri …', the word 'célèbre' is split after 'cé'. The line break should happen before or after the whole word (before based on Safari 6.04).
> 
> I'll investigate, but at first glance, it appears to be a different case, since there is no empty inline involved (as was the case with the hebrew text). The patch I just posted fixes the hebrew test case, but not the example you cite.

The attached test case is based on an example where the inline wasn’t empty. If your patch doesn’t fix that, then I suggest that you rework it to completely fix the regression.
Comment 7 mitz 2013-04-17 16:24:09 PDT
Added a URL that shows the bug and doesn’t involve empty inlines.
Comment 8 Philippe Wittenbergh 2013-04-17 16:33:29 PDT
(In reply to comment #5)
 
> I'll investigate, but at first glance, it appears to be a different case, since there is no empty inline involved (as was the case with the hebrew text). The patch I just posted fixes the hebrew test case, but not the example you cite.

It has the same regression range: between r147562 and r147603 (based on publicly available builds)
Comment 10 Glenn Adams 2013-04-17 16:36:05 PDT
(In reply to comment #8)
> (In reply to comment #5)
> 
> > I'll investigate, but at first glance, it appears to be a different case, since there is no empty inline involved (as was the case with the hebrew text). The patch I just posted fixes the hebrew test case, but not the example you cite.
> 
> It has the same regression range: between r147562 and r147603 (based on publicly available builds)

I agree it is a regression from r147588. I was just saying that it didn't seem the same mechanism as the empty inline. In any case, I will do as mitz suggests and create a patch that fixes both cases.
Comment 11 Glenn Adams 2013-04-17 19:27:44 PDT
Created attachment 198669 [details]
Test case - Latin1 Reduction

Reduced case of French example of regression. Here showing alternatives that do not entail regression.
Comment 12 Glenn Adams 2013-04-18 08:34:58 PDT
Created attachment 198740 [details]
Test case - Latin1 Reduction - Improved

A few more variations.
Comment 13 Glenn Adams 2013-04-18 20:30:51 PDT
Created attachment 198791 [details]
Patch
Comment 14 Glenn Adams 2013-04-18 20:39:39 PDT
(In reply to comment #13)
> Created an attachment (id=198791) [details]
> Patch

This patch corrects both the original problem reported by mitt (comment #1) and the additional problem reported by Philippe (comment #3). Before requesting review, I want to run some performance tests.
Comment 15 Glenn Adams 2013-04-18 20:40:15 PDT
(In reply to comment #14)
> (In reply to comment #13)
> > Created an attachment (id=198791) [details] [details]
> > Patch
> 
> This patch corrects both the original problem reported by mitt (comment #1) and the additional problem reported by Philippe (comment #3). Before requesting review, I want to run some performance tests.

s/mitt/mitz/
Comment 16 Build Bot 2013-04-18 21:42:57 PDT
Comment on attachment 198791 [details]
Patch

Attachment 198791 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.appspot.com/results/11201

New failing tests:
fast/writing-mode/Kusa-Makura-background-canvas.html
Comment 17 Build Bot 2013-04-18 21:42:59 PDT
Created attachment 198796 [details]
Archive of layout-test-results from webkit-ews-05 for mac-mountainlion

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: webkit-ews-05  Port: mac-mountainlion  Platform: Mac OS X 10.8.2
Comment 18 Build Bot 2013-04-18 22:15:25 PDT
Comment on attachment 198791 [details]
Patch

Attachment 198791 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.appspot.com/results/170168

New failing tests:
fast/writing-mode/Kusa-Makura-background-canvas.html
Comment 19 Build Bot 2013-04-18 22:15:27 PDT
Created attachment 198800 [details]
Archive of layout-test-results from webkit-ews-01 for mac-mountainlion

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: webkit-ews-01  Port: mac-mountainlion  Platform: Mac OS X 10.8.2
Comment 20 Glenn Adams 2013-04-18 22:23:32 PDT
Created attachment 198801 [details]
Patch
Comment 21 Glenn Adams 2013-04-19 18:09:14 PDT
Created attachment 198922 [details]
Performance Results

No apparent degradation.
Comment 22 Glenn Adams 2013-04-19 18:10:10 PDT
Comment on attachment 198801 [details]
Patch

No performance degradation indicated; ready for review.
Comment 23 Dean Jackson 2013-04-19 19:56:05 PDT
Comment on attachment 198801 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=198801&action=review

> LayoutTests/fast/text/line-break-after-empty-inline-hebrew.html:18
> +    testRunner.dumpAsText();
> +function getLineWidths(paragraphNumber) {

Nit: Needs blank line

> LayoutTests/fast/text/line-break-after-empty-inline-hebrew.html:33
> +}
> +var widths1 = getLineWidths(0);

Here too.

> LayoutTests/fast/text/line-break-after-empty-inline-hebrew.html:38
> +if (widths1.length != widths2.length) {
> +    results += 'FAIL: different number of lines, got ' + widths2.length + ', expected ' + widths1.length;
> +} else {

Single line block -> no braces.

> LayoutTests/fast/text/line-break-after-inline-latin1.html:10
> +<p>With span inline on first word, Latitn1 in first position of second word.</p>

Typo Latin1

> LayoutTests/fast/text/line-break-after-inline-latin1.html:16
> +<p>Without span, Latitn1 in first position of second word.</p>

Typo Latin1

> LayoutTests/fast/text/line-break-after-inline-latin1.html:25
> +    testRunner.dumpAsText();
> +function mergeRect(rects, rect) {

Nit: needs blank line

> LayoutTests/fast/text/line-break-after-inline-latin1.html:29
> +    if (!rects.length) {
> +        newRects.push(rect);
> +    } else {

Single line block -> no braces

> LayoutTests/fast/text/line-break-after-inline-latin1.html:33
> +            if (!rect) {
> +                newRects.push(r);

Ditto

> LayoutTests/fast/text/line-break-after-inline-latin1.html:41
> +                if (rect.left + rect.width > r.left + r.width) {
> +                    r.width = rect.left + rect.width - r.left;
> +                }

Ditto

> LayoutTests/fast/text/line-break-after-inline-latin1.html:46
> +            } else {
> +                newRects.push(r);
> +            }

Ditto

> LayoutTests/fast/text/line-break-after-inline-latin1.html:50
> +        if (rect) {
> +            newRects.push(rect);
> +        }

Ditto

> LayoutTests/fast/text/line-break-after-inline-latin1.html:57
> +    for (var i = 0; i < newRects.length; ++i) {
> +        rects = mergeRect(rects, newRects[i]);
> +    }

Ditto

> LayoutTests/fast/text/line-break-after-inline-latin1.html:85
> +    if (rects1.length != rects2.length) {
> +        results = appendResults(results, 'FAIL: different number of lines, got ' + rects1.length + ', expected ' + rects2.length);
> +    } else {

Ditto

> LayoutTests/fast/text/line-break-after-inline-latin1.html:99
> +for (var i = 0; i < 3; i++ ) {
> +    results = compareParagraphLineRects(i*2 + 1, i*2 + 1 + 6, results);
> +}

Ditto
Comment 24 Glenn Adams 2013-04-19 20:06:32 PDT
Thanks. I'll fix these tomorrow before landing. I didn't realize folks were using the WK C++ style convention for JS. Will keep that in mind in future.

(In reply to comment #23)
> (From update of attachment 198801 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=198801&action=review
> 
> > LayoutTests/fast/text/line-break-after-empty-inline-hebrew.html:18
> > +    testRunner.dumpAsText();
> > +function getLineWidths(paragraphNumber) {
> 
> Nit: Needs blank line
> 
> > LayoutTests/fast/text/line-break-after-empty-inline-hebrew.html:33
> > +}
> > +var widths1 = getLineWidths(0);
> 
> Here too.
> 
> > LayoutTests/fast/text/line-break-after-empty-inline-hebrew.html:38
> > +if (widths1.length != widths2.length) {
> > +    results += 'FAIL: different number of lines, got ' + widths2.length + ', expected ' + widths1.length;
> > +} else {
> 
> Single line block -> no braces.
> 
> > LayoutTests/fast/text/line-break-after-inline-latin1.html:10
> > +<p>With span inline on first word, Latitn1 in first position of second word.</p>
> 
> Typo Latin1
> 
> > LayoutTests/fast/text/line-break-after-inline-latin1.html:16
> > +<p>Without span, Latitn1 in first position of second word.</p>
> 
> Typo Latin1
> 
> > LayoutTests/fast/text/line-break-after-inline-latin1.html:25
> > +    testRunner.dumpAsText();
> > +function mergeRect(rects, rect) {
> 
> Nit: needs blank line
> 
> > LayoutTests/fast/text/line-break-after-inline-latin1.html:29
> > +    if (!rects.length) {
> > +        newRects.push(rect);
> > +    } else {
> 
> Single line block -> no braces
> 
> > LayoutTests/fast/text/line-break-after-inline-latin1.html:33
> > +            if (!rect) {
> > +                newRects.push(r);
> 
> Ditto
> 
> > LayoutTests/fast/text/line-break-after-inline-latin1.html:41
> > +                if (rect.left + rect.width > r.left + r.width) {
> > +                    r.width = rect.left + rect.width - r.left;
> > +                }
> 
> Ditto
> 
> > LayoutTests/fast/text/line-break-after-inline-latin1.html:46
> > +            } else {
> > +                newRects.push(r);
> > +            }
> 
> Ditto
> 
> > LayoutTests/fast/text/line-break-after-inline-latin1.html:50
> > +        if (rect) {
> > +            newRects.push(rect);
> > +        }
> 
> Ditto
> 
> > LayoutTests/fast/text/line-break-after-inline-latin1.html:57
> > +    for (var i = 0; i < newRects.length; ++i) {
> > +        rects = mergeRect(rects, newRects[i]);
> > +    }
> 
> Ditto
> 
> > LayoutTests/fast/text/line-break-after-inline-latin1.html:85
> > +    if (rects1.length != rects2.length) {
> > +        results = appendResults(results, 'FAIL: different number of lines, got ' + rects1.length + ', expected ' + rects2.length);
> > +    } else {
> 
> Ditto
> 
> > LayoutTests/fast/text/line-break-after-inline-latin1.html:99
> > +for (var i = 0; i < 3; i++ ) {
> > +    results = compareParagraphLineRects(i*2 + 1, i*2 + 1 + 6, results);
> > +}
> 
> Ditto
Comment 25 Glenn Adams 2013-04-20 09:13:05 PDT
Created attachment 198945 [details]
Patch for landing
Comment 26 Glenn Adams 2013-04-20 09:13:57 PDT
(In reply to comment #25)
> Created an attachment (id=198945) [details]
> Patch for landing

With JS style fixes.
Comment 27 WebKit Commit Bot 2013-04-20 09:40:02 PDT
Comment on attachment 198945 [details]
Patch for landing

Clearing flags on attachment: 198945

Committed r148791: <http://trac.webkit.org/changeset/148791>
Comment 28 WebKit Commit Bot 2013-04-20 09:40:05 PDT
All reviewed patches have been landed.  Closing bug.
Comment 29 Glenn Adams 2013-04-20 10:46:11 PDT
Note that fast/writing-mode/Kusa-Makura-background-canvas.html will require rebaselining (text and image).
Comment 30 Glenn Adams 2013-04-20 10:47:04 PDT
(In reply to comment #29)
> Note that fast/writing-mode/Kusa-Makura-background-canvas.html will require rebaselining (text and image).

P.S. I'm collecting results now to do this rebaseline very shortly.
Comment 31 Glenn Adams 2013-04-20 21:32:30 PDT
(In reply to comment #30)
> (In reply to comment #29)
> > Note that fast/writing-mode/Kusa-Makura-background-canvas.html will require rebaselining (text and image).
> 
> P.S. I'm collecting results now to do this rebaseline very shortly.

Done.

http://trac.webkit.org/changeset/148792
http://trac.webkit.org/changeset/148797