Bug 27074

Summary: Text should wrap after hyphens, not before
Product: WebKit Reporter: iain.dalton
Component: Layout and RenderingAssignee: Nobody <webkit-unassigned>
Severity: Minor CC: phnixwxz
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: Linux   
Bug Depends on:    
Bug Blocks: 41103    
Description Flags
test case none

Description iain.dalton 2009-07-08 02:20:08 PDT
Calibre (http://calibre.kovidgoyal.net/) uses Webkit, I'm not sure what version, and it wraps hyphens to the next line, which looks weird, and not book-like. The dev says it's a WebKit issue, so I'm reporting it here.
Comment 1 Mark Rowe (bdash) 2009-07-08 10:48:15 PDT
Perhaps you'd like to include a list of steps necessary to reproduce the problem?
Comment 2 iain.dalton 2009-07-08 21:57:24 PDT
1. Download and install Calibre (http://calibre.kovidgoyal.net/download).
2. Open an ebook (http://www.epubbooks.com/book/60/count-of-monte-cristo) with Calibre's viewer.
3. Resize the window until it has to break a hyphenated word (of which this book has several right at the top).

Of the few ebooks I've looked at, only this one puts the hyphen on the new line. I see nothing suspicious in the HTML, so I don't know why.
Comment 3 Alexey Proskuryakov 2009-07-10 21:46:12 PDT
> I see nothing suspicious in the HTML, so I don't know why.

Is this reproducible with WebKit-based Web browsers? Could you make a reduction, or upload the HTML source here?
Comment 4 iain.dalton 2009-07-12 14:48:53 PDT
In looking at the source, I see that the hyphens are actually en-dashes misused. I'm no typographer, but the only place I've know an en-dash to occur without spaces around it (other than this misuse as a hyphen) is in a range, such as 2–5. Should breaking occur at all at an en dash?
Comment 5 Xianzhu Wang 2010-10-20 20:15:37 PDT
Created attachment 71384 [details]
test case

I couldn't reproduce the bug in Chrome 7 and 8 and Safari 5, but I could reproduce it in Qt version of WebKit.

Qt version of WebKit uses QTextBoundaryFinder to find line breaking opportunities, so this bug should be of QT, not of WebKit itself.
Comment 6 Pat 2010-11-08 10:53:47 PST
Using QtWebKit trunk (wk45), this is not reproducible.
The test content is using entities &#2013; for the endash and the breaks occur *after* the dash. Tried the test content saved as UTF-8 with and without BOM, also UTF-16 and the content displayed and broke after the endash. Used QtTestBrowser to verify on Symbian and Linux. Both work as expected.

Test content, as provided, used:
abcdefg&#x2013;abcdefg&#x2013;abcdefg&#x2013;abcdefg&#x2013;abcdefg&#x2013;abcdefg&#x2013; ....

Note, that test content has no meta tag:
<meta http-equiv="content-type" content="text/html; charset=UTF-16"> 
Tested with meta tag added and works as expected.

The behavior is the same for Firefox, Safari, and QtTestBrowser.