WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED MOVED
242822
Em dash should not be separated from preceding word
https://bugs.webkit.org/show_bug.cgi?id=242822
Summary
Em dash should not be separated from preceding word
Brad Andalman
Reported
2022-07-15 15:12:08 PDT
Created
attachment 460938
[details]
HTML that shows incorrect word wrap for a word followed by an em dash When an em dash immediately follows a word, and that em dash can't fit on a line, then both the preceding word and the em dash should be moved to the next line. This works for hyphens, en dashes, and figure dashes, but does not work for em dashes. Both Safari and Chrome exhibit this bug. Firefox, however, behaves correctly.
Attachments
HTML that shows incorrect word wrap for a word followed by an em dash
(1.33 KB, text/html)
2022-07-15 15:12 PDT
,
Brad Andalman
no flags
Details
Screenshot of Safari, Chrome, and Firefox
(601.96 KB, image/png)
2022-07-15 15:14 PDT
,
Brad Andalman
no flags
Details
Test case
(381 bytes, text/html)
2022-07-15 16:43 PDT
,
zalan
no flags
Details
Apple Books showing em dash and quotation mark on its own line
(653.75 KB, image/png)
2022-07-15 18:53 PDT
,
Brad Andalman
no flags
Details
View All
Add attachment
proposed patch, testcase, etc.
Brad Andalman
Comment 1
2022-07-15 15:14:44 PDT
Created
attachment 460939
[details]
Screenshot of Safari, Chrome, and Firefox Screenshot of Safari, Chrome, and Firefox rendering the HTML in the first attachment. Safari and Chrome both exhibit the bug. Firefox, on the right, behaves correctly.
zalan
Comment 2
2022-07-15 16:43:50 PDT
Created
attachment 460941
[details]
Test case Apparently ubrk_following() returns position 2 for XX[em dash]XX and position 3 for XX[figure dash]XX so we find a soft wrap opportunity between XX and [em dash]. (not sure how FF resolve this. we strictly rely on ICU here)
Alexey Proskuryakov
Comment 3
2022-07-15 18:06:11 PDT
This looks like correct behavior per UAX #14. It also matches TextEdit.
Brad Andalman
Comment 4
2022-07-15 18:52:35 PDT
UAX#14 does assert that "Line breaks can occur before and after an EM DASH." It also claims that the only use for an EM DASH is to "set off parenthetical text." That is only one of the ways that an EM DASH can be used, however. The Chicago Manual of Style, for instance, enumerates EIGHT different, valid uses for an EM DASH. In entry 6.87 of the 17th edition, the Chicago Manual of Style mentions that an EM DASH should be used for "sudden breaks or interruptions." One of the examples it uses is as follows: "Well, I don't know," I began tentatively. "I thought I might—” "Might what?" she demanded. If that trailing EM DASH followed by a quotation mark were to end on its own line, it would look terrible. This is easy to make happen on a simple web page, as in my original attachment, but it is easily seen in Apple Books as well. (I'll attach a screenshot of The Invisible Man that illustrates this.) The Chicago Manual of Style also addresses the problem of line breaks directly (in 6.90): "In printed publications, line breaks should generally be made after an em dash but not before, in the manner of hyphens. In the case of a closing quotation mark (or any other mark of punctuation) immediately following the dash, however, the quotation mark and dash MUST NOT BE BROKEN AT THE END OF A LINE" [emphasis mine].
Brad Andalman
Comment 5
2022-07-15 18:53:24 PDT
Created
attachment 460950
[details]
Apple Books showing em dash and quotation mark on its own line
Alexey Proskuryakov
Comment 6
2022-07-16 13:34:57 PDT
An author can implement the desired behavior with a zero width joiner (e.g. "sir‍—" for the attached test), among other ways. While the CSS spec is not fully prescriptive on exactly following UAX #14, it does reference it as the baseline. So WebKit is not wrong here, and given that Chrome behaves in the same way, keeping our current behavior is best for compatibility.
https://drafts.csswg.org/css-text/#soft-wrap-opportunity
Myles C. Maxfield
Comment 7
2022-07-16 21:05:07 PDT
WebKit treats ICU as the source-of-truth for line breaking behavior. If you want this to be fixed, I recommend reporting this to the ICU project instead at
https://unicode-org.atlassian.net/jira/software/c/projects/ICU/issues/?filter=allissues
Myles C. Maxfield
Comment 8
2022-07-16 21:06:00 PDT
> If that trailing EM DASH followed by a quotation mark were to end on its own line, it would look terrible.
I agree, but this needs to be fixed in ICU, not WebKit.
Brad Andalman
Comment 9
2022-07-18 10:20:44 PDT
Filed with ICU here:
https://unicode-org.atlassian.net/browse/ICU-22090
Brad Andalman
Comment 10
2022-07-18 10:27:28 PDT
Thanks for helping me find the right place to report this!
Brent Fulgham
Comment 11
2022-07-18 11:54:36 PDT
Reclassifying as MOVED (as the bug is in the ICU component). The bug is not INVALID.
Myles C. Maxfield
Comment 12
2022-07-18 12:14:53 PDT
Thank you fo refiling!
Brad Andalman
Comment 13
2022-07-19 10:56:11 PDT
I was informed that filing with the ICU wasn't correct, so I refiled it as an error against UAX#14. My comments have been added to PRI #446 for feedback:
https://www.unicode.org/review/pri446/
Once again, thanks to everyone for helping me submit this to the right venue. I truly appreciate it!
zalan
Comment 14
2022-07-19 11:00:41 PDT
(In reply to Brad Andalman from
comment #13
)
> I was informed that filing with the ICU wasn't correct, so I refiled it as > an error against UAX#14. My comments have been added to PRI #446 for > feedback: >
https://www.unicode.org/review/pri446/
> > Once again, thanks to everyone for helping me submit this to the right > venue. I truly appreciate it!
Thank you for filing it! When the fix comes through both WebKit and Chrome will progress!
Myles C. Maxfield
Comment 15
2022-09-07 21:59:55 PDT
***
Bug 21677
has been marked as a duplicate of this bug. ***
Karl Dubost
Comment 16
2022-09-08 08:36:48 PDT
see Also the opposite behavior in
https://bugzilla.mozilla.org/show_bug.cgi?id=1269147
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug