Bug 241716

Summary: Uighur hyphenation should be supported
Product: WebKit Reporter: r12a <ishida>
Component: TextAssignee: Nobody <webkit-unassigned>
Status: NEW ---    
Severity: Normal CC: ishida, mmaxfield, webkit-bug-importer, zalan
Priority: P2 Keywords: InRadar
Version: Safari 15   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Example of hyphenation in Uighur none

Description r12a 2022-06-17 08:52:37 PDT
Created attachment 460298 [details]
Example of hyphenation in Uighur

Unlike Arabic, which is never hyphenated, words in Uighur text in the Arabic script can be broken at line ends. A short horizontal stroke is added at the end of the line, separated from the previous text by a small space, and joining forms are retained for left-joining letters at line end and line start.

See the attached illustration.

This hyphenation method needs to be supported in browsers.

Specs:
css-level-3 provides controls for hyphenation, and alludes to the requirement to create joining letter forms at line end and start for Arabic-script text where hyphenation is allowed, but leaves it to the browser implementation to produce the specific type of hyphenation that is appropriate to a given language.


Tests & results:
The following tests use the second half of the text in the image shown above.

interactive test, hyphens:auto makes the browser hyphenate Uighur text and uses a low stroke at the line end, and joining forms at line end and start.
https://w3c.github.io/i18n-tests/exploratory/hyphenation/int-hyphens.html?lang=ug&fontSize=33&fontFamily=Scheherazade%20New&width=400&height=492&hyphens=manual&textAlign=start&writingMode=horizontal-tb&height=492&text=%D8%A6%D9%89%D9%89%D8%B3%D8%B1%D8%A7%D8%A6%D9%89%D9%89%D9%84%D9%89%D9%8A%DB%95%20%D9%8A%DB%90%D9%85%DB%95%D9%83%D9%84%D9%89%DA%AD%20%D9%82%D8%A7%D8%AA%D8%A7%D8%B1%C2%AD%D9%84%D9%89%D9%82%20%D9%85%D8%A7%D8%AF%D8%AF%D9%89%D9%8A%20%D8%A6%D9%89%DB%95%D8%B4%D9%8A%D8%A7%D9%84%D8%A7%D8%B1%D9%86%D9%89%20%DA%86%DB%90%D8%B1%20%D9%83%D8%A7%DB%8B%D8%BA%D8%A7%20%D8%A6%D9%89%DB%90%D9%84%D9%89%D9%BE%20%D9%83%D9%89%D8%B1%D9%89%D8%B4%D9%83%DB%95%20%D8%B1%DB%87%D8%AE%D8%B3%DB%95%D8%AA%20%D9%82%D9%89%D9%84%C2%AD%D9%85%D9%89%D8%BA%D8%A7%D9%86%D9%84%D9%89%D9%82%D9%89%20%DA%BE%DB%95%D9%85%20%D9%85%DB%87%DA%BE%D8%A7%D8%B3%D9%89%D8%B1%DB%95%20%C2%AB%D8%A6%D9%89%D9%89%C2%AD%DA%86%D9%89%D8%AF%D9%89%D9%83%D9%89%20%D9%BE%DB%95%D9%84%DB%95%D8%B3%D8%AA%D9%89%D9%86%D9%84%D9%89%D9%83%20%D8%A8%D9%89%D8%B1%20%D9%82%D9%88%C2%AD%D8%B1%D8%A7%D9%84%D9%84%D9%89%D9%82%20%D8%AE%D8%A7%D8%AF%D9%89%D9%85%D9%86%D9%89%20%D8%A6%D9%89%DB%90%D8%AA%D9%89%D9%BE%20%D8%A6%D9%89%DB%86%D9%84%C2%AD%D8%AA%DB%88%D8%B1%DA%AF%DB%95%D9%86%D9%84%D9%89%D9%83%D9%89%20%D8%A6%D9%89%DB%88%DA%86%DB%88%D9%86%D8%8C%20%D9%BE%DB%95%D9%84%DB%95%DB%8B%D8%AA%D9%89%D9%86&a=After%20setting%20hyphens%3Amanual%2C%20the%20browser%20hyphenates%20Uighur%20text%20where%20soft%20hyphens%20occur.%20Hyphenation%20is%20shown%20by%20a%20low%20stroke%20at%20the%20line%20end%2C%20slightly%20separated%20from%20the%20foregoing%20text%2C%20and%20joining%20forms%20at%20line%20end%20and%20start.&i=Ajust%20the%20width%20of%20the%20bounding%20boxes%20to%20force%20word%20breaks%20where%20soft%20hyphens%20were%20added.%20Test%20passes%20if%20any%20word%20that%20breaks%20has%20a%20short%20stroke%20on%20the%20baseline%20at%20the%20end%20of%20the%20line%2C%20separated%20from%20the%20foregoing%20text%20with%20a%20small%20space%2C%20and%20the%20characters%20at%20the%20end%20and%20start%20of%20the%20lines%20are%20shaped%20for%20joining.

Results:

Gecko: ❌ No hyphenation occurs *Browser: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:98.0) Gecko/20100101 Firefox/98.0*
Blink: ❌ No hyphenation occurs *Browser: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.74 Safari/537.36*
Webkit: ❌ No hyphenation occurs *Browser: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.3 Safari/605.1.15*

interactive test, After setting hyphens:manual, the browser hyphenates Uighur text where soft hyphens occur. Hyphenation is shown by a low stroke at the line end, slightly separated from the foregoing text, and joining forms at line end and start.
https://w3c.github.io/i18n-tests/exploratory/hyphenation/int-hyphens.html?lang=ug&fontSize=33&fontFamily=Scheherazade%20New&width=400&height=492&hyphens=manual&textAlign=start&writingMode=horizontal-tb&height=492&text=%D8%A6%D9%89%D9%89%D8%B3%D8%B1%D8%A7%D8%A6%D9%89%D9%89%D9%84%D9%89%D9%8A%DB%95%20%D9%8A%DB%90%D9%85%DB%95%D9%83%D9%84%D9%89%DA%AD%20%D9%82%D8%A7%D8%AA%D8%A7%D8%B1%C2%AD%D9%84%D9%89%D9%82%20%D9%85%D8%A7%D8%AF%D8%AF%D9%89%D9%8A%20%D8%A6%D9%89%DB%95%D8%B4%D9%8A%D8%A7%D9%84%D8%A7%D8%B1%D9%86%D9%89%20%DA%86%DB%90%D8%B1%20%D9%83%D8%A7%DB%8B%D8%BA%D8%A7%20%D8%A6%D9%89%DB%90%D9%84%D9%89%D9%BE%20%D9%83%D9%89%D8%B1%D9%89%D8%B4%D9%83%DB%95%20%D8%B1%DB%87%D8%AE%D8%B3%DB%95%D8%AA%20%D9%82%D9%89%D9%84%C2%AD%D9%85%D9%89%D8%BA%D8%A7%D9%86%D9%84%D9%89%D9%82%D9%89%20%DA%BE%DB%95%D9%85%20%D9%85%DB%87%DA%BE%D8%A7%D8%B3%D9%89%D8%B1%DB%95%20%C2%AB%D8%A6%D9%89%D9%89%C2%AD%DA%86%D9%89%D8%AF%D9%89%D9%83%D9%89%20%D9%BE%DB%95%D9%84%DB%95%D8%B3%D8%AA%D9%89%D9%86%D9%84%D9%89%D9%83%20%D8%A8%D9%89%D8%B1%20%D9%82%D9%88%C2%AD%D8%B1%D8%A7%D9%84%D9%84%D9%89%D9%82%20%D8%AE%D8%A7%D8%AF%D9%89%D9%85%D9%86%D9%89%20%D8%A6%D9%89%DB%90%D8%AA%D9%89%D9%BE%20%D8%A6%D9%89%DB%86%D9%84%C2%AD%D8%AA%DB%88%D8%B1%DA%AF%DB%95%D9%86%D9%84%D9%89%D9%83%D9%89%20%D8%A6%D9%89%DB%88%DA%86%DB%88%D9%86%D8%8C%20%D9%BE%DB%95%D9%84%DB%95%DB%8B%D8%AA%D9%89%D9%86&a=After%20setting%20hyphens%3Amanual%2C%20the%20browser%20hyphenates%20Uighur%20text%20where%20soft%20hyphens%20occur.%20Hyphenation%20is%20shown%20by%20a%20low%20stroke%20at%20the%20line%20end%2C%20slightly%20separated%20from%20the%20foregoing%20text%2C%20and%20joining%20forms%20at%20line%20end%20and%20start.&i=Ajust%20the%20width%20of%20the%20bounding%20boxes%20to%20force%20word%20breaks%20where%20soft%20hyphens%20were%20added.%20Test%20passes%20if%20any%20word%20that%20breaks%20has%20a%20short%20stroke%20on%20the%20baseline%20at%20the%20end%20of%20the%20line%2C%20separated%20from%20the%20foregoing%20text%20with%20a%20small%20space%2C%20and%20the%20characters%20at%20the%20end%20and%20start%20of%20the%20lines%20are%20shaped%20for%20joining.

Results:

Gecko: ✅❌ The lines break and the line-end and line-start letters have joining forms, but the marker used is an ordinary hyphen and not on the baseline. *Browser: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:98.0) Gecko/20100101 Firefox/98.0*
Blink: ✅❌ The lines break and the line-end and line-start letters have joining forms, but the marker used is an ordinary hyphen and not on the baseline. *Browser: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.74 Safari/537.36*
Webkit: ✅❌ The lines break but the line-end and line-start letters don't have joining forms, and the marker used is an ordinary hyphen and not on the baseline. *Browser: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.3 Safari/605.1.15*


Priority:
Uighur hyphenation is common in printed material, so it should also work on the Web. At least, the manual hyphenation should use the appropriate characters and placement.
Comment 1 r12a 2022-06-17 09:16:16 PDT
This bug report is being tracked at the W3C at https://www.w3.org/TR/arab-ug-gap/#issue250_hyphenation
Comment 2 Radar WebKit Bug Importer 2022-06-24 08:53:14 PDT
<rdar://problem/95858116>
Comment 3 Myles C. Maxfield 2022-06-24 20:52:02 PDT
This is a good improvement. We should try to do this sooner rather than later.