Bug 234342

Summary: Letter-spacing & first-letter selection must keep yo-phola with preceding independent vowels
Product: WebKit Reporter: r12a <ishida>
Component: CSSAssignee: Nobody <webkit-unassigned>
Status: NEW ---    
Severity: Normal CC: ishida, mmaxfield, webkit-bug-importer, zalan
Priority: P2 Keywords: InRadar
Version: Safari 15   
Hardware: Unspecified   
OS: Unspecified   

Description r12a 2021-12-15 04:55:40 PST
There are two cases in Bengali where hasant (virama) is preceded by an independent vowel, rather than a consonant. These are:

    অ্যা [U+0985 BENGALI LETTER A + U+09CD BENGALI SIGN VIRAMA + U+09AF BENGALI LETTER YA + U+09BE BENGALI VOWEL SIGN AA], and
    এ্যা [U+098F BENGALI LETTER E + U+09CD BENGALI SIGN VIRAMA + U+09AF BENGALI LETTER YA + U+09BE BENGALI VOWEL SIGN AA]

Illustrations:
https://user-images.githubusercontent.com/4839211/146041054-f96714a2-b926-4ea4-8e17-ba4e003a671e.png

https://user-images.githubusercontent.com/4839211/146041082-fbdd1cee-1bdc-4c45-b2d8-d427b8027c77.png

https://user-images.githubusercontent.com/4839211/146041100-ab30d934-2a22-4e2e-9c6e-68cb3965f3e9.png

(In both cases this produces the sound æ, used for non-native words, such as 'application', 'administration' etc.)

This combination should not be split either, even though it doesn't fit the typical CvC structure of a conjunct (where 'v' is the virama).


Tests & results:
Both of the following tests were run with the following pre-installed fonts:

Windows: Shonar Bangla, Arial Unicode MS, Nirmala UI, Vrinda

Mac: Bangla MN, Bangla Sangam MN, Kohinoor Bangla, Tiro Bangla, Baloo Da

Also tested with Noto Sans Bengali and Noto Serif Bengali on the Mac.

Interactive test, Bengali অ্যা and এ্যা (æ) are selected as a single grapheme by ::first-letter. https://github.com/w3c/line_paragraph_tests/issues/80

    Gecko: ✅❌ Mac: fails for Bangla MN, and Bangla Sangam MN, but passes for the other 3 fonts. Windows: works fine for all fonts.
    Blink: ❌ Mac & Windows: fails for all fonts.
    Webkit: ❌ Mac: fails for all fonts. It was not possible to apply the Noto fonts.

Note that Blink and Webkit actually handle the more usual CvC conjunct arrangement (see this test).

Interactive test, Bengali অ্যা and এ্যা (æ) are treated as a single grapheme for letter-spacing.  https://github.com/w3c/line_paragraph_tests/issues/81

    Gecko: ✅❌ Windows: works with all fonts. Mac: fails with Bangla MN, Bangla Sangam MN, and Baloo Da, but works with the others. Works with Noto fonts.
    Blink: ✅❌ Windows: works with all fonts. Mac: same results as for Gecko.
    Webkit: ❌ Mac: failed for all fonts. In fact, letters were all spaced individually, rather than by grapheme cluster. Could not apply Noto fonts.

Gecko, Blink, and Webkit all fail to treat the sequence as a single grapheme, despite the fact that Blink and Webkit actually handle the more usual CvC conjunct arrangement (see this test).
Comment 1 r12a 2021-12-15 04:56:10 PST
This bug deals with both initial-letter and letter-spacing issues together, since i though the solution might be related. It could be split into 2 bugs.
Comment 2 r12a 2021-12-15 08:00:32 PST
Fwiw, this information is taken from https://www.w3.org/TR/2021/DNOTE-beng-gap-20211215/#issue67_spacing
Comment 3 Radar WebKit Bug Importer 2021-12-22 04:56:18 PST
<rdar://problem/86807248>