RESOLVED FIXED 177040
[FreeType] Support emoji modifiers
https://bugs.webkit.org/show_bug.cgi?id=177040
Summary [FreeType] Support emoji modifiers
Michael Catanzaro
Reported 2017-09-16 13:54:32 PDT
We now have support for color emoji, but gender and skin-tone modifiers [1] are not supported. We should change that. Apple ports already have this support (see e.g. r213499). [1] http://www.unicode.org/reports/tr51/
Attachments
Patch (9.44 KB, patch)
2019-01-11 04:57 PST, Carlos Garcia Campos
mmaxfield: review+
Michael Catanzaro
Comment 1 2018-12-13 09:12:18 PST
Test page: https://unicode.org/emoji/charts/full-emoji-list.html If you scroll down you'll notice that none of the emoji combinations work.
Carlos Garcia Campos
Comment 2 2018-12-14 02:03:49 PST
I see that some of them don't work, no none of them.
Michael Catanzaro
Comment 3 2018-12-14 06:04:06 PST
Is there a combination on that page that does work for you?
Carlos Garcia Campos
Comment 4 2018-12-14 06:34:39 PST
Yes, see all the flags for example, see also the test I attached to original bug: https://bug-156579-attachments.webkit.org/attachment.cgi?id=284325
Carlos Garcia Campos
Comment 5 2018-12-14 06:59:05 PST
Ok, so the problem only happens with emojis having the zero with joiner (U+200D) in the sequence. The sequence is broken because createAndFillGlyphPage() in Font.cpp overwrites zero with joiner with zero width space (U+200B). Myles, why are we doing that overwrite? should that depend on the font? Not doing the overwrite fixes all the combinations for me in https://unicode.org/emoji/charts/full-emoji-list.html. If we can't remove the overwrite we can handle that as a special case in harfbuzz set_nominal_glyph function.
Myles C. Maxfield
Comment 6 2018-12-19 13:35:42 PST
(In reply to Carlos Garcia Campos from comment #5) > Ok, so the problem only happens with emojis having the zero with joiner > (U+200D) in the sequence. The sequence is broken because > createAndFillGlyphPage() in Font.cpp overwrites zero with joiner with zero > width space (U+200B). Myles, why are we doing that overwrite? should that > depend on the font? Not doing the overwrite fixes all the combinations for > me in https://unicode.org/emoji/charts/full-emoji-list.html. If we can't > remove the overwrite we can handle that as a special case in harfbuzz > set_nominal_glyph function. This code is trying to make control characters invisible. I haven't done the research, but I expect the replacement code in createAndFillGlyphPage() is older than the WidthIterator::applyFontTransforms() code. (Both the simple text path and the complex text path use createAndFillGlyphPage().) You've hit the reason I filed https://bugs.webkit.org/show_bug.cgi?id=187166. Because shaping operates on the character stream of the input, if we muck around with the character string, we break shaping. This has bitten us multiple times, see https://bugs.webkit.org/show_bug.cgi?id=185976 The right way to do this is to run shaping on the unperturbed character stream, and after shaping is finished, map each glyph in the result back to the character in the stream it represents, and if it represents a control character, remove that glyph from the stream. Unfortunately, "map each glyph in the result back to the character in the stream it represents" is impossible in the Cocoa fast text codepath. That codepath uses CTFontTransformGlyphs() to do the shaping, but doesn't output a mapping back to the original string indices. <rdar://problem/44466695> is tracking this. So, if I understand the Harfbuzz API correctly, the Harfbuzz ports should be able to fix this immediately, but the existing code needs to continue to exist for the Cocoa ports until we get a solution from the platform that fits. Bonus: There used to be interop for many years between the browsers about making control characters invisible. However, the spec changed[1] and now says "Control characters (Unicode category Cc) other than tab (U+0009), line feed (U+000A), and form feed (U+000C), must be rendered as a visible glyph." However, I'm not sure we can actually make that change, because I don't think we've characterized how many sites would break (where "break" means "control character garbage shows up and makes content unreadable where it used to be invisible and readable"). The decision about rendering control characters visibly should be independently from the method of fixing createAndFillGlyphPage(). [1] https://drafts.csswg.org/css-text-3/#white-space-processing
Carlos Garcia Campos
Comment 7 2018-12-31 06:16:59 PST
Harfbuzz already replaces characters with Default_Ignorable Unicode property with space glyph and zeroes the advance width when shaping. So, I think we can probably avoid all the overrides for the GTK port.
Myles C. Maxfield
Comment 8 2019-01-02 14:41:49 PST
(In reply to Carlos Garcia Campos from comment #7) > Harfbuzz already replaces characters with Default_Ignorable Unicode property > with space glyph and zeroes the advance width when shaping. So, I think we > can probably avoid all the overrides for the GTK port. Awesome!
Carlos Garcia Campos
Comment 9 2019-01-11 02:04:32 PST
(In reply to Myles C. Maxfield from comment #8) > (In reply to Carlos Garcia Campos from comment #7) > > Harfbuzz already replaces characters with Default_Ignorable Unicode property > > with space glyph and zeroes the advance width when shaping. So, I think we > > can probably avoid all the overrides for the GTK port. > > Awesome! Unfortunately this is not so easy in the end. We still need to replace \n and \t to space too, and some characters never reach to harfbuzz anyway like isolations, because they are left out when splitting the text runs, and added back to list of glyphs when rendering, causing the missing glyph to be rendered. So, the safest approach that fixes this bug would be to only override the code points when the font doesn't support the code point. That would fix this bug without introducing any regression in tests.
Carlos Garcia Campos
Comment 10 2019-01-11 04:57:17 PST
Carlos Garcia Campos
Comment 11 2019-01-13 23:59:06 PST
Radar WebKit Bug Importer
Comment 12 2019-01-14 00:02:04 PST
Note You need to log in before you can comment on or make changes to this bug.