Bug 191976

Summary: [FreeType] Color emoji not properly supported
Product: WebKit Reporter: Pacho Ramos <pachoramos1>
Component: WebKitGTKAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: akira, aperez, benjamin, bugs-noreply, cdumez, cgarcia, cmarcelo, dbates, ews, Hironori.Fujii, mcatanzaro, mmaxfield, webkit-bug-importer
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugzilla.mozilla.org/show_bug.cgi?id=1509988
https://bugs.chromium.org/p/chromium/issues/detail?id=908541
https://bugs.webkit.org/show_bug.cgi?id=177040
Bug Depends on: 192852    
Bug Blocks:    
Attachments:
Description Flags
example code
none
WIP
none
Patch
mcatanzaro: review+
Patch for landing none

Description Pacho Ramos 2018-11-26 12:21:13 PST
This comes from https://gitlab.freedesktop.org/fontconfig/fontconfig/issues/136

In summary, with Google Chrome, Firefox and Epiphany (using webkit-gtk), sometimes some emojis are shown with Noto Color Emoji while others are shown with the old B&W style.

For example you can see many examples simply looking to the comments of many youtube videos as people seems to like to use a lot of emojis on them and you can see some use Noto Color Emoji while others not: https://www.youtube.com/watch?v=gAirINwjaxE https://www.youtube.com/watch?v=bDb5QMRTCPI https://www.youtube.com/watch?v=bbEoRnaOIbs (and many other random examples)

Looking to "Inspect element" Firefox tool, in "Fonts" section, I see that "Dejavu Sans" is being used to show emojis instead of Noto Color Emoji. I have seen that current default /etc/fonts/conf.avail/60-generic.conf only sets Noto Color Emoji for emoji while, probably, it should also be listed for other "families" to be used over ugly emoji fonts from other font sets.

I suggested that to fontconfig people, but they disagree and think that it's an issue in the application instead:
https://gitlab.freedesktop.org/fontconfig/fontconfig/issues/136#note_82092

Thanks
Comment 1 Michael Catanzaro 2018-11-26 12:52:50 PST
My understanding of the problem is that there is no magic by which to prefer an emoji font over other fonts for drawing emoji. So if you want to be sure that Noto Color Emoji to be preferred for drawing emoji, you have to uninstall other fonts that provide those emoji. After all, Deja Vu supports those emoji just fine, albeit without color....

It doesn't make much sense, of course. That can't be the desired behavior, surely. But that's just how fonts work on Linux. So I would bounce this back to the fontconfig maintainers to find some solution. Unfortunately, font-related bug reports really have to be filed by a font expert, because nobody here understands how fontconfing is supposed to work either, nor do we understand our font selection code. We're totally dependent on the font people to tell us what to do here. So if specific code changes are desired, we can try to implement them; otherwise, it's WONTFIX, sorry.

P.S. If we do fix this somehow, I'll await the bug reports that we no longer match the behavior of Chrome and Firefox. :P
Comment 2 Akira TAGOH 2018-11-27 01:03:53 PST
I'm not totally sure how webkit picks up fonts. as I told on the fontconfig bug reported above, looking up "emoji" would be preferrable way as Pango does and is quite simpler. fontconfig will returns Noto Color Emoji as long as they are installed.
But if the html rendering engine like webkit has own font management thing as they have own character classification, they need to be improved. that's why I said this is an application issue. they need to identify emoji and use the special font family for them instead of sans/serif/mono.

Doing the sort of this thing in the code:
$ fc-list :charset=0x1f600
/usr/share/fonts/dejavu/DejaVuSansCondensed-Oblique.ttf: DejaVu Sans,DejaVu Sans Condensed:style=Condensed Oblique,Oblique
/usr/share/fonts/dejavu/DejaVuSansCondensed-Bold.ttf: DejaVu Sans,DejaVu Sans Condensed:style=Condensed Bold,Bold
/usr/share/fonts/dejavu/DejaVuSans.ttf: DejaVu Sans:style=Book
/usr/share/fonts/dejavu/DejaVuSans-Bold.ttf: DejaVu Sans:style=Bold
/usr/share/fonts/gdouros-symbola/Symbola.ttf: Symbola:style=Regular
/usr/share/fonts/dejavu/DejaVuSansCondensed.ttf: DejaVu Sans,DejaVu Sans Condensed:style=Condensed,Book
/usr/share/fonts/dejavu/DejaVuSansCondensed-BoldOblique.ttf: DejaVu Sans,DejaVu Sans Condensed:style=Condensed Bold Oblique,Bold Oblique
/usr/share/fonts/google-noto-emoji/NotoColorEmoji.ttf: Noto Color Emoji:style=Regular
/usr/share/fonts/dejavu/DejaVuSans-Oblique.ttf: DejaVu Sans:style=Oblique
/usr/share/fonts/dejavu/DejaVuSans-BoldOblique.ttf: DejaVu Sans:style=Bold Oblique

behaves wrong and no surprise even if emoji is renderred as B&W.
Hmm, so I was about to propose a workaround if you do something like the above. but I found a bug in fontconfig that something like this doesn't work:
$ fc-list :charset=0x1f600:color=true

and fixed this in https://gitlab.freedesktop.org/tagoh/fontconfig/commit/67dcfeb4f6e544ec2f380717e04b25e724669856

I don't know if this is really the case for this but maybe worth trying.
Anyway, after this fix, the result of the above command is:
$ fc-list :charset=0x1f600:color=true
/usr/share/fonts/google-noto-emoji/NotoColorEmoji.ttf: Noto Color Emoji:style=Regular

This looks not perfect to me since there are no way to change the behavior if someone prefers B&W rather than colored. or you could make an option to use B&W emoji perhaps.

Anyway, hope that helps.
Comment 3 Michael Catanzaro 2018-11-27 01:56:47 PST
Thanks for joining, Akira!

If I understand you properly, your suggestion is to override the font specified by the website whenever displaying emoji characters, e.g. ignore the fonts requested by the website's CSS:

font-family: Arial, sans;

and instead ask fontconfig specifically for the emoji font (somehow), but only for the specific characters that are emoji. All emoji characters would need to be somehow split out by WebKit into separate font runs (not sure if I'm using the correct terminology there). We could certainly try to do it....

The results would be unexpected if the website uses a custom emoji web font, though, since that font would then be ignored. That's not great. Maybe matching Chrome (so changing nothing) would be best.

I wonder how Apple handles this.
Comment 4 Akira TAGOH 2018-11-27 02:08:23 PST
(In reply to Michael Catanzaro from comment #3)
> Thanks for joining, Akira!
> 
> If I understand you properly, your suggestion is to override the font
> specified by the website whenever displaying emoji characters, e.g. ignore
> the fonts requested by the website's CSS:
> 
> font-family: Arial, sans;
> 
> and instead ask fontconfig specifically for the emoji font (somehow), but
> only for the specific characters that are emoji. All emoji characters would
> need to be somehow split out by WebKit into separate font runs (not sure if
> I'm using the correct terminology there). We could certainly try to do it....
> 
> The results would be unexpected if the website uses a custom emoji web font,
> though, since that font would then be ignored. That's not great. Maybe
> matching Chrome (so changing nothing) would be best.
> 
> I wonder how Apple handles this.

W3C is about to define "emoji" generic family in CSS Fonts 4 (https://www.w3.org/TR/css-fonts-4/). I guess the web browsers may be sonner or later going to support them. I guess after that, the logic of fonts selection in the html rendering engine might becomes to be more complicated. or it could be a css bug in a content perhaps as if the order of fonts is important.
Comment 5 Myles C. Maxfield 2018-11-27 18:19:16 PST
font-family: Arial, sans; should still work with combining emoji.

When performing font fallback, the first step is to chop the text up into clusters (advanceByCombiningCharacterSequence()). Each cluster performs font fallback independently. The first font is chosen which supports every code point in the cluster (FontCascade::fontForCombiningCharacterSequence() and Font::canRenderCombiningCharacterSequence()).

Neither Arial nor sans (Helvetica) support emoji characters, so the cluster should end up causing FontCache::systemFallbackForCharacters() to be called. That's your opportunity to select the emoji font.

When investigating this, I would see which of those functions listed above is returning the wrong thing.
Comment 6 Akira TAGOH 2018-11-27 20:58:35 PST
There are nothing wrong. but it just needs to take care of emoji forthermore.

As the original reporter says and also I demonstrated at comment#2, DejaVu has emoji now. when DejaVu is picked up for a cluster by performing a fallback of "sans", it will be used as is unless they didn't classifiy emoji separately and look up a font for emoji instead of "sans" and so on.

I think the colored emoji should be used by default though, someone might prefers B&W. so browsers may needs to know which one they like:

int
is_color_enabled(void)
{
    FcPattern *pat = FcPatternCreate();
    FcBool b;

    FcPatternAddString (pat, FC_FAMILY, (const FcChar8 *)"emoji");
    FcConfigSubstitute (NULL, pat, FcMatchPattern);
    FcDefaultSubstitute (pat);
    if (FcPatternGetBool (pat, FC_COLOR, 0, &b) == FcResultMatch)
        return 0;
    return b;
}

if none of colored fonts is available on the list of fonts in CSS, then you can perform another fallback with "emoji". does it make sense?
Comment 7 Michael Catanzaro 2018-11-28 12:33:07 PST
(In reply to Myles C. Maxfield from comment #5)
> Neither Arial nor sans (Helvetica) support emoji characters, so the cluster
> should end up causing FontCache::systemFallbackForCharacters() to be called.
> That's your opportunity to select the emoji font.
> 
> When investigating this, I would see which of those functions listed above
> is returning the wrong thing.

I think Akira is right, sans (usually DejaVu on Linux) does support emoji (that's the problem) and I don't think any function is necessarily returning the wrong thing. It's just not designed to work.

(In reply to Akira TAGOH from comment #6)
> if none of colored fonts is available on the list of fonts in CSS, then you
> can perform another fallback with "emoji". does it make sense?

No, I still don't understand. Your example shows how to detect if the emoji font supports color, but I'm not sure what you suggest doing with that.
Comment 8 Akira TAGOH 2018-11-28 21:34:34 PST
(In reply to Michael Catanzaro from comment #7)
> No, I still don't understand. Your example shows how to detect if the emoji
> font supports color, but I'm not sure what you suggest doing with that.

Assuming that you have the sort of fonts list with FcPattern (or FcFontSet) from fontconfig by any way and font-family in CSS is set to Arial, sans. they all should have FC_COLOR property in FcPattern. if the target cluster has emoji, you would filter fonts out according to the result of the above function say. then if no fonts available there, which may means no colored emoji fonts is available in sans/serif/mono, you could get a fallback by querying with "emoji" to fontconfig. if people prefers B&W, then you can do choose a font as before.
Comment 9 Akira TAGOH 2018-11-28 21:37:36 PST
Created attachment 355977 [details]
example code

The attached code demonstrates how application can fallback against charset. although this doesn't work as expected because FcFontSort returns Noto * Emoji fonts in the result but you can replace it with what you have. the point is that how you can detect charset what you want.
Comment 10 Carlos Garcia Campos 2018-12-18 06:57:21 PST
Created attachment 357567 [details]
WIP

This seems to work. Myles, what do you think about this?
Comment 11 Build Bot 2018-12-18 06:58:54 PST Comment hidden (obsolete)
Comment 12 Carlos Garcia Campos 2018-12-19 07:30:15 PST
Created attachment 357679 [details]
Patch
Comment 13 Adrian Perez 2018-12-19 11:29:08 PST
Comment on attachment 357679 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=357679&action=review

Informal review: A couple of nits, patch looks good overall but I am not an expert
in fonts so it would be good to have an r+ by someone who knows better, like Myles :)

> Source/WebCore/ChangeLog:18
> +        (WebCore::characterSequenceIsEmoji): Checke whether the character sequence is an emoji.

Typo: Checke → Check

> Source/WebCore/platform/graphics/cairo/FontCairoHarfbuzzNG.cpp:67
> +        if (nextCharacter == 0xFE0F) {

Maybe it would be good to give a name to this constant instead of leaving
magic numbers around in the middle of the function.
Comment 14 Carlos Garcia Campos 2018-12-28 02:15:47 PST
Ping reviewers.
Comment 15 Michael Catanzaro 2018-12-28 09:28:31 PST
Comment on attachment 357679 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=357679&action=review

Myles might want to review the cross-platform changes after the holidays, but they look relatively straightforward so I don't think we need to delay.

> Source/WTF/wtf/unicode/CharacterNames.h:103
> +const UChar combiningEnclosingKeycap = 0x20E3;

This should be alphabetized, like the rest of the list.

> Source/WTF/wtf/unicode/CharacterNames.h:170
> +using WTF::Unicode::combiningEnclosingKeycap;

Ditto.

>> Source/WebCore/platform/graphics/cairo/FontCairoHarfbuzzNG.cpp:67
>> +        if (nextCharacter == 0xFE0F) {
> 
> Maybe it would be good to give a name to this constant instead of leaving
> magic numbers around in the middle of the function.

I think it's fine; it's not used anywhere else, and there's a comment to explain it.
Comment 16 Myles C. Maxfield 2019-01-02 14:50:01 PST
Comment on attachment 357679 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=357679&action=review

I think this is the wrong direction.

> Source/WebCore/platform/graphics/Font.cpp:569
> +        return FontCache::singleton().systemFallbackForCharacters(description, this, isForPlatformFont, false, &codeUnit, 1);

We usually use 2-value enums so it's clear at the calling site what the meaning of "false" is.

> Source/WebCore/platform/graphics/cairo/FontCairoHarfbuzzNG.cpp:49
> +static bool characterSequenceIsEmoji(const Vector<UChar, 4>& normalizedCharacters, int32_t normalizedLength)

This logic shouldn't be necessary; WebKit shouldn't be in the business of iterating through strings codepoint-by-codepoint.

> Source/WebCore/platform/graphics/cairo/FontCairoHarfbuzzNG.cpp:136
> +    if (auto systemFallback = FontCache::singleton().systemFallbackForCharacters(m_fontDescription, baseFont, false, isEmoji, characters, length)) {

There are color fonts that don't support emoji, and it's conceptually possible to have an emoji font that isn't color. It isn't right to pass "isEmoji" to "bool coloredFont"
Comment 17 Carlos Garcia Campos 2019-01-02 23:57:36 PST
(In reply to Myles C. Maxfield from comment #16)
> Comment on attachment 357679 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=357679&action=review
> 
> I think this is the wrong direction.
> 
> > Source/WebCore/platform/graphics/Font.cpp:569
> > +        return FontCache::singleton().systemFallbackForCharacters(description, this, isForPlatformFont, false, &codeUnit, 1);
> 
> We usually use 2-value enums so it's clear at the calling site what the
> meaning of "false" is.

I thought about it, but I saw the other bool parameter in the function and decided to keep a bool. I can change both, though. 

> > Source/WebCore/platform/graphics/cairo/FontCairoHarfbuzzNG.cpp:49
> > +static bool characterSequenceIsEmoji(const Vector<UChar, 4>& normalizedCharacters, int32_t normalizedLength)
> 
> This logic shouldn't be necessary; WebKit shouldn't be in the business of
> iterating through strings codepoint-by-codepoint.

But I need to know if the sequence is an emoji, the point is to prefer a colored font when falling back to system fonts only when rendering emojis.

> > Source/WebCore/platform/graphics/cairo/FontCairoHarfbuzzNG.cpp:136
> > +    if (auto systemFallback = FontCache::singleton().systemFallbackForCharacters(m_fontDescription, baseFont, false, isEmoji, characters, length)) {
> 
> There are color fonts that don't support emoji, and it's conceptually
> possible to have an emoji font that isn't color. It isn't right to pass
> "isEmoji" to "bool coloredFont"

I think so, what we want here is to prefer a colored font when rendering emojis. We are also passing the characters, so we will fallback to a font that supports those codepoints and it's colored (if it exists, for course). I know what we want is an emji font, but I wanted to avoid heuristics using the font name, because there might be emojis fonts without the emoji word in the name. What we really want here is a colored version of the emoji, so a colored font supporting the sequence should be enough and it will probably be an emoji font anyway.
Comment 18 Michael Catanzaro 2019-01-03 05:59:50 PST
(In reply to Carlos Garcia Campos from comment #17)
> I thought about it, but I saw the other bool parameter in the function and
> decided to keep a bool. I can change both, though. 

FWIW I noticed that during my review and would have complained if not for the other bool parameter already right next to it.

> > > Source/WebCore/platform/graphics/cairo/FontCairoHarfbuzzNG.cpp:136
> > > +    if (auto systemFallback = FontCache::singleton().systemFallbackForCharacters(m_fontDescription, baseFont, false, isEmoji, characters, length)) {
> > 
> > There are color fonts that don't support emoji, and it's conceptually
> > possible to have an emoji font that isn't color. It isn't right to pass
> > "isEmoji" to "bool coloredFont"
> 
> I think so, what we want here is to prefer a colored font when rendering
> emojis.

It's too confusing: someone looking at this is likely to think it's a bug upon discovering the parameter is named coloredFont and not isEmoji. Converting the code to use enums or a flags parameter would help make this more explicit, since then you'd need to write out something like isEmoji ? Colored : NotColored. So that tips the balance in favor of doing that conversion in this patch IMO.
Comment 19 Carlos Garcia Campos 2019-01-08 03:59:57 PST
Created attachment 358583 [details]
Patch for landing
Comment 20 Carlos Garcia Campos 2019-01-10 02:04:00 PST
Committed r239822: <https://trac.webkit.org/changeset/239822>
Comment 21 Michael Catanzaro 2019-01-27 13:46:33 PST
Good job, it seems we're now the only Linux browser where color emoji actually work properly!
Comment 22 Carlos Garcia Campos 2019-01-28 00:19:36 PST
(In reply to Michael Catanzaro from comment #21)
> Good job, it seems we're now the only Linux browser where color emoji
> actually work properly!

Chromium was also using colored emojis in my tests here.
Comment 23 Michael Catanzaro 2019-01-28 08:10:55 PST
I've verified that Chrome and Firefox are both drawing black and white emojis on the YouTube examples in the first comment, because they don't have any way to prefer color emoji to the black and white ones from DejaVu, like we now do. That's https://bugzilla.mozilla.org/show_bug.cgi?id=1509988 and https://bugs.chromium.org/p/chromium/issues/detail?id=908541.
Comment 24 Michael Catanzaro 2019-03-27 05:44:15 PDT
Test to see if your browser supports color emoji properly: 🐄
Comment 25 Michael Catanzaro 2019-03-27 05:45:58 PDT
😎