Bug 55453 - Pass preferred language information down to font fallback logic
Summary: Pass preferred language information down to font fallback logic
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Layout and Rendering (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC Linux
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-03-01 02:57 PST by Takayoshi Kochi
Modified: 2011-03-07 12:43 PST (History)
7 users (show)

See Also:


Attachments
This is a tentative patch trying to implement passing preferred language to font fallback logic. (5.45 KB, patch)
2011-03-01 03:01 PST, Takayoshi Kochi
no flags Details | Formatted Diff | Diff
Reflected comment from agl. (5.63 KB, patch)
2011-03-01 23:47 PST, Takayoshi Kochi
no flags Details | Formatted Diff | Diff
Reflected comment from Jungshik. (5.61 KB, patch)
2011-03-03 03:35 PST, Takayoshi Kochi
no flags Details | Formatted Diff | Diff
Fix indentation pointed out by Tony. (5.62 KB, patch)
2011-03-03 10:39 PST, Takayoshi Kochi
tony: review-
Details | Formatted Diff | Diff
Added ChangeLogs. (7.50 KB, patch)
2011-03-06 21:31 PST, Takayoshi Kochi
no flags Details | Formatted Diff | Diff
Fixed wrong ChangeLog entries and diff paths (now Source/ prefixed). (7.47 KB, patch)
2011-03-06 22:55 PST, Takayoshi Kochi
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Takayoshi Kochi 2011-03-01 02:57:46 PST
For Chromium on Linux, we don't use any locale information for selecting fallback font (no default or
CSS-supplied font has glyph for those characters).  This often happens especially for CJK pages
with font-family being 'Arial' or any other latin-only fonts.

Sometimes we see fontconfig pick up undesired font (CJK-unified font, which has wrong glyph for
unified character for some language, e.g. 誤's glyphs should be different between C & J)
and it ends up in wrong typeface rendered.

As Chromium/Linux uses fontconfig for selecting a font from character, fontconfig
returns the top result that contains glyph for it, regardless of font-family or language.

At least, we have to pass the language, then with some fontconfig configuration tweaks
we can pick a more appropriate font.

Attaching a patch to implement this, but this requires changes in Chromium code too.
See also http://crosbug.com/11699
Comment 1 Takayoshi Kochi 2011-03-01 03:01:33 PST
Created attachment 84205 [details]
This is a tentative patch trying to implement passing preferred language to font fallback logic.
Comment 2 Alexey Proskuryakov 2011-03-01 09:53:40 PST
(In reply to comment #0)
> For Chromium on Linux, we don't use any locale information for selecting fallback font

Do other versions of Chromium support that?

I thought it was a cross-paltform issue, tracked by bug 10874. The main blocker being that node language is slow to compute.
Comment 3 Adam Langley 2011-03-01 10:24:04 PST
Comment on attachment 84205 [details]
This is a tentative patch trying to implement passing preferred language to font fallback logic.

View in context: https://bugs.webkit.org/attachment.cgi?id=84205&action=review

> WebKit/chromium/public/gtk/WebFontInfo.h:48
> +    //   preferredLanguage: preferred language for the characters

The comment should specify the format of the language. There are (at least) two and three letter forms in use.

> WebKit/chromium/public/linux/WebSandboxSupport.h:-53
> -    virtual WebString getFontFamilyForCharacters(const WebUChar* characters, size_t numCharacters) = 0;

the indentation here looks to be wrong.

> WebCore/platform/graphics/chromium/FontCacheLinux.cpp:61
> +    icu::Locale locale = icu::Locale::getDefault();

If this is for Linux then is the best way to get the locale? In zygote_main_linux.cc we preload the ICU timezone information, but I don't see anything about locale. So, if ICU tries to access the disk for this then it'll fail.
Comment 4 Takayoshi Kochi 2011-03-01 22:45:57 PST
(In reply to comment #2)
> (In reply to comment #0)
> > For Chromium on Linux, we don't use any locale information for selecting fallback font
> 
> Do other versions of Chromium support that?
> 
> I thought it was a cross-paltform issue, tracked by bug 10874. The main blocker being that node language is slow to compute.

This change affects only Linux.
As FontCache has per-platform implementation and this is Linux-only issue.

But yes, more generically I think bug 10874 is the right direction to go.

I admit that this is an intermediate solution as it uses UI language, not
HTML language.
Comment 5 Takayoshi Kochi 2011-03-01 23:22:57 PST
Comment on attachment 84205 [details]
This is a tentative patch trying to implement passing preferred language to font fallback logic.

View in context: https://bugs.webkit.org/attachment.cgi?id=84205&action=review

>> WebKit/chromium/public/gtk/WebFontInfo.h:48
>> +    //   preferredLanguage: preferred language for the characters
> 
> The comment should specify the format of the language. There are (at least) two and three letter forms in use.

Done.

>> WebKit/chromium/public/linux/WebSandboxSupport.h:-53
>> -    virtual WebString getFontFamilyForCharacters(const WebUChar* characters, size_t numCharacters) = 0;
> 
> the indentation here looks to be wrong.

Fixed.

>> WebCore/platform/graphics/chromium/FontCacheLinux.cpp:61
>> +    icu::Locale locale = icu::Locale::getDefault();
> 
> If this is for Linux then is the best way to get the locale? In zygote_main_linux.cc we preload the ICU timezone information, but I don't see anything about locale. So, if ICU tries to access the disk for this then it'll fail.

This doesn't hit the disk.

In the chrome side of the patch, I explicitly set for each renderer the default ICU locale from
commandline option passed from browser process (--lang=xx).  getDefault() returns the value set then.

On Chrome OS, zygote and sandbox helper are spawned off of main browser process in very early stage,
where the browser process does not know about UI language at that point.  After user logs in,
the browser changes its UI language according to the user's pref, and then passes '--lang='
commandline option and this is the only knowledge that renderers know.
Comment 6 Takayoshi Kochi 2011-03-01 23:47:50 PST
Created attachment 84376 [details]
Reflected comment from agl.
Comment 7 Jungshik Shin 2011-03-02 17:32:24 PST
Comment on attachment 84376 [details]
Reflected comment from agl.

View in context: https://bugs.webkit.org/attachment.cgi?id=84376&action=review

> WebKit/chromium/public/gtk/WebFontInfo.h:49
> +    //                      or 2-letter with 2-letter region (e.g. "zh-cn")

Language codes can be longer than 2-letter (e.g. fil) .  The same is true of region code (numeric region code is 3-digit like 419). Also, actually, script code is better for the 2nd part (like zh-Hant vs zh-Hans). Anyway, just say 'locale identifier' (along with a couple of examples like zh-CN, ja) would suffice without going to details for this comment
Comment 8 Jungshik Shin 2011-03-02 17:36:34 PST
Thank you, Kochi, for the patch. 

In reply to comment #2 : 
What this patch does is roughly equivalent to what Chrome Windows does (to deal with CJK glyph preferences). As Kochi wrote, this is a kinda interim solution (that works in most cases but not always) and bug 10874 and related bugs have to be fixed eventually. I earnestly plan to those bugs in March. 

In case of Chrome Windows, everything happens in the renderer (Webkit), but in case of Chrome Linux, fontconfig call goes out to the browser process.
Comment 9 Adam Langley 2011-03-02 17:36:45 PST
Comment on attachment 84376 [details]
Reflected comment from agl.

LGTM (I am not a WebKit reviewer. I need an r+ from a real reviewer before landing.)
Comment 10 Takayoshi Kochi 2011-03-03 03:35:09 PST
Created attachment 84543 [details]
Reflected comment from Jungshik.
Comment 11 Takayoshi Kochi 2011-03-03 03:35:32 PST
Comment on attachment 84376 [details]
Reflected comment from agl.

View in context: https://bugs.webkit.org/attachment.cgi?id=84376&action=review

>> WebKit/chromium/public/gtk/WebFontInfo.h:49
>> +    //                      or 2-letter with 2-letter region (e.g. "zh-cn")
> 
> Language codes can be longer than 2-letter (e.g. fil) .  The same is true of region code (numeric region code is 3-digit like 419). Also, actually, script code is better for the 2nd part (like zh-Hant vs zh-Hans). Anyway, just say 'locale identifier' (along with a couple of examples like zh-CN, ja) would suffice without going to details for this comment

Done.
Comment 12 Takayoshi Kochi 2011-03-03 03:48:45 PST
Updated the patch for fixes to Jungshik's comment.

Also I included a couple of minor changes in WebFontInfo from the previous patch:
1. made preferredLocale optional (= 0 default parameter)
2. NULL check for preferredLocale

With 1, we can commit WebKit change side only once.
For Chromium, I'll submit http://codereview.chromium.org/6592065/ first
then I'll make follow-up changelist to really use this interface.

Jungshik, could you take another look?
Comment 13 Tony Chang 2011-03-03 10:22:02 PST
Comment on attachment 84543 [details]
Reflected comment from Jungshik.

View in context: https://bugs.webkit.org/attachment.cgi?id=84543&action=review

> WebKit/chromium/src/gtk/WebFontInfo.cpp:69
> +      FcLangSet* langset = FcLangSetCreate();
> +      FcLangSetAdd(langset, reinterpret_cast<const FcChar8 *>(preferredLocale));
> +      FcPatternAddLangSet(pattern, FC_LANG, langset);
> +      FcLangSetDestroy(langset);

WebKit style uses 4 space indents
Comment 14 Takayoshi Kochi 2011-03-03 10:38:57 PST
Comment on attachment 84543 [details]
Reflected comment from Jungshik.

View in context: https://bugs.webkit.org/attachment.cgi?id=84543&action=review

>> WebKit/chromium/src/gtk/WebFontInfo.cpp:69
>> +      FcLangSetDestroy(langset);
> 
> WebKit style uses 4 space indents

Fixed.

Thanks for the review!
Comment 15 Takayoshi Kochi 2011-03-03 10:39:33 PST
Created attachment 84584 [details]
Fix indentation pointed out by Tony.
Comment 16 Tony Chang 2011-03-03 17:06:33 PST
Comment on attachment 84584 [details]
Fix indentation pointed out by Tony.

View in context: https://bugs.webkit.org/attachment.cgi?id=84584&action=review

r- because this patch is missing a ChangeLog entry.  Also, shouldn't there be a LayoutTest?

> WebKit/chromium/public/gtk/WebFontInfo.h:53
> -    WEBKIT_API static WebCString familyForChars(const WebUChar* characters, size_t numCharacters);
> +    WEBKIT_API static WebCString familyForChars(const WebUChar* characters, size_t numCharacters, const char* preferredLocale = 0);

Please file a bug about making |preferredLocale| not have a default param after the chromium side is landed and rolled in.
Comment 17 Takayoshi Kochi 2011-03-06 21:31:03 PST
Created attachment 84920 [details]
Added ChangeLogs.
Comment 18 Takayoshi Kochi 2011-03-06 21:36:35 PST
Comment on attachment 84584 [details]
Fix indentation pointed out by Tony.

View in context: https://bugs.webkit.org/attachment.cgi?id=84584&action=review

>> WebKit/chromium/public/gtk/WebFontInfo.h:53
>> +    WEBKIT_API static WebCString familyForChars(const WebUChar* characters, size_t numCharacters, const char* preferredLocale = 0);
> 
> Please file a bug about making |preferredLocale| not have a default param after the chromium side is landed and rolled in.

Done.

https://bugs.webkit.org/show_bug.cgi?id=55752
Comment 19 Takayoshi Kochi 2011-03-06 22:55:24 PST
Created attachment 84922 [details]
Fixed wrong ChangeLog entries and diff paths (now Source/ prefixed).
Comment 20 Takayoshi Kochi 2011-03-07 00:57:15 PST
Tony, could you take another look at this?

If this looks good, could you put this into commit queue?

Thanks,
Comment 21 Takayoshi Kochi 2011-03-07 01:06:14 PST
(In reply to comment #16)
> (From update of attachment 84584 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=84584&action=review
> 
> r- because this patch is missing a ChangeLog entry.  Also, shouldn't there be a LayoutTest?

As I also wrote in one of the ChangeLogs, the code depends on locale setting
and cannot be run the test easily in the framework.

So test will be added in the Chromium side.
Comment 22 WebKit Commit Bot 2011-03-07 12:43:50 PST
Comment on attachment 84922 [details]
Fixed wrong ChangeLog entries and diff paths (now Source/ prefixed).

Clearing flags on attachment: 84922

Committed r80489: <http://trac.webkit.org/changeset/80489>
Comment 23 WebKit Commit Bot 2011-03-07 12:43:56 PST
All reviewed patches have been landed.  Closing bug.