Bug 10874 - lang, xml:lang, content-language ignored when choosing fonts
Summary: lang, xml:lang, content-language ignored when choosing fonts
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Layout and Rendering (show other bugs)
Version: 419.x
Hardware: Mac (PowerPC) OS X 10.4
: P2 Normal
Assignee: Alexey Proskuryakov
URL: http://en.wikipedia.org/wiki/Han_unif...
Keywords:
Depends on: 18085 93985 9454 20797 67586 76701 76892 78184 83792 97929
Blocks: 24574 77568
  Show dependency treegraph
 
Reported: 2006-09-15 13:08 PDT by Jakob Stoklund Olesen
Modified: 2012-10-09 01:12 PDT (History)
17 users (show)

See Also:


Attachments
work in progress patch (40.23 KB, patch)
2011-06-10 03:36 PDT, Matt Falkenhagen
no flags Details | Formatted Diff | Diff
improved patch (22.59 KB, patch)
2011-06-24 05:10 PDT, Matt Falkenhagen
no flags Details | Formatted Diff | Diff
use -webkit-locale for font selection (26.57 KB, patch)
2011-07-07 02:14 PDT, Matt Falkenhagen
no flags Details | Formatted Diff | Diff
update patch after ap's review (28.34 KB, patch)
2011-07-21 18:00 PDT, Matt Falkenhagen
no flags Details | Formatted Diff | Diff
revised after second review (28.01 KB, patch)
2011-07-25 15:42 PDT, Matt Falkenhagen
no flags Details | Formatted Diff | Diff
revise after mitz's review (28.77 KB, patch)
2011-08-04 01:25 PDT, Matt Falkenhagen
no flags Details | Formatted Diff | Diff
add tests for language-based font selection (15.72 KB, patch)
2012-02-07 21:37 PST, Matt Falkenhagen
no flags Details | Formatted Diff | Diff
Mac fix (17.85 KB, patch)
2012-04-10 13:16 PDT, Alexey Proskuryakov
buildbot: commit-queue-
Details | Formatted Diff | Diff
Mac fix (18.01 KB, patch)
2012-04-10 14:18 PDT, Alexey Proskuryakov
no flags Details | Formatted Diff | Diff
Mac fix (20.19 KB, patch)
2012-04-11 11:58 PDT, Alexey Proskuryakov
no flags Details | Formatted Diff | Diff
Mac fix (22.36 KB, patch)
2012-04-11 12:49 PDT, Alexey Proskuryakov
mitz: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jakob Stoklund Olesen 2006-09-15 13:08:21 PDT
When rendering Chinese, Japanese and Korean pages, WebKit uses the International preferences for choosing the font. The language attributes are not considered. This means that Chinese pages come out wrong on a Japanese computer.

Example:

<?xml version="1.0"?>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="zh-CN" lang="zh-CN">
<head>
<meta http-equiv="content-language" content="zh-CN"/>
<title>&#x6E24;&#x9EA6;&#x5802;</title>
</head>
<body>
<h1>&#x9EA6;&#x754C;&#x65B0;&#x95FB; &#x82F9;&#x679C; It's Showtime &#x7279;&#x522B;&#x53D1;&#x5E03;&#x4F1A;</h1>
</body>
</html>

The H1 is rendered in different fonts, depending on the user's language choice. Traditional Chinese, simplified Chinese, and Japanese give three different results, only the simplified Chinese is correct.

WebKit should look at lang/xml:lang attributes, and HTTP's Content-Language header before using user defaults to choose fonts for unihan characters.
Comment 1 Dave Hyatt 2006-09-16 11:34:13 PDT
Do other browsers do this?   What do WinIE and Firefox do?
Comment 2 Jakob Stoklund Olesen 2006-09-16 12:33:15 PDT
(In reply to comment #1)
> Do other browsers do this?   What do WinIE and Firefox do?

Firefox/Mac uses the lang attribute, as you can see by opening the wikipedia URL in FF.
Don't know about WinIE.
Opera 9.01/Mac uses the lang attribute.

Firefox gets the example wrong - it uses a mixture of traditional and simplified Chinese fonts. 
Opera displays the example in a single simplified Chinese font as it should.
Comment 3 Alexey Proskuryakov 2006-09-17 00:49:08 PDT
Although Han unification is by far the most prominent, it is not the only problematic one - the same issues exist with Arabic/Urdu, Greek/Coptic, Russian/Serbian, and modern/old (church) Cyrillic (examples taken from Unicode Demystified by Richard Gillam, chapter 3).
Comment 4 Eric Seidel (no email) 2007-01-31 04:31:05 PST
it seems :lang selection doesn't work at all:
http://www.w3.org/Graphics/SVG/Test/20061213/htmlEmbedHarness/full-styling-css-05-b.html
Comment 5 Andrew Dyson 2007-03-01 11:29:59 PST
(In reply to comment #1)
> Do other browsers do this?   What do WinIE and Firefox do?

See 

test case : 

http://www.w3.org/International/tests/sec-cjk-fonts

test result : note that Safari does not take into account 'lang' when selecting fonts while FF, IE, and Opera do. 

http://www.w3.org/International/tests/results/results-lang-and-cjk-font.php

(In reply to comment #4)
> it seems :lang selection doesn't work at all:
> http://www.w3.org/Graphics/SVG/Test/20061213/htmlEmbedHarness/full-styling-css-05-b.html
> 

I guess this needs to be a separate bug. Nonetheless, here are some links

http://www.w3.org/International/tests/#css-lang
http://www.w3.org/International/tests/results/css-lang
http://www.w3.org/International/questions/qa-css-lang
Comment 6 Robert Burns 2007-05-06 00:01:47 PDT
This is one of those things where WebKit gets it right and most other browsers get it really really wrong. Unfortunately that creates an issue where WebKit has to decide what path to take. Changing fonts due to language attributes is a really bad idea. Its not what Unicode intended for Unihan ideographs. It takes control away from content creators. And the list goes on. Glyphs should change when content creators specify different fonts. 

For fallback, WebKit (and other browsers) should select an appropriate glyph based on language context. Or if an OpenType font includes support for language dependent glyph variants. But the behavior demonstrated in these examples is just plain wrong from the spec. Also, this is not to say that WebKit doesn't need to fix its support for the lang CSS selectors which I believe it did with bug#3233). However, that's handled in other bug reports. Fixing the language selector and adding CJKV language dependent fonts to the default styleshet would be a setp towards addressing this issue. The only other thing that should be done would be to ensure that fallback font selection made use of that default useragent stylesheet before other fallbacks. Unicode suggests its the users enviornment settings that should govern the font selection (for plain text anyway). I think that should also guide fallback algorithms for CSS.
Comment 7 Allan Sandfeld Jensen 2007-08-22 10:01:30 PDT
This is actually a bug in Unicode. You should report it there.  But we need to fix it in webkit too.

The issue is that unicode is completely inconsistent. In the beginning they represented separated symbols. For instance there is different encodings for European numbers and Arabic number even though the meaning are the same, they are separated because the symbols are different. This made Unicode usefull for encoding fonts. But when the same type of issues appears in Asian scripts, Unicode changes and decides on encoding semantics instead of symbols, so suddenly the one unicode-character has several different graphical symbol.

Unfortunately it is unlikely that Unicode will fix their bugs, since it is so fundamentally flawed already.

Btw. Another place language is needed is in correct capitalization (like CSS text-transform). In Turkish a small 'i' needs to be capitalized to a big &#304;(dot over I), and a large 'I' to be lower-cased to '&#305;' (i with no dot).
Comment 8 Robert Burns 2007-08-22 11:53:38 PDT
(In reply to comment #7)
> This is actually a bug in Unicode. You should report it there.  But we need to
> fix it in webkit too.

I'm not so sure it is a bug in Unicode. Unicode has to balance many competing interests. They unify some things and dis-unify others. The rationale may not always be clear, but however they handle unification, it doesn't prevent proper glyph matching and text layout. It simply moves it to the presentation level (without an underlying character/semantic difference).

> > The issue is that unicode is completely inconsistent. In the beginning they
> represented separated symbols. For instance there is different encodings for
> European numbers and Arabic number even though the meaning are the same, they
> are separated because the symbols are different. This made Unicode usefull for
> encoding fonts. But when the same type of issues appears in Asian scripts,
> Unicode changes and decides on encoding semantics instead of symbols, so
> suddenly the one unicode-character has several different graphical symbol.

This is actually one of the *features* of Unicode. Separating the semantics from the presentation or the model from the view.

> Btw. Another place language is needed is in correct capitalization (like CSS
> text-transform). In Turkish a small 'i' needs to be capitalized to a big
> &#304;(dot over I), and a large 'I' to be lower-cased to '&#305;' (i with no
> dot).

This is actually a better example where glyphs should be different depending on the language settings. These CSS text transforms should be language aware for glyph selection.

In the case of Han unification, the webpage author should select a suitable font to display the Han characters with the desired glyphs.  The font the author selects will already be language dependent.

However, WebKit (and the various text systems it uses) should be language aware in it's glyph fallback selection. That is when the fonts specified by the author are not available, the text system should select the appropriate Han glyphs from an appropriate font based on the language settings of that run of text.
Comment 9 Jungshik Shin 2007-11-14 12:57:15 PST
(In reply to comment #1)
> Do other browsers do this?   What do WinIE and Firefox do?

Firefox has separate font preferences for script/langGroup and it does honor lang/xml:lang when selecting fonts. If lang/xml:lang is not specified, it falls back to script/langGroup inferred from charset (that does not always work, but in many cases, it works more or less).  Of course, documents encoded in UTF-{8,16,32} (or with a lot of NCRs to represent characters outside the repertoire of the current charset), this breaks down. For UTF-{8,16,32}, it falls back to the font for the lang/scriptGroup corresponding to the current application locale. (its lang/scriptGroup is not optimal being written almost 10 years ago. It'd be done differently if it was to be done today). 

I have a partial patch in that direction, but it's not anywhere near to be complete.

WebKit currently has only a single set of font preference across all languages/scripts.  |Settings| in webkit is not expressive enough (because it does not have a general 'key-value' with or without hierarchy) to have these font preferences.

font.<script>.serif
font.<script>.sans-serif
font.<script>.monospace

As for Han-disambiguation and Cyrillic-lang dependency, we need to do more than the above, but the above is a start. 

Comment 10 Nicholas Shanks 2008-03-26 02:26:46 PDT
Regarding comment #9, I believe you are touching upon a bug in CSS, whereby serif/sans-serif and monospace/proportional are actually orthogonal properties which have been conflated.
For example, compare Courier and Monaco.

If WebKit implements hierarchical preferences like Gecko has, we should use:

font.<script>.serif
font.<script>.serif.monospace
font.<script>.sans-serif
font.<script>.sans-serif.monospace

Presumably the font used for characters that have no script would be chosen based on the surrounding script, if any.
Comment 11 mitz 2008-08-25 18:23:20 PDT
<rdar://problem/3605214> is a related Apple bug.
Comment 12 Jungshik Shin 2009-03-09 10:53:39 PDT
bug 18085 has a tentative patch (that does not use 'lang/xml:lang' but infers 'dominant script' from the document character encoding). With bug 16801 (for which I just uploaded a tentative patch) fixed and some more changes in Settings, it can be made a patch for this bug. 
Comment 13 Jungshik Shin 2009-03-09 10:56:53 PDT
Sorry for bug spam. I meant to take this adding the previous comment
Comment 14 Nicholas Shanks 2009-03-13 04:33:51 PDT
(In reply to comment #8)
> In the case of Han unification, the webpage author should select a suitable
> font to display the Han characters with the desired glyphs.  The font the
> author selects will already be language dependent.

I whole heartedly disagree with you here. In all my websites I do not specify a typeface for the "body copy" (main bulk text) of my site. I don't even use the 'serif' and 'sans-serif' generic classes. I let the user specify their preferred font. If they want to read the internet in comic sans, then that's their problem, and I will be happy to let them suffer through it.

However, I do expect that Urdu and Farsi content will be presented in a Nasta'liq font, while Arabic would be presented using a Naskh font; and that Chinese will be presented in a Han font while Japanese will be presented in a Kanji font. Similarly, I expect a passage in Old Church Slavonic will use a single typeface for the entire passage that contains the OCS glyphs required, and not use the default Cyrillic font for 99% of the glyphs (such as Lucida Grande) and then find it needs to fall back to another font (like Gentium) to display one letter here and there, making the presentation look like rotten turds. Can you imagine what an English sentence would look like if the letters q and z had to come from Helvetica while the rest of the passage was set in Times?
Comment 15 Matt Falkenhagen 2011-06-10 03:36:39 PDT
Created attachment 96724 [details]
work in progress patch
Comment 16 Matt Falkenhagen 2011-06-10 03:37:40 PDT
Comment on attachment 96724 [details]
work in progress patch

This patch is not yet ready for formal review, but any comments would be appreciated. It is based strongly on Jungshik's patch on bug 18085 and includes a version of the patch on bug 20797 that can be removed when that is committed. This patch:

1. Adds a script member to FontDescription.
2. Sets that script in CSSStyleSelector based on lang/xml:lang/content-language.
3. Modifies fontDataForGenericFamily to get the appropriate font for the script

This patch does not yet:

1. Infer language/script from the charset (Jungshik's patch for bug 18085 has this)
2. Choose a good font when the lang attribute is used but font style is not.
3. Deal with lang changing dynamically.
Comment 17 Alexey Proskuryakov 2011-06-10 15:51:18 PDT
Comment on attachment 96724 [details]
work in progress patch

View in context: https://bugs.webkit.org/attachment.cgi?id=96724&action=review

It's great that you are working in this. This is a huge issue, in my opinion.

> Source/JavaScriptCore/ChangeLog:12
> +        * wtf/unicode/UScriptCode.h: Added.

Is this the same one that's being added in bug 20797? I previously made some comments in that bug.

> Source/WebCore/css/CSSStyleSelector.cpp:4260
> +        AtomicString lang = m_element->computeInheritedLanguage();

This bug is blocked on bug 16801, which says that language computation is likely unacceptably slow to be used for text rendering yet.
Comment 18 Alexey Proskuryakov 2011-06-10 15:52:25 PDT
> Is this the same one that's being added in bug 20797?

I see now that you said that.
Comment 19 Matt Falkenhagen 2011-06-13 02:25:30 PDT
Thanks for the comments! I am aware of bug 16801 and am hoping first to get a functioning implementation to better understand what needs to be done and measure the performance impact. It may be easier to tackle bug 16801 once that is done.
Comment 20 Matt Falkenhagen 2011-06-24 05:10:39 PDT
Created attachment 98483 [details]
improved patch
Comment 21 Matt Falkenhagen 2011-06-24 05:16:22 PDT
Comment on attachment 98483 [details]
improved patch

Sorry, I forgot to pass --no-review to webkit-patch upload. 

This patch is now easier to understand as it no longer contains the patch from bug 20797. It also  implements the 3 things that the previous patch is listed as not implementing.

Again, it is based strongly on Jungshik's earlier patches (bug 16801 and bug 18085).
Comment 22 Alexey Proskuryakov 2011-06-24 08:48:26 PDT
I talked to Dan about efficiently propagating language, and we now have an idea about how to do that.

WebKit already has a -webkit-locale CSS property, which affects hyphenation and line breaking. You could just turn lang into a mapped attribute (like for example font) to translate it into -webkit-locale, and use that for font selection, too. If the difference between locale and language is important here (which I don't think it is), you may also add a new CSS property.

I would suggest doing that in separate steps:
1) Add per-script defaults for languages. That would need some discussion with Safari engineers, because silently overriding preferences set by a user is not so nice. 
2) Implement font selection based on explicitly specified -webkit-locale. At this point, you can already have regression tests.
3) Add mapping from lang to -webkit-locale.
4) Add logic that takes encoding into account.

Does that sound like a reasonable approach to you?
Comment 23 Jungshik Shin 2011-06-24 15:04:11 PDT
>  just turn lang into a mapped attribute (like for example font) to translate it into -webkit-locale, and use that for font selection

We hit upon the same idea almost simultaneously :-)  Yesterday, I was scanning recent changes in ChangeLog and came across Dan's patch  to make LineBreakIterator dependent on webkit-locale. I'm trying it for bug 16801 (to use in bug 21312).
Comment 24 Matt Falkenhagen 2011-06-27 02:08:35 PDT
Alexey, thank you very much for your detailed comment!  It sounds like a reasonable approach.
Comment 25 Jungshik Shin 2011-06-27 10:52:34 PDT
> 1) Add per-script defaults for languages. . 

For ICU platforms, it's almost free because ICU has an API for that : add_likelySubtag  and getScript. For instance, ru will be expanded to ru_Cyrl_RU and getScript run on that expanded that will return Cyrl. 
If we pass sr_Latn, 'Latn' will remain as it is. 

'Cyrl' can be converted to ICU UScriptCode enum with another API. 

There's one problem, though. The ICU API in question was added in ICU 3.8 or 4.0 and Mac OS 10.5 has ICU 3.6....   

For non-ICU platforms, we need to do what Matt is doing (hard-coding and caching). Even on ICU platforms, we can do caching. 

> That would need some discussion with Safari engineers, because silently overriding preferences set by a user is not so nice

Actually, as long as Safari does not have per-script font preferences (but just global font preferences), the behavior will not be changed because in that case, the current patch will fall back to fonts for UScriptCommon (which is the only 

If we want to hard-code some preferences for Safari on Mac (as the latest patch does), what we can do is :

- replace each CSS generic with two fonts : one for ScriptCommon (that corresponds to the font selected by a user in Safari's UI) followed by a hard-coded font for each script + CSS generic. 

That way, what a user selected in the UI is still respected. 

As for the plan outlined, that looks great.
Comment 26 Matt Falkenhagen 2011-07-07 02:14:23 PDT
Created attachment 99959 [details]
use -webkit-locale for font selection
Comment 27 Matt Falkenhagen 2011-07-07 02:22:47 PDT
Comment on attachment 99959 [details]
use -webkit-locale for font selection

This patch links -webkit-locale with the per-script font settings that Jungshik implemented in bug 20797 (i.e., step #2 in Alexey's comment #22).

The default settings have not changed, so there should still be no visible change yet.
Comment 28 Matt Falkenhagen 2011-07-12 18:57:10 PDT
Could someone please review the latest patch? Since there is no visible effect (default fonts haven't changed), I didn't include a regression test. Would it be better to include one now and update the expectations when the font settings change?
Comment 29 Alexey Proskuryakov 2011-07-20 12:00:30 PDT
Comment on attachment 99959 [details]
use -webkit-locale for font selection

View in context: https://bugs.webkit.org/attachment.cgi?id=99959&action=review

Mitz asked me to look over parts other than CSS and font code.

> Source/WebCore/ChangeLog:30
> +        * css/LocaleUtils.cpp: Added.

This file should have a "Default" suffix, matching e.g Collator in JavaScriptCore/wtf/Unicode. Without it, it looks like a file that's used by all platforms.

These files should be in platform/text, not in css.

"Utils" is not so great to have in file name. It doesn't convey any information. I don't have a suggestion at the moment, but at the very least, "Utilities" should not be abbreviated.

> Source/WebCore/WebCore.gypi:2373
> +            'css/LocaleUtils.cpp',
> +            'css/LocaleUtilsICU.cpp',

Why does this this file list both? Wy doesn't that break the build due to duplicate definition of localeToScriptCode()?

> Source/WebCore/css/LocaleUtils.cpp:47
> +    // TODO: Need to add all the languages.

We use FIXME, not TODO. Furthermore, this comment is not actionable. It's not practically possible to add all languages.

I would have said something like "The below list only includes languages that need correct script identification for choosing a default font".

How hard would it be to just take all languages some modern version of ICU supports though?

> Source/WebCore/css/LocaleUtils.cpp:48
> +    const LocaleScript localeScriptList[] = {

This looks like it will be copied every time, no?

> Source/WebCore/css/LocaleUtils.cpp:52
> +        { "de", USCRIPT_LATIN },
> +        { "en", USCRIPT_LATIN },
> +        { "es", USCRIPT_LATIN },
> +        { "fr", USCRIPT_LATIN },

The short list of Roman languages seems particularly evil. Perhaps you should just maps them all to USCRIPT_COMMON, given that no port is going to set different defaults for COMMON and for ROMAN.

> Source/WebCore/css/LocaleUtils.h:34
> +#include <wtf/text/AtomicString.h>

No need to include, a forward declaration would work better.

> Source/WebCore/css/LocaleUtils.h:39
> +UScriptCode localeToScriptCode(const AtomicString& locale);

This should take const String&, since you don't ever use AtomicString features.

> Source/WebCore/css/LocaleUtilsICU.cpp:40
> +UScriptCode canonicalizeScriptCode(UScriptCode scriptCode)

This should be a static. The build should have broken because this doesn't have a forward declaration otherwise.

What does "Canonical" mean here? I'm quite confused.
Comment 30 Matt Falkenhagen 2011-07-21 18:00:51 PDT
Created attachment 101680 [details]
update patch after ap's review
Comment 31 Matt Falkenhagen 2011-07-21 18:09:03 PDT
Comment on attachment 99959 [details]
use -webkit-locale for font selection

View in context: https://bugs.webkit.org/attachment.cgi?id=99959&action=review

Thank you for the review! I've added a new patch.

>> Source/WebCore/WebCore.gypi:2373
>> +            'css/LocaleUtilsICU.cpp',
> 
> Why does this this file list both? Wy doesn't that break the build due to duplicate definition of localeToScriptCode()?

I looked at similar cases and it seems like the .gypi includes all files and then excludes the unwanted ones in the .gyp file (as I did here). For example, the TextIteratorIterator* files in WebCore.

>> Source/WebCore/css/LocaleUtils.cpp:47
>> +    // TODO: Need to add all the languages.
> 
> We use FIXME, not TODO. Furthermore, this comment is not actionable. It's not practically possible to add all languages.
> 
> I would have said something like "The below list only includes languages that need correct script identification for choosing a default font".
> 
> How hard would it be to just take all languages some modern version of ICU supports though?

I think it should be straightforward to do that.

>> Source/WebCore/css/LocaleUtils.cpp:52
>> +        { "fr", USCRIPT_LATIN },
> 
> The short list of Roman languages seems particularly evil. Perhaps you should just maps them all to USCRIPT_COMMON, given that no port is going to set different defaults for COMMON and for ROMAN.

Right, I kind of just meant this list as a preview for before adding "all the languages". Maybe it is better to just remove the Latin languages for now. I can add all the ICU languages in a later patch.

>> Source/WebCore/css/LocaleUtilsICU.cpp:40
>> +UScriptCode canonicalizeScriptCode(UScriptCode scriptCode)
> 
> This should be a static. The build should have broken because this doesn't have a forward declaration otherwise.
> 
> What does "Canonical" mean here? I'm quite confused.

I added a comment in the code.
Comment 32 Alexey Proskuryakov 2011-07-22 17:00:47 PDT
Comment on attachment 101680 [details]
update patch after ap's review

View in context: https://bugs.webkit.org/attachment.cgi?id=101680&action=review

> Source/WebCore/page/Settings.cpp:71
> +    if (script == USCRIPT_COMMON)
> +        return emptyAtom;
> +    return getGenericFontFamilyForScript(fontMap, USCRIPT_COMMON);

I don't understand why an empty font family is returned for USCRIPT_COMMON.

> Source/WebCore/platform/text/LocaleToScriptMapping.h:38
> +namespace WTF {
> +
> +class String;

I gave a bad/incomplete advice. This file should be including wtf/Forward.h.

> Source/WebCore/platform/text/LocaleToScriptMapping.h:44
> +UScriptCode localeToScriptCode(const WTF::String& locale);

Then you won't need this WTF::

> Source/WebCore/platform/text/LocaleToScriptMappingICU.cpp:41
> +// There are certain families of script codes that we want to treat as a single
> +// script for the purpose of assigning a per-script font in Settings. For

I'd call the function scriptCodeForFontSelection() then. The rest of this comment is easy to infer from the intent, which should be conveyed by function name.

> Source/WebCore/platform/text/LocaleToScriptMappingICU.cpp:61
> +UScriptCode localeToScriptCode(const String& locale)

And since this always returns a script code for font selection, this would be localeToScriptCodeForFontSelection(). 

Does that make sense? Do you envision any other uses of localeToScriptCode, with or without such canonicalization?

> Source/WebCore/platform/text/LocaleToScriptMappingICU.cpp:80
> +    // Ignore error that multiple scripts could be returned, since we only want
> +    // one script.

This is a short line, no need to break it.
Comment 33 Early Warning System Bot 2011-07-24 16:33:49 PDT
Comment on attachment 101680 [details]
update patch after ap's review

Attachment 101680 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/9233734
Comment 34 Matt Falkenhagen 2011-07-25 15:42:58 PDT
Created attachment 101929 [details]
revised after second review
Comment 35 Matt Falkenhagen 2011-07-25 15:52:51 PDT
Comment on attachment 101680 [details]
update patch after ap's review

View in context: https://bugs.webkit.org/attachment.cgi?id=101680&action=review

Thanks again for the review. I uploaded a new patch.

>> Source/WebCore/page/Settings.cpp:71
>> +    return getGenericFontFamilyForScript(fontMap, USCRIPT_COMMON);
> 
> I don't understand why an empty font family is returned for USCRIPT_COMMON.

It's just falling back to USCRIPT_COMMON. I feel like we always want to fallback here, but perhaps there is some situation where we must not fallback. In that case, I would have to move the fallback code outside of this function.

>> Source/WebCore/platform/text/LocaleToScriptMappingICU.cpp:61
>> +UScriptCode localeToScriptCode(const String& locale)
> 
> And since this always returns a script code for font selection, this would be localeToScriptCodeForFontSelection(). 
> 
> Does that make sense? Do you envision any other uses of localeToScriptCode, with or without such canonicalization?

Yes, this makes sense. I imagine it will always be used for font selection.
Comment 36 Alexey Proskuryakov 2011-07-25 16:11:02 PDT
Comment on attachment 101929 [details]
revised after second review

Thank you, looks good to me. Dan?
Comment 37 mitz 2011-08-03 21:06:52 PDT
Comment on attachment 101929 [details]
revised after second review

View in context: https://bugs.webkit.org/attachment.cgi?id=101929&action=review

r=me but please see the comment below and address if needed.

> Source/WebCore/css/CSSFontSelector.cpp:500
> +        if (fontDescription.genericFamily() == FontDescription::StandardFamily && !fontDescription.isSpecifiedFont())
> +            return fontDataForGenericFamily(m_document, fontDescription, "-webkit-standard");

I think you need similar handling in the !m_fontFaces.isEmpty() case that follows a few lines below. That case would be hit if the document had @font-face rules but you were still looking at the initial font.
Comment 38 Matt Falkenhagen 2011-08-04 01:25:33 PDT
Created attachment 102883 [details]
revise after mitz's review
Comment 39 Matt Falkenhagen 2011-08-04 01:29:58 PDT
Thanks for the review! I've added a new patch addressing your comment.
Comment 40 WebKit Review Bot 2011-08-04 08:22:18 PDT
Comment on attachment 102883 [details]
revise after mitz's review

Clearing flags on attachment: 102883

Committed r92375: <http://trac.webkit.org/changeset/92375>
Comment 41 WebKit Review Bot 2011-08-04 08:22:26 PDT
All reviewed patches have been landed.  Closing bug.
Comment 42 Matt Falkenhagen 2011-08-04 18:10:35 PDT
This bug should probably not be marked RESOLVED until lang/xml:lang is mapped to -webkit-locale as Jungshik's patch for bug 16801 does.  But it seems I can't change the Status directly.
Comment 43 Kent Tamura 2011-08-04 18:14:57 PDT
(In reply to comment #42)
> This bug should probably not be marked RESOLVED until lang/xml:lang is mapped to -webkit-locale as Jungshik's patch for bug 16801 does.  But it seems I can't change the Status directly.

Ok, I reopen.
Comment 44 Matt Falkenhagen 2012-02-07 21:37:47 PST
Created attachment 125996 [details]
add tests for language-based font selection
Comment 45 Matt Falkenhagen 2012-02-07 21:51:41 PST
Now that lang/xml:lang is mapped to -webkit-locale (bug 67586), lang/xml:lang is used for font selection. I proposed a patch with some tests for that.

Font selection based on http-equiv content-language is also implemented (bug 76701). But the content-language in the HTTP header is not yet used (bug 76892).
Comment 46 Alexey Proskuryakov 2012-02-08 00:55:36 PST
Please note that this bug covers making it work in Safari (it was filed in 2006).
Comment 47 Matt Falkenhagen 2012-02-08 17:08:08 PST
(In reply to comment #46)
> Please note that this bug covers making it work in Safari (it was filed in 2006).

That's true. It works on Safari but the tests rely on per-script font settings support in overridePreference of DumpRenderTree which is only implemented for the Chromium port so far. I've filed bug 78184.
Comment 48 Matt Falkenhagen 2012-03-14 20:45:12 PDT
Comment on attachment 125996 [details]
add tests for language-based font selection

Clear r? on stale patch.
Comment 49 Alexey Proskuryakov 2012-04-10 13:16:32 PDT
Created attachment 136522 [details]
Mac fix
Comment 50 Build Bot 2012-04-10 13:58:17 PDT
Comment on attachment 136522 [details]
Mac fix

Attachment 136522 [details] did not pass mac-ews (mac):
Output: http://queues.webkit.org/results/12383407
Comment 51 Alexey Proskuryakov 2012-04-10 14:18:30 PDT
Created attachment 136541 [details]
Mac fix

Oh llvm_gcc...
Comment 52 Alexey Proskuryakov 2012-04-10 18:20:57 PDT
Comment on attachment 136541 [details]
Mac fix

Withdrawing for now, got some offline review comments to address.
Comment 53 Alexey Proskuryakov 2012-04-11 11:58:33 PDT
Created attachment 136717 [details]
Mac fix
Comment 54 Alexey Proskuryakov 2012-04-11 12:49:44 PDT
Created attachment 136728 [details]
Mac fix

Forgot to include tests.
Comment 55 Alexey Proskuryakov 2012-04-11 13:13:54 PDT
Fixed in <http://trac.webkit.org/changeset/113900>.