Bug 16801 - CSS2: :lang has possibly poor performance on large pages
: CSS2: :lang has possibly poor performance on large pages
Status: ASSIGNED
: WebKit
CSS
: 528+ (Nightly build)
: All All
: P2 Normal
Assigned To:
:
: InRadar
: 9454
:
  Show dependency treegraph
 
Reported: 2008-01-09 01:24 PST by
Modified: 2011-09-04 23:54 PST (History)


Attachments
tentative patch (2.60 KB, patch)
2009-03-09 10:48 PST, Jungshik Shin
no flags Review Patch | Details | Formatted Diff | Diff
a one-line patch (that does not work) (772 bytes, patch)
2011-07-13 16:18 PST, Jungshik Shin
no flags Review Patch | Details | Formatted Diff | Diff
wip patch (partly working) (2.87 KB, patch)
2011-07-14 18:28 PST, Jungshik Shin
no flags Review Patch | Details | Formatted Diff | Diff
wip patch (only partly working) (3.05 KB, patch)
2011-07-15 14:10 PST, Jungshik Shin
no flags Review Patch | Details | Formatted Diff | Diff
wip patch (almost works) (3.63 KB, patch)
2011-07-18 12:23 PST, Jungshik Shin
no flags Review Patch | Details | Formatted Diff | Diff
a test (not for review) (1.03 KB, text/html)
2011-08-29 16:54 PST, Jungshik Shin
no flags Details
patch with layout test (6.35 KB, patch)
2011-09-01 18:16 PST, Jungshik Shin
webkit.review.bot: commit‑queue-
Review Patch | Details | Formatted Diff | Diff
patch update (still not passing some layout tests) (7.43 KB, patch)
2011-09-02 17:32 PST, Jungshik Shin
no flags Review Patch | Details | Formatted Diff | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2008-01-09 01:24:35 PST
CSS2: :lang has possibly poor performance on large pages

Proper :lang support was added as part of bug 9454.  Hyatt raised concerns about possible performance issues with :lang on large pages.  This bug is to track any such issues.
------- Comment #1 From 2008-01-09 02:50:50 PST -------
MacDome, to address an IRC comment, maybe we could do a survey of bilingual Chinese/Japanese readers and find out how often wrong font selection affects them. I would presume that han unification on bilingual pages is one of the biggest problems that :lang() solves. We should probably have some information oh how often the selector is used on such pages and how deep their dom trees generally are.
------- Comment #2 From 2008-01-21 14:20:06 PST -------
Actually, I realize that our implementation actually needs to change to allow fonts to use this information.  Currently the cascaded lang value is never stored anywhere, which means that other features (like font chosing, or SVG stuff) can't use this value, they'd have to recompute it themselves.
------- Comment #3 From 2008-01-21 15:09:10 PST -------
Yeah, that's why we should have the information in the style so that it is readily available. :)
------- Comment #4 From 2008-08-25 18:24:20 PST -------
<rdar://problem/6175111>
------- Comment #5 From 2009-03-09 10:46:01 PST -------
Not having noticed this bug blocking bug 10874, I uploaded a patch (along the direction discussed here) to bug 21312. I'll move it here. 
------- Comment #6 From 2009-03-09 10:48:32 PST -------
Created an attachment (id=28417) [details]
tentative patch 

I don't think this patch is acceptable as it is (it even has a commented-out block), but I'd like to see whether I'm in the right direction. This patch is copied from bug 21312. 
------- Comment #7 From 2009-03-09 11:30:40 PST -------
(From update of attachment 28417 [details])
Darin made an extensive comment on this patch (bug 21312 comment 5). 
------- Comment #8 From 2011-06-24 10:53:27 PST -------
I found that '-webkit-locale' (CSS property) was introduced. I don't understand why we introduced '-webkit-locale' when we have 'lang' to determine what locale to use for  hyphenation, line breaking, text-transform,  font selection, etc.  

However, in retrospect, introducing '-webkit-locale' seems to be a brilliant move because  Darin's comment on my patch for bug 21312 are mostly addressed by mapping HTML lang attribute to CSS '-webkit-locale' as HTML 'dir' is to CSS 'direction'.
------- Comment #9 From 2011-07-13 16:18:06 PST -------
Created an attachment (id=100726) [details]
a one-line patch (that does not work)

This patch is a 1-liner that tried to map 'lang' attribute to '-webkit-locale'. I also have a layout test but it's not included here because it's not working yet. 

If it works, it'd be great, but it does not. 

addCSSProperty calls CSSParser::parseValue() eventually and it fails because |cssparse| (in CSSGrammar.cpp) sets m_numParsedProperties to 0.

The following part in CSSGrammar.y is reached. The 3rd entry (STRING maybe_space) should be hit, but instead 'IDENT maybe_space' is used. Because lang/locale id (like "he", "en-US", "zh", "fr")  is not a cssValueKeyword, it fails. 

term:
  unary_term { $$ = $1; }
  | unary_operator unary_term { $$ = $2; $$.fValue *= $1; }
  | STRING maybe_space { $$.id = 0; $$.string = $1; $$.unit = CSSPrimitiveValue::CSS_STRING; }
  | IDENT maybe_space {
      $$.id = cssValueKeywordID($1);      $$.unit = CSSPrimitiveValue::CSS_IDENT;
      $$.string = $1;  }


I also considered special-casing -webkit-locale like font-related properties are in C++, but if I can tweak CSSGrammar.y to do what's necessary, I'd rather take that route. 


http://paste.lisp.org/display/123254
------- Comment #10 From 2011-07-13 16:28:53 PST -------
(In reply to comment #9)

> addCSSProperty calls CSSParser::parseValue() eventually and it fails because |cssparse| (in CSSGrammar.cpp) sets m_numParsedProperties to 0.
> 
> The following part in CSSGrammar.y is reached. The 3rd entry (STRING maybe_space) should be hit, but instead 'IDENT maybe_space' is used. Because lang/locale id (like "he", "en-US", "zh", "fr")  is not a cssValueKeyword, it fails. 

Locale identifiers are not CSS keywords. When you specify a -webkit-locale value you need to either specify the auto keyword or a string whose content is the locale identifier. This means that you need to add quotation marks around the attribute value (and escape internal quotation marks, maybe).
------- Comment #11 From 2011-07-13 16:29:35 PST -------
I.e. these are valid:

-webkit-locale: auto;
-webkit-locale: "en_US";

This is not:
-webkit-locale: en_US;
------- Comment #12 From 2011-07-13 17:23:30 PST -------
(In reply to comment #10)
> (In reply to comment #9)
> 
> > addCSSProperty calls CSSParser::parseValue() eventually and it fails because |cssparse| (in CSSGrammar.cpp) sets m_numParsedProperties to 0.
> > 
> > The following part in CSSGrammar.y is reached. The 3rd entry (STRING maybe_space) should be hit, but instead 'IDENT maybe_space' is used. Because lang/locale id (like "he", "en-US", "zh", "fr")  is not a cssValueKeyword, it fails. 
> 
> Locale identifiers are not CSS keywords. 

That much I knew. My question was why it's not picked up as a string. 

> When you specify a -webkit-locale value you need to either specify the auto keyword or a string whose content is the locale identifier. This means that you need to add quotation marks around the attribute value (and escape internal quotation marks, maybe).

Thank you. Now I got it. I should have quote *explicitly* the value of 'lang'.  Otherwise, the value will be treated as a CSS keyword and parsing will fail because it's not.
------- Comment #13 From 2011-07-14 18:28:01 PST -------
Created an attachment (id=100910) [details]
wip patch (partly working)

xml:lang is not taken into account. Content-Language is not, either. 
More importantly, it crashes a bunch of layout tests (when computeInheritedLanguage() is called). I haven't yet looked into this. 

fast/selectors/lang-inheritance2.html  
css1/conformance/forward_compatible_parsing.html       
http/tests/eventsource/workers/eventsource-simple.html 
svg/W3C-SVG-1.1/styling-css-05-b.svg   
fast/dom/css-selectorText.html 
fast/dom/SelectorAPI/resig-SelectorsAPI-test.xhtml     
fast/selectors/lang-vs-xml-lang-xhtml.xhtml    
fast/selectors/lang-inheritance.html   
http/tests/misc/acid3.html     
fast/selectors/lang-vs-xml-lang.html   
fast/css/css3-modsel-22.html   
fast/css/lang-selector-empty-attribute.xhtml   
fast/css/css-selector-text.html        
fast/css/css3-space-in-nth-and-lang.html       
css2.1/t0402-c71-fwd-parsing-02-f.html 
inspector/debugger/script-formatter.html


Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000038
Crashed Thread:  0  Dispatch queue: com.apple.main-thread

Thread 0 Crashed:  Dispatch queue: com.apple.main-thread
0   com.apple.WebCore                 0x0000000101472798 WTF::RefPtr<WebCore::StyleRareInheritedData>::get() const + 12 (RefPtr.h:60)
1   com.apple.WebCore                 0x0000000101501305 WebCore::DataRef<WebCore::StyleRareInheritedData>::get() const + 21 (DataRef.h:33)
2   com.apple.WebCore                 0x000000010150131d WebCore::DataRef<WebCore::StyleRareInheritedData>::operator->() const + 21 (DataRef.h:36)
3   com.apple.WebCore                 0x00000001014727dd WebCore::RenderStyle::locale() const + 25 (RenderStyle.h:703)
4   com.apple.WebCore                 0x00000001016e2ef9 WebCore::Element::computeInheritedLanguage() const + 109 (Element.cpp:1778)
5   com.apple.WebCore                 0x0000000101526f08 WebCore::CSSStyleSelector::SelectorChecker::checkOneSelector(WebCore::CSSSelector*,
------- Comment #14 From 2011-07-15 14:10:12 PST -------
Created an attachment (id=101041) [details]
wip patch (only partly working)

No more crash with null check, but still not taking care of  xml:lang vs lang precedence. So it breaks tests like fast/selectors/lang-vs-xml-lang-xhtml.xhtml
------- Comment #15 From 2011-07-18 12:23:47 PST -------
Created an attachment (id=101187) [details]
wip patch (almost works)

This one works except when xml:lang is specified without lang specified.
------- Comment #16 From 2011-07-18 16:50:30 PST -------
(From update of attachment 101187 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=101187&action=review

> Source/WebCore/dom/Element.cpp:1777
> +            RefPtr<RenderStyle> nodeStyle(static_cast<const Element*>(n)->renderStyle());

Instead of the above, I wonder if the following is better:

RefPtr<RenderStyle> nodeStyle(const_cast<Element*>(static_cast<const Element*>(n))->computedStyle(NOPSEUDO));

Note that computedStyle() is not const. So, either I have to const_cast or drop 'const' for this function. If there's no difference, I'd just use what I have in the patch.

> Source/WebCore/html/HTMLElement.cpp:164
> +        // FIXME: this does not work when xml:lang is present without lang.

XHTML spec says that both lang and xml:lang have to be used (for backward compatibility). I guess there will be very few document with only xml:lang. And, none of layout tests test that case. So, maybe it's ok not to support cases when ONLY xml:lang is specified. 

I tried the following, but it's never hit. 

else if (attr->name() == XMLNames::langAttr)
------- Comment #17 From 2011-08-29 16:54:58 PST -------
Created an attachment (id=105545) [details]
a test (not for review)

I'm trying to write a layout test that just checks the value of 'style.webkitLocale' of a node with 'lang' set directly or by inheritance. However, somehow accessing <node>.style.webkitLocale in JavaScript does not work as expected. 

In DOM inspector, all the nodes with lang set directly or by inheritance do have '-webkit-locale' set to the value I used. However, 'lang' node in the attached HTML file has

l1= l2= l3= l4=undefined.

In case of l4 (for 'l4' node), '-webkit-locale' is specified in inline style spec. In JS console, document.getElemetnById("l4".style.webkitLocale is "ja" as expected. 

I've tried HTML 'dir' mapped to CSS 'direction'.  Nodes with HTML dir attribute have 'direction' set to the value of 'dir' in DOM inspector. However, in JavaScript, their values are empty (the same as HTML 'lang' mapped to CSS '-webkit-locale').  However, a node with 'style="dir: rtl"' works as expected. That is, <node>.style.direction has 'rtl' instead of 'undefined'.  


Perhaps, some more plumbing is necessary to expose mapped HTML attributes in JavaScript?  Any hint/help would be appreciated. In the meantime, I'll poke around further.
------- Comment #18 From 2011-08-29 17:05:00 PST -------
Oh. I think I should use computedStyle, instead.
------- Comment #19 From 2011-09-01 18:16:08 PST -------
Created an attachment (id=106075) [details]
patch with layout test
------- Comment #20 From 2011-09-01 18:46:50 PST -------
(From update of attachment 106075 [details])
Attachment 106075 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/9580465

New failing tests:
fast/css/css3-modsel-22.html
fast/css/lang-selector-empty-attribute.xhtml
svg/W3C-SVG-1.1/styling-css-05-b.svg
fast/selectors/lang-vs-xml-lang.html
fast/css/css3-space-in-nth-and-lang.html
fast/dom/SelectorAPI/resig-SelectorsAPI-test.xhtml
fast/selectors/lang-inheritance.html
------- Comment #21 From 2011-09-02 10:33:01 PST -------
(From update of attachment 106075 [details])
it can't be landed because a bunch of layout tests involving psuedo-lang selector fails. 

When they're loaded for the 1st time, they're broken. As soon as I turn on DOM inspector, all the nodes affected by pseudo-lang selector get rendered correctly.
------- Comment #22 From 2011-09-02 17:32:57 PST -------
Created an attachment (id=106227) [details]
patch update (still not passing some layout tests)

I also tried using computedStyle in computeInheritedLanguage, but it still does not pass some layout tests. 

What I'll do is to punt this bug (making computeInheritedLanguage more efficient) and file a new bug that focus on mapping 'lang' and 'xml:lang' to -webkit-locale because only the latter is necessary for font fallback.  
That means Content-Langauge is not reflected in font fallback (neither will be document encoding). I have another idea on that issue, but we can punt that, too, for now. 

After filing a new bug on mapping lang/xml:lang to -webkit-locale, I'll upload a patch for html/HTMLElement.cpp there along with the current layout test.
------- Comment #23 From 2011-09-04 23:54:30 PST -------
I filed bug 67586 with a narrower scope than this. To that bug, I attached a patch to map lang/xml:lang to -webkit-locale to be used in font fallback and tex-transform.