Bug 16801 - CSS2: :lang has possibly poor performance on large pages
Summary: CSS2: :lang has possibly poor performance on large pages
Status: ASSIGNED
Alias: None
Product: WebKit
Classification: Unclassified
Component: CSS (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on: 9454
Blocks:
  Show dependency treegraph
 
Reported: 2008-01-09 01:24 PST by Eric Seidel (no email)
Modified: 2014-06-14 14:43 PDT (History)
12 users (show)

See Also:


Attachments
tentative patch (2.60 KB, patch)
2009-03-09 10:48 PDT, Jungshik Shin
no flags Details | Formatted Diff | Diff
a one-line patch (that does not work) (772 bytes, patch)
2011-07-13 16:18 PDT, Jungshik Shin
no flags Details | Formatted Diff | Diff
wip patch (partly working) (2.87 KB, patch)
2011-07-14 18:28 PDT, Jungshik Shin
no flags Details | Formatted Diff | Diff
wip patch (only partly working) (3.05 KB, patch)
2011-07-15 14:10 PDT, Jungshik Shin
no flags Details | Formatted Diff | Diff
wip patch (almost works) (3.63 KB, patch)
2011-07-18 12:23 PDT, Jungshik Shin
no flags Details | Formatted Diff | Diff
a test (not for review) (1.03 KB, text/html)
2011-08-29 16:54 PDT, Jungshik Shin
no flags Details
patch with layout test (6.35 KB, patch)
2011-09-01 18:16 PDT, Jungshik Shin
webkit.review.bot: commit-queue-
Details | Formatted Diff | Diff
patch update (still not passing some layout tests) (7.43 KB, patch)
2011-09-02 17:32 PDT, Jungshik Shin
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Seidel (no email) 2008-01-09 01:24:35 PST
CSS2: :lang has possibly poor performance on large pages

Proper :lang support was added as part of bug 9454.  Hyatt raised concerns about possible performance issues with :lang on large pages.  This bug is to track any such issues.
Comment 1 Nicholas Shanks 2008-01-09 02:50:50 PST
MacDome, to address an IRC comment, maybe we could do a survey of bilingual Chinese/Japanese readers and find out how often wrong font selection affects them. I would presume that han unification on bilingual pages is one of the biggest problems that :lang() solves. We should probably have some information oh how often the selector is used on such pages and how deep their dom trees generally are.
Comment 2 Eric Seidel (no email) 2008-01-21 14:20:06 PST
Actually, I realize that our implementation actually needs to change to allow fonts to use this information.  Currently the cascaded lang value is never stored anywhere, which means that other features (like font chosing, or SVG stuff) can't use this value, they'd have to recompute it themselves.
Comment 3 Dave Hyatt 2008-01-21 15:09:10 PST
Yeah, that's why we should have the information in the style so that it is readily available. :)

Comment 4 mitz 2008-08-25 18:24:20 PDT
<rdar://problem/6175111>
Comment 5 Jungshik Shin 2009-03-09 10:46:01 PDT
Not having noticed this bug blocking bug 10874, I uploaded a patch (along the direction discussed here) to bug 21312. I'll move it here. 
Comment 6 Jungshik Shin 2009-03-09 10:48:32 PDT
Created attachment 28417 [details]
tentative patch 

I don't think this patch is acceptable as it is (it even has a commented-out block), but I'd like to see whether I'm in the right direction. This patch is copied from bug 21312.
Comment 7 Jungshik Shin 2009-03-09 11:30:40 PDT
Comment on attachment 28417 [details]
tentative patch 

Darin made an extensive comment on this patch (bug 21312 comment 5).
Comment 8 Jungshik Shin 2011-06-24 10:53:27 PDT
I found that '-webkit-locale' (CSS property) was introduced. I don't understand why we introduced '-webkit-locale' when we have 'lang' to determine what locale to use for  hyphenation, line breaking, text-transform,  font selection, etc.  

However, in retrospect, introducing '-webkit-locale' seems to be a brilliant move because  Darin's comment on my patch for bug 21312 are mostly addressed by mapping HTML lang attribute to CSS '-webkit-locale' as HTML 'dir' is to CSS 'direction'.
Comment 9 Jungshik Shin 2011-07-13 16:18:06 PDT
Created attachment 100726 [details]
a one-line patch (that does not work)

This patch is a 1-liner that tried to map 'lang' attribute to '-webkit-locale'. I also have a layout test but it's not included here because it's not working yet. 

If it works, it'd be great, but it does not. 

addCSSProperty calls CSSParser::parseValue() eventually and it fails because |cssparse| (in CSSGrammar.cpp) sets m_numParsedProperties to 0.

The following part in CSSGrammar.y is reached. The 3rd entry (STRING maybe_space) should be hit, but instead 'IDENT maybe_space' is used. Because lang/locale id (like "he", "en-US", "zh", "fr")  is not a cssValueKeyword, it fails. 

term:
  unary_term { $$ = $1; }
  | unary_operator unary_term { $$ = $2; $$.fValue *= $1; }
  | STRING maybe_space { $$.id = 0; $$.string = $1; $$.unit = CSSPrimitiveValue::CSS_STRING; }
  | IDENT maybe_space {
      $$.id = cssValueKeywordID($1);      $$.unit = CSSPrimitiveValue::CSS_IDENT;
      $$.string = $1;  }


I also considered special-casing -webkit-locale like font-related properties are in C++, but if I can tweak CSSGrammar.y to do what's necessary, I'd rather take that route. 


http://paste.lisp.org/display/123254
Comment 10 mitz 2011-07-13 16:28:53 PDT
(In reply to comment #9)

> addCSSProperty calls CSSParser::parseValue() eventually and it fails because |cssparse| (in CSSGrammar.cpp) sets m_numParsedProperties to 0.
> 
> The following part in CSSGrammar.y is reached. The 3rd entry (STRING maybe_space) should be hit, but instead 'IDENT maybe_space' is used. Because lang/locale id (like "he", "en-US", "zh", "fr")  is not a cssValueKeyword, it fails. 

Locale identifiers are not CSS keywords. When you specify a -webkit-locale value you need to either specify the auto keyword or a string whose content is the locale identifier. This means that you need to add quotation marks around the attribute value (and escape internal quotation marks, maybe).
Comment 11 mitz 2011-07-13 16:29:35 PDT
I.e. these are valid:

-webkit-locale: auto;
-webkit-locale: "en_US";

This is not:
-webkit-locale: en_US;
Comment 12 Jungshik Shin 2011-07-13 17:23:30 PDT
(In reply to comment #10)
> (In reply to comment #9)
> 
> > addCSSProperty calls CSSParser::parseValue() eventually and it fails because |cssparse| (in CSSGrammar.cpp) sets m_numParsedProperties to 0.
> > 
> > The following part in CSSGrammar.y is reached. The 3rd entry (STRING maybe_space) should be hit, but instead 'IDENT maybe_space' is used. Because lang/locale id (like "he", "en-US", "zh", "fr")  is not a cssValueKeyword, it fails. 
> 
> Locale identifiers are not CSS keywords. 

That much I knew. My question was why it's not picked up as a string. 

> When you specify a -webkit-locale value you need to either specify the auto keyword or a string whose content is the locale identifier. This means that you need to add quotation marks around the attribute value (and escape internal quotation marks, maybe).

Thank you. Now I got it. I should have quote *explicitly* the value of 'lang'.  Otherwise, the value will be treated as a CSS keyword and parsing will fail because it's not.
Comment 13 Jungshik Shin 2011-07-14 18:28:01 PDT
Created attachment 100910 [details]
wip patch (partly working)

xml:lang is not taken into account. Content-Language is not, either. 
More importantly, it crashes a bunch of layout tests (when computeInheritedLanguage() is called). I haven't yet looked into this. 

fast/selectors/lang-inheritance2.html  
css1/conformance/forward_compatible_parsing.html       
http/tests/eventsource/workers/eventsource-simple.html 
svg/W3C-SVG-1.1/styling-css-05-b.svg   
fast/dom/css-selectorText.html 
fast/dom/SelectorAPI/resig-SelectorsAPI-test.xhtml     
fast/selectors/lang-vs-xml-lang-xhtml.xhtml    
fast/selectors/lang-inheritance.html   
http/tests/misc/acid3.html     
fast/selectors/lang-vs-xml-lang.html   
fast/css/css3-modsel-22.html   
fast/css/lang-selector-empty-attribute.xhtml   
fast/css/css-selector-text.html        
fast/css/css3-space-in-nth-and-lang.html       
css2.1/t0402-c71-fwd-parsing-02-f.html 
inspector/debugger/script-formatter.html


Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000038
Crashed Thread:  0  Dispatch queue: com.apple.main-thread

Thread 0 Crashed:  Dispatch queue: com.apple.main-thread
0   com.apple.WebCore             	0x0000000101472798 WTF::RefPtr<WebCore::StyleRareInheritedData>::get() const + 12 (RefPtr.h:60)
1   com.apple.WebCore             	0x0000000101501305 WebCore::DataRef<WebCore::StyleRareInheritedData>::get() const + 21 (DataRef.h:33)
2   com.apple.WebCore             	0x000000010150131d WebCore::DataRef<WebCore::StyleRareInheritedData>::operator->() const + 21 (DataRef.h:36)
3   com.apple.WebCore             	0x00000001014727dd WebCore::RenderStyle::locale() const + 25 (RenderStyle.h:703)
4   com.apple.WebCore             	0x00000001016e2ef9 WebCore::Element::computeInheritedLanguage() const + 109 (Element.cpp:1778)
5   com.apple.WebCore             	0x0000000101526f08 WebCore::CSSStyleSelector::SelectorChecker::checkOneSelector(WebCore::CSSSelector*,
Comment 14 Jungshik Shin 2011-07-15 14:10:12 PDT
Created attachment 101041 [details]
wip patch (only partly working)

No more crash with null check, but still not taking care of  xml:lang vs lang precedence. So it breaks tests like fast/selectors/lang-vs-xml-lang-xhtml.xhtml
Comment 15 Jungshik Shin 2011-07-18 12:23:47 PDT
Created attachment 101187 [details]
wip patch (almost works)

This one works except when xml:lang is specified without lang specified.
Comment 16 Jungshik Shin 2011-07-18 16:50:30 PDT
Comment on attachment 101187 [details]
wip patch (almost works)

View in context: https://bugs.webkit.org/attachment.cgi?id=101187&action=review

> Source/WebCore/dom/Element.cpp:1777
> +            RefPtr<RenderStyle> nodeStyle(static_cast<const Element*>(n)->renderStyle());

Instead of the above, I wonder if the following is better:

RefPtr<RenderStyle> nodeStyle(const_cast<Element*>(static_cast<const Element*>(n))->computedStyle(NOPSEUDO));

Note that computedStyle() is not const. So, either I have to const_cast or drop 'const' for this function. If there's no difference, I'd just use what I have in the patch.

> Source/WebCore/html/HTMLElement.cpp:164
> +        // FIXME: this does not work when xml:lang is present without lang.

XHTML spec says that both lang and xml:lang have to be used (for backward compatibility). I guess there will be very few document with only xml:lang. And, none of layout tests test that case. So, maybe it's ok not to support cases when ONLY xml:lang is specified. 

I tried the following, but it's never hit. 

else if (attr->name() == XMLNames::langAttr)
Comment 17 Jungshik Shin 2011-08-29 16:54:58 PDT
Created attachment 105545 [details]
a test (not for review)

I'm trying to write a layout test that just checks the value of 'style.webkitLocale' of a node with 'lang' set directly or by inheritance. However, somehow accessing <node>.style.webkitLocale in JavaScript does not work as expected. 

In DOM inspector, all the nodes with lang set directly or by inheritance do have '-webkit-locale' set to the value I used. However, 'lang' node in the attached HTML file has

l1= l2= l3= l4=undefined.

In case of l4 (for 'l4' node), '-webkit-locale' is specified in inline style spec. In JS console, document.getElemetnById("l4".style.webkitLocale is "ja" as expected. 

I've tried HTML 'dir' mapped to CSS 'direction'.  Nodes with HTML dir attribute have 'direction' set to the value of 'dir' in DOM inspector. However, in JavaScript, their values are empty (the same as HTML 'lang' mapped to CSS '-webkit-locale').  However, a node with 'style="dir: rtl"' works as expected. That is, <node>.style.direction has 'rtl' instead of 'undefined'.  


Perhaps, some more plumbing is necessary to expose mapped HTML attributes in JavaScript?  Any hint/help would be appreciated. In the meantime, I'll poke around further.
Comment 18 Jungshik Shin 2011-08-29 17:05:00 PDT
Oh. I think I should use computedStyle, instead.
Comment 19 Jungshik Shin 2011-09-01 18:16:08 PDT
Created attachment 106075 [details]
patch with layout test
Comment 20 WebKit Review Bot 2011-09-01 18:46:50 PDT
Comment on attachment 106075 [details]
patch with layout test

Attachment 106075 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/9580465

New failing tests:
fast/css/css3-modsel-22.html
fast/css/lang-selector-empty-attribute.xhtml
svg/W3C-SVG-1.1/styling-css-05-b.svg
fast/selectors/lang-vs-xml-lang.html
fast/css/css3-space-in-nth-and-lang.html
fast/dom/SelectorAPI/resig-SelectorsAPI-test.xhtml
fast/selectors/lang-inheritance.html
Comment 21 Jungshik Shin 2011-09-02 10:33:01 PDT
Comment on attachment 106075 [details]
patch with layout test

it can't be landed because a bunch of layout tests involving psuedo-lang selector fails. 

When they're loaded for the 1st time, they're broken. As soon as I turn on DOM inspector, all the nodes affected by pseudo-lang selector get rendered correctly.
Comment 22 Jungshik Shin 2011-09-02 17:32:57 PDT
Created attachment 106227 [details]
patch update (still not passing some layout tests)

I also tried using computedStyle in computeInheritedLanguage, but it still does not pass some layout tests. 

What I'll do is to punt this bug (making computeInheritedLanguage more efficient) and file a new bug that focus on mapping 'lang' and 'xml:lang' to -webkit-locale because only the latter is necessary for font fallback.  
That means Content-Langauge is not reflected in font fallback (neither will be document encoding). I have another idea on that issue, but we can punt that, too, for now. 

After filing a new bug on mapping lang/xml:lang to -webkit-locale, I'll upload a patch for html/HTMLElement.cpp there along with the current layout test.
Comment 23 Jungshik Shin 2011-09-04 23:54:30 PDT
I filed bug 67586 with a narrower scope than this. To that bug, I attached a patch to map lang/xml:lang to -webkit-locale to be used in font fallback and tex-transform.