Bug 9454 - CSS2: Language not inherited from parent element (Acid3 bug)
Summary: CSS2: Language not inherited from parent element (Acid3 bug)
Alias: None
Product: WebKit
Classification: Unclassified
Component: CSS (show other bugs)
Version: 420+
Hardware: Mac OS X 10.4
: P2 Normal
Assignee: Nobody
Keywords: InRadar
Depends on: 3233
Blocks: 16801 3234 10874 11390 13325
  Show dependency treegraph
Reported: 2006-06-15 09:03 PDT by Nicholas Shanks
Modified: 2008-01-09 01:25 PST (History)
12 users (show)

See Also:

Proposed patch (1.03 KB, patch)
2006-12-29 10:34 PST, Allan Sandfeld Jensen
sam: review-
Details | Formatted Diff | Diff
Test of inheriting LANG (199 bytes, text/html)
2006-12-29 11:01 PST, Allan Sandfeld Jensen
no flags Details
Another LANG inheritance test (217 bytes, text/html)
2006-12-29 11:02 PST, Allan Sandfeld Jensen
no flags Details
Patch (2.58 KB, patch)
2006-12-31 07:23 PST, Allan Sandfeld Jensen
rwlbuis: review-
Details | Formatted Diff | Diff
quick and dirty fix for rob's comments :-) (2.46 KB, patch)
2007-01-12 09:49 PST, Nicholas Shanks
rwlbuis: review-
Details | Formatted Diff | Diff
New patch (5.57 KB, patch)
2007-08-11 12:14 PDT, Allan Sandfeld Jensen
no flags Details | Formatted Diff | Diff
Patch (6.05 KB, patch)
2007-08-11 12:34 PDT, Allan Sandfeld Jensen
no flags Details | Formatted Diff | Diff
patch with darin's suggested fixes (4.33 KB, patch)
2007-09-30 12:39 PDT, Eric Seidel (no email)
mitz: review-
Details | Formatted Diff | Diff
even better patch, now including test case results and xml:lang support (90.43 KB, patch)
2007-10-01 07:44 PDT, Eric Seidel (no email)
hyatt: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Nicholas Shanks 2006-06-15 09:03:45 PDT
The :lang() pseudo-selector does not inherit the content language from its parent element or the HTTP Content-Language header. KHTML already has code for both of these inheritance paths.

Comment 1 Allan Sandfeld Jensen 2006-12-29 10:34:58 PST
Created attachment 12107 [details]
Proposed patch

Test implements simple inheritance checking of LANG. 

The check for HTTP content-language has not been added.
Comment 2 Allan Sandfeld Jensen 2006-12-29 11:01:51 PST
Created attachment 12108 [details]
Test of inheriting LANG
Comment 3 Allan Sandfeld Jensen 2006-12-29 11:02:46 PST
Created attachment 12109 [details]
Another LANG inheritance test
Comment 4 Sam Weinig 2006-12-29 12:04:33 PST
Comment on attachment 12107 [details]
Proposed patch

Nice to see this finally getting ported over.  I think it would be appropriate to add a comment/FIXME or two about the missing features (ie, no support for HTTP content-language or xml:lang (though as you explained that has not been defined)).  We also usually put the patch, the tests (and their expected results) and the ChangeLogs together in one patch to be easily landed.  We have a prepare-ChangeLog script to make the ChangeLog and svn-create-patch script to create the patch.  

Only one semi-color required here.

+                Node *n = e->parent();;

I also have a few style nits.

We put the * on the otherside

+                Node *n = e->parent();;

Single line if statements don't get curly-braces.

+                    if (n->isElementNode()) {
+                        value = static_cast<Element*>(n)->getAttribute(langAttr);
+                    }

r- for now.
Comment 5 Allan Sandfeld Jensen 2006-12-31 07:23:53 PST
Created attachment 12133 [details]

Patch with comments and test-case
Comment 6 Dave Hyatt 2006-12-31 11:19:22 PST
Comment on attachment 12133 [details]

I'm a bit concerned about the performance of this approach, since you'll end up crawling up a lot of nodes.

We also need to do something about style sharing.  :lang is going to get confused by it right now.
Comment 7 Rob Buis 2007-01-04 12:25:16 PST
Comment on attachment 12133 [details]

>+<html >

No need for the extra space.

>+  p:lang(x) { background:lime }

Please use green here, it is used throughout the layout tests.
The testcase seems fine, be sure to include expected results(.txt,.checksum and .png). You can get them by running run-webkit-tests and run-webkit-tests -p.

>+                // but the it has not been clarified how lang and xml:lang coexists

You probably didnt mean to write "the" there.

>+                Node *n = e->parent();

* needs to right next to Node (see style guidelines).

>+                    /*
>+                    else
>+                    if (n->isDocumentNode()) {
>+                        value = static_cast<Document*>(n)->contentLanguage();
>+                    */

AFAIK code in comments should not be included in patches. Also the code is not 100% according to style guidelines.

>+2006-12-31  Allan Sandfeld Jensen,,,  <set EMAIL_ADDRESS environment variable>

Make sure this line is correct, you need to give your mail, and I see no need for the ,,,

So coding wise this patch seems fine (except some style problems mentioned). As soon as you fix the points above I am fine with r+ing it.
Comment 8 Anne van Kesteren 2007-01-07 16:42:51 PST
Some more testcases: http://tc.labs.opera.com/selectors/lang/
Comment 9 Nicholas Shanks 2007-01-12 09:49:23 PST
Created attachment 12390 [details]
quick and dirty fix for rob's comments :-)
Comment 10 Rob Buis 2007-01-12 11:17:50 PST
Comment on attachment 12390 [details]
quick and dirty fix for rob's comments :-)

>+                        value = static_cast<Element*>(n)->getAttribute(langAttr);
>+                    // FIXME: Document needs to provide the MIME content-language
>+                    n = n->parent();

You don't want the empty lines there.

This patch is much better. However it needs test results included before it can be okayed.

Comment 11 Anne van Kesteren 2007-01-12 15:14:47 PST
Also, HTML5 says how to work with lang and xml:lang. You always take xml:lang. (Note that xml:lang can occur in an HTML DOM as well.)
Comment 12 Allan Sandfeld Jensen 2007-08-11 12:14:49 PDT
Created attachment 15931 [details]
New patch

New patch. This implements parsing of http-equiv content-language. Still doesn't read HTTP headers though.
Comment 13 Allan Sandfeld Jensen 2007-08-11 12:27:31 PDT
Hyatt: The performance is a really a design decision. I've used the same model when implementing the nth-selectors. The idea is to not to let these rarely-used features bleed code-wise or performance-wise over anything else. If :lang selectors become more common you can always improve the performance for a general memory-hit.

Rob: run-webkit-tests doesn't work here. You have to make the test-results yourself.

Hmm. Seems I missed one thing. I'll make a new patch with green results.. 
Comment 14 Allan Sandfeld Jensen 2007-08-11 12:34:52 PDT
Created attachment 15932 [details]

New patch now with green test-results.
Comment 15 Darin Adler 2007-08-11 15:33:49 PDT
Comment on attachment 15932 [details]

This looks really good.

+        if (isHTMLDocument())

Why restrict this to HTML documents? In the past we've found we need these document-level functions for any documents with HTML content, even ones that aren't HTML documents. Does Gecko have this function for non-HTML documents?

+            this->setContentLanguage(content);

Why an explicit "this->" here?

+    String m_contentLanguage;

New fields ought to be private, rather than protected, unless there's some need a derived class needs access. We need to scale back our use of protected to the minimum.

+                // but the it has not been clarified how lang and xml:lang coexists

Need to take out the word "the" here.

+                Node* n = e->parent();
+                while (n && value.isEmpty()) {
+                    n = n->parent();
+                }

I think this would read even better as a for loop rather than a while loop. I also think we have too much code here in-line. I'd like to see it put in a helper function. There are some features we're planning in the future that will require figuring out what the language is for a Node, so it would be nice to have this loop and logic somewhere in the DOM outside the style system. Maybe even:

    AtomicString Node::language()

I think I'm going to say r=me despite these quibbles. But please consider revising the patch.
Comment 16 Mark Rowe (bdash) 2007-08-21 17:54:22 PDT
Allan, it would be great if you could revise the patch to address Darin's comments so that it can easily be landed.
Comment 17 Alexey Proskuryakov 2007-08-21 23:43:23 PDT
The biggest use for lang is probably in the engine itself, see bug 10874. It looks like future hot code.
Comment 18 Dave Hyatt 2007-08-22 02:22:13 PDT
This should presumably land on the feature branch and not on the main trunk.
Comment 19 Eric Seidel (no email) 2007-09-30 12:39:29 PDT
Created attachment 16473 [details]
patch with darin's suggested fixes

I applied darin's suggested fixes to the patch.  Personally I think this code will need to be re-written to store the language off of RenderStyle.  It's more efficient to cascade the language that way, instead of an ancestor crawl.  I'm not sure that that issue in any way prevents this from landing.  It just kinda prevents using :lang in the engine. ;)
Comment 20 mitz 2007-10-01 00:25:02 PDT
Comment on attachment 16473 [details]
patch with darin's suggested fixes

Changes to this file are missing from the patch:
+        * css/CSSStyleSelector.cpp:
Comment 21 Eric Seidel (no email) 2007-10-01 07:44:05 PDT
Created attachment 16482 [details]
even better patch, now including test case results and xml:lang support
Comment 22 Dave Hyatt 2007-10-01 14:49:51 PDT
Comment on attachment 16482 [details]
even better patch, now including test case results and xml:lang support

I think we should fix the perf issue here and hang the language off RenderStyle.  We have time to fix this.
Comment 23 Eric Seidel (no email) 2007-10-02 15:14:36 PDT
I tried, and had some trouble implementing this as a RenderStyle-based hack.  Need to talk with Dave again (or just land the existing working code).
Comment 24 Dave Hyatt 2007-10-03 10:56:57 PDT
I do not think the ancestor crawl should be landed as is.  The problem is that often the kinds of Web pages that use this feature use generic :lang rules that aren't scoped to anything.  You will then end up crawling up the entire ancestor chain for every single unique style in your document.  This *will* hurt performance.

I also disagree with the idea of doing DOM-based inheritance.  I don't think it's necessary, and I think the RenderStyle can hold the current language in this case. 

Maybe you could elaborate on what problem you were having.
Comment 25 Allan Sandfeld Jensen 2007-10-06 06:08:48 PDT
I don't think it is serious performance problem. It is less expensive than a descent selector where the first rule has matched. On sites like slashdot.org they have 100 rules like ".class li" that matches most elements on the page, and this is obviously not a big problem. 

I made a survey of webpages last year, to figure out if it would be worthwhile to speed-up descent selectors, and it turned out webpages in general are very flat.

The only issue that should be considered is the likelyhood of anyone combining a descent and language selector in which case the runtime goes up to square DOM-tree height. I consider it unlikely.

To move the language to RenderStyle means you will need a system for dirtying and recalculating language subtrees whenever a node changes LANG. I really really don't think it is worth it, unless you have a strong need for a constant-time language result in text-layout.
Comment 26 Eric Seidel (no email) 2007-10-06 23:44:45 PDT
I discussed this with hyatt over IRC, and he identified some issues in trying to implement this based on RenderStyle.  His primary issue was cascade order, as if we hang this off of the elements inline style declaration that will be applied *after* the UA stylesheet (which might at some later point contain rules which depend on :lang working correctly).  Hyatt said he would noodle on the problem and get back to me.
Comment 27 Maciej Stachowiak 2007-10-13 18:15:24 PDT
Comment on attachment 16482 [details]
even better patch, now including test case results and xml:lang support

I'm going to flag this as review? again, because I don't think the performance problem is critical and doing lang inheritence more efficiently but still correctly in the face of dynamic changes seems hard. In particular, what if you have <div><span><i> nested, and the <i> inherits its lang from the div, and you add a lang attribute to the span. Does the span need to update the RenderStyle for all its children?

I think we should land this now and optimize later because DOM trees tend to be shallow and many more common types of selectors already need to look at ancestors.
Comment 28 Maciej Stachowiak 2007-10-13 18:16:07 PDT
Comment on attachment 15932 [details]

clearing review flag so this doesn't show up in the commit queue
Comment 29 Dave Hyatt 2007-11-08 15:42:05 PST
dhyatt: i think a single :lang rule (with no qualifiers) will really hurt perf
[5:39pm] dhyatt: every element will crawl up the whole tree
[5:39pm] dhyatt: i don't think it's good enough as is
[5:39pm] dhyatt: existing pages that use this commonly have orphaned :lang rules
[5:39pm] dhyatt: and it's used as a css detection hack i think also
[5:39pm] MacDome: dhyatt: meaning?
[5:39pm] dhyatt: meaning that existing pages will slow down
[5:39pm] MacDome: (by orphan'd rules)
[5:39pm] dhyatt: if we land this
[5:39pm] dhyatt: global :lang rules
[5:40pm] dhyatt: orphaned is the wrong term to use i guess
[5:40pm] MacDome: like *:lang { }
[5:40pm] dhyatt: right
[5:40pm] MacDome: ick
[5:40pm] dhyatt: *:lang will be lethal
[5:40pm] dhyatt: with this patch
[5:40pm] dhyatt: but safari2/3 ignore it
[5:40pm] dhyatt: so it doesn't matter
[5:40pm] dhyatt: and ffx honors it and has an efficient impl
[5:40pm] dhyatt: when it comes to css selectors you just can't check in a slow impl imo
[5:41pm] dhyatt: they're too easily abused
Comment 30 Allan Sandfeld Jensen 2007-11-08 23:04:12 PST
Lethal is a strong word, ".class *" is a semi-common combination and it doesn't destroy performance. The current patch has the exact same performance as "[lang] *".
Comment 31 Eric Seidel (no email) 2008-01-01 22:12:10 PST
Test 33 in Acid3 hits this bug.
Comment 32 Eric Seidel (no email) 2008-01-06 12:46:51 PST
The test number for Acid3 has changed:
Test 34: FAIL (expected 1, got: 0 - descendants inheriting lang=en-GB should be matched by :lang(en))

This patch is still just waiting for mjs & hyatt to debate it out.  We've not come up with any alternative approach yet.  Certainly hanging it off the RenderStyle does not work, as noted by hyatt himself in comments related to this bug.

What could move us closer to landing this?  Would creating a worst-case perf test case help?
Comment 33 Eric Seidel (no email) 2008-01-06 12:49:55 PST
I guess one possibility for a perf-improvement would be to cache the computed :lang value on the renderer during the reverse-walk up the tree for the :lang computation.  We'd just have to know how to invalidate it (and the whole subtree) every time a lang= value changed.
Comment 34 Allan Sandfeld Jensen 2008-01-06 14:45:23 PST
I am not sure it is worthwhile to do the low-grained invalidation of a selector-cache. 

I have a selector-cache implementation for KHTML which caches Descent selectors, it can easily cache :lang selectors as well. The short-cut I decided to do to make the cache really simple and mem-safe was to only use it on recalcStyle(Forced) runs. These runs happens every time a new style-element is loaded or added to the document, and everytime a top-selector in a Descent selector changes (eg. [attr] div, should create a run when attr changes).
Such a cache will effectively give you and O(1) run-time on these ancestor-crawls.  

For the language though, it doesn't solve the fast-access need if you are going to use them in font-selection.
Comment 35 Dave Hyatt 2008-01-07 12:01:57 PST
"Certainly hanging it off the
RenderStyle does not work, as noted by hyatt himself in comments related to
this bug."

I'm not sure how you reached that conclusion Eric.  As this is exactly what Firefox does, obviously such an approach can work.

Comment 36 Eric Seidel (no email) 2008-01-07 20:05:13 PST
(In reply to comment #35)
My comment was a restatement of:

I bet we can dig up the IRC logs if necessary.  You had expressed doubts towards a correct implementation of :lang via RenderStyle, which I was summarizing in comment #26.

Right now we have a patch which works, and some performance concerns related to that patch brought up by a senior engineer in whom I place a lot of trust (that's you).  In order to move forward with this patch, either need to alleviate your performance concerns (with a test case?), or move to a different model (hopefully with some guidance from said senior engineer).

I'd be interested to hear your suggested course of action.  How would you like to see this resolved hyatt?  You'd suggest a RenderStyle based solution?

Again, I understand this is not a mission critical bug.  I appreciate you spending cycles discussing it.
Comment 37 Nicholas Shanks 2008-01-08 04:35:23 PST
It is mission-critical for some people: WebKit users who need to handle multiple languages.
Comment 38 Dave Hyatt 2008-01-08 11:54:50 PST
Ok, let's just land it and file a followup bug to improve the performance.

Comment 39 Dave Hyatt 2008-01-08 11:55:04 PST
Comment on attachment 16482 [details]
even better patch, now including test case results and xml:lang support

Comment 40 Eric Seidel (no email) 2008-01-09 01:25:16 PST
Landed as r29336, bug 16801 was filed to track any perf concerns regarding :lang.