Bug 83438 - querySelectorAll unable to find SVG camelCase elements imported into HTML
: querySelectorAll unable to find SVG camelCase elements imported into HTML
Status: NEW
: WebKit
HTML DOM
: 528+ (Nightly build)
: All All
: P1 Normal
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2012-04-08 14:37 PST by
Modified: 2013-01-30 09:02 PST (History)


Attachments
test case (564 bytes, text/html)
2012-11-28 15:27 PST, Alexandru Chiculita
no flags Details
Patch V1 (26.45 KB, patch)
2012-12-03 11:52 PST, Alexandru Chiculita
no flags Review Patch | Details | Formatted Diff | Diff
Patch V2 (36.71 KB, patch)
2012-12-05 10:56 PST, Alexandru Chiculita
webkit.review.bot: commit‑queue-
Review Patch | Details | Formatted Diff | Diff
Patch V3 (38.04 KB, patch)
2012-12-05 15:42 PST, Alexandru Chiculita
koivisto: review-
koivisto: commit‑queue-
Review Patch | Details | Formatted Diff | Diff
performance results (4.98 KB, text/plain)
2012-12-05 17:22 PST, Alexandru Chiculita
no flags Details
testcase-svg (2.21 KB, text/html)
2013-01-15 14:27 PST, Alexandru Chiculita
no flags Details
testcase-svg (3.84 KB, text/html)
2013-01-15 15:33 PST, Alexandru Chiculita
no flags Details
Patch V4 (16.66 KB, patch)
2013-01-17 17:13 PST, Alexandru Chiculita
no flags Review Patch | Details | Formatted Diff | Diff
Patch V5 (74.71 KB, patch)
2013-01-29 15:57 PST, Alexandru Chiculita
buildbot: commit‑queue-
Review Patch | Details | Formatted Diff | Diff
Patch V6 (74.69 KB, patch)
2013-01-29 17:21 PST, Alexandru Chiculita
koivisto: review-
koivisto: commit‑queue-
Review Patch | Details | Formatted Diff | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2012-04-08 14:37:18 PST
This is related to bug 46000, but covers the querySelector part of finding SVG camelCase elements.
------- Comment #1 From 2012-04-08 14:39:31 PST -------
Correction, related to bug 46800.
------- Comment #2 From 2012-04-10 09:03:43 PST -------
I have no intention of working on this but will try to attach a testcase this week, unless somebody beats me to it.
------- Comment #3 From 2012-04-22 11:06:33 PST -------
*** Bug 74919 has been marked as a duplicate of this bug. ***
------- Comment #4 From 2012-06-28 21:23:37 PST -------
I have a test case here: http://bl.ocks.org/1322814

I would love for this bug to be fixed soon!
------- Comment #5 From 2012-11-28 12:17:30 PST -------
Moving this to P1; this is high priority.
------- Comment #6 From 2012-11-28 15:27:34 PST -------
Created an attachment (id=176588) [details]
test case

Looks like it also affects CSS styling.
------- Comment #7 From 2012-11-28 17:18:23 PST -------
Looks like this is a problem in the CSS Grammar. It always converts the selector tag name to lower case. However, the element tag names are stored in their canonical format, so matching will never succeed. This worked just fine for HTML elements because their canonical format is always lower case.

Also noted that the problem doesn't reproduce when CSS is not parsed as part of an HTML context (ie. SVG documents). The bug might also affect attribute names selectors for attribute names that have both lower case and upper case letters. 

Tested FF and IE10 and they both parse case-insensitive for SVG and case-sensitive for HTML.
Opera parses case-insensitive for HTML too, but getElementsByTagName will still be case-sensitive.
------- Comment #8 From 2012-12-03 11:52:20 PST -------
Created an attachment (id=177299) [details]
Patch V1
------- Comment #9 From 2012-12-03 13:16:45 PST -------
(From update of attachment 177299 [details])
Removing review flags for now. I want to make sure the patch is not adding any performance regressions.
------- Comment #10 From 2012-12-03 13:40:36 PST -------
(From update of attachment 177299 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=177299&action=review

> Source/WebCore/css/SelectorChecker.h:200
> +#if ENABLE(SVG)
> +        if (!element->document()->isHTMLDocument() || element->isSVGElement()) {
> +#else
> +        if (!element->document()->isHTMLDocument()) {
> +#endif

Shouldn't this work for all XML content embeddable in HTML? At least also MathML. Would it be possible to instead apply it to elements that has a non-default namespace?
------- Comment #11 From 2012-12-04 11:12:22 PST -------
(In reply to comment #10)
> (From update of attachment 177299 [details] [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=177299&action=review
> 
> > Source/WebCore/css/SelectorChecker.h:200
> > +#if ENABLE(SVG)
> > +        if (!element->document()->isHTMLDocument() || element->isSVGElement()) {
> > +#else
> > +        if (!element->document()->isHTMLDocument()) {
> > +#endif
> 
> Shouldn't this work for all XML content embeddable in HTML? At least also MathML. Would it be possible to instead apply it to elements that has a non-default namespace?

You're right. However, all the elements in an HTML page were matched using case-insensitive comparison. Chancing that for MathML and/or SVG would for sure break content that was tested to work only in WebKit.

For that reason, I think my initial patch is wrong anyway, even though it was aligning the behavior for SVG with other browsers. 

I will post a new patch in which I will just preserve the current behavior: case-insensitive matching for all elements in HTML documents. The patch will only fix the mixed-case tag name that only happens for some SVG elements.
------- Comment #12 From 2012-12-05 10:56:10 PST -------
Created an attachment (id=177795) [details]
Patch V2
------- Comment #13 From 2012-12-05 13:05:05 PST -------
(From update of attachment 177795 [details])
Attachment 177795 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/15138750

New failing tests:
css3/selectors3/xml/css3-modsel-181.xml
css3/selectors3/xhtml/css3-modsel-181.xml
------- Comment #14 From 2012-12-05 15:42:36 PST -------
Created an attachment (id=177850) [details]
Patch V3

Re-added the change in StyleResolver::checkSelector that got lost during a previous rebase. This should fix the two failing tests.
------- Comment #15 From 2012-12-05 16:30:03 PST -------
(From update of attachment 177850 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=177850&action=review

> Source/WebCore/css/SelectorChecker.h:208
> -    const AtomicString& localName = selector->tag().localName();
> -    if (localName != starAtom && localName != element->localName())
> -        return false;
> +    const QualifiedName& selectorTag = selector->tag();
> +    if (selectorTag.localName() != starAtom) {
> +        if (element->document()->isHTMLDocument()) {
> +            // All elements inside HTML documents will use case-insensitive matching.
> +            // That applies even to SVG and MathML elements embedded in HTML. Other browsers
> +            // match SVG using case-sensitive comparison, but WebKit has always 
> +            // supported case-insensitive matching and we need to preserve that behavior
> +            // for backwards compatibility reasons.
> +            if (selectorTag.localNameUpper() != element->tagQName().localNameUpper())
> +                return false;
> +        } else {
> +            // XML based documents (XHTML and SVG) need to use case sensitive matching.
> +            if (selectorTag.localName() != element->tagQName().localName())
> +                return false;
> +        }
> +    }

The code path here is very hot and this seems certain to regress performance. I think you need to find more efficient ways to handle these rare edge cases.
------- Comment #16 From 2012-12-05 16:41:56 PST -------
(In reply to comment #15)
> (From update of attachment 177850 [details] [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=177850&action=review
> 
> > Source/WebCore/css/SelectorChecker.h:208
> > -    const AtomicString& localName = selector->tag().localName();
> > -    if (localName != starAtom && localName != element->localName())
> > -        return false;
> > +    const QualifiedName& selectorTag = selector->tag();
> > +    if (selectorTag.localName() != starAtom) {
> > +        if (element->document()->isHTMLDocument()) {
> > +            // All elements inside HTML documents will use case-insensitive matching.
> > +            // That applies even to SVG and MathML elements embedded in HTML. Other browsers
> > +            // match SVG using case-sensitive comparison, but WebKit has always 
> > +            // supported case-insensitive matching and we need to preserve that behavior
> > +            // for backwards compatibility reasons.
> > +            if (selectorTag.localNameUpper() != element->tagQName().localNameUpper())
> > +                return false;
> > +        } else {
> > +            // XML based documents (XHTML and SVG) need to use case sensitive matching.
> > +            if (selectorTag.localName() != element->tagQName().localName())
> > +                return false;
> > +        }
> > +    }
> 
> The code path here is very hot and this seems certain to regress performance. I think you need to find more efficient ways to handle these rare edge cases.

I've included some performance tests in the patch and according to my tests it should have no performance regressions. I will post some results.
------- Comment #17 From 2012-12-05 17:22:48 PST -------
Created an attachment (id=177878) [details]
performance results

Test results of running the performance tests added in this patch.
------- Comment #18 From 2012-12-06 00:36:01 PST -------
(In reply to comment #11)
> (In reply to comment #10)
> > (From update of attachment 177299 [details] [details] [details])
> > View in context: https://bugs.webkit.org/attachment.cgi?id=177299&action=review
> > 
> > > Source/WebCore/css/SelectorChecker.h:200
> > > +#if ENABLE(SVG)
> > > +        if (!element->document()->isHTMLDocument() || element->isSVGElement()) {
> > > +#else
> > > +        if (!element->document()->isHTMLDocument()) {
> > > +#endif
> > 
> > Shouldn't this work for all XML content embeddable in HTML? At least also MathML. Would it be possible to instead apply it to elements that has a non-default namespace?
> 
> You're right. However, all the elements in an HTML page were matched using case-insensitive comparison. Chancing that for MathML and/or SVG would for sure break content that was tested to work only in WebKit.
> 
Do you have any examples of that, or reason to believe that would happen in common cases? I could imagine all upper-case, but I find it unlikely camel-case tags would be common. Unless this is a problem in important or common cases, shouldn't we still fix it?
------- Comment #19 From 2012-12-06 10:16:44 PST -------
(From update of attachment 177850 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=177850&action=review

> Source/WebCore/css/SelectorChecker.h:200
> +            // All elements inside HTML documents will use case-insensitive matching.
> +            // That applies even to SVG and MathML elements embedded in HTML. Other browsers
> +            // match SVG using case-sensitive comparison, but WebKit has always 
> +            // supported case-insensitive matching and we need to preserve that behavior
> +            // for backwards compatibility reasons.

It seems to me that we should just match what other browsers do. Is this specced somewhere?
------- Comment #20 From 2013-01-11 18:06:57 PST -------
Progress on this bug seems to have stalled but we were so close! @Alexandru, is there any chance of reviving this?
------- Comment #21 From 2013-01-14 09:06:41 PST -------
(In reply to comment #20)
> Progress on this bug seems to have stalled but we were so close! @Alexandru, is there any chance of reviving this?

Sorry about that, I will get back to this starting today.
------- Comment #22 From 2013-01-14 20:19:52 PST -------
(In reply to comment #21)
> (In reply to comment #20)
> > Progress on this bug seems to have stalled but we were so close! @Alexandru, is there any chance of reviving this?
> 
> Sorry about that, I will get back to this starting today.

The results on TypeNameSelectors look like a show-stopper, unfortunately :/. As Antti mentioned, SelectorChecker::tagMatches(...) is going to be piping hot on perf tests. An idea: there are relatively few mixed-case SVG tags. Could you do a single static hash lookup before the matching loop to see if the tag is one of the mixed-case SVG tags?
------- Comment #23 From 2013-01-15 14:27:23 PST -------
Created an attachment (id=182846) [details]
testcase-svg

Added a better test case to show the differences visually.
------- Comment #24 From 2013-01-15 15:33:47 PST -------
Created an attachment (id=182857) [details]
testcase-svg

Used linearGradient instead of foreignObject to make it work on all browsers.
------- Comment #25 From 2013-01-17 17:13:08 PST -------
Created an attachment (id=183324) [details]
Patch V4

This patch makes the testcase-svg pass. It's work in progress, no changelogs, no new tests. Just saving my work so far.
------- Comment #26 From 2013-01-24 09:16:48 PST -------
(In reply to comment #25)
> Created an attachment (id=183324) [details] [details]
> Patch V4
> 
> This patch makes the testcase-svg pass. It's work in progress, no changelogs, no new tests. Just saving my work so far.

My previous approach will not work anymore as there's been some changes in the CSSSelector.

Another approach would be to have a new m_match type in CSSSelector that is only going to be created for mixed-case tag selectors in HTML documents. It will use the RareData structure to store the qualified name with lower case + the mixed-case version. I could reuse the m_attribute and m_argument for that reason.

How does that sound? It should keep common lower case selectors as fast as before and will only impact performance on tags with mixed-case letters.

It seems like fields of the RareData structure could be collected in an union, to save some space. That would be a different patch, and will have to come before the fix for this bug. Let me know if that's a good idea or not and I could work on that too.
------- Comment #27 From 2013-01-29 15:57:06 PST -------
Created an attachment (id=185327) [details]
Patch V5
------- Comment #28 From 2013-01-29 16:46:32 PST -------
(From update of attachment 185327 [details])
Attachment 185327 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://queues.webkit.org/results/16198367

New failing tests:
fast/css/css-selector-text.html
fast/css/css-set-selector-text.html
------- Comment #29 From 2013-01-29 16:57:58 PST -------
(In reply to comment #28)
> (From update of attachment 185327 [details] [details])
> Attachment 185327 [details] [details] did not pass mac-wk2-ews (mac-wk2):
> Output: http://queues.webkit.org/results/16198367
> 
> New failing tests:
> fast/css/css-selector-text.html
> fast/css/css-set-selector-text.html

I'm not sure if I need to update the tests or fix the implementation. The issue is that CSSParser::updateSpecifiersWithElementName was not propagating the tagIsForNamespaceRule on the last "custom element" case.

The two tests fail with similar assets like the following one:

FAIL parseThenSerializeRule('::-webkit-file-upload-button { }') should be *::-webkit-file-upload-button { }. Was ::-webkit-file-upload-button { }.

It seems like the new implementation does the right thing, as in it preserves the selector as it was parsed. However, I don't know if the "*" was forced to get a "canonical form" instead.

I will revert that fix in this patch, as the change is orthogonal to the curent bug.
------- Comment #30 From 2013-01-29 17:21:50 PST -------
Created an attachment (id=185355) [details]
Patch V6

Fixed the two failing tests.
------- Comment #31 From 2013-01-30 03:01:11 PST -------
(From update of attachment 185355 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=185355&action=review

> Source/WebCore/ChangeLog:14
> +        However, non-HTML elements embedded in HTML need to follow the XML rule instead. They need to be matched using case-sensitive comparison.
> +        That's aligned with the behavior of FF and IE10. However, because the CSS parser already lower-cased the tag name, the matching never
> +        works for elements like "foreignObject" or "textPath", as their canonical form has both lower-case and upper-case letters.

It may be aligned with FF and IE10 but is it correct? It is still not clear to me that these complications make sense. Shouldn't everything within a document type match the same?

> Source/WebCore/css/CSSGrammar.y.in:1215
> +            AtomicString mixedCaseTagName;
> +            if (parser->m_context.isHTMLDocument && !$2.isLowerCase()) {
> +                mixedCaseTagName = $2;
> +                $2.lower();
> +            }
> +            parser->updateSpecifiersWithElementName($1, $2, mixedCaseTagName, $$);

So if someone writes selectors in capital letters DIV { foo: bar } he will end up with rare datas. That seems bad. Can this be limited to known SVG names only or something?

> Source/WebCore/css/SelectorChecker.cpp:203
> +        case CSSSelector::MixedCaseTag:
> +            if (!fastCheckSingleSelector<checkMixedCaseTagValue>(selector, element, topChildOrSubselector, topChildOrSubselectorMatchElement))
> +                return false;
> +            break;

Please don't add this to the fast path. These are supposed to be rare (they are rare in data after all!) and making fast path bigger can make it less fast.
------- Comment #32 From 2013-01-30 09:02:38 PST -------
(In reply to comment #31)
> (From update of attachment 185355 [details] [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=185355&action=review
> 
> > Source/WebCore/ChangeLog:14
> > +        However, non-HTML elements embedded in HTML need to follow the XML rule instead. They need to be matched using case-sensitive comparison.
> > +        That's aligned with the behavior of FF and IE10. However, because the CSS parser already lower-cased the tag name, the matching never
> > +        works for elements like "foreignObject" or "textPath", as their canonical form has both lower-case and upper-case letters.
> 
> It may be aligned with FF and IE10 but is it correct? It is still not clear to me that these complications make sense. Shouldn't everything within a document type match the same?
> 
The specification is not clear about the right behavior, so I thought we already decided that the right approach is to see what other browsers do.

> > Source/WebCore/css/CSSGrammar.y.in:1215
> > +            AtomicString mixedCaseTagName;
> > +            if (parser->m_context.isHTMLDocument && !$2.isLowerCase()) {
> > +                mixedCaseTagName = $2;
> > +                $2.lower();
> > +            }
> > +            parser->updateSpecifiersWithElementName($1, $2, mixedCaseTagName, $$);
> 
> So if someone writes selectors in capital letters DIV { foo: bar } he will end up with rare datas. That seems bad. Can this be limited to known SVG names only or something?
I can limit it to only the right SVG names.

> > Source/WebCore/css/SelectorChecker.cpp:203
> > +        case CSSSelector::MixedCaseTag:
> > +            if (!fastCheckSingleSelector<checkMixedCaseTagValue>(selector, element, topChildOrSubselector, topChildOrSubselectorMatchElement))
> > +                return false;
> > +            break;
> 
> Please don't add this to the fast path. These are supposed to be rare (they are rare in data after all!) and making fast path bigger can make it less fast.
My tests showed that it didn't have any regression, but if we limit just to SVG elements, it makes sense to be removed from the fast path.