Bug 46800 - getElementsByTagName unable to find SVG camelCase elements imported into HTML
: getElementsByTagName unable to find SVG camelCase elements imported into HTML
Status: RESOLVED FIXED
: WebKit
HTML DOM
: 528+ (Nightly build)
: All All
: P2 Normal
Assigned To:
: http://a.deveria.com/tests/lg-qselect...
:
:
:
  Show dependency treegraph
 
Reported: 2010-09-29 05:31 PST by
Modified: 2012-11-27 11:19 PST (History)


Attachments
test case (523 bytes, text/html)
2010-09-29 07:33 PST, Alexis Deveria
no flags Details
test case in XHTML (This test case passes) (913 bytes, application/xhtml+xml)
2011-09-28 02:34 PST, Dirk Schulze
no flags Details
Patch (10.02 KB, patch)
2012-03-19 19:31 PST, Rob Buis
no flags Review Patch | Details | Formatted Diff | Diff
Patch (23.95 KB, patch)
2012-04-08 11:07 PST, Rob Buis
no flags Review Patch | Details | Formatted Diff | Diff
Patch (24.72 KB, patch)
2012-04-08 15:20 PST, Rob Buis
zimmermann: review+
Review Patch | Details | Formatted Diff | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2010-09-29 05:31:58 PST
Neither document.getElementsByTagName('linearGradient') nor document.querySelectorAll('linearGradient') match any linearGradient elements, nor any other elements with camelCase tags. This works as expected in Gecko/Opera.
------- Comment #1 From 2010-09-29 07:33:52 PST -------
Created an attachment (id=69194) [details]
test case

Alerts the amount of linearGradient elements found (should be 1)
------- Comment #2 From 2011-09-27 20:56:42 PST -------
Any update on this? This is an issue with the D3.js visualization framework: you can create camel-cased elements, but it's impossible to select them again.

I assume this bug has something to do with the HTML elements being case-insensitive, while the SVG elements are case-sensitive. My guess is that the selector is internally downcased, but the SVG element names are not, making them impossible to select.

It'd be great if SVG were not case-sensitive when used inside an HTML document. I'm not sure if that's possible, but it would fix this issue and make SVG more accessible besides.
------- Comment #3 From 2011-09-28 01:52:21 PST -------
It doesn't even work on creating the linearGradient with the NS of SVG: 


var a = document.createElementNS('http://www.w3.org/2000/svg', 'linearGradient');
alert(a);  --> [object SVGLinearGradientElement]

document.getElementsByTagName('svg')[0].appendChild(a);
document.getElementsByTagName('linearGradient').length --> 0

I'll check the behavior in XHTML.

Btw. the SVG element was found.
------- Comment #4 From 2011-09-28 02:32:43 PST -------
The example works with a XHTML document. A problem with HTML5?
------- Comment #5 From 2011-09-28 02:34:03 PST -------
Created an attachment (id=108989) [details]
test case in XHTML works

Same test case, but as XHTML document. I get the expected count of linearGradient elements of 1.
------- Comment #6 From 2011-09-28 02:36:50 PST -------
(From update of attachment 108989 [details])
Doesn't work with MIME-type text/html, changing to application/xhtml+xml.
------- Comment #7 From 2011-09-28 02:56:44 PST -------
Looks like it can be reproduced with every SVG element that has a capital letter in it. I tested linearGradient, radialGradient, clipPath, altGlyph and animateColor. All SVG elements that do not have a capital letter in its tag name seem to work. Tested it with pattern, mask and rect without any problems.

A complete list of all SVG element tag names can be found in WebCore/svg/svgtags.in
------- Comment #8 From 2011-09-28 02:58:36 PST -------
Changing component to HTML DOM. Because the example works with XML DOM.
------- Comment #9 From 2011-09-30 01:42:03 PST -------
(In reply to comment #2)
> I assume this bug has something to do with the HTML elements being case-insensitive, while the SVG elements are case-sensitive. My guess is that the selector is internally downcased, but the SVG element names are not, making them impossible to select.
> 
> It'd be great if SVG were not case-sensitive when used inside an HTML document. I'm not sure if that's possible, but it would fix this issue and make SVG more accessible besides.

This assumption seems to be correct. I commented out this line:

http://trac.webkit.org/browser/trunk/Source/WebCore/dom/Node.cpp#L1671

And after that the first example works as well.
------- Comment #10 From 2011-09-30 05:57:32 PST -------
You need to implement:

http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#dom-document-getelementsbytagname

querySelector should already be case-insensitive.
------- Comment #11 From 2011-09-30 12:53:55 PST -------
(In reply to comment #10)
> You need to implement:
> 
> http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#dom-document-getelementsbytagname
> 
> querySelector should already be case-insensitive.
"""
Element nodes in the HTML namespace whose local name is localName converted to ASCII lowercase.
Element nodes, not in the HTML namespace, whose local name is localName.
"""

This is exactly done in WebKit right now. The only problem: the current code assumes that every element in HTML documents belongs to HTML namespace.

SVG elements are not included in the HTML namespace. Therefor getElementsByTagName('lineargradient') shouldn't work in HTML as well.
------- Comment #12 From 2011-10-05 03:19:16 PST -------
(In reply to comment #10)
> You need to implement:
> 
> http://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#dom-document-getelementsbytagname
> 
> querySelector should already be case-insensitive.

I just checked Opera with HTML5 Parser. It has exactly the same problem. Maybe I misunderstand the spec. Would be great if you can explain it Anne. My understanding of:

"""
Element nodes in the HTML namespace whose local name is localName converted to ASCII lowercase.
Element nodes, not in the HTML namespace, whose local name is localName.
"""

is, that just for HTML elements (Element nodes in the HTML namespace) the localName should converted to lowercase.

It might make more sense if the localName gets converted to lowercase if we search in an HTML document, independent of the namespace of the element that we search for? Otherwise we would have to check if other namespace contain the element before converting to lowercase and searching in the HTML namespace?
------- Comment #13 From 2011-10-05 10:51:02 PST -------
Why would that make more sense? Even in an HTML document, the SVG elements still use camelcase.
------- Comment #14 From 2011-10-05 10:59:24 PST -------
(In reply to comment #13)
> Why would that make more sense? Even in an HTML document, the SVG elements still use camelcase.

Why would it make sense to accept 'localName' with big capitals for elements in HTML namespace than?  Why shouldn't engines be as strict for HTML elements as for elements in other namespaces than? I just think that it is confusing that you don't care about capitalization for HTML elements but must care about capitalization on non HTML elements, even if they are all used in the same HTML document.
------- Comment #15 From 2011-11-09 11:20:20 PST -------
We care about it for HTML for backwards compatibility. If this API was never added until now it would do case-sensitive matching for everything. Should we add a note to the specification to point out the HTML behavior is strictly for backwards compatibility?
------- Comment #16 From 2011-12-21 11:22:58 PST -------
My interpretation is that getElementsByTagName and querySelectorAll should be case-sensitive when matching SVG elements in HTML documents, but case-insensitive when matching HTML elements. The problem is that the case-sensitive check currently applies per-document rather per-element. In Node.cpp:

    if (document()->isHTMLDocument())
        name = localName.lower();

So, if you querySelectorAll("linearGradient"), that should match an HTML element with the name <LiNeaRGradIENt> despite the absurd casing, but only match SVG elements of the same exact case: <linearGradient>.
------- Comment #17 From 2012-03-10 20:30:51 PST -------
Any update on this? It continues to be rather unfortunate that there's no way to select camelCased elements within an HTML document; e.g., querySelectorAll("clipPath") returns no results.
------- Comment #18 From 2012-03-14 18:18:44 PST -------
Adding anttik and kling to the bug. The code seems to have changed, the problem stays the same.
------- Comment #19 From 2012-03-19 19:31:55 PST -------
Created an attachment (id=132747) [details]
Patch
------- Comment #20 From 2012-03-19 19:37:49 PST -------
Will that do the right thing in non-HTML documents that contain HTML-namespace elements?
------- Comment #21 From 2012-03-20 04:21:52 PST -------
(From update of attachment 132747 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=132747&action=review

> Source/WebCore/ChangeLog:10
> +        Test: fast/dom/getelementsbyname-localName-matching.html

I think it would be cleanest, if you'd actually test with different document types: foo.xhtml, foo.htm, foo.html, foo.svg as well.
Did you compare your testcase against FF nightlies as well?

> Source/WebCore/dom/TagNodeList.h:49
> +        AtomicString m_loweredLocalName;

Is it worth to cache this? Can't we lower on demand?
------- Comment #22 From 2012-03-20 04:23:29 PST -------
(In reply to comment #20)
> Will that do the right thing in non-HTML documents that contain HTML-namespace elements?
Should be tested, I agree.
------- Comment #23 From 2012-03-20 04:38:29 PST -------
Hi Niko, Boris,

Thanks for the valid remarks. Indeed this could be tested better and Boris usecase is one that will fail with my patch (I think it works with code in ToT). I have a good idea how to fix it tonight, until then feel free to r-.
Cheers,

Rob.

PS: encouraging that the bots remained green.
------- Comment #24 From 2012-03-20 08:35:22 PST -------
One more thing: should a separate bug be spun off for querySelector?
------- Comment #25 From 2012-04-08 11:07:56 PST -------
Created an attachment (id=136149) [details]
Patch
------- Comment #26 From 2012-04-08 11:13:59 PST -------
Hi Niko,

(In reply to comment #21)
> (From update of attachment 132747 [details] [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=132747&action=review
> 
> > Source/WebCore/ChangeLog:10
> > +        Test: fast/dom/getelementsbyname-localName-matching.html
> 
> I think it would be cleanest, if you'd actually test with different document types: foo.xhtml, foo.htm, foo.html, foo.svg as well.

I tried to do this with my latest patch.

> Did you compare your testcase against FF nightlies as well?

We match FF 11 on those new test cases.

> > Source/WebCore/dom/TagNodeList.h:49
> > +        AtomicString m_loweredLocalName;
> 
> Is it worth to cache this? Can't we lower on demand?

I don't see how unfortunately.
Cheers,

Rob.
------- Comment #27 From 2012-04-08 13:35:20 PST -------
Can you please be more explicit with the description on your change log? You change a lot more than you actually mention in the ChangeLog. Also, this just fixes getElementsByTagName, right? Can you either rename this bug and open a new bug for query selector, or open a sub bug for getElementsByTagName please?
------- Comment #28 From 2012-04-08 14:38:28 PST -------
Change bug title to reflect we deal with getElementsByTagName only here. Bug 83438 tracks the querySelector part.
------- Comment #29 From 2012-04-08 15:20:56 PST -------
Created an attachment (id=136158) [details]
Patch
------- Comment #30 From 2012-04-08 15:21:35 PST -------
Hi Dirk,

(In reply to comment #27)
> Can you please be more explicit with the description on your change log? You change a lot more than you actually mention in the ChangeLog. Also, this just fixes getElementsByTagName, right? Can you either rename this bug and open a new bug for query selector, or open a sub bug for getElementsByTagName please?

Should all be done now :)
Cheers,

Rob.
------- Comment #31 From 2012-04-09 08:57:28 PST -------
(From update of attachment 136158 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=136158&action=review

Patch looks great. I'm not touching r? yet, as I have a question:

> Source/WebCore/dom/TagNodeList.cpp:59
> +    : TagNodeList(rootNode, namespaceURI, localName)
> +    , m_loweredLocalName(localName.lower())

TagNodeList stores the localName, so why not use m_localName.lower() instead of storing m_loweredLocalName if you need it?

> Source/WebCore/dom/TagNodeList.cpp:67
> +        const AtomicString &localName_ = testNode->isHTMLElement() ? m_loweredLocalName : m_localName;

The & is misplaced, and the variable name looks weird :-)
const AtomicString& localName = testNode->isHTMLElement() ? m_localName.lower() : m_localName.

I think we shouldn't store m_loweredLocalName, unless this shows hot in profiles.
------- Comment #32 From 2012-04-09 11:37:41 PST -------
Hi Niko,

(In reply to comment #31)
> (From update of attachment 136158 [details] [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=136158&action=review
> 
> Patch looks great. I'm not touching r? yet, as I have a question:

Sure.

> > Source/WebCore/dom/TagNodeList.cpp:59
> > +    : TagNodeList(rootNode, namespaceURI, localName)
> > +    , m_loweredLocalName(localName.lower())
> 
> TagNodeList stores the localName, so why not use m_localName.lower() instead of storing m_loweredLocalName if you need it?

See below.

> > Source/WebCore/dom/TagNodeList.cpp:67
> > +        const AtomicString &localName_ = testNode->isHTMLElement() ? m_loweredLocalName : m_localName;
> 
> The & is misplaced, and the variable name looks weird :-)

Hmm, should the style checker have caught the &?
I agree that the variable name is not the best :)

> const AtomicString& localName = testNode->isHTMLElement() ? m_localName.lower() : m_localName.

I could add something like that before landing, I should not put up a new patch and get all the bots working again for a one line change I think.

> I think we shouldn't store m_loweredLocalName, unless this shows hot in profiles.

I do think it will be hot. getElementsByTagName will be used a lot in a html context, so it will lowere() for every node it traverses. I think that is why the original code does lower() in Node.cpp and not in the TagNodeList. Note that following the exact matching rules require some comparisons to the lowered one and some to the exact localName, depending on whether the Node is html or not. I considered doing case insensitive comparison instead of lowercase but it seems much slower to me.
------- Comment #33 From 2012-04-09 12:30:26 PST -------
> > I think we shouldn't store m_loweredLocalName, unless this shows hot in profiles.
> 
> I do think it will be hot. getElementsByTagName will be used a lot in a html context, so it will lowere() for every node it traverses. I think that is why the original code does lower() in Node.cpp and not in the TagNodeList. Note that following the exact matching rules require some comparisons to the lowered one and some to the exact localName, depending on whether the Node is html or not. I considered doing case insensitive comparison instead of lowercase but it seems much slower to me.

Sorry I meant it *will be* hot, i.e. requirement for a lowered localName will be high.
------- Comment #34 From 2012-04-10 00:48:15 PST -------
(From update of attachment 136158 [details])
You're right, r=me. Please land with the variable name fixes and the & position fix.
------- Comment #35 From 2012-04-10 09:02:59 PST -------
Fix was landed in r113710.
------- Comment #36 From 2012-05-14 09:45:17 PST -------
Thank you for fixing getElementsByTagName, but will someone please follow up with a fix to querySelector + querySelectorAll (i.e., Bug 83438)? This frequently trips up people trying to use linearGradient, textPath, foreignObject, clipPath, etc. with D3.js.
------- Comment #37 From 2012-11-27 11:19:51 PST -------
Apologies for the ping, but any chance someone could fix the querySelector part of this bug (Bug 83438)?