Bug 71152 - selectors should match attribute name with case sensitivity based on element & document type
Summary: selectors should match attribute name with case sensitivity based on element ...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: CSS (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Rob Buis
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-10-28 15:47 PDT by Rafael Weinstein
Modified: 2013-08-01 18:12 PDT (History)
13 users (show)

See Also:


Attachments
Patch (9.80 KB, patch)
2013-07-30 11:58 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (10.53 KB, patch)
2013-07-30 14:36 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (10.51 KB, patch)
2013-07-30 15:39 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (10.54 KB, patch)
2013-07-31 14:19 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (24.74 KB, patch)
2013-08-01 08:59 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (27.09 KB, patch)
2013-08-01 14:36 PDT, Rob Buis
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Rafael Weinstein 2011-10-28 15:47:03 PDT
E.G. in, the following document:

<svg width="100%" height="100%" viewBox="0 0 100 1"
  xmlns="http://www.w3.org/2000/svg" version="1.1">

	<style type="text/css"><![CDATA[
		path { fill:none; stroke:black; stroke-width:1 }
	]]></style>

	<path d="M0,0 L100,0" pathLength="200" stroke-dasharray="100,100"/>
</svg>

   document.querySelector('*[pathLength]');

should return the path element. This currently only works if the document *isn't* html. If it is html then the above selector matches nothing because the selector parser grammar lower-cases the selectorAttributeName

From reading the spec & talking with Hixie & Ojan, the correct for matching attribute names appears to be:

-if html document & html element: match case-insensitive
-otherwise match case-sensitive

FWIW, This is Gecko's current behavior.

I'm happy to work on a patch for this if a likely reviewer agrees this is the correct behavior.
Comment 1 Ojan Vafai 2011-10-28 19:36:45 PDT
To clarify, the selectors spec says attribute matching is case-sensitive as per the document type and the HTML spec says HTML element attributes are case-insensitive but SVG element attributes are case-sensitive. So, attribute selectors need to be checked case-sensitively for SVG (and MATHML?).
Comment 2 Ojan Vafai 2011-10-30 19:20:15 PDT
The relevant commit that describes almost the behavior Rafael described. With the exception of class names and ids in standards mode documents:
"Everything else (attribute values on HTML elements, IDs and classes in no-quirks mode and limited-quirks mode and element names, attribute names, and attribute values in XML documents) must be treated as case-sensitive for the purposes of selector matching."
Comment 4 Rob Buis 2013-07-30 11:58:31 PDT
Created attachment 207753 [details]
Patch
Comment 5 Darin Adler 2013-07-30 12:19:48 PDT
Comment on attachment 207753 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=207753&action=review

I was close to a review+ on this, but I think the really tricky changes to QualifiedName and Attribute::matches need at least some minor tweaks before this is OK to land.

> Source/WebCore/dom/Attribute.h:54
> +    bool matches(const QualifiedName&, const AtomicString&) const;

This second argument needs a argument name. It’s not at all clear what it is. And I think this function is now not self-explanatory any more, so we probably need a comment, now, too.

> Source/WebCore/dom/Attribute.h:75
> +inline bool Attribute::matches(const QualifiedName& qualifiedName, const AtomicString& lname) const

Please don’t use an abbreviation, “lname”. You could use localName and then write localName != this->localName(). Or some other non-abbreviation.

> Source/WebCore/dom/QualifiedName.cpp:156
>  const AtomicString& QualifiedName::localNameUpper() const
>  {
> -    if (!m_impl->m_localNameUpper)
> -        m_impl->m_localNameUpper = m_impl->m_localName.upper();
> -    return m_impl->m_localNameUpper;
> +    if (!m_impl->m_localNameUpperLower)
> +        m_impl->m_localNameUpperLower = m_impl->m_localName.upper();
> +    return m_impl->m_localNameUpperLower;
> +}
> +
> +const AtomicString& QualifiedName::localNameLower() const
> +{
> +    if (!m_impl->m_localNameUpperLower)
> +        m_impl->m_localNameUpperLower = m_impl->m_localName.lower();
> +    return m_impl->m_localNameUpperLower;
>  }

How does this work? It seems that if you call localNameUpper and then localNameLower, you will get an uppercased copy. That seems like a very dangerous design, even if there is some crazy reason it works!

If we really are going to rely on people calling only one or the other on each QualifiedName, then I think we need a debug-only flag to catch if we ever do that wrong at runtime at least.
Comment 6 Darin Adler 2013-07-30 12:20:22 PDT
Comment on attachment 207753 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=207753&action=review

> LayoutTests/svg/custom/querySelector-attribute-selector-caseSensitive-expected.txt:4
> +[object SVGPathElement]
> +null
> +[object HTMLDivElement]
> +[object HTMLDivElement]

This test output is not at all clear. It’s much better when tests make it clear what their testing is rather than just dumping out some strings. The pass/fail style with expected values is much better.
Comment 7 Rob Buis 2013-07-30 12:24:12 PDT
(In reply to comment #6)
> (From update of attachment 207753 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=207753&action=review
> 
> > LayoutTests/svg/custom/querySelector-attribute-selector-caseSensitive-expected.txt:4
> > +[object SVGPathElement]
> > +null
> > +[object HTMLDivElement]
> > +[object HTMLDivElement]
> 
> This test output is not at all clear. It’s much better when tests make it clear what their testing is rather than just dumping out some strings. The pass/fail style with expected values is much better.

You are right, I'll fix it.
Comment 8 Rob Buis 2013-07-30 12:27:27 PDT
Hi Darin,

Thanks for the quick review!

(In reply to comment #5)
> (From update of attachment 207753 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=207753&action=review
> 
> I was close to a review+ on this, but I think the really tricky changes to QualifiedName and Attribute::matches need at least some minor tweaks before this is OK to land.

Yeah, my main objective was to check whether it causes any regressions and maybe to trigger some discussion.

> > Source/WebCore/dom/Attribute.h:54
> > +    bool matches(const QualifiedName&, const AtomicString&) const;
> 
> This second argument needs a argument name. It’s not at all clear what it is. And I think this function is now not self-explanatory any more, so we probably need a comment, now, too.

Agreed, I'll think about that one.

> > Source/WebCore/dom/Attribute.h:75
> > +inline bool Attribute::matches(const QualifiedName& qualifiedName, const AtomicString& lname) const
> 
> Please don’t use an abbreviation, “lname”. You could use localName and then write localName != this->localName(). Or some other non-abbreviation.

Yeah that was a bit lazy, will fix.

> > Source/WebCore/dom/QualifiedName.cpp:156
> >  const AtomicString& QualifiedName::localNameUpper() const
> >  {
> > -    if (!m_impl->m_localNameUpper)
> > -        m_impl->m_localNameUpper = m_impl->m_localName.upper();
> > -    return m_impl->m_localNameUpper;
> > +    if (!m_impl->m_localNameUpperLower)
> > +        m_impl->m_localNameUpperLower = m_impl->m_localName.upper();
> > +    return m_impl->m_localNameUpperLower;
> > +}
> > +
> > +const AtomicString& QualifiedName::localNameLower() const
> > +{
> > +    if (!m_impl->m_localNameUpperLower)
> > +        m_impl->m_localNameUpperLower = m_impl->m_localName.lower();
> > +    return m_impl->m_localNameUpperLower;
> >  }
> 
> How does this work? It seems that if you call localNameUpper and then localNameLower, you will get an uppercased copy. That seems like a very dangerous design, even if there is some crazy reason it works!
> 
> If we really are going to rely on people calling only one or the other on each QualifiedName, then I think we need a debug-only flag to catch if we ever do that wrong at runtime at least.

Agreed, I verified that it works in theory because localNameUpper is only used for elements, not attributes. But you are right that asserting would be better, will try to fix.
Comment 9 Rob Buis 2013-07-30 14:36:16 PDT
Created attachment 207766 [details]
Patch
Comment 10 Darin Adler 2013-07-30 14:59:56 PDT
Comment on attachment 207766 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=207766&action=review

> Source/WebCore/dom/QualifiedName.cpp:148
> +    ASSERT(m_impl->m_localNameUpperLower.impl() == m_impl->m_localName.upper().impl());

This assertion will fail. You need to check if the strings are equal, not that they have the same StringImpl pointer. I think you should just remove the calls to impl().

> Source/WebCore/dom/QualifiedName.cpp:156
> +    ASSERT(m_impl->m_localNameUpperLower.impl() == m_impl->m_localName.lower().impl());

Same problem.
Comment 11 Rob Buis 2013-07-30 15:39:19 PDT
Created attachment 207776 [details]
Patch
Comment 12 Rob Buis 2013-07-31 14:19:26 PDT
Created attachment 207875 [details]
Patch
Comment 13 Darin Adler 2013-07-31 15:25:33 PDT
Comment on attachment 207875 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=207875&action=review

I’m pretty sure that there is a problem sharing m_localNameUpperLower. The same QualifiedNameImpl can be used in a context where localNameUpper is called on it and another where localNameLower needs to be called on it. Not the same qualified name, but the same QualifiedNameImpl from the global table.

> Source/WebCore/dom/QualifiedName.h:91
> +    // Uppercased/lowercased localName, cached for efficiency

I wish this made it clear that you can only call one or the other, not both, on any given QualifiedName.

I still don’t understand how we can guarantee that. If the two qualified names are identical, won’t we share the same underlying QualifiedNameImpl? So can’t we use the equal qualified name in two separate contexts and make the assertions fire!?!
Comment 14 Darin Adler 2013-07-31 15:26:13 PDT
(In reply to comment #8)
> I verified that it works in theory because localNameUpper is only used for elements, not attributes.

But an element and an attribute could have the same qualified name, right?
Comment 15 Rob Buis 2013-07-31 15:34:25 PDT
(In reply to comment #14)
> (In reply to comment #8)
> > I verified that it works in theory because localNameUpper is only used for elements, not attributes.
> 
> But an element and an attribute could have the same qualified name, right?

I see your point, that can happen :( This part was always hacky, I think I can spot a way to do it differently, will try to come up with a new patch, thanks again for the helpful reviews.
Comment 16 Rob Buis 2013-08-01 08:59:44 PDT
Created attachment 207930 [details]
Patch
Comment 17 Darin Adler 2013-08-01 11:47:40 PDT
Comment on attachment 207930 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=207930&action=review

> Source/WebCore/css/CSSSelector.h:207
> +        const AtomicString& attributeLocalNameLower() const;

This is not the right name for the function, since the name is only lowercase for an HTML document. We need to find a different name. Maybe with the word “canonical” or something like that. Maybe attributeCanonicalLowerName?

> Source/WebCore/css/CSSSelector.h:267
> +            AtomicString m_attributeLocalNameLower;

Same comment as above.

> Source/WebCore/css/SelectorChecker.cpp:363
> +        if (!attributeItem->matches(selectorAttr.prefix(), element->isHTMLElement() ? selector->attributeLocalNameLower() : selectorAttr.localName(), selectorAttr.namespaceURI()))

Is the HTMLElement check here needed? Can’t we always use attributeLocalNameLower here?

> Source/WebCore/css/SelectorChecker.cpp:393
> +        const QualifiedName& attr = selector->attribute();

I think the code would read better without this local variable.

> Source/WebCore/css/SelectorChecker.h:135
> +    const AtomicString& localName = element->isHTMLElement() ? selector->attributeLocalNameLower() : selectorAttributeName.localName();

Can’t we use attributeLocalNameLower unconditionally?
Comment 18 Rob Buis 2013-08-01 13:38:54 PDT
(In reply to comment #17)
> (From update of attachment 207930 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=207930&action=review
> 
> > Source/WebCore/css/CSSSelector.h:207
> > +        const AtomicString& attributeLocalNameLower() const;
> 
> This is not the right name for the function, since the name is only lowercase for an HTML document. We need to find a different name. Maybe with the word “canonical” or something like that. Maybe attributeCanonicalLowerName?

You are right, that would be more clear. Note that it only does this for HTML documents only to avoid doing unnecessary work, that is the only reason.

> > Source/WebCore/css/CSSSelector.h:267
> > +            AtomicString m_attributeLocalNameLower;
> 
> Same comment as above.

Agreed, will fix.

> > Source/WebCore/css/SelectorChecker.cpp:363
> > +        if (!attributeItem->matches(selectorAttr.prefix(), element->isHTMLElement() ? selector->attributeLocalNameLower() : selectorAttr.localName(), selectorAttr.namespaceURI()))
> 
> Is the HTMLElement check here needed? Can’t we always use attributeLocalNameLower here?

I think it is needed, it is reflected in the bug title too. The case where this happens is having a HTML document with a foreign namespace element, i.e. SVG or MathML. It seems this behavior is chosen to match getElementsByTagName, see:

http://w3-org.9356.n7.nabble.com/selectors-Case-sensitivity-of-type-and-attribute-selectors-td240849.html

In particular this link on the thread details it best:

http://www.whatwg.org/specs/web-apps/current-work/multipage/selectors.html#selectors

The fact that FireFox results are the same on the testcases in the patch suggests it is the right approach.

> > Source/WebCore/css/SelectorChecker.cpp:393
> > +        const QualifiedName& attr = selector->attribute();
> 
> I think the code would read better without this local variable.

I'll check.

> > Source/WebCore/css/SelectorChecker.h:135
> > +    const AtomicString& localName = element->isHTMLElement() ? selector->attributeLocalNameLower() : selectorAttributeName.localName();
> 
> Can’t we use attributeLocalNameLower unconditionally?

See earlier explanation.
I would be loved to proven wrong on that since obviously it would be more efficient but I don't see a way around it given foreign elements/attribute names in HTML documents compare using case-sensitivity.
Comment 19 Rob Buis 2013-08-01 14:36:23 PDT
Created attachment 207954 [details]
Patch
Comment 20 Darin Adler 2013-08-01 16:01:45 PDT
Comment on attachment 207954 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=207954&action=review

r=me but I think we have the name wrong

> Source/WebCore/css/CSSSelector.h:207
> +        const AtomicString& attributeCanonicalLowerName() const;

Did you mean attributeCanonicalLocalName?
Comment 21 Rob Buis 2013-08-01 18:12:22 PDT
Committed r153631: <http://trac.webkit.org/changeset/153631>