Bug 133292

Summary: Class name matching should use ASCII case-insensitive matching, not Unicode case folding
Product: WebKit Reporter: Darin Adler <darin>
Component: DOMAssignee: Darin Adler <darin>
Status: RESOLVED FIXED    
Severity: Normal CC: benjamin, cmarcelo, commit-queue, esprehn+autocc, kangil.han, ossy, rniwa
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
Patch andersca: review+

Description Darin Adler 2014-05-26 12:07:18 PDT
Class name matching should use ASCII case-insensitive matching, not Unicode case folding
Comment 1 Darin Adler 2014-05-26 12:42:36 PDT
Created attachment 232091 [details]
Patch
Comment 2 Darin Adler 2014-05-26 12:46:39 PDT
This same kind of mistake seems common in other parts of the DOM. I’ll have to look at various places that call lower, equalIgnoringCase, and equalPossiblyIgnoringCase to find these mistakes, build more test cases, and then use this new function in more places. I think we’ll also need to add an equalASCIICaseInsensitive function to replace equalIgnoringCase in most cases. I’m taking the terminology from <http://dom.spec.whatwg.org/#ascii-case-insensitive>.
Comment 3 Anders Carlsson 2014-05-26 12:53:35 PDT
Comment on attachment 232091 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=232091&action=review

> Source/WebCore/dom/SpaceSplitString.cpp:197
> +    if (SpaceSplitStringData* data = table.get(keyString))
> +        return data;

Can you .add nullptr here instead and avoid the other hash lookup below?
Comment 4 Darin Adler 2014-05-26 12:55:24 PDT
Comment on attachment 232091 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=232091&action=review

>> Source/WebCore/dom/SpaceSplitString.cpp:197
>> +        return data;
> 
> Can you .add nullptr here instead and avoid the other hash lookup below?

I can’t because the string might have no tokens in it. In that case, we do not add anything to do the table.

This is actually a problem. It means that if we run into a string with no tokens repeatedly, we do hash table lookups and tokenization every time a SpaceSplitString is created for it!
Comment 5 Darin Adler 2014-05-26 12:58:56 PDT
Committed r169358: <http://trac.webkit.org/changeset/169358>
Comment 6 Benjamin Poulain 2014-05-26 13:25:43 PDT
Comment on attachment 232091 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=232091&action=review

Rename shouldFoldCase -> convertToASCIILowercase as the argument of SpaceSplitString::set()?

> LayoutTests/ChangeLog:11
> +        * fast/dom/getElementsByClassName/ASCII-case-insensitive-expected.txt: Added.
> +        * fast/dom/getElementsByClassName/ASCII-case-insensitive.html: Added.
> +        * fast/dom/getElementsByClassName/case-sensitive-expected.txt: Added.
> +        * fast/dom/getElementsByClassName/case-sensitive.html: Added.

Also test that DOMSettableTokenList is no affected?
Comment 7 Csaba Osztrogonác 2014-05-27 03:07:44 PDT
(In reply to comment #5)
> Committed r169358: <http://trac.webkit.org/changeset/169358>

It broke the Apple Windows build:

     1>WebKit.exp : error LNK2001: unresolved external symbol "private: static class WTF::PassRefPtr<class WTF::StringImpl> __cdecl WTF::AtomicString::addSlowCase(class WTF::StringImpl *)" (?addSlowCase@AtomicString@WTF@@CA?AV?$PassRefPtr@VStringImpl@WTF@@@2@PAVStringImpl@2@@Z)
     1>C:\cygwin\home\buildbot\slave\win-release\build\WebKitBuild\Release\bin32\WebKit.dll : fatal error LNK1120: 1 unresolved externals
     1>Done Building Project "C:\cygwin\home\buildbot\slave\win-release\build\Source\WebKit\WebKit.vcxproj\WebKit\WebKit.vcxproj" (Build target(s)) -- FAILED.
Comment 8 Csaba Osztrogonác 2014-05-27 03:26:09 PDT
Fix landed in http://trac.webkit.org/changeset/169376
Comment 9 Darin Adler 2014-05-27 13:27:33 PDT
(In reply to comment #6)
> Also test that DOMSettableTokenList is no affected?

Sure, I’ll add a test.

I’ve learned that anywhere HTML specifies case-insensitive matching, it’s ASCII case-insensitive matching. So I will be following this patch up with changes to lots of other places in HTML where we don’t need to do general case folding, but rather simply ASCII case folding.

I have started work on some patches for this.

By the way, one place this patch missed was the CSS selector compiler matching of class names. My larger patch will cover that too.
Comment 10 Benjamin Poulain 2014-05-27 17:36:21 PDT
(In reply to comment #9)
> By the way, one place this patch missed was the CSS selector compiler matching of class names. My larger patch will cover that too.

I don't think className matching is ever case insensitive.

All the attribute matching code can be case insensitive and it is probably wrong.
Comment 11 Darin Adler 2014-05-28 10:10:31 PDT
(In reply to comment #10)
> I don't think className matching is ever case insensitive.

It’s amazing how much conflicting information there is about this out on the web. I had real trouble finding anything definitive in either CSS or HTML specifications about it.

> All the attribute matching code can be case insensitive and it is probably wrong.

I guess that’s why we need test cases.