Bug 163894 - Remove Unicode case-insensitive matching for usemap=""
Summary: Remove Unicode case-insensitive matching for usemap=""
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: Safari Technology Preview
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Chris Dumez
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-10-24 08:49 PDT by Domenic Denicola
Modified: 2016-12-15 12:37 PST (History)
6 users (show)

See Also:


Attachments
Patch (45.33 KB, patch)
2016-12-13 18:33 PST, Chris Dumez
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews125 for ios-simulator-wk2 (deleted)
2016-12-13 20:12 PST, Build Bot
no flags Details
Patch (46.27 KB, patch)
2016-12-13 20:23 PST, Chris Dumez
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Domenic Denicola 2016-10-24 08:49:52 PDT
In https://github.com/whatwg/html/commit/6acdb2122298d2bb7bb839c0a61b4e1f9b0f9bc9 we changed the HTML spec to remove all of its instances of "compatibility caseless" Unicode case-insensitive matching. There were three:

- <object usemap> matching <map name>
- <img usemap> matching <map name>
- radio button group names

All of these have been replaced with simple case-sensitive matching. WebKit already uses case-sensitive matching for radio button group names; see https://bugs.webkit.org/show_bug.cgi?id=90617#c21. This bug is about the usemap cases.

We believe this change will be low-risk due to Chromium use counter data from the stable channel, which reveals that <0.0001% of pages have a map attribute that matches using ASCII case-insensitive matching, but not using case-sensitive mapping. Furthermore, zero additional matches are triggered by Chrome's Unicode case-insensitive matching. Another reason to believe this is low-risk is that the particular Unicode case-insensitive matching used was highly inconsistent and unreliable across browsers; the spec's "compatibility caseless" was fiction.

More background at https://github.com/whatwg/html/issues/1666 and links contained therein. Web platform tests available at:

- http://w3c-test.org/html/semantics/embedded-content/the-img-element/usemap-casing.html
- http://w3c-test.org/html/semantics/embedded-content/the-object-element/usemap-casing.html
Comment 1 Chris Dumez 2016-12-13 14:28:37 PST
Looks like Firefox already behaves this way. Chrome does not yet.
Comment 2 Chris Dumez 2016-12-13 18:33:26 PST
Created attachment 297056 [details]
Patch
Comment 3 Chris Dumez 2016-12-13 18:33:49 PST
(In reply to comment #2)
> Created attachment 297056 [details]
> Patch

With this patch, we pass 100% of:
http://w3c-test.org/html/semantics/embedded-content/the-img-element/usemap-casing.html
Comment 4 Build Bot 2016-12-13 20:12:39 PST
Comment on attachment 297056 [details]
Patch

Attachment 297056 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: http://webkit-queues.webkit.org/results/2717392

New failing tests:
fast/images/image-usemap-parsing.html
Comment 5 Build Bot 2016-12-13 20:12:45 PST
Created attachment 297058 [details]
Archive of layout-test-results from ews125 for ios-simulator-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews125  Port: ios-simulator-wk2  Platform: Mac OS X 10.11.6
Comment 6 Chris Dumez 2016-12-13 20:23:01 PST
Created attachment 297060 [details]
Patch
Comment 7 Darin Adler 2016-12-14 09:31:03 PST
Surprised this ended up being case sensitive instead of ASCII case insensitive, but super happy that we are done with "compatibility caseless".
Comment 8 WebKit Commit Bot 2016-12-14 09:38:00 PST
Comment on attachment 297060 [details]
Patch

Clearing flags on attachment: 297060

Committed r209810: <http://trac.webkit.org/changeset/209810>
Comment 9 WebKit Commit Bot 2016-12-14 09:38:05 PST
All reviewed patches have been landed.  Closing bug.
Comment 10 Darin Adler 2016-12-14 22:46:05 PST
It’s interesting: After this patch, there are only two call sites left in the entire WebKit project for the String::foldCase function:

1) In the UCONFIG_NO_COLLATION version of WebCore::SearchBuffer; I am not sure we should keep that around. Is anyone using that configuration?

2) In TypeAhead::handleEvent. Not entirely sure the use there is quite correct.
Comment 11 Ryosuke Niwa 2016-12-15 00:10:10 PST
(In reply to comment #10)
> It’s interesting: After this patch, there are only two call sites left in
> the entire WebKit project for the String::foldCase function:
> 
> 1) In the UCONFIG_NO_COLLATION version of WebCore::SearchBuffer; I am not
> sure we should keep that around. Is anyone using that configuration?
> 
> 2) In TypeAhead::handleEvent. Not entirely sure the use there is quite
> correct.

I think this code is about matching what the user typed with an option element's text so that should probably be using real Unicode case folding. Although that function really just wants to have startsWithIgnoringCase or something.
Comment 12 Darin Adler 2016-12-15 12:37:47 PST
(In reply to comment #11)
> (In reply to comment #10)
> > 2) In TypeAhead::handleEvent. Not entirely sure the use there is quite
> > correct.
> 
> I think this code is about matching what the user typed with an option
> element's text so that should probably be using real Unicode case folding.
> Although that function really just wants to have startsWithIgnoringCase or
> something.

Yes, it is about matching what the user typed. But search, for example, does not just use Unicode case folding, it uses other rules for matching that go beyond that, and it seems inadequate to search within the select list using case folded string matching rather than the actual rules we use to match when searching within text. I think the correct ICU API to use might be usearch, not the case folding function.