WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
117353
window.find() case-insensitive search doesn't match diacritical marks
https://bugs.webkit.org/show_bug.cgi?id=117353
Summary
window.find() case-insensitive search doesn't match diacritical marks
Daniel Bates
Reported
2013-06-07 10:33:53 PDT
Consider a web page that contains the spanish word "que". Then window.find("qué", false /* Case sensitive */) returns true. That is, we found a match since we consider the word "qué" to be equivalent to the word "que". Instead we should consider the accented e as being more specific and hence window.find() should return false (i.e. not found).
Attachments
Patch
(31.25 KB, patch)
2013-06-07 11:41 PDT
,
Daniel Bates
no flags
Details
Formatted Diff
Diff
Patch
(31.26 KB, patch)
2013-06-07 13:11 PDT
,
Daniel Bates
darin
: review+
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Daniel Bates
Comment 1
2013-06-07 10:34:18 PDT
<
rdar://problem/13510672
>
Daniel Bates
Comment 2
2013-06-07 11:41:14 PDT
Created
attachment 204058
[details]
Patch
Daniel Bates
Comment 3
2013-06-07 11:50:45 PDT
(In reply to
comment #2
)
> Created an attachment (id=204058) [details] > Patch
As a side effect of this change our case-insensitive search behavior for window.find() will more closely match the same behavior in Firefox.
Daniel Bates
Comment 4
2013-06-07 13:11:25 PDT
Created
attachment 204062
[details]
Patch Remove tab characters that I inadvertently added to file LayoutTests/fast/text/test-find.js. Also, write testFind() in terms of variable CaseInsensitive.
Darin Adler
Comment 5
2013-06-07 14:01:17 PDT
Comment on
attachment 204058
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=204058&action=review
> LayoutTests/fast/text/find-case-folding.html:40 > - if (!canFind("Stra" + smallSharpS + "e", "Strasse")) { > + if (canFind("Stra" + smallSharpS + "e", "Strasse")) { > success = false; > - message += " Cannot find ss when searching for small sharp s."; > + message += " Found ss when searching for small sharp s."; > }
Is this change an improvement? I think it was good that searching for ß found ss.
Alexey Proskuryakov
Comment 6
2013-06-07 14:18:14 PDT
I asked Dan on IRC to review some of older bugs we had about diacritics (in)sensitive search, I think that user expectations may be more nuanced.
Daniel Bates
Comment 7
2013-06-07 15:15:19 PDT
(In reply to
comment #6
)
> I asked Dan on IRC to review some of older bugs we had about diacritics (in)sensitive search, I think that user expectations may be more nuanced.
From looking at the related bugs, the proposed change will make our search behavior match people's expectations when searching for accented characters. Moreover, it turned out that the Radar bug for this issue, <
rdar://problem/13510672
>, is a duplicate of <
rdar://problem/8535227
>.
Daniel Bates
Comment 8
2013-06-07 15:58:40 PDT
(In reply to
comment #5
)
> (From update of
attachment 204058
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=204058&action=review
> > > LayoutTests/fast/text/find-case-folding.html:40 > > - if (!canFind("Stra" + smallSharpS + "e", "Strasse")) { > > + if (canFind("Stra" + smallSharpS + "e", "Strasse")) { > > success = false; > > - message += " Cannot find ss when searching for small sharp s."; > > + message += " Found ss when searching for small sharp s."; > > } > > Is this change an improvement? I think it was good that searching for ß found ss.
Disregarding the case with ß for the moment (I talk about it below), the proposed patch is an improvement because it makes the behavior of window.find() more closely match a person's expectation when searching for special characters based on our in-person discussion this morning (06/07). This behavior change has also been requested by various people as seen in the bug reports related to <
rdar://problem/8535227
>. Should we look to special case our handling of ß? ICU regards ß similar to an accented character (é) when using strength UCOL_SECONDARY.
Daniel Bates
Comment 9
2013-06-11 16:31:18 PDT
(In reply to
comment #8
)
> (In reply to
comment #5
) > > (From update of
attachment 204058
[details]
[details]) > > View in context:
https://bugs.webkit.org/attachment.cgi?id=204058&action=review
> > > > > LayoutTests/fast/text/find-case-folding.html:40 > > > - if (!canFind("Stra" + smallSharpS + "e", "Strasse")) { > > > + if (canFind("Stra" + smallSharpS + "e", "Strasse")) { > > > success = false; > > > - message += " Cannot find ss when searching for small sharp s."; > > > + message += " Found ss when searching for small sharp s."; > > > } > > > [...] > Should we look to special case our handling of ß? ICU regards ß similar to an accented character (é) when using strength UCOL_SECONDARY.
From talking with Darin today (06/11) in person, we should look to map ß to "ss". For now, we should land expected failure results for test LayoutTests/fast/text/find-case-folding.html instead of updating it to pass.
Daniel Bates
Comment 10
2013-06-12 09:04:00 PDT
(In reply to
comment #9
)
> (In reply to
comment #8
) > > (In reply to
comment #5
) > > > (From update of
attachment 204058
[details]
[details] [details]) > > > View in context:
https://bugs.webkit.org/attachment.cgi?id=204058&action=review
> > > > > > > LayoutTests/fast/text/find-case-folding.html:40 > > > > - if (!canFind("Stra" + smallSharpS + "e", "Strasse")) { > > > > + if (canFind("Stra" + smallSharpS + "e", "Strasse")) { > > > > success = false; > > > > - message += " Cannot find ss when searching for small sharp s."; > > > > + message += " Found ss when searching for small sharp s."; > > > > } > > > > > [...] > > Should we look to special case our handling of ß? ICU regards ß similar to an accented character (é) when using strength UCOL_SECONDARY. > > From talking with Darin today (06/11) in person, we should look to map ß to "ss". For now, we should land expected failure results for test LayoutTests/fast/text/find-case-folding.html instead of updating it to pass.
Filed
bug #117548
to map ß to "ss".
Daniel Bates
Comment 11
2013-06-12 09:46:29 PDT
Committed
r151505
: <
http://trac.webkit.org/changeset/151505
>
Allan Sandfeld Jensen
Comment 12
2013-09-13 05:43:44 PDT
This patch raised the minimum required libicu to 4.4, was that intentional?
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug