Summary: | RegExp fails to match non-ASCII characters against [\S\s] | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Doug Wright <apple> | ||||||||||
Component: | JavaScriptCore | Assignee: | Alexey Proskuryakov <ap> | ||||||||||
Status: | RESOLVED FIXED | ||||||||||||
Severity: | Major | CC: | ap, ddkilzer, hartman.wiki, mrowe, nilcolor | ||||||||||
Priority: | P2 | Keywords: | HasReduction | ||||||||||
Version: | 420+ | ||||||||||||
Hardware: | All | ||||||||||||
OS: | OS X 10.4 | ||||||||||||
URL: | http://www.dougweb.org/bugzilla/safari/regexpbug/ | ||||||||||||
Attachments: |
|
Description
Doug Wright
2006-08-12 09:51:19 PDT
Confirmed with WebKit ToT and 418.8. The character in question is Unicode "RIGHT SINGLE QUOTATION MARK". Reduction forthcoming. Created attachment 10008 [details]
Reduced test case
*** Bug 15224 has been marked as a duplicate of this bug. *** As seen in bug 15224, this affects all non-ASCII characters, and causes problems in prototype.js. Looks like a very important bug to me. Created attachment 16333 [details]
a more complete test case
Tests other regex special characters, too. Passes in Firefox, and mostly passes in IE7, which apparently doesn't treat Unicode whitespace characters as such.
This issue is also present in original PCRE 6.1 and 7.4. From comments in code, I'm not sure what the intended behavior for Perl is, but the the fact that \S and [\S] work differently surely looks like an bug. Created attachment 16349 [details] a more complete test case Added a test for a closely related issue from <http://bugs.exim.org/show_bug.cgi?id=580>. That bug was recently fixed, see svn diff -r218:219 svn://tahini.csx.cam.ac.uk/pcre I'm going to file the problem with [\S] to PCRE bugzilla soon. (In reply to comment #7) > svn diff -r218:219 svn://tahini.csx.cam.ac.uk/pcre I've just found that there's a ViewVC for PCRE: http://vcs.pcre.org/viewvc?view=rev&revision=219 > I'm going to file the problem with [\S] to PCRE bugzilla soon. http://bugs.exim.org/show_bug.cgi?id=603 Created attachment 16446 [details]
proposed fix
This is based on an approach suggested by Philip Hazel, and on his fix for \S{2} vs. \S\S bug.
I think this fix is important enough to go to trunk.
Comment on attachment 16446 [details]
proposed fix
r=me
Committed revision 25958 (feature branch). Hi. Feature branch - is it nightly build of WebKit (http://nightly.webkit.org/)? Or I have to compile it myself? Nightly builds of the feature branch are available at http://nightly.webkit.org/builds/overview/feature-branch. *** Bug 14877 has been marked as a duplicate of this bug. *** |