Bug 14931 - JavaScript regular expression non-participating capturing parentheses fail in 3 different ways
Summary: JavaScript regular expression non-participating capturing parentheses fail in...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 523.x (Safari 3)
Hardware: All All
: P2 Normal
Assignee: Darin Adler
URL: http://blog.stevenlevithan.com/archiv...
Keywords: HasReduction, InRadar
Depends on:
Blocks:
 
Reported: 2007-08-10 11:13 PDT by Kris Kowal
Modified: 2007-08-12 19:51 PDT (History)
4 users (show)

See Also:


Attachments
Test case based on Steve Levithan's blog post (2.42 KB, text/html)
2007-08-10 22:09 PDT, Mark Rowe (bdash)
no flags Details
Working test case based on Steve Levithan's blog post (2.46 KB, text/html)
2007-08-10 22:16 PDT, Mark Rowe (bdash)
no flags Details
patch that fixes a couple of the failures (744 bytes, patch)
2007-08-12 16:05 PDT, Darin Adler
no flags Details | Formatted Diff | Diff
patch, fixes 3 problems so all the tests pass now (85.86 KB, patch)
2007-08-12 18:35 PDT, Darin Adler
mjs: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Kris Kowal 2007-08-10 11:13:13 PDT
Javascript Regular Expression Non-participating capturing groups behave incorrectly in edge cases.  Steve Levithan did a case study on which forms of non-capturing groups conformed to the ECMA spec and Safari passes a paltry 1 of his use cases.  Please refer to his blog post for details:

http://blog.stevenlevithan.com/archives/npcg-javascript
Comment 1 Mark Rowe (bdash) 2007-08-10 22:09:47 PDT
Created attachment 15923 [details]
Test case based on Steve Levithan's blog post
Comment 2 Mark Rowe (bdash) 2007-08-10 22:16:58 PDT
Created attachment 15924 [details]
Working test case based on Steve Levithan's blog post
Comment 3 Mark Rowe (bdash) 2007-08-10 22:17:45 PDT
WebKit ToT passes three of the tests.  Opera 9.22 passes them all.
Comment 4 Mark Rowe (bdash) 2007-08-10 22:22:22 PDT
<rdar://problem/5403816>
Comment 5 Steven Levithan 2007-08-10 23:41:56 PDT
MRowe, thank you for the test case. It looks like my results listed for IE and Firefox were correct, but not the results for Opera or Safari. I hate to misrepresent things like that. I used a slightly modified version of the dump method from netgrow.com.au to generate my original results (I've just reproduced them now). It looks like that code had problems. So Safari gets 3 out of 14, instead of 1. :-) I will update my blog post as appropriate.
Comment 6 Mark Rowe (bdash) 2007-08-10 23:56:34 PDT
Steven, I did my tests against newer versions of both Opera and WebKit than the results in your blog post, so that probably accounts for the differences.
Comment 7 David Kilzer (:ddkilzer) 2007-08-11 07:00:28 PDT
I wonder if later versions of PCRE have fixed these issues.

Comment 8 Darin Adler 2007-08-12 15:41:45 PDT
(In reply to comment #7)
> I wonder if later versions of PCRE have fixed these issues.

Some of the most basic issues are differences between Perl and JavaScript, so I wouldn't expect PCRE to fix them. Instead I think it's up to us to patch PCRE so it doesn't have these issues. That's what the JAVASCRIPT macro in our PCRE is for -- so we can make changes like these that are theoretically suitable to merge upstream.
Comment 9 Darin Adler 2007-08-12 16:05:18 PDT
Created attachment 15941 [details]
patch that fixes a couple of the failures

There are many separate issues here, some inside PCRE and some outside. Here's a fix to one of the issues, specific to String.replace.
Comment 10 Darin Adler 2007-08-12 16:08:13 PDT
8 of the tests are failing because the PCRE engine uses the same array for results from the regular expression match that it uses internally to determine what back-references should match. So all the cases that say (x)?\1 fail to match since there is no expression 1. The specification says that \1 should match the empty string in this case. This is all discussed in the article, but I thought it was worth mentioning that the issue is affecting 8 of the test cases.
Comment 11 Darin Adler 2007-08-12 18:35:52 PDT
Created attachment 15945 [details]
patch, fixes 3 problems so all the tests pass now
Comment 12 Darin Adler 2007-08-12 18:41:23 PDT
(In reply to comment #7)
> I wonder if later versions of PCRE have fixed these issues.

FWIW, none of these problems turned out to be PCRE bugs.
Comment 13 Maciej Stachowiak 2007-08-12 18:50:57 PDT
Comment on attachment 15945 [details]
patch, fixes 3 problems so all the tests pass now

r=me
Comment 14 Darin Adler 2007-08-12 19:51:24 PDT
Committed revision 25026.