WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 6257
Throw errors on invalid expressions (KJS merge)
https://bugs.webkit.org/show_bug.cgi?id=6257
Summary
Throw errors on invalid expressions (KJS merge)
Maks Orlovich
Reported
2005-12-27 12:18:06 PST
The attached makes the RegExp constructor throw JS error on syntax errors, and by doing that for the POSIX path also makes it work... It fixes one of the KJS RegExp tests, but makes one of the Mozilla tests throw an exception here where it used to "work" --- I do not think that's a regression. (It also shows the lack of \u support --- but that's a separate change). This is a merge of:
r455153
| montanaro | 2005-08-30 14:00:05 -0400 (Tue, 30 Aug 2005) | 2 lines Backport from head, no need to call regcomp twice ------------------------------------------------------------------------
r454902
| montanaro | 2005-08-30 04:33:46 -0400 (Tue, 30 Aug 2005) | 2 lines Committed fix for
bug #110995
(non-pcre regular expressions do not work) ------------------------------------------------------------------------
r393449
| porten | 2005-02-27 01:24:53 -0500 (Sun, 27 Feb 2005) | 2 lines throw exception when constructing an invalid regexp
Attachments
patch
(2.91 KB, patch)
2005-12-27 12:19 PST
,
Maks Orlovich
no flags
Details
Formatted Diff
Diff
patch -- with proper var names for POSIX path
(2.91 KB, patch)
2005-12-27 12:22 PST
,
Maks Orlovich
no flags
Details
Formatted Diff
Diff
patch, with additional \u support
(4.78 KB, patch)
2005-12-27 13:40 PST
,
Maks Orlovich
ggaren
: review-
Details
Formatted Diff
Diff
run-javascriptcore-tests patch
(593 bytes, patch)
2006-01-09 14:38 PST
,
Geoffrey Garen
no flags
Details
Formatted Diff
Diff
proposed patch
(11.93 KB, patch)
2006-12-12 04:33 PST
,
Alexey Proskuryakov
ggaren
: review+
Details
Formatted Diff
Diff
Show Obsolete
(4)
View All
Add attachment
proposed patch, testcase, etc.
Maks Orlovich
Comment 1
2005-12-27 12:19:08 PST
Created
attachment 5305
[details]
patch Also removes an another TODO which I think was no longer correct.
Maks Orlovich
Comment 2
2005-12-27 12:22:15 PST
Created
attachment 5306
[details]
patch -- with proper var names for POSIX path Oops, posix path may even build this time
Geoffrey Garen
Comment 3
2005-12-27 13:14:12 PST
We should figure out why the tests are newly-failing and fix them or the patch. A quick glance indicates that ecma_3/RegExp/regress-119909.js now reports as a failure because it throws an exception (whereas it used to silently ignore the regexp). I think that's a bug in how we interpret success/failure, because the test expects you to recognize the pattern as illegitimate. ecma_2/RegExp/properties-001.js used to pass fully, and now the exception prevents it from running. So we may need to surround a part of the test in a try/catch block. I'm also curious about why it used to pass -- probably because it barfed on an expression that wasn't expected to match in the first place.
Maks Orlovich
Comment 4
2005-12-27 13:40:34 PST
Created
attachment 5308
[details]
patch, with additional \u support To fix the properties-001 test, and one of our testcases, this also includes the merge of the following (with one tweak -- it searches the string for \u before applying the special path):
r354074
| porten | 2004-10-12 19:38:49 -0400 (Tue, 12 Oct 2004) | 4 lines support \u escape sequences in regular expressions. They are non-standard - at least they are not supported in Perl/PCRE/POSIX so we simply do our own resolving of those. --- This may need to be performance-tested, though.
Maciej Stachowiak
Comment 5
2005-12-29 00:51:55 PST
Comment on
attachment 5308
[details]
patch, with additional \u support Geoff, since you had the earlier comments, could you look at the new version?
Geoffrey Garen
Comment 6
2006-01-09 14:36:41 PST
Comment on
attachment 5308
[details]
patch, with additional \u support Sorry I took so long to review this. 1. ecma_3/RegExp/regress-119909.js. We can't check in code that causes test regressions, so we need to figure out why this test still fails with your latest patch. I suspect it's because (a) throwing an error prevents the rest of the test from running or (b) testkjs interprets all thrown errors as failures. Either way, the simplest solution is probably to surround the relevant part of the test in a try/catch block. 2. Performance. I'm worried about the performance of scanning the expression string an extra time. We've had performance problems in this area before. At a minimum, I would suggest changing the code so that we only scan for \u if pcre returns an error with the errorCode corresponding to "PCRE does not support \u in RegExps." However, I think a better approach would be to have the lexer take care of converting the unicode expression from the get-go. 3. To match flags() / _flags, _valid should be _isValid. 4. When pcre fails, doesn't it set _regex to NULL? If so, you can eliminate _isValid, and have isValid() just return _regex. r- for 1 & 2, but I'd love to see a patch that covers 3 & 4, too. I hope this review doesn't sound too negative. You've been doing great work, and we've had a lot of requests to fix this bug in particular, so it's great that you're working on it.
Geoffrey Garen
Comment 7
2006-01-09 14:38:52 PST
Created
attachment 5575
[details]
run-javascriptcore-tests patch FYI, you'll have to apply this patch to get run-javascriptcore-tests to run in a post-svn world. It looks like the Netscape tests were hard-coded for CVS.
Darin Adler
Comment 8
2006-01-09 14:59:57 PST
Regarding "the lexer take care of converting the unicode expression": We can do that if this is truly just a lexer-time feature. But if a regular expression that includes a "\" and a "u" has to work, even if it's composed out of separate strings, then we can't fix that by changing the lexer. So we should do that if we can, but we need to test to see if that's right.
Maks Orlovich
Comment 9
2006-01-09 15:21:23 PST
I can only respond partly, since I am leaving for a conference soon'ish,and I better finish packing 1. libpcre plain doesn't support as many nested parentheses as mozilla, so I do think the expectation for the test needs to be adjusted --- the point is that it shouldn't crash, anyway 4. This code doesn't only work with PCRE, but also (somewhat) with POSIX regexps, if PCRE is not available. In that case, you can not 0-check. 3. As Darin said, the lexer approach seems wrong for constructed strings. e.g. 'b a'.search(new RegExp("\\u" + "0020")) in firefox returns 1. 2. I can certainly do a 2-pass approach, though it'll likely take some restructuring to get it to not overly complicate the code, considering 2.5 RegExp runtimes supported.
Geoffrey Garen
Comment 10
2006-01-11 09:50:12 PST
RegExp("\\u" + "0020") is a really interesting case -- something I hadn't considered. Both Firefox and MacIE honor it, and a quick reading of the spec suggests it's required, so, yes, that rules out the lexer solution. I'm not sure if we still care about supporting the POSIX library. Darin? Are there any platforms on which we can't use PCRE? Regardless, I think the 2-pass approach is worth it. It would just require factoring the Unicode escape converting code into its own function and calling that function as a fallback when the regular expression compiler returns an error. With your patch applied, ecma_3/RegExp/regress-119909.js doesn't fail because it crashes. Rather, where we used to silently ignore the regular expression, we now throw an error. And our test suite interprets thrown errors as failures. Since we expect this test to throw an error, the simplest solution is to surround the test code in a try/catch block. The expectation for the test can remain the same.
rahul abrol
Comment 11
2006-03-29 07:01:48 PST
***
Bug 8043
has been marked as a duplicate of this bug. ***
Darin Adler
Comment 12
2006-03-29 10:40:19 PST
(In reply to
comment #10
)
> I'm not sure if we still care about supporting the POSIX library. Darin? Are > there any platforms on which we can't use PCRE?
I don't think we do care about that. The KDE folks might, but I can't see why we'd even want to use a different regular expression library, since it's not going to work.
David Kilzer (:ddkilzer)
Comment 13
2006-12-10 07:44:08 PST
A fix for regexps with \u sequences was committed with
Bug 7445
and later fixed with
Bug 11501
using a two-pass approach as noted in
Comment #9
and
Comment #10
above. Can this bug be closed?
Alexey Proskuryakov
Comment 14
2006-12-12 04:33:41 PST
Created
attachment 11818
[details]
proposed patch Tweaked the patch to also report the error message; removed the Unicode escapes part; a few style fixes.
Alexey Proskuryakov
Comment 15
2006-12-12 04:37:03 PST
Comment on
attachment 5575
[details]
run-javascriptcore-tests patch
>FYI, you'll have to apply this patch to get run-javascriptcore-tests to run in >a post-svn world. It looks like the Netscape tests were hard-coded for CVS.
This was fixed in a different way (by putting a correct Mixed.pm version number here).
Maks Orlovich
Comment 16
2006-12-12 08:17:27 PST
Thanks for doing this, Alexey
Geoffrey Garen
Comment 17
2006-12-12 11:59:33 PST
Comment on
attachment 11818
[details]
proposed patch r=me
Alexey Proskuryakov
Comment 18
2006-12-12 12:23:55 PST
Committed revision 18182.
David Kilzer (:ddkilzer)
Comment 19
2006-12-29 18:11:21 PST
(In reply to
comment #9
)
> 1. libpcre plain doesn't support as many nested parentheses as mozilla, so I > do think the expectation for the test needs to be adjusted --- the point is > that it shouldn't crash, anyway
For what it's worth, pcre-7.0 removes the limitation of more than 200 nested parenthesis (in lieu of other trade-offs).
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug