Bug 16091 - JSCRE needs to import the PCRE test suite
Summary: JSCRE needs to import the PCRE test suite
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: Mac OS X 10.4
: P2 Normal
Assignee: Alexey Proskuryakov
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-11-22 01:06 PST by Eric Seidel (no email)
Modified: 2007-12-06 23:03 PST (History)
4 users (show)

See Also:


Attachments
work in progress (30.77 KB, application/zip)
2007-12-05 13:49 PST, Alexey Proskuryakov
no flags Details
proposed patch (203.10 KB, patch)
2007-12-06 02:49 PST, Alexey Proskuryakov
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Seidel (no email) 2007-11-22 01:06:50 PST
JSCRE needs to import the PCRE test suite

We've now forked from PCRE... but we still need tests.  We should import their test suite and write a tool to run JSCRE against the test data.  It's only like 30 files and it should be rather straightforward to work up a tool to deal with them.
Comment 1 Alexey Proskuryakov 2007-12-05 13:49:52 PST
Created attachment 17726 [details]
work in progress

I'm making a JS parser that will work on unmodified PCRE test data files (only converted to UTF-8). I know that it's still buggy and skips some tests, but I wanted to upload this WIP test to give an early warning of RegExp regressions we seem to have in TOT :-(
Comment 2 Darin Adler 2007-12-05 14:25:12 PST
I was separately pursuing a perl script that would convert PCRE test cases into files we could use directly in the LayoutTests directory. I have a half-baked version of that on my computer. Not sure if I should keep working on it.
Comment 3 Alexey Proskuryakov 2007-12-06 02:49:47 PST
Created attachment 17745 [details]
proposed patch

Many of the regressions in TOT are with character classes. E.g., "cthing".match(/^[]cde]/) is now null instead of ["c"].
Comment 4 Darin Adler 2007-12-06 09:16:24 PST
Comment on attachment 17745 [details]
proposed patch

Looks great!

What will we do with this going forward, though?

For the many tests that reflect differences between PCRE and desired JavaScript behavior, will we change the original testinput/output files, or what?

Some of the failures look to me like they might reflect problems in the test harness, like this one:

+/abcd\t\n\r\f\a\e\071\x3b\$\\\?caxyz/
+    abcd\t\n\r\f\a\e9;\$\\?caxyz: FAIL. Actual results: "null"

r=me
Comment 5 Darin Adler 2007-12-06 09:18:15 PST
(In reply to comment #3)
> Many of the regressions in TOT are with character classes. E.g.,
> "cthing".match(/^[]cde]/) is now null instead of ["c"].

That particular one is an intentional behavior change; the JavaScript specification and other web browsers require behavior different from PCRE's.
Comment 6 Alexey Proskuryakov 2007-12-06 13:52:39 PST
(In reply to comment #4)
> What will we do with this going forward, though?
> 
> For the many tests that reflect differences between PCRE and desired JavaScript
> behavior, will we change the original testinput/output files, or what?

I do not have a firm opinion either way. I envisioned this to work as a test for status quo and for crashes, that would be trivially upgradable to latest version from PCRE, but a clean test that says PASS and FAIL in correct places has a lot of value, too!
Comment 7 Alexey Proskuryakov 2007-12-06 23:03:59 PST
Committed revision 28513.