16321 – new RegExp("[\u0097]{4,6}", "gmy") crashes in DEBUG builds

RESOLVED FIXED 16321

new RegExp("[\u0097]{4,6}", "gmy") crashes in DEBUG builds

https://bugs.webkit.org/show_bug.cgi?id=16321

Summary new RegExp("[\u0097]{4,6}", "gmy") crashes in DEBUG builds

Eric Seidel (no email)

Reported 2007-12-06 00:37:37 PST

new RegExp("[\u0097]{4,6}", "gmy") crashes in DEBUG builds (probably release builds too) ASSERTION FAILED: code - codestart <= length (/Stuff/Projects/WebKit/JavaScriptCore/pcre/pcre_compile.cpp:2804 JSRegExp* jsRegExpCompile(const UChar*, int, JSRegExpIgnoreCaseOption, JSRegExpMultilineOption, unsigned int*, const char**))

Attachments
Test case (will ASSERT) (64 bytes, text/html) 2007-12-06 07:39 PST, David Kilzer (:ddkilzer)	no flags	Details
more stressful test case (171 bytes, text/html) 2007-12-06 09:38 PST, Darin Adler	no flags	Details
patch (4.53 KB, patch) 2007-12-06 09:48 PST, Darin Adler	eric: review+	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Eric Seidel (no email)

Comment 1 2007-12-06 00:43:10 PST

This one too: new RegExp("[\Ñ]{2,4}");

Eric Seidel (no email)

Comment 2 2007-12-06 00:48:26 PST

And this one: new RegExp("[\£]{3,4}", "gmy");

Eric Seidel (no email)

Comment 3 2007-12-06 00:50:58 PST

Reduction: new RegExp("[\u0097]{4,6}");

David Kilzer (:ddkilzer)

Comment 4 2007-12-06 07:39:16 PST

Created attachment 17748 [details] Test case (will ASSERT)

David Kilzer (:ddkilzer)

Comment 5 2007-12-06 07:39:56 PST

Does not crash WebKit nightly r28482 (a Release build), so the ASSERT() condition may need to be changed?

Darin Adler

Comment 6 2007-12-06 08:09:01 PST

No, the assertion is definitely correct.

David Kilzer (:ddkilzer)

Comment 7 2007-12-06 08:25:01 PST

<rdar://problem/5632992>

Darin Adler

Comment 8 2007-12-06 08:40:42 PST

These test cases are extended character classes followed by repeat sequences. Should be easy to track down.

Darin Adler

Comment 9 2007-12-06 08:53:12 PST

The preflight is computing a length of 16, the compiled regular expression has a length of 17.

Darin Adler

Comment 10 2007-12-06 09:34:21 PST

If we want to see if this crashed in older builds, we should repeat the expression multiple times, and use a max number much higher than 6.

Darin Adler

Comment 11 2007-12-06 09:35:58 PST

Famous last words, but once again I don't see how this could be a regression. The logic has not changed since Leopard. Last time I was mistaken about this.

Darin Adler

Comment 12 2007-12-06 09:38:36 PST

Created attachment 17750 [details] more stressful test case

Darin Adler

Comment 13 2007-12-06 09:48:24 PST

Created attachment 17752 [details] patch

Eric Seidel (no email)

Comment 14 2007-12-06 10:30:58 PST

Comment on attachment 17752 [details] patch Looks good. r=me

Eric Seidel (no email)

Comment 15 2007-12-06 10:32:59 PST

I should note, the ASSERT is actually not quite in the right place. We ASSERT right after we've walked off the end of the buffer. We could be more aggressive and ASSERT right before we add the final KET, so as to catch cases like this were we're off by one (we'd still be asserting/returning late for cases which were off by more than 1). I actually changed my local tree to do that, and still hit some bad off by > 1 cases. I'll file those additional cases once darin's patch lands.

Darin Adler

Comment 16 2007-12-06 10:36:30 PST

(In reply to comment #15) > I should note, the ASSERT is actually not quite in the right place. We ASSERT > right after we've walked off the end of the buffer. We could be more > aggressive and ASSERT right before we add the final KET, so as to catch cases > like this were we're off by one (we'd still be asserting/returning late for > cases which were off by more than 1). I don't understand what you're saying. The current assertion fires if and only if we're running off the end of the buffer. If the assert doesn't fire, then we're not running off the end of the buffer. The final KET takes 3 bytes and the END opcode takes 1 byte -- I don't see how asserting before writing out those last 4 bytes would be more accurate. Maybe I'll understand when I see your patch.

Eric Seidel (no email)

Comment 17 2007-12-06 10:45:32 PST

I mispoke. OP_END instead of OP_KET. I moved the error return before writing of OP_END and made it check to make sure there was enough space to write OP_END. Left the ASSERT checking if we were already over the code buffer before writing the OP_END. That allowed me to continue fuzz testing while ignoring the off-by-one case caused most often by this bug, and was not a permanent solution. What I was proposing above was moving the return, but still ASSERTing if there was not enough space for the OP_END. I agree, it's probably not worth the hassle to bother avoiding writing the OP_END. it's a programmer error if we ever are writing the OP_END off the end of the buffer, no sense in doing contortions in the runtime to workaround that.

Darin Adler

Comment 18 2007-12-06 11:05:49 PST

(In reply to comment #17) > I mispoke. OP_END instead of OP_KET. > > I moved the error return before writing of OP_END and made it check to make > sure there was enough space to write OP_END. Left the ASSERT checking if we > were already over the code buffer before writing the OP_END. That allowed me > to continue fuzz testing while ignoring the off-by-one case caused most often > by this bug, and was not a permanent solution. What I was proposing above was > moving the return, but still ASSERTing if there was not enough space for the > OP_END. > > I agree, it's probably not worth the hassle to bother avoiding writing the > OP_END. it's a programmer error if we ever are writing the OP_END off the end > of the buffer, no sense in doing contortions in the runtime to workaround that. We could allocate a bigger-than-needed buffer in debug versions. Then we could catch overruns and be able to continue testing, I suppose.

Darin Adler

Comment 19 2007-12-06 12:26:10 PST

Committed revision 28491.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P1

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Mac

OS OS X 10.4

Product WebKit

Component JavaScriptCore

Assignee

Darin Adler

Reported

2007-12-06 00:37 PST

Modified

2007-12-06 12:26 PST History

CC List

1 user Show

URL

Keywords HasReduction, InRadar, Regression

Depends on

Blocks