Bug 16127

Summary: Reproducible crash inside PCRE under guard malloc
Product: WebKit Reporter: Mark Rowe (bdash) <mrowe>
Component: JavaScriptCoreAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Major CC: ap, eric
Priority: P2 Keywords: HasReduction, InRadar
Version: 528+ (Nightly build)   
Hardware: Macintosh   
OS: OS X 10.5   
Attachments:
Description Flags
Reduced test case
none
Crash log
none
Check against patternEnd to make sure we don't walk of the end of the string
none
Check against patternEnd to make sure we don't walk off the end of the string darin: review+

Description Mark Rowe (bdash) 2007-11-24 22:24:00 PST
Loading a JavaScript snippet containing the following regexp literal will cause a crash under guard malloc:
/\)[;\s]+/

This leads to indexing off the end of an array.  My reading of the PCRE 7.4 sources suggest that they also have this problem.  I recall seeing valgrind warnings in similar places when playing with it on Linux recently.

Steps to reproduce:
1) Save attachment as test.js.
2) DYLD_INSERT_LIBRARIES=/usr/lib/libgmalloc.dylib DYLD_FRAMEWORK_PATH=WebKitBuild/Debug ./WebitBuild/Debug/testkjs test.js
Comment 1 Mark Rowe (bdash) 2007-11-24 22:24:40 PST
Created attachment 17499 [details]
Reduced test case
Comment 2 Mark Rowe (bdash) 2007-11-24 22:25:23 PST
Created attachment 17500 [details]
Crash log

0   com.apple.JavaScriptCore      	0x0031a0f2 calculateCompiledPatternLengthAndFlags(unsigned short const*, int, JSRegExpIgnoreCaseOption, compile_data&, ErrorCode) + 1268 (pcre_compile.cpp:2550)
1   com.apple.JavaScriptCore      	0x0031ab56 jsRegExpCompile(unsigned short const*, int, JSRegExpIgnoreCaseOption, JSRegExpMultilineOption, unsigned int*, char const**) + 82 (pcre_compile.cpp:3013)
2   com.apple.JavaScriptCore      	0x002883df KJS::RegExp::RegExp(KJS::UString const&, KJS::UString const&) + 417 (regexp.cpp:70)
Comment 3 Mark Rowe (bdash) 2007-11-24 22:26:28 PST
<rdar://problem/5611792>
Comment 4 Alexey Proskuryakov 2007-11-25 00:03:00 PST
I don't think that original PCRE has this problem - it relies on strings being null-terminated.

*** This bug has been marked as a duplicate of 15839 ***
Comment 5 Eric Seidel (no email) 2007-11-30 05:18:00 PST
bug 15839 no longer crashes on TOT, however this still does.  Reopening.
Comment 6 Eric Seidel (no email) 2007-11-30 05:39:01 PST
Created attachment 17603 [details]
Check against patternEnd to make sure we don't walk of the end of the string

 JavaScriptCore/ChangeLog                           |   12 +++++
 JavaScriptCore/pcre/pcre_compile.cpp               |   47 ++++++++++----------
 LayoutTests/ChangeLog                              |   10 ++++
 .../fast/js/regexp-compile-crash-expected.txt      |    5 ++
 LayoutTests/fast/js/regexp-compile-crash.html      |   22 +++++++++
 5 files changed, 72 insertions(+), 24 deletions(-)
Comment 7 Eric Seidel (no email) 2007-11-30 05:55:42 PST
Created attachment 17604 [details]
Check against patternEnd to make sure we don't walk off the end of the string

 JavaScriptCore/ChangeLog                           |   12 +++++
 JavaScriptCore/pcre/pcre_compile.cpp               |   47 ++++++++++----------
 LayoutTests/ChangeLog                              |   10 ++++
 .../fast/js/regexp-compile-crash-expected.txt      |   11 +++++
 LayoutTests/fast/js/regexp-compile-crash.html      |   22 +++++++++
 5 files changed, 78 insertions(+), 24 deletions(-)
Comment 8 Darin Adler 2007-11-30 09:30:02 PST
Comment on attachment 17604 [details]
Check against patternEnd to make sure we don't walk off the end of the string

This is not the correct way to write tests in fast/js -- the actual test should be in a resources file, and the wrapper generated by the make-js-test-wrappers script. With that structure we can possibly run them standalone in the future.

+                while ((++ptr < patternEnd) && ((c = *ptr) != ']')) {
                     if (c > 127)
                         c = getCharAndAdvanceIfSurrogate(ptr);

This still has a "run off the end" problem, because getCharAndAdvanceIfSurrogate doesn't check the end. But that problem will go away if we merge my change to remove support for UTF-16 surrogate pairs, currently up for review.

I'm a little bit opposed to this code using parentheses around the patternEnd conditions, when all the other similar code in the same file omits parens in cases like these.

r=me, but better to improve the tests.
Comment 9 Eric Seidel (no email) 2007-11-30 15:44:19 PST
r28260