|Summary:||REGRESSION (r18182): Google Calendar is broken (a regular expression containing a null character is not parsed correctly)|
|Severity:||Major||CC:||abob, ap, ddkilzer, ggaren, jhmart1|
|Priority:||P1||Keywords:||GoogleBug, HasReduction, Regression|
|OS:||OS X 10.4|
Description mitz 2006-12-16 05:18:16 PST
Google Calendar does not display most of the interface in TOT. It fails due to an "Invalid regular expression" error during its initialization. Prior to r18182 (fix for bug 6257), JSC did not throw errors on invalid regular expression. The offending regular expression is valid, but it fails to be parsed correctly since it contains a null character, which PCRE considers as a terminator. A reduced test case is RegExp("[\0]") which throws an error ("missing terminating ] for character class").
Comment 1 Alexey Proskuryakov 2006-12-26 12:39:42 PST
*** Bug 11982 has been marked as a duplicate of this bug. ***
Comment 2 mitz 2006-12-28 07:24:38 PST
Comment 3 mitz 2006-12-28 15:06:43 PST
Comment 4 mitz 2006-12-28 17:24:59 PST
Created attachment 12092 [details] pcre changes, refined Most of the time compile_branch() doesn't need to check for end of string.
Comment 5 David Kilzer (:ddkilzer) 2006-12-28 17:52:47 PST
Comment 6 Geoffrey Garen 2006-12-28 18:08:30 PST
This patch would increase the time required to scan a regular expression by O(n) --an unfortunate performance hit. An alternative, which would fix the Google Calendar bug, would be to escape NULL characters as a part of sanitizePattern -- in other words, only if the expression failed to parse. However, that would leave open a subtle bug in the case where the subexpressionup until the first NULL character was still valid. I think we need to change PCRE to take a length, instead. We do have a substantial number of regular expression tests that can catch subtle bugs along the way.
Comment 7 mitz 2006-12-29 01:53:39 PST
(In reply to comment #6) > This patch would increase the time required to scan a regular expression by > O(n) --an unfortunate performance hit. You mean the first patch, right? > I think we need to change PCRE to take a length, instead. That's where the second (and third) patch are headed, isn't it?
Comment 8 mitz 2006-12-29 08:16:55 PST
Comment 9 mitz 2006-12-29 08:37:22 PST
Created attachment 12106 [details] Change pcre_compile to take the pattern's length and work with patterns containing null characters No layout test regressions, no JSC test regressions, two JSC tests fixed, no PCRE test regressions (when applying an equivalent patch to pcre-6.1). I would really appreciate it if someone who knows how to do it measured the performance impact of this patch and of the alternative approach (attachment 12081 [details]).
Comment 10 Geoffrey Garen 2006-12-29 13:54:58 PST
(In reply to comment #9) I ran a simple test: tot: 231ms (baseline) attachment 12106 [details]: 224ms (3% faster) attachment 12081 [details]: 240ms (3.9% slower) Test attached as 'regexp-parsing-performance.js'. Usage: "run-testkjs regexp-parsing-performance.js" As a side note, I guess my concern was a little academic, since this issue only applies to parsing regular expressions, while executing them is by far the greater cost in common usage.
Comment 11 Geoffrey Garen 2006-12-29 13:55:52 PST
Created attachment 12112 [details] regexp-parsing-performance.js
Comment 12 Matt Lilek 2006-12-29 19:00:54 PST
*** Bug 12032 has been marked as a duplicate of this bug. ***
Comment 13 mitz 2007-01-01 08:55:07 PST
Created attachment 12144 [details] Change pcre_compile to take the pattern's length and work with patterns containing null characters Updated for PCRE 6.4 (following bug 12031 and bug 12036).
Comment 14 Darin Adler 2007-01-01 19:42:40 PST
Comment on attachment 12144 [details] Change pcre_compile to take the pattern's length and work with patterns containing null characters /* The space before the ; is to avoid a warning on a silly compiler on the Macintosh. */ The above comment no longer makes sense. This is by far the biggest change we've ever made to PCRE! But it's worth it. Merging with future versions of PCRE is going to be rather difficult. r=me
Comment 15 David Kilzer (:ddkilzer) 2007-01-01 21:00:56 PST
Comment 16 David Kilzer (:ddkilzer) 2007-01-01 21:28:40 PST
Committed revision 18517.
Comment 17 David Kilzer (:ddkilzer) 2007-01-02 04:58:50 PST
Patch in Attachment 12144 [details] causes new compile errors in pcre_compile.c on Windows now: http://build.webkit.org/post-commit-win32/builds/4045
Comment 18 David Kilzer (:ddkilzer) 2007-01-02 09:43:58 PST
Created attachment 12164 [details] Fixes Windows build (In reply to comment #17) > Patch in Attachment 12144 [details]  causes new compile errors in pcre_compile.c on > Windows now: > > http://build.webkit.org/post-commit-win32/builds/4045 Fixed by this patch. Committed revision 18525.