WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED INVALID
Bug 12638
Some regular expressions could potentially be matched much faster
https://bugs.webkit.org/show_bug.cgi?id=12638
Summary
Some regular expressions could potentially be matched much faster
David Smith
Reported
2007-02-06 13:12:46 PST
The regular expression at the URL (32 repetitions of a? followed by 32 repetitions of a, and attempting to match on a string of 32 repetitions of a) takes a long time to run in Safari, but runs extremely quickly in the attached C program. The C program was obtained from
http://swtch.com/~rsc/regexp/regexp1.html
, which also has an explanation of why it's faster. The downside is that backreferences and other similar features are not supported, but it might be possible to decide which method to use based on checking whether the regex uses features requiring the slower method.
Attachments
regex matching program
(7.67 KB, application/octet-stream)
2007-02-06 13:14 PST
,
David Smith
no flags
Details
Patch to fix size limitation.
(1.85 KB, patch)
2008-09-11 02:10 PDT
,
Erik Corry
no flags
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
David Smith
Comment 1
2007-02-06 13:14:17 PST
Created
attachment 12982
[details]
regex matching program
David Kilzer (:ddkilzer)
Comment 2
2007-02-06 14:49:30 PST
JavaScriptCore uses a modified version of the PCRE library to do its regular expression matching. The embedded version is at version 6.4. Versions 6.5, 6.6, 6.7 and 7.0 have been released since 6.4. If you have time, please test this expression with the newer versions of pcre to find out if upgrading will benefit WebKit. (It's on my mental to-do list to upgrade the embedded version, but I've been distracted by other shiny objects like webarchives recently.) Apple's UTF-16 changes (and Mitz's recent non-null-terminated API changes) to PCRE really need to be contributed back to the PCRE project so that (some day) PCRE could be linked to WebKit instead of being embedded in it.
David Smith
Comment 3
2007-02-06 15:12:55 PST
It's much harder to cause a hang after
r19434
, so that's a definite improvement. I don't know if it's avoiding the exponential behavior mentioned in the linked article, or just fast enough to run even that fairly quickly. I will try to investigate later.
Cameron Zwarich (cpst)
Comment 4
2007-07-10 22:04:45 PDT
I tried it with the latest version of PCRE (7.2 at the time of this post) with the included program pcredemo, and as in the Russ Cox article, it stops after 22 repetitions because it reaches its default backtracking limit. Prior to that, it is still slow.
Erik Corry
Comment 5
2008-09-11 02:10:25 PDT
Created
attachment 23342
[details]
Patch to fix size limitation. See comment above.
David Kilzer (:ddkilzer)
Comment 6
2008-09-11 05:53:45 PDT
(In reply to
comment #5
)
> Created an attachment (id=23342) [edit] > Patch to fix size limitation. > > See comment above.
I think this should have been attached to
Bug 14873
instead.
David Kilzer (:ddkilzer)
Comment 7
2008-09-11 05:56:57 PDT
Comment on
attachment 23342
[details]
Patch to fix size limitation. Please reattach to
Bug 14873
. How does this patch affect SunSpider?
David Smith
Comment 8
2009-01-02 23:22:30 PST
Retesting in
r39572
, this hangs. Presumably due to avoiding the 22 repetition limit that PCRE imposed.
Gavin Barraclough
Comment 9
2012-09-06 17:18:42 PDT
The patch attached to this bug is no longer valid - we no longer use PCRE. I don't think we're particularly interested in optimizing for the artificial stress tests. An automaton based matcher may have a benefit for real web pages, but we should find benchmarks representative of those cases. I don't think there is actionable work here.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug