Bug 16458 - REGRESSION (r28164): regular expressions can now hang due to lack of a match limit
Summary: REGRESSION (r28164): regular expressions can now hang due to lack of a match ...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Mac OS X 10.5
: P1 Major
Assignee: Darin Adler
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2007-12-15 19:32 PST by Darin Adler
Modified: 2007-12-18 11:33 PST (History)
3 users (show)

See Also:


Attachments
patch (11.88 KB, patch)
2007-12-17 01:08 PST, Darin Adler
no flags Details | Formatted Diff | Diff
patch (9.51 KB, patch)
2007-12-17 01:09 PST, Darin Adler
no flags Details | Formatted Diff | Diff
patch (9.70 KB, patch)
2007-12-17 01:13 PST, Darin Adler
ggaren: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Darin Adler 2007-12-15 19:32:47 PST
When we removed the match limit in http://trac.webkit.org/projects/webkit/changeset/28164 we made it possible to create regular expressions that hang because of exponential complexity of matching. Here's an example:

    /(?:[^(?!)]||){23}\z/

<rdar://problem/5636067>
Comment 1 Darin Adler 2007-12-15 21:04:09 PST
I'd like to get bug 16438 reviewed and landed before fixing this one.
Comment 2 Darin Adler 2007-12-17 01:08:50 PST
Created attachment 17953 [details]
patch
Comment 3 Darin Adler 2007-12-17 01:09:44 PST
Created attachment 17954 [details]
patch
Comment 4 Darin Adler 2007-12-17 01:13:21 PST
Created attachment 17955 [details]
patch
Comment 5 Eric Seidel (no email) 2007-12-17 01:14:35 PST
Comment on attachment 17953 [details]
patch

I'm curious as to what sunspider says.
Comment 6 Darin Adler 2007-12-17 01:17:01 PST
Unfortunately this does slow down the regexp-dna test 1.13x times; a big slowdown! Overall effect on SunSpider is 1.01x as slow.

If we can find a faster way to do it, I'm all for it!
Comment 7 Darin Adler 2007-12-17 10:00:02 PST
By the way, I made a loop to run this expression with various numbers and you can see the problem:

    /(?:[^(?!)]||){1}z/ took 0 milliseconds.
    /(?:[^(?!)]||){2}z/ took 0 milliseconds.
    /(?:[^(?!)]||){3}z/ took 0 milliseconds.
    /(?:[^(?!)]||){4}z/ took 0 milliseconds.
    /(?:[^(?!)]||){5}z/ took 0 milliseconds.
    /(?:[^(?!)]||){6}z/ took 1 milliseconds.
    /(?:[^(?!)]||){7}z/ took 2 milliseconds.
    /(?:[^(?!)]||){8}z/ took 4 milliseconds.
    /(?:[^(?!)]||){9}z/ took 11 milliseconds.
    /(?:[^(?!)]||){10}z/ took 25 milliseconds.
    /(?:[^(?!)]||){11}z/ took 60 milliseconds.
    /(?:[^(?!)]||){12}z/ took 144 milliseconds.
    /(?:[^(?!)]||){13}z/ took 341 milliseconds.
    /(?:[^(?!)]||){14}z/ took 805 milliseconds.
    /(?:[^(?!)]||){15}z/ took 1885 milliseconds.
    /(?:[^(?!)]||){16}z/ took 10120 milliseconds.
    /(?:[^(?!)]||){17}z/ took 29980 milliseconds.

This is an exponential curve. To handle expressions like this one we need some sort of limit.
Comment 8 Maciej Stachowiak 2007-12-18 03:16:52 PST
the performance hit hurts. I wonder if there is a way to do the check that would be less costly.
Comment 9 Darin Adler 2007-12-18 10:02:40 PST
(In reply to comment #8)
> the performance hit hurts. I wonder if there is a way to do the check that
> would be less costly.

It'd great! I'd love it if someone could figure out a new optimization other than just removing the match limit. Knowing that it's a 13% speed-up to remove the match limit may point us at something.

But I think the right thing to do is to restore the feature, and remove the incorrect optimization. Then we can look for effective, correct optimizations.

I don't think it makes sense to leave this incorrect optimization in just because it was a good speed-up!

Could someone review the patch, please?
Comment 10 Geoffrey Garen 2007-12-18 10:27:56 PST
Comment on attachment 17955 [details]
patch

r=me

Might be good to file a bug recording the potential speed gain from optimizing this code path.
Comment 11 Darin Adler 2007-12-18 10:55:02 PST
(In reply to comment #10)
> Might be good to file a bug recording the potential speed gain from optimizing
> this code path.

Done, bug 16503.
Comment 12 Darin Adler 2007-12-18 11:33:03 PST
Committed revision 28833.