Bug 16458 - REGRESSION (r28164): regular expressions can now hang due to lack of a match limit
: REGRESSION (r28164): regular expressions can now hang due to lack of a match ...
Status: RESOLVED FIXED
: WebKit
JavaScriptCore
: 528+ (Nightly build)
: Macintosh Mac OS X 10.5
: P1 Major
Assigned To:
:
: InRadar, ReviewedForRadar
:
:
  Show dependency treegraph
 
Reported: 2007-12-15 19:32 PST by
Modified: 2007-12-18 11:33 PST (History)


Attachments
patch (11.88 KB, patch)
2007-12-17 01:08 PST, Darin Adler
no flags Review Patch | Details | Formatted Diff | Diff
patch (9.51 KB, patch)
2007-12-17 01:09 PST, Darin Adler
no flags Review Patch | Details | Formatted Diff | Diff
patch (9.70 KB, patch)
2007-12-17 01:13 PST, Darin Adler
ggaren: review+
Review Patch | Details | Formatted Diff | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2007-12-15 19:32:47 PST
When we removed the match limit in http://trac.webkit.org/projects/webkit/changeset/28164 we made it possible to create regular expressions that hang because of exponential complexity of matching. Here's an example:

    /(?:[^(?!)]||){23}\z/

<rdar://problem/5636067>
------- Comment #1 From 2007-12-15 21:04:09 PST -------
I'd like to get bug 16438 reviewed and landed before fixing this one.
------- Comment #2 From 2007-12-17 01:08:50 PST -------
Created an attachment (id=17953) [details]
patch
------- Comment #3 From 2007-12-17 01:09:44 PST -------
Created an attachment (id=17954) [details]
patch
------- Comment #4 From 2007-12-17 01:13:21 PST -------
Created an attachment (id=17955) [details]
patch
------- Comment #5 From 2007-12-17 01:14:35 PST -------
(From update of attachment 17953 [details])
I'm curious as to what sunspider says.
------- Comment #6 From 2007-12-17 01:17:01 PST -------
Unfortunately this does slow down the regexp-dna test 1.13x times; a big slowdown! Overall effect on SunSpider is 1.01x as slow.

If we can find a faster way to do it, I'm all for it!
------- Comment #7 From 2007-12-17 10:00:02 PST -------
By the way, I made a loop to run this expression with various numbers and you can see the problem:

    /(?:[^(?!)]||){1}z/ took 0 milliseconds.
    /(?:[^(?!)]||){2}z/ took 0 milliseconds.
    /(?:[^(?!)]||){3}z/ took 0 milliseconds.
    /(?:[^(?!)]||){4}z/ took 0 milliseconds.
    /(?:[^(?!)]||){5}z/ took 0 milliseconds.
    /(?:[^(?!)]||){6}z/ took 1 milliseconds.
    /(?:[^(?!)]||){7}z/ took 2 milliseconds.
    /(?:[^(?!)]||){8}z/ took 4 milliseconds.
    /(?:[^(?!)]||){9}z/ took 11 milliseconds.
    /(?:[^(?!)]||){10}z/ took 25 milliseconds.
    /(?:[^(?!)]||){11}z/ took 60 milliseconds.
    /(?:[^(?!)]||){12}z/ took 144 milliseconds.
    /(?:[^(?!)]||){13}z/ took 341 milliseconds.
    /(?:[^(?!)]||){14}z/ took 805 milliseconds.
    /(?:[^(?!)]||){15}z/ took 1885 milliseconds.
    /(?:[^(?!)]||){16}z/ took 10120 milliseconds.
    /(?:[^(?!)]||){17}z/ took 29980 milliseconds.

This is an exponential curve. To handle expressions like this one we need some sort of limit.
------- Comment #8 From 2007-12-18 03:16:52 PST -------
the performance hit hurts. I wonder if there is a way to do the check that would be less costly.
------- Comment #9 From 2007-12-18 10:02:40 PST -------
(In reply to comment #8)
> the performance hit hurts. I wonder if there is a way to do the check that
> would be less costly.

It'd great! I'd love it if someone could figure out a new optimization other than just removing the match limit. Knowing that it's a 13% speed-up to remove the match limit may point us at something.

But I think the right thing to do is to restore the feature, and remove the incorrect optimization. Then we can look for effective, correct optimizations.

I don't think it makes sense to leave this incorrect optimization in just because it was a good speed-up!

Could someone review the patch, please?
------- Comment #10 From 2007-12-18 10:27:56 PST -------
(From update of attachment 17955 [details])
r=me

Might be good to file a bug recording the potential speed gain from optimizing this code path.
------- Comment #11 From 2007-12-18 10:55:02 PST -------
(In reply to comment #10)
> Might be good to file a bug recording the potential speed gain from optimizing
> this code path.

Done, bug 16503.
------- Comment #12 From 2007-12-18 11:33:03 PST -------
Committed revision 28833.