Bug 50579 - Regular expression methods crashing browser (buffer overflow?)
Summary: Regular expression methods crashing browser (buffer overflow?)
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Mac (Intel) OS X 10.6
: P1 Major
Assignee: Michael Saboff
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2010-12-06 12:18 PST by Alexei
Modified: 2010-12-09 10:53 PST (History)
9 users (show)

See Also:


Attachments
Patch to protect from prior backtrack label don't get overwritten (3.21 KB, patch)
2010-12-08 15:39 PST, Michael Saboff
darin: review+
commit-queue: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alexei 2010-12-06 12:18:28 PST
Calling test() on regular expressions over some character length crashes WebKit. Here is an example regex that crashes every time:

/indextools\.js|static\.scribefire\.com\/ads\.js|(static\.getclicky\.com\/|clicky\.js)|statisfy\.net\/javascripts\/stats\.js|gmodules.com\/|rate\.thummit\.com\/js\/|twitter\.com\/(javascripts\/[0-9a-z]+\.js|statuses\/user_timeline\/)|analytics\.live\.com\/|(pub\.lookery\.com\/js\/|lookery\.com\/look\.js|\/j\/pub\/look\.js)|google-analytics\.com\/(urchin\.js|ga\.js|__utm\.gif)|\.mybloglog\.com\/|(\.quantserve\.com\/|\/quant\.js)|sitemeter\.com\/(js\/counter\.js|meter\.asp)|www\.lijit\.com\/informers\/wijits|(\.1[12]2\.2o7\.net\/|\/hbx\.js|\/s_code[0-9a-zA-Z_-]*(\.[0-9a-zA-Z_-]*)?\.js|\.omtrdc\.net\/|omniunih\.js|\/(omniture|mbox)(.*)?\.js|common\.onset\.freedom\.com\/fi\/analytics\/cms\/)|cetrk\.com\//i

I am using WebKit r73340.
Comment 1 Alexei 2010-12-06 12:22:31 PST
An example call that will crash WebKit:

/indextools\.js|static\.scribefire\.com\/ads\.js|(static\.getclicky\.com\/|clicky\.js)|statisfy\.net\/javascripts\/stats\.js|gmodules.com\/|rate\.thummit\.com\/js\/|twitter\.com\/(javascripts\/[0-9a-z]+\.js|statuses\/user_timeline\/)|analytics\.live\.com\/|(pub\.lookery\.com\/js\/|lookery\.com\/look\.js|\/j\/pub\/look\.js)|google-analytics\.com\/(urchin\.js|ga\.js|__utm\.gif)|\.mybloglog\.com\/|(\.quantserve\.com\/|\/quant\.js)|sitemeter\.com\/(js\/counter\.js|meter\.asp)|www\.lijit\.com\/informers\/wijits|(\.1[12]2\.2o7\.net\/|\/hbx\.js|\/s_code[0-9a-zA-Z_-]*(\.[0-9a-zA-Z_-]*)?\.js|\.omtrdc\.net\/|omniunih\.js|\/(omniture|mbox)(.*)?\.js|common\.onset\.freedom\.com\/fi\/analytics\/cms\/)|cetrk\.com\//i.test('http://google.com');
Comment 2 Alexey Proskuryakov 2010-12-06 17:01:57 PST
Could you please attach a crash log <http://webkit.org/quality/crashlogs.html>?
Comment 3 Peter Varga 2010-12-07 02:34:01 PST
I have analyzed the mentioned regex pattern. It didn't cause a crash on jsc for me, but the matching seems to run into an infinite loop.
The pattern isn't a fallback case and the YARR Interpreter works well with this pattern so it seems there is a bug in the YARR JIT.

The problem exists since http://trac.webkit.org/changeset/73307 (https://bugs.webkit.org/show_bug.cgi?id=50295).

The problem surely isn't the length of pattern.
I have created a more simple test case to reproduce this regression:
/(a(b|c)(.*))|xxx/.test('aaa');

I haven't checked the generated JIT code yet.
Comment 4 Peter Varga 2010-12-07 07:47:16 PST
I checked the YARR JIT. I don't have a complete solution but I summarize the partial results of my investigation:

Here is a more simple test case which I'm using for debugging: /a(b)(a*)|aaa/.test('aaa')

The problem is if the matching of term 'b' fails then it resets the result of subpattern matching and 
it starts the matching from the beginning, but the index of position (edx) is never increased. 
Thus the JIT does the same character check again and again in an infinite loop.

Wrong backtrack code block is executed since the backtrack logic extension was introduced.
Here is a simple asm example from the generated code:

match_b:
  cmpw   $0x62,-0x2(%eax,%edx,2)
  jne    parentheses_tail
...
expected_backtrack:
  add    $0x1,%edx
  jmp    available_input
...
current_backtrack:
  mov    %edx,%ebx
  sub    $0x2,%ebx
  mov    %ebx,(%edi)
  jmp    match_b 
available_input:
  mov    %edx,%ebx
  sub    $0x2,%ebx
  mov    %ebx,(%edi)
  add    $0x0,%edx
  cmp    %ecx,%edx
  jbe    match_b
...
parentheses_tail:
  movl   $0xffffffff,0x8(%edi)
  jmp    current_backtrack;

It should jump to the "expected_backtrack" instead of "current_backtrack" label in the "parentheses_tail" code block.

The state.linkAlternativeBacktracks(this, true) links to the "current_backtrack" at RegexJIT.cpp:1958.
I guess the desired place of this link is at RegexJIT.cpp:1902 where the notEnoughInputForPreviousAlternative label
is linked now.

I hope this information is useful.
Comment 5 Geoffrey Garen 2010-12-07 12:34:51 PST
<rdar://problem/8739597>
Comment 6 Michael Saboff 2010-12-07 15:51:55 PST
Testing fix.
Comment 7 Michael Saboff 2010-12-08 15:39:13 PST
Created attachment 75981 [details]
Patch to protect from prior backtrack label don't get overwritten
Comment 8 WebKit Commit Bot 2010-12-08 20:28:31 PST
Comment on attachment 75981 [details]
Patch to protect from prior backtrack label don't get overwritten

Rejecting patch 75981 from commit-queue.

Failed to run "['./WebKitTools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '--bot-id=eseidel-cq-sl', 'build', '--no-clean', '--no-update', '--build-style=both']" exit_code: 1
ERROR: Working directory has local commits, pass --force-clean to continue.

Full output: http://queues.webkit.org/results/6951002
Comment 9 Michael Saboff 2010-12-09 09:20:01 PST
Committed r73617: <http://trac.webkit.org/changeset/73617>
Comment 10 Michael Saboff 2010-12-09 09:40:58 PST
Some problem with the checkin.  The fix was first in <http://trac.webkit.org/changeset/73615> and then the change log was fixed in <http://trac.webkit.org/changeset/73617>.
Comment 11 WebKit Review Bot 2010-12-09 10:53:06 PST
http://trac.webkit.org/changeset/73617 might have broken Leopard Intel Release (Tests)
The following tests are not passing:
inspector/styles-source-offsets.html