RESOLVED FIXED 50579
Regular expression methods crashing browser (buffer overflow?)
https://bugs.webkit.org/show_bug.cgi?id=50579
Summary Regular expression methods crashing browser (buffer overflow?)
Alexei
Reported 2010-12-06 12:18:28 PST
Calling test() on regular expressions over some character length crashes WebKit. Here is an example regex that crashes every time: /indextools\.js|static\.scribefire\.com\/ads\.js|(static\.getclicky\.com\/|clicky\.js)|statisfy\.net\/javascripts\/stats\.js|gmodules.com\/|rate\.thummit\.com\/js\/|twitter\.com\/(javascripts\/[0-9a-z]+\.js|statuses\/user_timeline\/)|analytics\.live\.com\/|(pub\.lookery\.com\/js\/|lookery\.com\/look\.js|\/j\/pub\/look\.js)|google-analytics\.com\/(urchin\.js|ga\.js|__utm\.gif)|\.mybloglog\.com\/|(\.quantserve\.com\/|\/quant\.js)|sitemeter\.com\/(js\/counter\.js|meter\.asp)|www\.lijit\.com\/informers\/wijits|(\.1[12]2\.2o7\.net\/|\/hbx\.js|\/s_code[0-9a-zA-Z_-]*(\.[0-9a-zA-Z_-]*)?\.js|\.omtrdc\.net\/|omniunih\.js|\/(omniture|mbox)(.*)?\.js|common\.onset\.freedom\.com\/fi\/analytics\/cms\/)|cetrk\.com\//i I am using WebKit r73340.
Attachments
Patch to protect from prior backtrack label don't get overwritten (3.21 KB, patch)
2010-12-08 15:39 PST, Michael Saboff
darin: review+
commit-queue: commit-queue-
Alexei
Comment 1 2010-12-06 12:22:31 PST
An example call that will crash WebKit: /indextools\.js|static\.scribefire\.com\/ads\.js|(static\.getclicky\.com\/|clicky\.js)|statisfy\.net\/javascripts\/stats\.js|gmodules.com\/|rate\.thummit\.com\/js\/|twitter\.com\/(javascripts\/[0-9a-z]+\.js|statuses\/user_timeline\/)|analytics\.live\.com\/|(pub\.lookery\.com\/js\/|lookery\.com\/look\.js|\/j\/pub\/look\.js)|google-analytics\.com\/(urchin\.js|ga\.js|__utm\.gif)|\.mybloglog\.com\/|(\.quantserve\.com\/|\/quant\.js)|sitemeter\.com\/(js\/counter\.js|meter\.asp)|www\.lijit\.com\/informers\/wijits|(\.1[12]2\.2o7\.net\/|\/hbx\.js|\/s_code[0-9a-zA-Z_-]*(\.[0-9a-zA-Z_-]*)?\.js|\.omtrdc\.net\/|omniunih\.js|\/(omniture|mbox)(.*)?\.js|common\.onset\.freedom\.com\/fi\/analytics\/cms\/)|cetrk\.com\//i.test('http://google.com');
Alexey Proskuryakov
Comment 2 2010-12-06 17:01:57 PST
Could you please attach a crash log <http://webkit.org/quality/crashlogs.html>?
Peter Varga
Comment 3 2010-12-07 02:34:01 PST
I have analyzed the mentioned regex pattern. It didn't cause a crash on jsc for me, but the matching seems to run into an infinite loop. The pattern isn't a fallback case and the YARR Interpreter works well with this pattern so it seems there is a bug in the YARR JIT. The problem exists since http://trac.webkit.org/changeset/73307 (https://bugs.webkit.org/show_bug.cgi?id=50295). The problem surely isn't the length of pattern. I have created a more simple test case to reproduce this regression: /(a(b|c)(.*))|xxx/.test('aaa'); I haven't checked the generated JIT code yet.
Peter Varga
Comment 4 2010-12-07 07:47:16 PST
I checked the YARR JIT. I don't have a complete solution but I summarize the partial results of my investigation: Here is a more simple test case which I'm using for debugging: /a(b)(a*)|aaa/.test('aaa') The problem is if the matching of term 'b' fails then it resets the result of subpattern matching and it starts the matching from the beginning, but the index of position (edx) is never increased. Thus the JIT does the same character check again and again in an infinite loop. Wrong backtrack code block is executed since the backtrack logic extension was introduced. Here is a simple asm example from the generated code: match_b: cmpw $0x62,-0x2(%eax,%edx,2) jne parentheses_tail ... expected_backtrack: add $0x1,%edx jmp available_input ... current_backtrack: mov %edx,%ebx sub $0x2,%ebx mov %ebx,(%edi) jmp match_b available_input: mov %edx,%ebx sub $0x2,%ebx mov %ebx,(%edi) add $0x0,%edx cmp %ecx,%edx jbe match_b ... parentheses_tail: movl $0xffffffff,0x8(%edi) jmp current_backtrack; It should jump to the "expected_backtrack" instead of "current_backtrack" label in the "parentheses_tail" code block. The state.linkAlternativeBacktracks(this, true) links to the "current_backtrack" at RegexJIT.cpp:1958. I guess the desired place of this link is at RegexJIT.cpp:1902 where the notEnoughInputForPreviousAlternative label is linked now. I hope this information is useful.
Geoffrey Garen
Comment 5 2010-12-07 12:34:51 PST
Michael Saboff
Comment 6 2010-12-07 15:51:55 PST
Testing fix.
Michael Saboff
Comment 7 2010-12-08 15:39:13 PST
Created attachment 75981 [details] Patch to protect from prior backtrack label don't get overwritten
WebKit Commit Bot
Comment 8 2010-12-08 20:28:31 PST
Comment on attachment 75981 [details] Patch to protect from prior backtrack label don't get overwritten Rejecting patch 75981 from commit-queue. Failed to run "['./WebKitTools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '--bot-id=eseidel-cq-sl', 'build', '--no-clean', '--no-update', '--build-style=both']" exit_code: 1 ERROR: Working directory has local commits, pass --force-clean to continue. Full output: http://queues.webkit.org/results/6951002
Michael Saboff
Comment 9 2010-12-09 09:20:01 PST
Michael Saboff
Comment 10 2010-12-09 09:40:58 PST
Some problem with the checkin. The fix was first in <http://trac.webkit.org/changeset/73615> and then the change log was fixed in <http://trac.webkit.org/changeset/73617>.
WebKit Review Bot
Comment 11 2010-12-09 10:53:06 PST
http://trac.webkit.org/changeset/73617 might have broken Leopard Intel Release (Tests) The following tests are not passing: inspector/styles-source-offsets.html
Note You need to log in before you can comment on or make changes to this bug.