16076 – DOMParser().parseFromString() freezes Safari when parsing large nodes with XML entities

RESOLVED FIXED 16076

DOMParser().parseFromString() freezes Safari when parsing large nodes with XML entities

https://bugs.webkit.org/show_bug.cgi?id=16076

Summary DOMParser().parseFromString() freezes Safari when parsing large nodes with XM...

Brian Kaull

Reported 2007-11-20 16:31:41 PST

Calling DOMParser.parseFromString can freeze safari for a long time when the xml that is trying to be parsed contains at least one node that is extremely large and has many xml-safe characters (ie. "&", "<", ">", """, "'"). By large, I mean at least 10,000 repetitions of text: For example 1. This is a long & boring message. 2. This is a long & boring message. ... 10000. This is a long & boring message. (new DOMParser()).parseFromString( xml, "text/xml" ); The larger the number of repitions, the longer it takes to parse. Right now I'm seeing numbers such as: 5,000 reps = ~1 sec 10,000 reps = ~7 sec 20,000 reps = ~32 sec If the text of the node doesn't have any xml safe characters, the parsing runs very quickly, generally less than a second. I'm going to attach an example page that has several tests that parse large sample xml text. I have run this test on Firefox 2(PC & Mac) and IE7 and the numbers for the tests there are never over 3 seconds for even the largest test, 100,000 reps. I've been testing with Safari 3.0.4 (523.12) and WebKit r27930

Attachments
Slow XML Parsing Test (2.32 KB, text/html) 2007-11-20 16:37 PST, Brian Kaull	no flags	Details
Patch (5.34 KB, patch) 2007-11-20 22:47 PST, Mark Rowe (bdash)	mjs: review+	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Brian Kaull

Comment 1 2007-11-20 16:37:54 PST

Created attachment 17422 [details] Slow XML Parsing Test These tests create large amounts of xml and parse that xml using: (new DOMParser()).parseFromString( xml, "text/xml" ); Note: If the text of the node is wrapped in CDATA the parsing times are extremely fast. Note: If the text of the node is large and there are no xml-safe characters the parsing times are very fast.

Maciej Stachowiak

Comment 2 2007-11-20 17:43:23 PST

This test case seems to show O(N^2) behavior: 5,000 - 0.682sec 10,000 - 6.547sec 20,000 - 33.04sec

Maciej Stachowiak

Comment 3 2007-11-20 17:44:34 PST

<rdar://problem/5609579>

Mark Rowe (bdash)

Comment 4 2007-11-20 18:56:43 PST

A Shark profile shows that > 90% of the time is spent in memcpy. This appears to be due to the naive implementation of StringImpl::append(UChar*, unsigned) that allocates and copies on every append rather than growing more gracefully.

Adam Roben (:aroben)

Comment 5 2007-11-20 20:14:50 PST

(In reply to comment #4) > A Shark profile shows that > 90% of the time is spent in memcpy. This appears > to be due to the naive implementation of StringImpl::append(UChar*, unsigned) > that allocates and copies on every append rather than growing more gracefully. While fixing StringImpl::append would be nice, perhaps in this specific case we could use a Vector<UChar> and String::adopt(), as we do in other places in WebCore.

Mark Rowe (bdash)

Comment 6 2007-11-20 22:47:08 PST

Created attachment 17425 [details] Patch This patch causes the largest case in the test document to take around 0.18 seconds on my MacBook Pro, down from several minutes.

Mark Rowe (bdash)

Comment 7 2007-11-20 22:48:03 PST

I did not create a layout test because I'm unsure how this could be tested without being timing dependent. I'd rather not introduce another layout test that fails on slower machines or when run under guard malloc.

Maciej Stachowiak

Comment 8 2007-11-20 22:48:43 PST

Comment on attachment 17425 [details] Patch r=me

Alexey Proskuryakov

Comment 9 2007-11-20 23:11:17 PST

(In reply to comment #5) > While fixing StringImpl::append would be nice, I actually think that we should eventually get rid of it - no need to make String identical to Vector in implementation.

Mark Rowe (bdash)

Comment 10 2007-11-20 23:30:45 PST

Landed in r27936.

Lucas Forschler

Comment 11 2019-02-06 09:04:15 PST

Mass moving XML DOM bugs to the "DOM" Component.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Mac (Intel)

OS OS X 10.4

Product WebKit

Component DOM

Assignee

Mark Rowe (bdash)

Reported

2007-11-20 16:31 PST

Modified

2019-02-06 09:04 PST History

CC List

3 users Show

URL

Keywords HasReduction, InRadar, YahooBug

Depends on

Blocks