WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
16076
DOMParser().parseFromString() freezes Safari when parsing large nodes with XML entities
https://bugs.webkit.org/show_bug.cgi?id=16076
Summary
DOMParser().parseFromString() freezes Safari when parsing large nodes with XM...
Brian Kaull
Reported
2007-11-20 16:31:41 PST
Calling DOMParser.parseFromString can freeze safari for a long time when the xml that is trying to be parsed contains at least one node that is extremely large and has many xml-safe characters (ie. "&", "<", ">", """, "'"). By large, I mean at least 10,000 repetitions of text: For example 1. This is a long & boring message. 2. This is a long & boring message. ... 10000. This is a long & boring message. (new DOMParser()).parseFromString( xml, "text/xml" ); The larger the number of repitions, the longer it takes to parse. Right now I'm seeing numbers such as: 5,000 reps = ~1 sec 10,000 reps = ~7 sec 20,000 reps = ~32 sec If the text of the node doesn't have any xml safe characters, the parsing runs very quickly, generally less than a second. I'm going to attach an example page that has several tests that parse large sample xml text. I have run this test on Firefox 2(PC & Mac) and IE7 and the numbers for the tests there are never over 3 seconds for even the largest test, 100,000 reps. I've been testing with Safari 3.0.4 (523.12) and WebKit
r27930
Attachments
Slow XML Parsing Test
(2.32 KB, text/html)
2007-11-20 16:37 PST
,
Brian Kaull
no flags
Details
Patch
(5.34 KB, patch)
2007-11-20 22:47 PST
,
Mark Rowe (bdash)
mjs
: review+
Details
Formatted Diff
Diff
View All
Add attachment
proposed patch, testcase, etc.
Brian Kaull
Comment 1
2007-11-20 16:37:54 PST
Created
attachment 17422
[details]
Slow XML Parsing Test These tests create large amounts of xml and parse that xml using: (new DOMParser()).parseFromString( xml, "text/xml" ); Note: If the text of the node is wrapped in CDATA the parsing times are extremely fast. Note: If the text of the node is large and there are no xml-safe characters the parsing times are very fast.
Maciej Stachowiak
Comment 2
2007-11-20 17:43:23 PST
This test case seems to show O(N^2) behavior: 5,000 - 0.682sec 10,000 - 6.547sec 20,000 - 33.04sec
Maciej Stachowiak
Comment 3
2007-11-20 17:44:34 PST
<
rdar://problem/5609579
>
Mark Rowe (bdash)
Comment 4
2007-11-20 18:56:43 PST
A Shark profile shows that > 90% of the time is spent in memcpy. This appears to be due to the naive implementation of StringImpl::append(UChar*, unsigned) that allocates and copies on every append rather than growing more gracefully.
Adam Roben (:aroben)
Comment 5
2007-11-20 20:14:50 PST
(In reply to
comment #4
)
> A Shark profile shows that > 90% of the time is spent in memcpy. This appears > to be due to the naive implementation of StringImpl::append(UChar*, unsigned) > that allocates and copies on every append rather than growing more gracefully.
While fixing StringImpl::append would be nice, perhaps in this specific case we could use a Vector<UChar> and String::adopt(), as we do in other places in WebCore.
Mark Rowe (bdash)
Comment 6
2007-11-20 22:47:08 PST
Created
attachment 17425
[details]
Patch This patch causes the largest case in the test document to take around 0.18 seconds on my MacBook Pro, down from several minutes.
Mark Rowe (bdash)
Comment 7
2007-11-20 22:48:03 PST
I did not create a layout test because I'm unsure how this could be tested without being timing dependent. I'd rather not introduce another layout test that fails on slower machines or when run under guard malloc.
Maciej Stachowiak
Comment 8
2007-11-20 22:48:43 PST
Comment on
attachment 17425
[details]
Patch r=me
Alexey Proskuryakov
Comment 9
2007-11-20 23:11:17 PST
(In reply to
comment #5
)
> While fixing StringImpl::append would be nice,
I actually think that we should eventually get rid of it - no need to make String identical to Vector in implementation.
Mark Rowe (bdash)
Comment 10
2007-11-20 23:30:45 PST
Landed in
r27936
.
Lucas Forschler
Comment 11
2019-02-06 09:04:15 PST
Mass moving XML DOM bugs to the "DOM" Component.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug