Calling DOMParser.parseFromString can freeze safari for a long time when the xml that is trying to be parsed contains at least one node that is extremely large and has many xml-safe characters (ie. "&", "<", ">", """, "'"). By large, I mean at least 10,000 repetitions of text: For example 1. This is a long & boring message. 2. This is a long & boring message. ... 10000. This is a long & boring message. (new DOMParser()).parseFromString( xml, "text/xml" ); The larger the number of repitions, the longer it takes to parse. Right now I'm seeing numbers such as: 5,000 reps = ~1 sec 10,000 reps = ~7 sec 20,000 reps = ~32 sec If the text of the node doesn't have any xml safe characters, the parsing runs very quickly, generally less than a second. I'm going to attach an example page that has several tests that parse large sample xml text. I have run this test on Firefox 2(PC & Mac) and IE7 and the numbers for the tests there are never over 3 seconds for even the largest test, 100,000 reps. I've been testing with Safari 3.0.4 (523.12) and WebKit r27930
Created attachment 17422 [details] Slow XML Parsing Test These tests create large amounts of xml and parse that xml using: (new DOMParser()).parseFromString( xml, "text/xml" ); Note: If the text of the node is wrapped in CDATA the parsing times are extremely fast. Note: If the text of the node is large and there are no xml-safe characters the parsing times are very fast.
This test case seems to show O(N^2) behavior: 5,000 - 0.682sec 10,000 - 6.547sec 20,000 - 33.04sec
<rdar://problem/5609579>
A Shark profile shows that > 90% of the time is spent in memcpy. This appears to be due to the naive implementation of StringImpl::append(UChar*, unsigned) that allocates and copies on every append rather than growing more gracefully.
(In reply to comment #4) > A Shark profile shows that > 90% of the time is spent in memcpy. This appears > to be due to the naive implementation of StringImpl::append(UChar*, unsigned) > that allocates and copies on every append rather than growing more gracefully. While fixing StringImpl::append would be nice, perhaps in this specific case we could use a Vector<UChar> and String::adopt(), as we do in other places in WebCore.
Created attachment 17425 [details] Patch This patch causes the largest case in the test document to take around 0.18 seconds on my MacBook Pro, down from several minutes.
I did not create a layout test because I'm unsure how this could be tested without being timing dependent. I'd rather not introduce another layout test that fails on slower machines or when run under guard malloc.
Comment on attachment 17425 [details] Patch r=me
(In reply to comment #5) > While fixing StringImpl::append would be nice, I actually think that we should eventually get rid of it - no need to make String identical to Vector in implementation.
Landed in r27936.
Mass moving XML DOM bugs to the "DOM" Component.