Bug 18282

Summary:

WebKit is very slow in loading deeply nested divs

Product:

WebKit

Reporter:

Brett Wilson (Google) <brettw>

Component:

DOM

Assignee:

Nobody <webkit-unassigned>

Status:

NEW

Severity:

Normal

CC:

ahmad.saleem792, ap, bfulgham, cdumez, darin, ddkilzer, eric, ggaren, gustavo, hyatt, kamaji, koivisto, laszlo.gombos, mal, mitz, rniwa, tkent, tonikitoo, vidavera, webkit.review.bot, xan.lopez, zalan

Priority:

Keywords:

HasReduction, InRadar

Version:

525.x (Safari 3.1)

Hardware:

All

OS:

All

Bug Depends on:

37247

Bug Blocks:

Attachments:

Description	Flags
Test case	none
Test case (crashes both mac and windows)	none
Parameterized version of Eric Seidel's divs.html	none
backtrace of qtwebkit failure on divs.html?l=3500 with 256K max stack	none
Patch providing an optional maximum depth cap to the DOM tree, enabled using --enable-domtree-maxdepth	none
Proposed patch by tkent	none
Proposed patch by tkent (rev.2)	none
Proposed patch by tkent (rev.3)	none
Performance checker	none
Proposed patch by tkent (rev.4)	none
Parser-only patch	none
Parser-only patch (rev.2)	none
Parser-only patch (rev.3)	none
Parser-only patch (rev.4)	none
Parser-only patch (rev.5)	none
Parser-only patch (rev.6)	none

Brett Wilson (Google)

Reported 2008-04-02 09:58:18 PDT

I only tested on Safari for Windows. It could be blowing out the stack.

Attachments
Test case (136.74 KB, text/html) 2008-04-02 09:58 PDT, Brett Wilson (Google)	no flags	Details
Test case (crashes both mac and windows) (273 bytes, text/html) 2008-04-03 13:05 PDT, Eric Seidel (no email)	no flags	Details
Parameterized version of Eric Seidel's divs.html (731 bytes, text/html) 2009-10-16 14:09 PDT, Keith Kyzivat	no flags	Details
backtrace of qtwebkit failure on divs.html?l=3500 with 256K max stack (86.86 KB, application/x-bzip2) 2009-10-16 15:51 PDT, Keith Kyzivat	no flags	Details
Patch providing an optional maximum depth cap to the DOM tree, enabled using --enable-domtree-maxdepth (12.98 KB, patch) 2009-11-24 13:17 PST, Keith Kyzivat	no flags	Details Formatted Diff Diff
Proposed patch by tkent (24.17 KB, patch) 2010-03-15 00:52 PDT, Kent Tamura	no flags	Details Formatted Diff Diff
Proposed patch by tkent (rev.2) (24.18 KB, patch) 2010-03-15 19:52 PDT, Kent Tamura	no flags	Details Formatted Diff Diff
Proposed patch by tkent (rev.3) (31.67 KB, patch) 2010-03-17 23:29 PDT, Kent Tamura	no flags	Details Formatted Diff Diff
Performance checker (890 bytes, text/html) 2010-03-31 04:14 PDT, Kent Tamura	no flags	Details
Proposed patch by tkent (rev.4) (11.81 KB, patch) 2010-03-31 23:33 PDT, Kent Tamura	no flags	Details Formatted Diff Diff
Parser-only patch (16.23 KB, patch) 2010-04-07 23:20 PDT, Kent Tamura	no flags	Details Formatted Diff Diff
Parser-only patch (rev.2) (15.88 KB, patch) 2010-04-07 23:50 PDT, Kent Tamura	no flags	Details Formatted Diff Diff
Parser-only patch (rev.3) (15.89 KB, patch) 2010-04-08 01:04 PDT, Kent Tamura	no flags	Details Formatted Diff Diff
Parser-only patch (rev.4) (10.82 KB, patch) 2010-04-08 03:22 PDT, Kent Tamura	no flags	Details Formatted Diff Diff
Parser-only patch (rev.5) (11.19 KB, patch) 2010-04-24 10:50 PDT, Kent Tamura	no flags	Details Formatted Diff Diff
Parser-only patch (rev.6) (18.15 KB, patch) 2010-04-24 11:15 PDT, Kent Tamura	no flags	Details Formatted Diff Diff
Show Obsolete (10) View All Add attachment proposed patch, testcase, etc.

Brett Wilson (Google)

Comment 1 2008-04-02 09:58:35 PDT

Created attachment 20293 [details] Test case

Eric Seidel (no email)

Comment 2 2008-04-02 10:51:49 PDT

Reproducible crashers are P1. http://webkit.org/quality/bugpriorities.html

Matt Lilek

Comment 3 2008-04-02 11:12:05 PDT

This only happens on Windows and VS tells me that it does indeed blow out the stack: > WebKit.dll!WebCore::WidthIterator::advance(int offset=3, WebCore::GlyphBuffer * glyphBuffer=0x00000000) Line 111 + 0x6 bytes C++ WebKit.dll!WebCore::Font::floatWidthForSimpleText(const WebCore::TextRun & run={...}, WebCore::GlyphBuffer * glyphBuffer=0x00000000) Line 729 C++ WebKit.dll!WebCore::Font::floatWidth(const WebCore::TextRun & run={...}) Line 707 + 0xe bytes C++ WebKit.dll!WebCore::Font::width(const WebCore::TextRun & run={...}) Line 512 + 0xc bytes C++ WebKit.dll!WebCore::RenderText::widthFromCache(const WebCore::Font & f={...}, int start=0, int len=3, int xPos=0) Line 448 C++ WebKit.dll!WebCore::RenderText::calcPrefWidths(int leadWidth=0) Line 648 + 0x1b bytes C++ WebKit.dll!WebCore::RenderText::maxPrefWidth() Line 546 + 0x14 bytes C++ WebKit.dll!WebCore::RenderText::width(unsigned int from=0, unsigned int len=3, const WebCore::Font & f={...}, int xPos=0) Line 1062 + 0x12 bytes C++ WebKit.dll!WebCore::RenderBlock::findNextLineBreak(WebCore::BidiResolver<WebCore::BidiIterator,WebCore::BidiRun> & start={...}, WebCore::EClear * clear=0x00033560) Line 1891 + 0x31 bytes C++ WebKit.dll!WebCore::RenderBlock::layoutInlineChildren(bool relayoutChildren=true, int & repaintTop=0, int & repaintBottom=0) Line 890 + 0x1d bytes C++ WebKit.dll!WebCore::RenderBlock::layoutBlock(bool relayoutChildren=true) Line 581 C++ WebKit.dll!WebCore::RenderBlock::layout() Line 491 + 0x14 bytes C++ WebKit.dll!WebCore::RenderBlock::layoutBlockChildren(bool relayoutChildren=true, int & maxFloatBottom=0) Line 1228 + 0x12 bytes C++ WebKit.dll!WebCore::RenderBlock::layoutBlock(bool relayoutChildren=true) Line 585 C++ [snip]

Brett Wilson (Google)

Comment 4 2008-04-02 12:48:01 PDT

It seems like if this is a stack problem, it would have to happen on OSX as well. Does it just have a higher maximum stack or something?

Eric Seidel (no email)

Comment 5 2008-04-03 13:05:25 PDT

Created attachment 20319 [details] Test case (crashes both mac and windows)

Mark Rowe (bdash)

Comment 6 2008-04-03 14:53:32 PDT

<rdar://problem/5841541>

Eric Seidel (no email)

Comment 7 2008-07-03 22:48:18 PDT

*** Bug 18350 has been marked as a duplicate of this bug. ***

Alexey Proskuryakov

Comment 8 2009-08-13 15:49:09 PDT

See also: bug 28245.

Keith Kyzivat

Comment 9 2009-10-16 14:09:33 PDT

Created attachment 41325 [details] Parameterized version of Eric Seidel's divs.html Parameterizes Eric Seidel's divs.html

Keith Kyzivat

Comment 10 2009-10-16 15:51:20 PDT

Created attachment 41336 [details] backtrace of qtwebkit failure on divs.html?l=3500 with 256K max stack

Keith Kyzivat

Comment 11 2009-10-16 15:53:03 PDT

Not surprisingly, this problem also occurs with other nested elements. I uncovered this in a page with deeply nested lists, and again reproduced it with Eric Seidel's divs.html attachment. This bug particularly affects mobile devices, as these devices typically are low-memory, and have a maximum stack size set to orders of magnitude smaller than desktop webkit uses. The maximum element depth before crash on these devices is significantly smaller - typically hundreds of levels deep instead of thousands. Reproduced this with latest r49687 webkit sources under Linux using QtWebkit. Attached is a backtrace from a run of QtWebkit rendering nested divs 3500 deep, with maximum stack set to 256KBytes (ulimit -s 256). Also attached is a modified version of Eric Seidel's divs.html, adding parameterization of the number of levels, which I used in this testing.

Keith Kyzivat

Comment 12 2009-11-24 10:32:30 PST

(repeated from bug 30898) I have a proposed patch that I've had done for a while now, was just trying to think if there would be any better way to do it. It caps the tree in the parser when parser adds a node, and also in Javascript when a node is appended. There is no memory consumption increase, but there is a hit in performance, since depth of node to be added/appended to is calculated, for each add/append. Since there is a performance hit, I added a build flag to turn it on, and specify the maximum depth. Under standard desktop browsers with large max stack size, this can be omitted, since the maximum depth is generally in the hundreds of thousands. For low-memory devices, this is a bigger deal, and thus they may configure webkit with --enable-domtree-maxdepth=<value>. Patch coming shortly.

Keith Kyzivat

Comment 13 2009-11-24 13:17:22 PST

Created attachment 43803 [details] Patch providing an optional maximum depth cap to the DOM tree, enabled using --enable-domtree-maxdepth Attached is the patch I have created to cap the DOM tree when a node is added (from HTML parse), or appended to (Javascript). This functionality does not add any memory overhead, however it does add performance overhead. Because of this, I have made this an optional feature that must be turned on using --enable-domtree-maxdepth. To configure the maximum depth, one changes the MAX_DOM_TREE_DEPTH define in WebCore/config.h To more quickly test this on desktop browsers, one should spawn a subshell, and run ulimit -s 256 (under Linux, under Mac it may be different) before testing one of the above HTML attachments that reproduce the bug.

Kent Tamura

Comment 14 2009-11-24 18:14:22 PST

*** Bug 30898 has been marked as a duplicate of this bug. ***

Eric Seidel (no email)

Comment 15 2009-11-25 08:15:59 PST

Comment on attachment 43803 [details] Patch providing an optional maximum depth cap to the DOM tree, enabled using --enable-domtree-maxdepth Can't commit with "OOPS" in a ChangeLog. The "Reviewed by" line will be fixed by our landing scripts, but: No new tests. (OOPS!) will cause the commit to fail. The parsing case could be made quick by caching the current level of the parse tree. I'm not sure I'm a big fan of this change, but I don't really know what to suggest. r- because the ChangeLog would cause the commit to fail. Testing for this is easy. There should be a test which validates that we can't add below 400 items. Obviously that test only makes sense if this feature is enabled, and would probably be skipped on all platforms except whichever plans to use this. Speakign of which, what platform would use this?

Keith Kyzivat

Comment 16 2009-11-25 09:35:58 PST

> No new tests. (OOPS!) will cause the commit to fail. I left it in there on purpose, as I figured you would come back wanting me to add tests for this commit -- just wanted some input on how I should reasonably add tests for this. > The parsing case could be made quick by caching the current level of the parse tree. How exactly can that be accomplished? I'm not familiar with this. (a pointer to some reading material is fine) > I'm not sure I'm a big fan of this change, but I don't really know what to suggest. I'm all ears -- if someone else can think of a reasonable way to prevent stack overflow, that'd be excellent. Could be done at render-time too, but I couldn't think of a way to figure out when we're getting close to exceeding the maximum stack size. > Speakign of which, what platform would use this? Embedded devices with small maximum stack size -- specifically, in my case, Nokia S40 phones. Currently, since S40 phones don't have process separation, if the stack overflows, it causes the phone to reset. I want to prevent that from happening. Under S60 phones the maximum stack is also relatively small, but when stack is exceeded it just takes down the browser, not the OS. Under Android, iPhone/iPod touch, and also desktop webkit-based browsers, this bug can exhibit itself, but, since the maximum stack size is quite large, the element depth to reproduce the bug is ludicrously large -- on the order of 100,000 elements deep, therefore it isn't really a big deal on those browsers -- users would have long since hit the stop button before the problem would manifest itself.

Eric Seidel (no email)

Comment 17 2009-11-30 22:42:43 PST

(In reply to comment #16) > > No new tests. (OOPS!) will cause the commit to fail. > I left it in there on purpose, as I figured you would come back wanting me to > add tests for this commit -- just wanted some input on how I should reasonably > add tests for this. > > > The parsing case could be made quick by caching the current level of the parse tree. > How exactly can that be accomplished? I'm not familiar with this. (a pointer > to some reading material is fine) http://trac.webkit.org/browser/trunk/WebCore/html/HTMLParser.h#L162 the "block stack" might also be able to cache this information, I'm not sure. > > Speakign of which, what platform would use this? > Embedded devices with small maximum stack size -- specifically, in my case, > Nokia S40 phones. Currently, since S40 phones don't have process separation, > if the stack overflows, it causes the phone to reset. I want to prevent that > from happening. Under S60 phones the maximum stack is also relatively small, > but when stack is exceeded it just takes down the browser, not the OS. > Under Android, iPhone/iPod touch, and also desktop webkit-based browsers, this > bug can exhibit itself, but, since the maximum stack size is quite large, the > element depth to reproduce the bug is ludicrously large -- on the order of > 100,000 elements deep, therefore it isn't really a big deal on those browsers > -- users would have long since hit the stop button before the problem would > manifest itself. OK, that makes a lot more sense. I can see how devices could be more sensitive to this and might want to make the speed tradeoff to prevent bringing down the whole OS. I think you still want to optimize the HTML parsing case if possible, but I think your approach of using a stack crawl instead of caching the value on Node is probably the right one.

Kent Tamura

Comment 18 2010-03-04 18:06:39 PST

Hi Keith, Are you working on updating the patch? If not, I (or someone else) will take it over.

Kent Tamura

Comment 19 2010-03-11 01:36:33 PST

We had better add another flag to enable the check by parsers. ENABLE_DOMTREE_MAXDEPTH seems to have performance regression and performance-sensitive browsers won't enable it. However the check by parsers won't have significant performance regression and performance-sensitive browsers will enable it.

Kent Tamura

Comment 20 2010-03-15 00:52:58 PDT

Created attachment 50688 [details] Proposed patch by tkent

Eric Seidel (no email)

Comment 21 2010-03-15 12:57:12 PDT

Comment on attachment 50688 [details] Proposed patch by tkent Does Chromium want this? Aren't ports going to want to control MAX_DOM_TREE_DEPTH? I guess they can already by setting both ENABLE and MAX_DOM_TREE_DEPTH. The implementation looks OK. Are you sure this is the order you want? +#if ENABLE(DOMTREE_PARSING_MAXDEPTH) + if (m_nodeDepth > MAX_DOM_TREE_DEPTH) + return; +#endif exitText(); + handleError(nonFatal, "Too deep tree.", lineNumber(), columnNumber()); "DOM tree is too deep" would be more clear to my eyes. What sort of message does FireFox provide here? I'm pretty sure they have a limit.

Kent Tamura

Comment 22 2010-03-15 19:23:37 PDT

(In reply to comment #21) > (From update of attachment 50688 [details]) > Does Chromium want this? Yes. I'll enable DOMTREE_PARSING_MAXDEPTH. > > Aren't ports going to want to control MAX_DOM_TREE_DEPTH? I guess they can > already by setting both ENABLE and MAX_DOM_TREE_DEPTH. Do you mean no configure.ac change for MAX_DOM_TREE_DEPTH? > Are you sure this is the order you want? > +#if ENABLE(DOMTREE_PARSING_MAXDEPTH) > + if (m_nodeDepth > MAX_DOM_TREE_DEPTH) > + return; > +#endif > exitText(); Yes. In parseStartElement(). the check should be put after exitText(), and the check should be put before exitText(). It's because exitText() in parseStartElement() is for text *before* the start tag. > + handleError(nonFatal, "Too deep tree.", lineNumber(), columnNumber()); > > "DOM tree is too deep" would be more clear to my eyes. What sort of message > does FireFox provide here? I'm pretty sure they have a limit. Ok, I'll update the message. Firefox has a limit for HTML DOM: https://hg.mozilla.org/mozilla-central/file/050887c64183/parser/htmlparser/src/nsHTMLTokenizer.cpp#l382 but it doesn't show any message like WebKit doesn't for HTML. I couldn't find a limit for XML parsing in Firefox.

Kent Tamura

Comment 23 2010-03-15 19:24:59 PDT

> Yes. In parseStartElement(). the check should be put after exitText(), and the > check should be put before exitText(). Correction: .. the check should be put before exitText() in other functions.

Kent Tamura

Comment 24 2010-03-15 19:52:21 PDT

Created attachment 50755 [details] Proposed patch by tkent (rev.2)

Eric Seidel (no email)

Comment 25 2010-03-15 20:53:16 PDT

OK, so if Chromium wants this, why don't all ports want this?

Kent Tamura

Comment 26 2010-03-15 21:14:40 PDT

(In reply to comment #25) > OK, so if Chromium wants this, why don't all ports want this? Ah, I see. We don't need to make ENABLE_DOMTREE_PARSING_MAXDEPTH optional. I'll remove ENABLE_DOMTREE_PARSING_MAXDEPTH and make it work by default later if no one objects.

Kent Tamura

Comment 27 2010-03-17 23:29:48 PDT

Created attachment 51008 [details] Proposed patch by tkent (rev.3)

Eric Seidel (no email)

Comment 28 2010-03-31 00:25:01 PDT

Comment on attachment 51008 [details] Proposed patch by tkent (rev.3) This looks really expensive. I'm certain this must cause a performance regression on the PLT. Have you run any performance testing?

Eric Seidel (no email)

Comment 29 2010-03-31 00:25:55 PDT

(In reply to comment #19) > We had better add another flag to enable the check by parsers. > ENABLE_DOMTREE_MAXDEPTH seems to have performance regression and > performance-sensitive browsers won't enable it. > However the check by parsers won't have significant performance regression and > performance-sensitive browsers will enable it. What browser is not performance sensitive? Certainly both Chrome and Safari are...

Maciej Stachowiak

Comment 30 2010-03-31 01:09:18 PDT

Comment on attachment 51008 [details] Proposed patch by tkent (rev.3) 1) I don't think this fix should be optional. Ports should not have to choose between a crash fix and a possible performance hit. We should have one code path that both has adequate performance and does not crash. 2) A number of the DOM methods now do if (atMaxTreeDepth()) return 0;. It seems to me that they should set the exception code too (I am not sure offhand which DOM exception code is most appropriate). I suggest trying some DOM benchmarks, such as <http://dromaeo.com/?dom> and <http://nontroppo.org/timer/Hixie_DOM.html>. 3) Because node insertion has become O(N), it's possible that building up a deep tree using DOM APIs alone will trigger O(N^2) behavior. Almost any time we've had N^2 algorithms, we have run into some site that hangs as a result. 4) MAX_DOM_TREE_DEPTH should be defined as a const size_t rather than a preprocessor macro. 5) 4096 seems like a fairly low limit. Is that the best we can do? 6) cMaxBlockDepth shouldn't be defined as MAX_DOM_TREE_DEPTH; the fact that it's the same constant is a coincidence, not fundamental. 7) In the parser, it looks to me like you repurposed m_blocksInStack to be a count of total nesting level, rather than block nesting level. That seems dubious to me for a couple of reasons: (a) The name is now wrong. If this change is otherwise right, the variable should be renamed (or just use the dom tree depth constant directly). (b) Checks relating to block depth just go off of tag depth, but that seems wrong to me; I suspect this breaks the intent of the block checks. For example, non-blocks could now cause blocks to be popped. (c) I think it would be good to test a case of blocks and a case of mixed blocks and inlines. It seems like your test case only covers spans, which are an inline element 8) Like Eric, I would like to see testing of page load speed. r- to address above comments.

Kent Tamura

Comment 31 2010-03-31 02:20:04 PDT

I have had Dromaeo DOM results. But the results make almost no sense. The appendChild and insertBefore tests of Dromaeo check times to insert elements to document.body. So the results have very little impact of ENABLE_DOMTREE_RUNTIME_MAXDEPTH. Dromaeo DOM test results: Safari with WebKit r56828: http://dromaeo.com/?id=98801 createElement: 531.19runs/s ±0.79% createTextNode: 264.63runs/s ±0.78% innerHTML: 79.55runs/s ±0.96% cloneNode: 148.10runs/s ±1.73% appendChild: 1046.00runs/s ±0.15% insertBefore: 981.40runs/s ±0.46% Safari with WebKit r56828 + ENABLE_DOMTREE_RUNTIME_MAXDEPTH: http://dromaeo.com/?id=98804 createElement: 513.69runs/s ±0.87% createTextNode: 261.68runs/s ±1.03% innerHTML: 78.74runs/s ±0.58% cloneNode: 143.97runs/s ±2.30% appendChild: 999.20runs/s ±0.16% insertBefore: 988.80runs/s ±0.22%

Kent Tamura

Comment 32 2010-03-31 02:28:58 PDT

(In reply to comment #31) > I have had Dromaeo DOM results. But the results make almost no sense. > The appendChild and insertBefore tests of Dromaeo check times to insert > elements to document.body. So the results have very little impact of > ENABLE_DOMTREE_RUNTIME_MAXDEPTH. Hixie_DOM is not suitable too. Hixie_DOM tests insertions to a child of the body. Hmm, we should make new tests.

Kent Tamura

Comment 33 2010-03-31 02:33:00 PDT

(In reply to comment #28) > (From update of attachment 51008 [details]) > This looks really expensive. I'm certain this must cause a performance > regression on the PLT. Have you run any performance testing? What's 'PLT'?

Kent Tamura

Comment 34 2010-03-31 02:35:16 PDT

(In reply to comment #30) > 5) 4096 seems like a fairly low limit. Is that the best we can do? I don't think it's low. It's very higher than Firefox's depth limit. Firefox's depth limit is 200. https://hg.mozilla.org/mozilla-central/file/050887c64183/parser/htmlparser/src/nsHTMLTokenizer.cpp#l382

Kent Tamura

Comment 35 2010-03-31 04:14:00 PDT

Created attachment 52149 [details] Performance checker With this test code: Iteration=4000 depth=200 No patch: 4,633 DOMTREE_RUNTIME_MAXDEPTH: 4,878 (4.6% time gain) Iteration=500 depth=1000 No patch: 7,353 DOMTREE_RUNTIME_MAXDEPTH: 8,023 (9.1% time gain) Iteration=30 depth=4000 No patch: 5,747 DOMTREE_RUNTIME_MAXDEPTH: 6,662 (15.9% time gain) Hmm, DOMTREE_RUNTIME_MAXDEPTH is not acceptable. I have an idea to improve performance. How about introducing just one cache of a node depth? Suppose that: Document has m_lastDepth and m_lastDepthNode. They have information about a parent node which was used in the last node insertion. When another child node is inserted to a parent node P, - If P is m_lastDepthNode, m_lastDepth is the depth - If P's parent is m_lastDepthNode, m_lastDepth++ and m_lastDepthNode = P - If m_lastDepthNode's parent is P, m_lastDepth-- and m_lastDepthNode = P - If P's parent is m_lastDepthNode's parent, m_lastDepthNode = P This reduces a lot of depth calculation.

Kent Tamura

Comment 36 2010-03-31 23:33:44 PDT

Created attachment 52263 [details] Proposed patch by tkent (rev.4)

Kent Tamura

Comment 37 2010-03-31 23:46:58 PDT

(In reply to comment #36) > Created an attachment (id=52263) [details] > Proposed patch by tkent (rev.4) This patch contains a change for cap by DOM operation, and removed a change for cap by parsers because the DOM operation got much faster than the past patches. fast/dom/tree-depth-cap.html takes 2.93sec on my machine, and fast/parser/tree-depth-cap.html takes 0.47sec. I think fast/dom/tree-depth-cap.html is not acceptable to check in. * Performance ** Doromaeo DOM - No patch http://dromaeo.com/?id=98854 createElement: 545.00runs/s ±0.70% createTextNode: 272.29runs/s ±1.08% innerHTML: 78.90runs/s ±0.47% cloneNode: 146.59runs/s ±2.33% appendChild: 1023.00runs/s ±0.49% insertBefore: 958.60runs/s ±0.54% - With this patch http://dromaeo.com/?id=98855 createElement: 532.88runs/s ±0.89% * This should not have performance difference. createTextNode: 263.83runs/s ±0.95% * ditto. innerHTML: 78.56runs/s ±0.32% cloneNode: 143.58runs/s ±5.47% appendChild: 1059.60runs/s ±0.13% insertBefore: 998.40runs/s ±0.64% ** Performance checker in this bug Iteration=4,000 depth=200 No patch: 4,808 This patch: 4.982 (3.6% time gain) Iteration=500 depth=1,000 No patch: 7,160 This patch: 7,315 (2.1% time gain) Iteration=30 depth=4,000 No patch: 5,623 This patch: 5,706 (1.4% time gain)

Dave Hyatt

Comment 38 2010-04-01 11:43:04 PDT

Comment on attachment 52263 [details] Proposed patch by tkent (rev.4) I'm concerned about the performance impact of this change. It seems like in the typical case of just doing normal DOM insertions, you've made an O(1) operation into an O(depth) operation. What about the insertion/appending of document fragments? Do they fool your depth counting?

Kent Tamura

Comment 39 2010-04-02 02:41:51 PDT

(In reply to comment #38) > (From update of attachment 52263 [details]) > I'm concerned about the performance impact of this change. It seems like in > the typical case of just doing normal DOM insertions, you've made an O(1) > operation into an O(depth) operation. It would be O(1) in typical cases and O(depth) at most with my latest patch. But ... > What about the insertion/appending of document fragments? Do they fool your > depth counting? Ah, it can avoid the limitation. I have no good idea to limit the depth in a case of inserting a tree into another tree. Does anyone have a good idea?

Kent Tamura

Comment 40 2010-04-07 23:20:26 PDT

Created attachment 52832 [details] Parser-only patch

Kent Tamura

Comment 41 2010-04-07 23:50:06 PDT

Created attachment 52840 [details] Parser-only patch (rev.2)

Kent Tamura

Comment 42 2010-04-07 23:55:17 PDT

I have no idea of a complete and efficient algorithm to solve this problem in DOM operations. So I focus on the parser solution though it can't solve this problem completely. (In reply to comment #30) > (From update of attachment 51008 [details]) > 4) MAX_DOM_TREE_DEPTH should be defined as a const size_t rather than a > preprocessor macro. I added "size_t maxDomTreeDepth" in config.h. I kept MAX_DOM_TREE_DEPTH to change this value by a compiler flag, and I didn't wrap it with "namespace WebCobre" because config.h is used for .c too. > 5) 4096 seems like a fairly low limit. Is that the best we can do? As I wrote in another comment, it's very high. I counted the maximum depth of Gmail message view. It's about 35 + <tree depth in an HTML message>. However, I changed the default to 5000 because the existing block depth cap doesn't work if MAX_DOM_TREE_DEPTH is <= 4096. > 6) cMaxBlockDepth shouldn't be defined as MAX_DOM_TREE_DEPTH; the fact that > it's the same constant is a coincidence, not fundamental. Fixed. > 7) In the parser, it looks to me like you repurposed m_blocksInStack to be a > count of total nesting level, rather than block nesting level. That seems > dubious to me for a couple of reasons: I added another depth counter. So this issue was resolved.

WebKit Review Bot

Comment 43 2010-04-08 00:49:22 PDT

Attachment 52840 [details] did not build on gtk: Build output: http://webkit-commit-queue.appspot.com/results/1653244

Kent Tamura

Comment 44 2010-04-08 01:04:32 PDT

Created attachment 52847 [details] Parser-only patch (rev.3)

Kent Tamura

Comment 45 2010-04-08 01:53:45 PDT

Comment on attachment 52847 [details] Parser-only patch (rev.3) r57263 (Bug#37247) fixed the XML parser part. I'll remove the XML parser change from the patch.

Kent Tamura

Comment 46 2010-04-08 03:22:14 PDT

Created attachment 52852 [details] Parser-only patch (rev.4)

Darin Adler

Comment 47 2010-04-09 09:14:19 PDT

Comment on attachment 52852 [details] Parser-only patch (rev.4) > +#ifndef config_h > +#define config_h The config.h file is not a normal header file, and should not have a header guard in it. We can discuss this further elsewhere, but please don't add this in this patch without further discussion. > +static const unsigned maxDOMTreeDepth = MAX_DOM_TREE_DEPTH; It's not appropriate to have a "static const" in the config.h file. I didn't review the rest of the patch, but these are both no-nos. I don't think this value should go in config.h. It should go in a header file included only by XMLTokenizer.cpp and HTMLParser.cpp. We don't want this to have any effect on the rest of the source code. We could add a new source file called TreeDepthLimit.h in the dom directory. I think that m_nodeDepth is a strange name; is it really the depth of a node? The patch is risky. If we find that in some cases the node depth gets off, then we could either underflow to 0 or slowly creep up to the maximum depth and break parsing. So we need to make sure we are testing all the parsing code paths. It would help to have an assertion that the depth is back to zero once parsing is done. Running the regression tests with such an assert in place could help us catch any mismatches that otherwise would be silent. Besides that, I think we need to do some performance testing, but otherwise, this patch looks good to me.

Darin Adler

Comment 48 2010-04-09 09:14:52 PDT

(In reply to comment #47) > I didn't review the rest of the patch, but these are both no-nos. Heh, I did review the rest of the patch. Sorry.

Kent Tamura

Comment 49 2010-04-24 10:50:05 PDT

Created attachment 54220 [details] Parser-only patch (rev.5)

Kent Tamura

Comment 50 2010-04-24 11:15:38 PDT

Created attachment 54221 [details] Parser-only patch (rev.6)

Kent Tamura

Comment 51 2010-04-24 11:19:39 PDT

(In reply to comment #47) > I don't think this value should go in config.h. It should go in a header file > included only by XMLTokenizer.cpp and HTMLParser.cpp. We don't want this to > have any effect on the rest of the source code. We could add a new source file > called TreeDepthLimit.h in the dom directory. ok, I introduced dom/TreeDepthLimit.h. > I think that m_nodeDepth is a strange name; is it really the depth of a node? Changed it to m_treeDepth. > The patch is risky. If we find that in some cases the node depth gets off, then > we could either underflow to 0 or slowly creep up to the maximum depth and > break parsing. So we need to make sure we are testing all the parsing code > paths. It would help to have an assertion that the depth is back to zero once > parsing is done. Running the regression tests with such an assert in place > could help us catch any mismatches that otherwise would be silent. I added an ASSERT(). > Besides that, I think we need to do some performance testing, but otherwise, > this patch looks good to me. Dromaeo innerHTMl test results: Without this patch: innerHTML: 108.85runs/s ±0.36% With this patch: innerHTML: 110.76runs/s ±0.39% It seems no performance regression in this test.

Kent Tamura

Comment 52 2010-04-26 00:37:46 PDT

I committed "Parser-only patch (rev.6)" as r58236. But I don't close this bug because we can still make deep trees by DOM operations. I don't have good ideas to fix it.

Keith Kyzivat

Comment 53 2012-03-19 03:18:13 PDT

(In reply to comment #42) > (In reply to comment #30) > > 5) 4096 seems like a fairly low limit. Is that the best we can do? > > As I wrote in another comment, it's very high. > I counted the maximum depth of Gmail message view. It's about 35 + <tree depth much more than 5000 will probably be too high for very low end devices with little memory, S40 phones, or maybe RaspberryPi for example. Stack size of S40 browser is 256K.. pretty tiny, given most Linux desktops stack size is 8M. Wow, much work has gone on in this bug since I last looked. Thank you for continuing work on it Kent.

Ahmad Saleem

Comment 54 2022-08-29 15:20:23 PDT

I took "Performance Checker" test case and got following with default configuration: Iteration: 30 Depth: 4000 >> STP - Elapsed time: 6014 [msec] >> Firefox Nightly 107 - Elapsed time: 505 [msec] >> Chrome Canary 107 - Elapsed time: 3332 [msec] ______ Safari is the slowest among all and I think improving this would be good. Just wanted to share updated test results and I don't get any crash. Thanks!

Note You need to log in before you can comment on or make changes to this bug.