Bug 18282

Summary: WebKit is very slow in loading deeply nested divs
Product: WebKit Reporter: Brett Wilson (Google) <brettw>
Component: DOMAssignee: Nobody <webkit-unassigned>
Status: NEW ---    
Severity: Normal CC: ahmad.saleem792, ap, bfulgham, cdumez, darin, ddkilzer, eric, ggaren, gustavo, hyatt, kamaji, koivisto, laszlo.gombos, mal, mitz, rniwa, tkent, tonikitoo, vidavera, webkit.review.bot, xan.lopez, zalan
Priority: P1 Keywords: HasReduction, InRadar
Version: 525.x (Safari 3.1)   
Hardware: All   
OS: All   
Bug Depends on: 37247    
Bug Blocks:    
Attachments:
Description Flags
Test case
none
Test case (crashes both mac and windows)
none
Parameterized version of Eric Seidel's divs.html
none
backtrace of qtwebkit failure on divs.html?l=3500 with 256K max stack
none
Patch providing an optional maximum depth cap to the DOM tree, enabled using --enable-domtree-maxdepth
none
Proposed patch by tkent
none
Proposed patch by tkent (rev.2)
none
Proposed patch by tkent (rev.3)
none
Performance checker
none
Proposed patch by tkent (rev.4)
none
Parser-only patch
none
Parser-only patch (rev.2)
none
Parser-only patch (rev.3)
none
Parser-only patch (rev.4)
none
Parser-only patch (rev.5)
none
Parser-only patch (rev.6) none

Description Brett Wilson (Google) 2008-04-02 09:58:18 PDT
I only tested on Safari for Windows. It could be blowing out the stack.
Comment 1 Brett Wilson (Google) 2008-04-02 09:58:35 PDT
Created attachment 20293 [details]
Test case
Comment 2 Eric Seidel (no email) 2008-04-02 10:51:49 PDT
Reproducible crashers are P1.
http://webkit.org/quality/bugpriorities.html
Comment 3 Matt Lilek 2008-04-02 11:12:05 PDT
This only happens on Windows and VS tells me that it does indeed blow out the stack:

>	WebKit.dll!WebCore::WidthIterator::advance(int offset=3, WebCore::GlyphBuffer * glyphBuffer=0x00000000)  Line 111 + 0x6 bytes	C++
 	WebKit.dll!WebCore::Font::floatWidthForSimpleText(const WebCore::TextRun & run={...}, WebCore::GlyphBuffer * glyphBuffer=0x00000000)  Line 729	C++
 	WebKit.dll!WebCore::Font::floatWidth(const WebCore::TextRun & run={...})  Line 707 + 0xe bytes	C++
 	WebKit.dll!WebCore::Font::width(const WebCore::TextRun & run={...})  Line 512 + 0xc bytes	C++
 	WebKit.dll!WebCore::RenderText::widthFromCache(const WebCore::Font & f={...}, int start=0, int len=3, int xPos=0)  Line 448	C++
 	WebKit.dll!WebCore::RenderText::calcPrefWidths(int leadWidth=0)  Line 648 + 0x1b bytes	C++
 	WebKit.dll!WebCore::RenderText::maxPrefWidth()  Line 546 + 0x14 bytes	C++
 	WebKit.dll!WebCore::RenderText::width(unsigned int from=0, unsigned int len=3, const WebCore::Font & f={...}, int xPos=0)  Line 1062 + 0x12 bytes	C++
 	WebKit.dll!WebCore::RenderBlock::findNextLineBreak(WebCore::BidiResolver<WebCore::BidiIterator,WebCore::BidiRun> & start={...}, WebCore::EClear * clear=0x00033560)  Line 1891 + 0x31 bytes	C++
 	WebKit.dll!WebCore::RenderBlock::layoutInlineChildren(bool relayoutChildren=true, int & repaintTop=0, int & repaintBottom=0)  Line 890 + 0x1d bytes	C++
 	WebKit.dll!WebCore::RenderBlock::layoutBlock(bool relayoutChildren=true)  Line 581	C++
 	WebKit.dll!WebCore::RenderBlock::layout()  Line 491 + 0x14 bytes	C++
 	WebKit.dll!WebCore::RenderBlock::layoutBlockChildren(bool relayoutChildren=true, int & maxFloatBottom=0)  Line 1228 + 0x12 bytes	C++
 	WebKit.dll!WebCore::RenderBlock::layoutBlock(bool relayoutChildren=true)  Line 585	C++
        [snip]
Comment 4 Brett Wilson (Google) 2008-04-02 12:48:01 PDT
It seems like if this is a stack problem, it would have to happen on OSX as well. Does it just have a higher maximum stack or something?
Comment 5 Eric Seidel (no email) 2008-04-03 13:05:25 PDT
Created attachment 20319 [details]
Test case (crashes both mac and windows)
Comment 6 Mark Rowe (bdash) 2008-04-03 14:53:32 PDT
<rdar://problem/5841541>
Comment 7 Eric Seidel (no email) 2008-07-03 22:48:18 PDT
*** Bug 18350 has been marked as a duplicate of this bug. ***
Comment 8 Alexey Proskuryakov 2009-08-13 15:49:09 PDT
See also: bug 28245.
Comment 9 Keith Kyzivat 2009-10-16 14:09:33 PDT
Created attachment 41325 [details]
Parameterized version of Eric Seidel's divs.html

Parameterizes Eric Seidel's divs.html
Comment 10 Keith Kyzivat 2009-10-16 15:51:20 PDT
Created attachment 41336 [details]
backtrace of qtwebkit failure on divs.html?l=3500 with 256K max stack
Comment 11 Keith Kyzivat 2009-10-16 15:53:03 PDT
Not surprisingly, this problem also occurs with other nested elements.
I uncovered this in a page with deeply nested lists, and again reproduced it with Eric Seidel's divs.html attachment.

This bug particularly affects mobile devices, as these devices typically are low-memory, and have a maximum stack size set to orders of magnitude smaller than desktop webkit uses.  The maximum element depth before crash on these devices is significantly smaller - typically hundreds of levels deep instead of thousands.

Reproduced this with latest r49687 webkit sources under Linux using QtWebkit.

Attached is a backtrace from a run of QtWebkit rendering nested divs 3500 deep, with maximum stack set to 256KBytes (ulimit -s 256).

Also attached is a modified version of Eric Seidel's divs.html, adding parameterization of the number of levels, which I used in this testing.
Comment 12 Keith Kyzivat 2009-11-24 10:32:30 PST
(repeated from bug 30898)

I have a proposed patch that I've had done for a while now, was just trying to
think if there would be any better way to do it.

It caps the tree in the parser when parser adds a node, and also in Javascript
when a node is appended.
There is no memory consumption increase, but there is a hit in performance,
since depth of node to be added/appended to is calculated, for each add/append.

Since there is a performance hit, I added a build flag to turn it on, and
specify the maximum depth.  Under standard desktop browsers with large max
stack size, this can be omitted, since the maximum depth is generally in the
hundreds of thousands.  For low-memory devices, this is a bigger deal, and thus
they may configure webkit with --enable-domtree-maxdepth=<value>.

Patch coming shortly.
Comment 13 Keith Kyzivat 2009-11-24 13:17:22 PST
Created attachment 43803 [details]
Patch providing an optional maximum depth cap to the DOM tree, enabled using --enable-domtree-maxdepth

Attached is the patch I have created to cap the DOM tree when a node is added (from HTML parse), or appended to (Javascript).
This functionality does not add any memory overhead, however it does add performance overhead.
Because of this, I have made this an optional feature that must be turned on using --enable-domtree-maxdepth.
To configure the maximum depth, one changes the MAX_DOM_TREE_DEPTH define in WebCore/config.h

To more quickly test this on desktop browsers, one should spawn a subshell, and run ulimit -s 256 (under Linux, under Mac it may be different) before testing one of the above HTML attachments that reproduce the bug.
Comment 14 Kent Tamura 2009-11-24 18:14:22 PST
*** Bug 30898 has been marked as a duplicate of this bug. ***
Comment 15 Eric Seidel (no email) 2009-11-25 08:15:59 PST
Comment on attachment 43803 [details]
Patch providing an optional maximum depth cap to the DOM tree, enabled using --enable-domtree-maxdepth

Can't commit with "OOPS" in a ChangeLog.  The "Reviewed by" line will be fixed by our landing scripts, but:
No new tests. (OOPS!)
will cause the commit to fail.

The parsing case could be made quick by caching the current level of the parse tree.

I'm not sure I'm a big fan of this change, but I don't really know what to suggest.

r- because the ChangeLog would cause the commit to fail.

Testing for this is easy.  There should be a test which validates that we can't add below 400 items.  Obviously that test only makes sense if this feature is enabled, and would probably be skipped on all platforms except whichever plans to use this.

Speakign of which, what platform would use this?
Comment 16 Keith Kyzivat 2009-11-25 09:35:58 PST
> No new tests. (OOPS!) will cause the commit to fail.
I left it in there on purpose, as I figured you would come back wanting me to add tests for this commit -- just wanted some input on how I should reasonably add tests for this.

> The parsing case could be made quick by caching the current level of the parse tree.
How exactly can that be accomplished? I'm not familiar with this.  (a pointer to some reading material is fine)

> I'm not sure I'm a big fan of this change, but I don't really know what to suggest.
I'm all ears -- if someone else can think of a reasonable way to prevent stack overflow, that'd be excellent.  Could be done at render-time too, but I couldn't think of a way to figure out when we're getting close to exceeding the maximum stack size.

> Speakign of which, what platform would use this?
Embedded devices with small maximum stack size -- specifically, in my case, Nokia S40 phones.  Currently, since S40 phones don't have process separation, if the stack overflows, it causes the phone to reset.  I want to prevent that from happening.  Under S60 phones the maximum stack is also relatively small, but when stack is exceeded it just takes down the browser, not the OS.
Under Android, iPhone/iPod touch, and also desktop webkit-based browsers, this bug can exhibit itself, but, since the maximum stack size is quite large, the element depth to reproduce the bug is ludicrously large -- on the order of 100,000 elements deep, therefore it isn't really a big deal on those browsers -- users would have long since hit the stop button before the problem would manifest itself.
Comment 17 Eric Seidel (no email) 2009-11-30 22:42:43 PST
(In reply to comment #16)
> > No new tests. (OOPS!) will cause the commit to fail.
> I left it in there on purpose, as I figured you would come back wanting me to
> add tests for this commit -- just wanted some input on how I should reasonably
> add tests for this.
> 
> > The parsing case could be made quick by caching the current level of the parse tree.
> How exactly can that be accomplished? I'm not familiar with this.  (a pointer
> to some reading material is fine)

http://trac.webkit.org/browser/trunk/WebCore/html/HTMLParser.h#L162

the "block stack" might also be able to cache this information, I'm not sure.

> > Speakign of which, what platform would use this?
> Embedded devices with small maximum stack size -- specifically, in my case,
> Nokia S40 phones.  Currently, since S40 phones don't have process separation,
> if the stack overflows, it causes the phone to reset.  I want to prevent that
> from happening.  Under S60 phones the maximum stack is also relatively small,
> but when stack is exceeded it just takes down the browser, not the OS.
> Under Android, iPhone/iPod touch, and also desktop webkit-based browsers, this
> bug can exhibit itself, but, since the maximum stack size is quite large, the
> element depth to reproduce the bug is ludicrously large -- on the order of
> 100,000 elements deep, therefore it isn't really a big deal on those browsers
> -- users would have long since hit the stop button before the problem would
> manifest itself.

OK, that makes a lot more sense.  I can see how devices could be more sensitive to this and might want to make the speed tradeoff to prevent bringing down the whole OS.

I think you still want to optimize the HTML parsing case if possible, but I think your approach of using a stack crawl instead of caching the value on Node is probably the right one.
Comment 18 Kent Tamura 2010-03-04 18:06:39 PST
Hi Keith,
Are you working on updating the patch?
If not, I (or someone else) will take it over.
Comment 19 Kent Tamura 2010-03-11 01:36:33 PST
We had better add another flag to enable the check by parsers.
ENABLE_DOMTREE_MAXDEPTH seems to have performance regression and performance-sensitive browsers won't enable it.
However the check by parsers won't have significant performance regression and performance-sensitive browsers will enable it.
Comment 20 Kent Tamura 2010-03-15 00:52:58 PDT
Created attachment 50688 [details]
Proposed patch by tkent
Comment 21 Eric Seidel (no email) 2010-03-15 12:57:12 PDT
Comment on attachment 50688 [details]
Proposed patch by tkent

Does Chromium want this?

Aren't ports going to want to control MAX_DOM_TREE_DEPTH?  I guess they can already by setting both ENABLE and MAX_DOM_TREE_DEPTH.

The implementation looks OK.

Are you sure this is the order you want?
+#if ENABLE(DOMTREE_PARSING_MAXDEPTH)
+    if (m_nodeDepth > MAX_DOM_TREE_DEPTH)
+        return;
+#endif
     exitText();

+        handleError(nonFatal, "Too deep tree.", lineNumber(), columnNumber());

"DOM tree is too deep" would be more clear to my eyes.  What sort of message does FireFox provide here?  I'm pretty sure they have a limit.
Comment 22 Kent Tamura 2010-03-15 19:23:37 PDT
(In reply to comment #21)
> (From update of attachment 50688 [details])
> Does Chromium want this?

Yes.  I'll enable DOMTREE_PARSING_MAXDEPTH.

> 
> Aren't ports going to want to control MAX_DOM_TREE_DEPTH?  I guess they can
> already by setting both ENABLE and MAX_DOM_TREE_DEPTH.

Do you mean no configure.ac change for MAX_DOM_TREE_DEPTH?

> Are you sure this is the order you want?
> +#if ENABLE(DOMTREE_PARSING_MAXDEPTH)
> +    if (m_nodeDepth > MAX_DOM_TREE_DEPTH)
> +        return;
> +#endif
>      exitText();

Yes. In parseStartElement(). the check should be put after exitText(), and the check should be put before exitText().  It's because exitText() in parseStartElement() is for text *before* the start tag.

> +        handleError(nonFatal, "Too deep tree.", lineNumber(), columnNumber());
> 
> "DOM tree is too deep" would be more clear to my eyes.  What sort of message
> does FireFox provide here?  I'm pretty sure they have a limit.

Ok, I'll update the message.
Firefox has a limit for HTML DOM: https://hg.mozilla.org/mozilla-central/file/050887c64183/parser/htmlparser/src/nsHTMLTokenizer.cpp#l382
but it doesn't show any message like WebKit doesn't for HTML.
I couldn't find a limit for XML parsing in Firefox.
Comment 23 Kent Tamura 2010-03-15 19:24:59 PDT
> Yes. In parseStartElement(). the check should be put after exitText(), and the
> check should be put before exitText().
Correction:
.. the check should be put before exitText() in other functions.
Comment 24 Kent Tamura 2010-03-15 19:52:21 PDT
Created attachment 50755 [details]
Proposed patch by tkent (rev.2)
Comment 25 Eric Seidel (no email) 2010-03-15 20:53:16 PDT
OK, so if Chromium wants this, why don't all ports want this?
Comment 26 Kent Tamura 2010-03-15 21:14:40 PDT
(In reply to comment #25)
> OK, so if Chromium wants this, why don't all ports want this?

Ah, I see.  We don't need to make ENABLE_DOMTREE_PARSING_MAXDEPTH optional.
I'll remove ENABLE_DOMTREE_PARSING_MAXDEPTH and make it work by default later if no one objects.
Comment 27 Kent Tamura 2010-03-17 23:29:48 PDT
Created attachment 51008 [details]
Proposed patch by tkent (rev.3)
Comment 28 Eric Seidel (no email) 2010-03-31 00:25:01 PDT
Comment on attachment 51008 [details]
Proposed patch by tkent (rev.3)

This looks really expensive.  I'm certain this must cause a performance regression on the PLT.  Have you run any performance testing?
Comment 29 Eric Seidel (no email) 2010-03-31 00:25:55 PDT
(In reply to comment #19)
> We had better add another flag to enable the check by parsers.
> ENABLE_DOMTREE_MAXDEPTH seems to have performance regression and
> performance-sensitive browsers won't enable it.
> However the check by parsers won't have significant performance regression and
> performance-sensitive browsers will enable it.

What browser is not performance sensitive?  Certainly both Chrome and Safari are...
Comment 30 Maciej Stachowiak 2010-03-31 01:09:18 PDT
Comment on attachment 51008 [details]
Proposed patch by tkent (rev.3)

1) I don't think this fix should be optional. Ports should not have to choose between a crash fix and a possible performance hit. We should have one code path that both has adequate performance and does not crash.

2) A number of the DOM methods now do if (atMaxTreeDepth()) return 0;. It seems to me that they should set the exception code too (I am not sure offhand which DOM exception code is most appropriate). I suggest trying some DOM benchmarks, such as <http://dromaeo.com/?dom> and <http://nontroppo.org/timer/Hixie_DOM.html>.

3) Because node insertion has become O(N), it's possible that building up a deep tree using DOM APIs alone will trigger O(N^2) behavior. Almost any time we've had N^2 algorithms, we have run into some site that hangs as a result.

4) MAX_DOM_TREE_DEPTH should be defined as a const size_t rather than a preprocessor macro.

5) 4096 seems like a fairly low limit. Is that the best we can do?

6) cMaxBlockDepth shouldn't be defined as MAX_DOM_TREE_DEPTH; the fact that it's the same constant is a coincidence, not fundamental.

7) In the parser, it looks to me like you repurposed m_blocksInStack to be a count of total nesting level, rather than block nesting level. That seems dubious to me for a couple of reasons: 
   (a) The name is now wrong. If this change is otherwise right, the variable should be renamed (or just use the dom tree depth constant directly).
   (b) Checks relating to block depth just go off of tag depth, but that seems wrong to me; I suspect this breaks the intent of the block checks. For example, non-blocks could now cause blocks to be popped.
   (c) I think it would be good to test a case of blocks and a case of mixed blocks and inlines. It seems like your test case only covers spans, which are an inline element

8) Like Eric, I would like to see testing of page load speed.

r- to address above comments.
Comment 31 Kent Tamura 2010-03-31 02:20:04 PDT
I have had Dromaeo DOM results.  But the results make almost no sense.
The appendChild and insertBefore tests of Dromaeo check times to insert elements to document.body.  So the results have very little impact of ENABLE_DOMTREE_RUNTIME_MAXDEPTH.

Dromaeo DOM test results:
Safari with WebKit r56828: http://dromaeo.com/?id=98801
  createElement: 531.19runs/s ±0.79%
  createTextNode: 264.63runs/s ±0.78%
  innerHTML: 79.55runs/s ±0.96%
  cloneNode: 148.10runs/s ±1.73%
  appendChild: 1046.00runs/s ±0.15%
  insertBefore: 981.40runs/s ±0.46%

Safari with WebKit r56828 + ENABLE_DOMTREE_RUNTIME_MAXDEPTH: http://dromaeo.com/?id=98804
  createElement: 513.69runs/s ±0.87%
  createTextNode: 261.68runs/s ±1.03%
  innerHTML: 78.74runs/s ±0.58%
  cloneNode: 143.97runs/s ±2.30%
  appendChild: 999.20runs/s ±0.16%
  insertBefore: 988.80runs/s ±0.22%
Comment 32 Kent Tamura 2010-03-31 02:28:58 PDT
(In reply to comment #31)
> I have had Dromaeo DOM results.  But the results make almost no sense.
> The appendChild and insertBefore tests of Dromaeo check times to insert
> elements to document.body.  So the results have very little impact of
> ENABLE_DOMTREE_RUNTIME_MAXDEPTH.

Hixie_DOM is not suitable too.  Hixie_DOM tests insertions to a child of the body.

Hmm, we should make new tests.
Comment 33 Kent Tamura 2010-03-31 02:33:00 PDT
(In reply to comment #28)
> (From update of attachment 51008 [details])
> This looks really expensive.  I'm certain this must cause a performance
> regression on the PLT.  Have you run any performance testing?

What's 'PLT'?
Comment 34 Kent Tamura 2010-03-31 02:35:16 PDT
(In reply to comment #30)
> 5) 4096 seems like a fairly low limit. Is that the best we can do?

I don't think it's low.  It's very higher than Firefox's depth limit.
Firefox's depth limit is 200.
https://hg.mozilla.org/mozilla-central/file/050887c64183/parser/htmlparser/src/nsHTMLTokenizer.cpp#l382
Comment 35 Kent Tamura 2010-03-31 04:14:00 PDT
Created attachment 52149 [details]
Performance checker

With this test code:

Iteration=4000 depth=200
No patch: 4,633
DOMTREE_RUNTIME_MAXDEPTH: 4,878 (4.6% time gain)

Iteration=500 depth=1000
No patch: 7,353
DOMTREE_RUNTIME_MAXDEPTH: 8,023 (9.1% time gain)

Iteration=30 depth=4000
No patch: 5,747
DOMTREE_RUNTIME_MAXDEPTH: 6,662 (15.9% time gain)

Hmm, DOMTREE_RUNTIME_MAXDEPTH is not acceptable.


I have an idea to improve performance. How about introducing just one cache of a node depth?
Suppose that:
 Document has m_lastDepth and m_lastDepthNode. They have information about a parent node which was used in the last node insertion.
 When another child node is inserted to a parent node P,
  - If P is m_lastDepthNode, m_lastDepth is the depth
  - If P's parent is m_lastDepthNode, m_lastDepth++ and m_lastDepthNode = P
  - If m_lastDepthNode's parent is P, m_lastDepth-- and m_lastDepthNode = P
  - If P's parent is m_lastDepthNode's parent, m_lastDepthNode = P
This reduces a lot of depth calculation.
Comment 36 Kent Tamura 2010-03-31 23:33:44 PDT
Created attachment 52263 [details]
Proposed patch by tkent (rev.4)
Comment 37 Kent Tamura 2010-03-31 23:46:58 PDT
(In reply to comment #36)
> Created an attachment (id=52263) [details]
> Proposed patch by tkent (rev.4)

This patch contains a change for cap by DOM operation, and removed a change for cap by parsers because the DOM operation got much faster than the past patches.

fast/dom/tree-depth-cap.html takes 2.93sec on my machine, and fast/parser/tree-depth-cap.html takes 0.47sec.  I think fast/dom/tree-depth-cap.html is not acceptable to check in.

* Performance
** Doromaeo DOM
- No patch http://dromaeo.com/?id=98854
  createElement: 545.00runs/s ±0.70%
  createTextNode: 272.29runs/s ±1.08%
  innerHTML: 78.90runs/s ±0.47%
  cloneNode: 146.59runs/s ±2.33%
  appendChild: 1023.00runs/s ±0.49%
  insertBefore: 958.60runs/s ±0.54%

- With this patch http://dromaeo.com/?id=98855
  createElement: 532.88runs/s ±0.89%    * This should not have performance difference.
  createTextNode: 263.83runs/s ±0.95%  * ditto.
  innerHTML: 78.56runs/s ±0.32%
  cloneNode: 143.58runs/s ±5.47%
  appendChild: 1059.60runs/s ±0.13%
  insertBefore: 998.40runs/s ±0.64%


** Performance checker in this bug
Iteration=4,000 depth=200
No patch: 4,808
This patch: 4.982 (3.6% time gain)

Iteration=500 depth=1,000
No patch: 7,160
This patch: 7,315 (2.1% time gain)

Iteration=30 depth=4,000
No patch: 5,623
This patch: 5,706 (1.4% time gain)
Comment 38 Dave Hyatt 2010-04-01 11:43:04 PDT
Comment on attachment 52263 [details]
Proposed patch by tkent (rev.4)

I'm concerned about the performance impact of this change.  It seems like in the typical case of just doing normal DOM insertions, you've made an O(1) operation into an O(depth) operation.

What about the insertion/appending of document fragments?  Do they fool your depth counting?
Comment 39 Kent Tamura 2010-04-02 02:41:51 PDT
(In reply to comment #38)
> (From update of attachment 52263 [details])
> I'm concerned about the performance impact of this change.  It seems like in
> the typical case of just doing normal DOM insertions, you've made an O(1)
> operation into an O(depth) operation.

It would be O(1) in typical cases and O(depth) at most with my latest patch. But ...

> What about the insertion/appending of document fragments?  Do they fool your
> depth counting?

Ah, it can avoid the limitation.
I have no good idea to limit the depth in a case of inserting a tree into another tree.
Does anyone have a good idea?
Comment 40 Kent Tamura 2010-04-07 23:20:26 PDT
Created attachment 52832 [details]
Parser-only patch
Comment 41 Kent Tamura 2010-04-07 23:50:06 PDT
Created attachment 52840 [details]
Parser-only patch (rev.2)
Comment 42 Kent Tamura 2010-04-07 23:55:17 PDT
I have no idea of a complete and efficient algorithm to solve this problem in DOM operations.
So I focus on the parser solution though it can't solve this problem completely.

(In reply to comment #30)
> (From update of attachment 51008 [details])
> 4) MAX_DOM_TREE_DEPTH should be defined as a const size_t rather than a
> preprocessor macro.

I added "size_t maxDomTreeDepth" in config.h.
I kept MAX_DOM_TREE_DEPTH to change this value by a compiler
flag, and I didn't wrap it with "namespace WebCobre" because
config.h is used for .c too.

> 5) 4096 seems like a fairly low limit. Is that the best we can do?

As I wrote in another comment, it's very high.
I counted the maximum depth of Gmail message view.  It's about 35 + <tree depth in an HTML message>.

However, I changed the default to 5000 because the existing block
depth cap doesn't work if MAX_DOM_TREE_DEPTH is <= 4096.

> 6) cMaxBlockDepth shouldn't be defined as MAX_DOM_TREE_DEPTH; the fact that
> it's the same constant is a coincidence, not fundamental.

Fixed.

> 7) In the parser, it looks to me like you repurposed m_blocksInStack to be a
> count of total nesting level, rather than block nesting level. That seems
> dubious to me for a couple of reasons: 

I added another depth counter. So this issue was resolved.
Comment 43 WebKit Review Bot 2010-04-08 00:49:22 PDT
Attachment 52840 [details] did not build on gtk:
Build output: http://webkit-commit-queue.appspot.com/results/1653244
Comment 44 Kent Tamura 2010-04-08 01:04:32 PDT
Created attachment 52847 [details]
Parser-only patch (rev.3)
Comment 45 Kent Tamura 2010-04-08 01:53:45 PDT
Comment on attachment 52847 [details]
Parser-only patch (rev.3)

r57263 (Bug#37247) fixed the XML parser part. I'll remove the XML parser change from the patch.
Comment 46 Kent Tamura 2010-04-08 03:22:14 PDT
Created attachment 52852 [details]
Parser-only patch (rev.4)
Comment 47 Darin Adler 2010-04-09 09:14:19 PDT
Comment on attachment 52852 [details]
Parser-only patch (rev.4)

> +#ifndef config_h
> +#define config_h

The config.h file is not a normal header file, and should not have a header guard in it. We can discuss this further elsewhere, but please don't add this in this patch without further discussion.

> +static const unsigned maxDOMTreeDepth = MAX_DOM_TREE_DEPTH;

It's not appropriate to have a "static const" in the config.h file.

I didn't review the rest of the patch, but these are both no-nos.

I don't think this value should go in config.h. It should go in a header file included only by XMLTokenizer.cpp and HTMLParser.cpp. We don't want this to have any effect on the rest of the source code. We could add a new source file called TreeDepthLimit.h in the dom directory.

I think that m_nodeDepth is a strange name; is it really the depth of a node?

The patch is risky. If we find that in some cases the node depth gets off, then we could either underflow to 0 or slowly creep up to the maximum depth and break parsing. So we need to make sure we are testing all the parsing code paths. It would help to have an assertion that the depth is back to zero once parsing is done. Running the regression tests with such an assert in place could help us catch any mismatches that otherwise would be silent.

Besides that, I think we need to do some performance testing, but otherwise, this patch looks good to me.
Comment 48 Darin Adler 2010-04-09 09:14:52 PDT
(In reply to comment #47)
> I didn't review the rest of the patch, but these are both no-nos.

Heh, I did review the rest of the patch. Sorry.
Comment 49 Kent Tamura 2010-04-24 10:50:05 PDT
Created attachment 54220 [details]
Parser-only patch (rev.5)
Comment 50 Kent Tamura 2010-04-24 11:15:38 PDT
Created attachment 54221 [details]
Parser-only patch (rev.6)
Comment 51 Kent Tamura 2010-04-24 11:19:39 PDT
(In reply to comment #47)
> I don't think this value should go in config.h. It should go in a header file
> included only by XMLTokenizer.cpp and HTMLParser.cpp. We don't want this to
> have any effect on the rest of the source code. We could add a new source file
> called TreeDepthLimit.h in the dom directory.

ok, I introduced dom/TreeDepthLimit.h.

> I think that m_nodeDepth is a strange name; is it really the depth of a node?

Changed it to m_treeDepth.

> The patch is risky. If we find that in some cases the node depth gets off, then
> we could either underflow to 0 or slowly creep up to the maximum depth and
> break parsing. So we need to make sure we are testing all the parsing code
> paths. It would help to have an assertion that the depth is back to zero once
> parsing is done. Running the regression tests with such an assert in place
> could help us catch any mismatches that otherwise would be silent.

I added an ASSERT().

> Besides that, I think we need to do some performance testing, but otherwise,
> this patch looks good to me.

Dromaeo innerHTMl test results:
Without this patch:
  innerHTML: 108.85runs/s ±0.36%
With this patch:
  innerHTML: 110.76runs/s ±0.39%

It seems no performance regression in this test.
Comment 52 Kent Tamura 2010-04-26 00:37:46 PDT
I committed "Parser-only patch (rev.6)" as r58236.
But I don't close this bug because we can still make deep trees by DOM operations.

I don't have good ideas to fix it.
Comment 53 Keith Kyzivat 2012-03-19 03:18:13 PDT
(In reply to comment #42)
> (In reply to comment #30)
> > 5) 4096 seems like a fairly low limit. Is that the best we can do?
> 
> As I wrote in another comment, it's very high.
> I counted the maximum depth of Gmail message view.  It's about 35 + <tree depth 

much more than 5000 will probably be too high for very low end devices with little memory, S40 phones, or maybe RaspberryPi for example.  Stack size of S40 browser is 256K.. pretty tiny, given most Linux desktops stack size is 8M.


Wow, much work has gone on in this bug since I last looked.  Thank you for continuing work on it Kent.
Comment 54 Ahmad Saleem 2022-08-29 15:20:23 PDT
I took "Performance Checker" test case and got following with default configuration:

Iteration: 30
Depth: 4000

>> STP - Elapsed time: 6014 [msec]
>> Firefox Nightly 107 - Elapsed time: 505 [msec]
>> Chrome Canary 107 - Elapsed time: 3332 [msec]


______

Safari is the slowest among all and I think improving this would be good. Just wanted to share updated test results and I don't get any crash. Thanks!