63082 – HTML parser should limit element depth of produced tree

RESOLVED FIXED 63082

HTML parser should limit element depth of produced tree

https://bugs.webkit.org/show_bug.cgi?id=63082

Summary HTML parser should limit element depth of produced tree

Sam Weinig

Reported 2011-06-21 11:26:36 PDT

The old HTML parser (and the current XML parser) constrained the depth of the tree it produced, in part to avoid running into stack overflow when laying out (due to the recursive nature of WebCore rendering). It would be great to get this feature back.

Attachments
Patch (15.88 KB, patch) 2011-06-21 18:53 PDT, Sam Weinig	no flags	Details Formatted Diff Diff
Patch (15.88 KB, patch) 2011-06-21 18:55 PDT, Sam Weinig	no flags	Details Formatted Diff Diff
Archive of layout-test-results from ec2-cr-linux-02 (1.39 MB, application/zip) 2011-06-21 19:18 PDT, WebKit Review Bot	no flags	Details
Patch (17.91 KB, patch) 2011-06-21 20:14 PDT, Sam Weinig	no flags	Details Formatted Diff Diff
Patch (18.21 KB, patch) 2011-06-21 22:34 PDT, Sam Weinig	abarth: review+ abarth: commit-queue-	Details Formatted Diff Diff
Show Obsolete (3) View All Add attachment proposed patch, testcase, etc.

Sam Weinig

Comment 1 2011-06-21 18:53:53 PDT

Created attachment 98097 [details] Patch

Sam Weinig

Comment 2 2011-06-21 18:55:03 PDT

Created attachment 98098 [details] Patch

WebKit Review Bot

Comment 3 2011-06-21 18:57:01 PDT

Attachment 98097 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1 Source/WebCore/html/parser/HTMLTreeBuilder.h:119: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Source/WebCore/html/parser/HTMLTreeBuilder.h:120: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 2 in 11 files If any of these errors are false positives, please file a bug against check-webkit-style.

WebKit Review Bot

Comment 4 2011-06-21 18:58:08 PDT

Attachment 98098 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1 Source/WebCore/html/parser/HTMLTreeBuilder.h:119: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Source/WebCore/html/parser/HTMLTreeBuilder.h:120: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 2 in 11 files If any of these errors are false positives, please file a bug against check-webkit-style.

Joseph Pecoraro

Comment 5 2011-06-21 19:00:30 PDT

No test? We already lost this feature once, maybe because it didn't have a test. =)

Sam Weinig

Comment 6 2011-06-21 19:08:31 PDT

(In reply to comment #5) > No test? We already lost this feature once, maybe because > it didn't have a test. =) The best way to test this would be to be able to set the limit quite low, parse a bunch of nested <div>s and check the depth of the deepest one. I will try and whip that up.

WebKit Review Bot

Comment 7 2011-06-21 19:18:41 PDT

Comment on attachment 98098 [details] Patch Attachment 98098 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/8919190 New failing tests: fast/parser/block-nesting-cap.html fast/parser/element-nesting-cap.html

WebKit Review Bot

Comment 8 2011-06-21 19:18:46 PDT

Created attachment 98101 [details] Archive of layout-test-results from ec2-cr-linux-02 The attached test failures were seen while running run-webkit-tests on the chromium-ews. Bot: ec2-cr-linux-02 Port: Chromium Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick

Sam Weinig

Comment 9 2011-06-21 19:30:05 PDT

(In reply to comment #7) > (From update of attachment 98098 [details]) > Attachment 98098 [details] did not pass chromium-ews (chromium-xvfb): > Output: http://queues.webkit.org/results/8919190 > > New failing tests: > fast/parser/block-nesting-cap.html > fast/parser/element-nesting-cap.html Look at that, there already were tests, they were just failing :)

Sam Weinig

Comment 10 2011-06-21 19:42:49 PDT

There is some wrongness in this patch. I will have a new one shortly.

Sam Weinig

Comment 11 2011-06-21 20:14:42 PDT

Created attachment 98109 [details] Patch

Sam Weinig

Comment 12 2011-06-21 20:16:13 PDT

(In reply to comment #11) > Created an attachment (id=98109) [details] > Patch This new patch also updates the tests results. The failures in fast/parser/element-nesting-cap.html come from the fact that the old parser had a different limit for different types of elements.

WebKit Review Bot

Comment 13 2011-06-21 20:17:21 PDT

Attachment 98109 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/ChangeLog', u'LayoutTests/fast..." exit_code: 1 Source/WebCore/html/parser/HTMLTreeBuilder.h:119: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Source/WebCore/html/parser/HTMLTreeBuilder.h:120: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 2 in 14 files If any of these errors are false positives, please file a bug against check-webkit-style.

mitz

Comment 14 2011-06-21 20:45:37 PDT

Comment on attachment 98109 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=98109&action=review > Source/WebCore/ChangeLog:16 > + Limit the DOM tree depth by turning attempts to add a child passed a certain Typo: passed > Source/WebCore/html/parser/HTMLDocumentParser.cpp:575 > + ASSERT(document); > + if (!document->settings()) Since you are immediately dereferencing document, the assertion has very limited usefulness. > Source/WebCore/html/parser/HTMLElementStack.h:78 > + size_t size() const { return m_size; } Why are you using size_t for the size of the stack but unsigned for the depth of the DOM tree? > Source/WebCore/page/Settings.cpp:103 > + , m_maximumDOMTreeDepth(4096) Please turn this into a named constant, and expose it via a static member in Settings so that HTMLDocumentParser::maximumDOMTreeDepth() can refer to it. > Source/WebCore/page/Settings.h:392 > + void setMaximumDOMTreeDepth(unsigned maximumDOMTreeDepth) { m_maximumDOMTreeDepth = maximumDOMTreeDepth; } > + unsigned maximumDOMTreeDepth() const { return m_maximumDOMTreeDepth; } This name is misleading. The name should indicate that this limit only applies to the parser, not to the DOM API.

Alexey Proskuryakov

Comment 15 2011-06-21 21:22:40 PDT

-PASS s2.parentNode === s1 is true -FAIL s3.previousSibling === s2 should be true. Was false. +FAIL s2.parentNode === s1 should be true. Was false. +PASS s3.previousSibling === s2 is true Is this test wrong?

Sam Weinig

Comment 16 2011-06-21 21:45:06 PDT

(In reply to comment #15) > -PASS s2.parentNode === s1 is true > -FAIL s3.previousSibling === s2 should be true. Was false. > +FAIL s2.parentNode === s1 should be true. Was false. > +PASS s3.previousSibling === s2 is true > > Is this test wrong? Define wrong. We used to have a different/larger limit for inlines, something about a test of hyatt's. Mitz might know more.

Alexey Proskuryakov

Comment 17 2011-06-21 22:00:33 PDT

> Define wrong. Is it something that needs to be fixed, or should the test be fixed to say PASS?

Sam Weinig

Comment 18 2011-06-21 22:07:35 PDT

(In reply to comment #17) > > Define wrong. > > Is it something that needs to be fixed, or should the test be fixed to say PASS? Ah, I don't know the answer to that.

Adam Barth

Comment 19 2011-06-21 22:29:24 PDT

Comment on attachment 98109 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=98109&action=review I agree with mitz's comments. Eric should verify that we're keeping track of m_size properly. > Source/WebCore/html/parser/HTMLConstructionSite.cpp:103 > + parent->parserAddChild(child); > + } else > + parent->parserAddChild(child); Should these lines go outside the if-block?

Sam Weinig

Comment 20 2011-06-21 22:34:20 PDT

Created attachment 98119 [details] Patch

WebKit Review Bot

Comment 21 2011-06-21 22:36:03 PDT

Attachment 98119 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/ChangeLog', u'LayoutTests/fast..." exit_code: 1 Source/WebCore/html/parser/HTMLTreeBuilder.h:119: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Source/WebCore/html/parser/HTMLTreeBuilder.h:120: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 2 in 14 files If any of these errors are false positives, please file a bug against check-webkit-style.

Adam Barth

Comment 22 2011-06-22 10:22:32 PDT

Comment on attachment 98119 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=98119&action=review This looks good. I've now checked the HTMLElementStack accounting in detail. I think the only remaining issue is whether the foster-parenting code path correct enforces the depth limit. I'm marking r+/cq-, so you should feel free to either land this patch with the nits fixed and deal with foster parenting in a followup patch or post an updated version of the patch with the foster parenting issues addressed/tested. > Source/WebCore/html/parser/HTMLConstructionSite.cpp:103 > + parent->parserAddChild(child); > + } else > + parent->parserAddChild(child); Looks like you didn't move this code outside the if-block. Do we need a similar check in HTMLConstructionSite::attachAtSite (or in the code that chooses the site for attachAtSite? >> Source/WebCore/html/parser/HTMLTreeBuilder.h:119 >> + HTMLTreeBuilder(HTMLDocumentParser* parser, HTMLDocument*, bool reportErrors, bool usePreHTML5ParserQuirks, unsigned maximumDOMTreeDepth); > > The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] You could fix these style nits if you were so inclined.

Eric Seidel (no email)

Comment 23 2011-06-22 11:02:46 PDT

Do we know why we need this? Does this crash iPhone safari? Does this need to be a security bug?

Eric Seidel (no email)

Comment 24 2011-06-22 11:09:47 PDT

Comment on attachment 98119 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=98119&action=review > LayoutTests/fast/parser/element-nesting-cap-expected.txt:7 > +FAIL s2.parentNode === s1 should be true. Was false. We should either adjust the constant in this test from 5100 to 4096 (or whatever it was decided on), or remove the test (since I"m not sure how much it adds over the block-nesting-cap.html

Eric Seidel (no email)

Comment 25 2011-06-22 11:11:27 PDT

You may have already done so, but just in case: you should run the HTML parser perf tests: http://trac.webkit.org/browser/trunk/PerformanceTests/Parser They're quick and easy, and will make sure that we don't regress other public benchmarks that rely on html parsing. This is unlikely to be a hit, as we've found that per-token branches don't really affect perf (but per char ones definitely do!)

Eric Seidel (no email)

Comment 26 2011-06-22 11:12:00 PDT

It would also be nice to add some commentary in the tests or code as to why this is needed. :)

Sam Weinig

Comment 27 2011-06-22 11:26:09 PDT

Committed r89453: <http://trac.webkit.org/changeset/89453>

Sam Weinig

Comment 28 2011-06-22 11:37:54 PDT

Eric, sorry I missed your comments before landing. I don't think this needs to be a security bug as the result of the crashes that can result have not been exploitable. In general, this code is needed because rendering can become pathological for largely nested code. It was also present in the old parser, and I find it objectionable that is was not added to the new one, even though there were tests that had to be turned into failures. In general, we should not be removing these type of safe guards without understanding why it was added in the first place. Regarding the test that still fails. I want to talk to hyatt about why he wanted that behavior in the first place before changing it, but alas, he is on vacation.

Sam Weinig

Comment 29 2011-06-22 11:38:30 PDT

(In reply to comment #25) > You may have already done so, but just in case: you should run the HTML parser perf tests: > http://trac.webkit.org/browser/trunk/PerformanceTests/Parser > > They're quick and easy, and will make sure that we don't regress other public benchmarks that rely on html parsing. > > This is unlikely to be a hit, as we've found that per-token branches don't really affect perf (but per char ones definitely do!) I ran the perf tests and saw no change.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware All

OS All

Product WebKit

Component DOM

Assignee

Sam Weinig

Reported

2011-06-21 11:26 PDT

Modified

2011-06-29 23:26 PDT History

CC List

7 users Show

URL

Keywords

Depends on

Blocks

63686

Dependencies

tree graph