WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
63082
HTML parser should limit element depth of produced tree
https://bugs.webkit.org/show_bug.cgi?id=63082
Summary
HTML parser should limit element depth of produced tree
Sam Weinig
Reported
2011-06-21 11:26:36 PDT
The old HTML parser (and the current XML parser) constrained the depth of the tree it produced, in part to avoid running into stack overflow when laying out (due to the recursive nature of WebCore rendering). It would be great to get this feature back.
Attachments
Patch
(15.88 KB, patch)
2011-06-21 18:53 PDT
,
Sam Weinig
no flags
Details
Formatted Diff
Diff
Patch
(15.88 KB, patch)
2011-06-21 18:55 PDT
,
Sam Weinig
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ec2-cr-linux-02
(1.39 MB, application/zip)
2011-06-21 19:18 PDT
,
WebKit Review Bot
no flags
Details
Patch
(17.91 KB, patch)
2011-06-21 20:14 PDT
,
Sam Weinig
no flags
Details
Formatted Diff
Diff
Patch
(18.21 KB, patch)
2011-06-21 22:34 PDT
,
Sam Weinig
abarth
: review+
abarth
: commit-queue-
Details
Formatted Diff
Diff
Show Obsolete
(3)
View All
Add attachment
proposed patch, testcase, etc.
Sam Weinig
Comment 1
2011-06-21 18:53:53 PDT
Created
attachment 98097
[details]
Patch
Sam Weinig
Comment 2
2011-06-21 18:55:03 PDT
Created
attachment 98098
[details]
Patch
WebKit Review Bot
Comment 3
2011-06-21 18:57:01 PDT
Attachment 98097
[details]
did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1 Source/WebCore/html/parser/HTMLTreeBuilder.h:119: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Source/WebCore/html/parser/HTMLTreeBuilder.h:120: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 2 in 11 files If any of these errors are false positives, please file a bug against check-webkit-style.
WebKit Review Bot
Comment 4
2011-06-21 18:58:08 PDT
Attachment 98098
[details]
did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1 Source/WebCore/html/parser/HTMLTreeBuilder.h:119: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Source/WebCore/html/parser/HTMLTreeBuilder.h:120: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 2 in 11 files If any of these errors are false positives, please file a bug against check-webkit-style.
Joseph Pecoraro
Comment 5
2011-06-21 19:00:30 PDT
No test? We already lost this feature once, maybe because it didn't have a test. =)
Sam Weinig
Comment 6
2011-06-21 19:08:31 PDT
(In reply to
comment #5
)
> No test? We already lost this feature once, maybe because > it didn't have a test. =)
The best way to test this would be to be able to set the limit quite low, parse a bunch of nested <div>s and check the depth of the deepest one. I will try and whip that up.
WebKit Review Bot
Comment 7
2011-06-21 19:18:41 PDT
Comment on
attachment 98098
[details]
Patch
Attachment 98098
[details]
did not pass chromium-ews (chromium-xvfb): Output:
http://queues.webkit.org/results/8919190
New failing tests: fast/parser/block-nesting-cap.html fast/parser/element-nesting-cap.html
WebKit Review Bot
Comment 8
2011-06-21 19:18:46 PDT
Created
attachment 98101
[details]
Archive of layout-test-results from ec2-cr-linux-02 The attached test failures were seen while running run-webkit-tests on the chromium-ews. Bot: ec2-cr-linux-02 Port: Chromium Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
Sam Weinig
Comment 9
2011-06-21 19:30:05 PDT
(In reply to
comment #7
)
> (From update of
attachment 98098
[details]
) >
Attachment 98098
[details]
did not pass chromium-ews (chromium-xvfb): > Output:
http://queues.webkit.org/results/8919190
> > New failing tests: > fast/parser/block-nesting-cap.html > fast/parser/element-nesting-cap.html
Look at that, there already were tests, they were just failing :)
Sam Weinig
Comment 10
2011-06-21 19:42:49 PDT
There is some wrongness in this patch. I will have a new one shortly.
Sam Weinig
Comment 11
2011-06-21 20:14:42 PDT
Created
attachment 98109
[details]
Patch
Sam Weinig
Comment 12
2011-06-21 20:16:13 PDT
(In reply to
comment #11
)
> Created an attachment (id=98109) [details] > Patch
This new patch also updates the tests results. The failures in fast/parser/element-nesting-cap.html come from the fact that the old parser had a different limit for different types of elements.
WebKit Review Bot
Comment 13
2011-06-21 20:17:21 PDT
Attachment 98109
[details]
did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/ChangeLog', u'LayoutTests/fast..." exit_code: 1 Source/WebCore/html/parser/HTMLTreeBuilder.h:119: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Source/WebCore/html/parser/HTMLTreeBuilder.h:120: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 2 in 14 files If any of these errors are false positives, please file a bug against check-webkit-style.
mitz
Comment 14
2011-06-21 20:45:37 PDT
Comment on
attachment 98109
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=98109&action=review
> Source/WebCore/ChangeLog:16 > + Limit the DOM tree depth by turning attempts to add a child passed a certain
Typo: passed
> Source/WebCore/html/parser/HTMLDocumentParser.cpp:575 > + ASSERT(document); > + if (!document->settings())
Since you are immediately dereferencing document, the assertion has very limited usefulness.
> Source/WebCore/html/parser/HTMLElementStack.h:78 > + size_t size() const { return m_size; }
Why are you using size_t for the size of the stack but unsigned for the depth of the DOM tree?
> Source/WebCore/page/Settings.cpp:103 > + , m_maximumDOMTreeDepth(4096)
Please turn this into a named constant, and expose it via a static member in Settings so that HTMLDocumentParser::maximumDOMTreeDepth() can refer to it.
> Source/WebCore/page/Settings.h:392 > + void setMaximumDOMTreeDepth(unsigned maximumDOMTreeDepth) { m_maximumDOMTreeDepth = maximumDOMTreeDepth; } > + unsigned maximumDOMTreeDepth() const { return m_maximumDOMTreeDepth; }
This name is misleading. The name should indicate that this limit only applies to the parser, not to the DOM API.
Alexey Proskuryakov
Comment 15
2011-06-21 21:22:40 PDT
-PASS s2.parentNode === s1 is true -FAIL s3.previousSibling === s2 should be true. Was false. +FAIL s2.parentNode === s1 should be true. Was false. +PASS s3.previousSibling === s2 is true Is this test wrong?
Sam Weinig
Comment 16
2011-06-21 21:45:06 PDT
(In reply to
comment #15
)
> -PASS s2.parentNode === s1 is true > -FAIL s3.previousSibling === s2 should be true. Was false. > +FAIL s2.parentNode === s1 should be true. Was false. > +PASS s3.previousSibling === s2 is true > > Is this test wrong?
Define wrong. We used to have a different/larger limit for inlines, something about a test of hyatt's. Mitz might know more.
Alexey Proskuryakov
Comment 17
2011-06-21 22:00:33 PDT
> Define wrong.
Is it something that needs to be fixed, or should the test be fixed to say PASS?
Sam Weinig
Comment 18
2011-06-21 22:07:35 PDT
(In reply to
comment #17
)
> > Define wrong. > > Is it something that needs to be fixed, or should the test be fixed to say PASS?
Ah, I don't know the answer to that.
Adam Barth
Comment 19
2011-06-21 22:29:24 PDT
Comment on
attachment 98109
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=98109&action=review
I agree with mitz's comments. Eric should verify that we're keeping track of m_size properly.
> Source/WebCore/html/parser/HTMLConstructionSite.cpp:103 > + parent->parserAddChild(child); > + } else > + parent->parserAddChild(child);
Should these lines go outside the if-block?
Sam Weinig
Comment 20
2011-06-21 22:34:20 PDT
Created
attachment 98119
[details]
Patch
WebKit Review Bot
Comment 21
2011-06-21 22:36:03 PDT
Attachment 98119
[details]
did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/ChangeLog', u'LayoutTests/fast..." exit_code: 1 Source/WebCore/html/parser/HTMLTreeBuilder.h:119: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Source/WebCore/html/parser/HTMLTreeBuilder.h:120: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 2 in 14 files If any of these errors are false positives, please file a bug against check-webkit-style.
Adam Barth
Comment 22
2011-06-22 10:22:32 PDT
Comment on
attachment 98119
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=98119&action=review
This looks good. I've now checked the HTMLElementStack accounting in detail. I think the only remaining issue is whether the foster-parenting code path correct enforces the depth limit. I'm marking r+/cq-, so you should feel free to either land this patch with the nits fixed and deal with foster parenting in a followup patch or post an updated version of the patch with the foster parenting issues addressed/tested.
> Source/WebCore/html/parser/HTMLConstructionSite.cpp:103 > + parent->parserAddChild(child); > + } else > + parent->parserAddChild(child);
Looks like you didn't move this code outside the if-block. Do we need a similar check in HTMLConstructionSite::attachAtSite (or in the code that chooses the site for attachAtSite?
>> Source/WebCore/html/parser/HTMLTreeBuilder.h:119 >> + HTMLTreeBuilder(HTMLDocumentParser* parser, HTMLDocument*, bool reportErrors, bool usePreHTML5ParserQuirks, unsigned maximumDOMTreeDepth); > > The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5]
You could fix these style nits if you were so inclined.
Eric Seidel (no email)
Comment 23
2011-06-22 11:02:46 PDT
Do we know why we need this? Does this crash iPhone safari? Does this need to be a security bug?
Eric Seidel (no email)
Comment 24
2011-06-22 11:09:47 PDT
Comment on
attachment 98119
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=98119&action=review
> LayoutTests/fast/parser/element-nesting-cap-expected.txt:7 > +FAIL s2.parentNode === s1 should be true. Was false.
We should either adjust the constant in this test from 5100 to 4096 (or whatever it was decided on), or remove the test (since I"m not sure how much it adds over the block-nesting-cap.html
Eric Seidel (no email)
Comment 25
2011-06-22 11:11:27 PDT
You may have already done so, but just in case: you should run the HTML parser perf tests:
http://trac.webkit.org/browser/trunk/PerformanceTests/Parser
They're quick and easy, and will make sure that we don't regress other public benchmarks that rely on html parsing. This is unlikely to be a hit, as we've found that per-token branches don't really affect perf (but per char ones definitely do!)
Eric Seidel (no email)
Comment 26
2011-06-22 11:12:00 PDT
It would also be nice to add some commentary in the tests or code as to why this is needed. :)
Sam Weinig
Comment 27
2011-06-22 11:26:09 PDT
Committed
r89453
: <
http://trac.webkit.org/changeset/89453
>
Sam Weinig
Comment 28
2011-06-22 11:37:54 PDT
Eric, sorry I missed your comments before landing. I don't think this needs to be a security bug as the result of the crashes that can result have not been exploitable. In general, this code is needed because rendering can become pathological for largely nested code. It was also present in the old parser, and I find it objectionable that is was not added to the new one, even though there were tests that had to be turned into failures. In general, we should not be removing these type of safe guards without understanding why it was added in the first place. Regarding the test that still fails. I want to talk to hyatt about why he wanted that behavior in the first place before changing it, but alas, he is on vacation.
Sam Weinig
Comment 29
2011-06-22 11:38:30 PDT
(In reply to
comment #25
)
> You may have already done so, but just in case: you should run the HTML parser perf tests: >
http://trac.webkit.org/browser/trunk/PerformanceTests/Parser
> > They're quick and easy, and will make sure that we don't regress other public benchmarks that rely on html parsing. > > This is unlikely to be a hit, as we've found that per-token branches don't really affect perf (but per char ones definitely do!)
I ran the perf tests and saw no change.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug