The old HTML parser (and the current XML parser) constrained the depth of the tree it produced, in part to avoid running into stack overflow when laying out (due to the recursive nature of WebCore rendering). It would be great to get this feature back.
Created attachment 98097 [details] Patch
Created attachment 98098 [details] Patch
Attachment 98097 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1 Source/WebCore/html/parser/HTMLTreeBuilder.h:119: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Source/WebCore/html/parser/HTMLTreeBuilder.h:120: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 2 in 11 files If any of these errors are false positives, please file a bug against check-webkit-style.
Attachment 98098 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1 Source/WebCore/html/parser/HTMLTreeBuilder.h:119: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Source/WebCore/html/parser/HTMLTreeBuilder.h:120: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 2 in 11 files If any of these errors are false positives, please file a bug against check-webkit-style.
No test? We already lost this feature once, maybe because it didn't have a test. =)
(In reply to comment #5) > No test? We already lost this feature once, maybe because > it didn't have a test. =) The best way to test this would be to be able to set the limit quite low, parse a bunch of nested <div>s and check the depth of the deepest one. I will try and whip that up.
Comment on attachment 98098 [details] Patch Attachment 98098 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/8919190 New failing tests: fast/parser/block-nesting-cap.html fast/parser/element-nesting-cap.html
Created attachment 98101 [details] Archive of layout-test-results from ec2-cr-linux-02 The attached test failures were seen while running run-webkit-tests on the chromium-ews. Bot: ec2-cr-linux-02 Port: Chromium Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
(In reply to comment #7) > (From update of attachment 98098 [details]) > Attachment 98098 [details] did not pass chromium-ews (chromium-xvfb): > Output: http://queues.webkit.org/results/8919190 > > New failing tests: > fast/parser/block-nesting-cap.html > fast/parser/element-nesting-cap.html Look at that, there already were tests, they were just failing :)
There is some wrongness in this patch. I will have a new one shortly.
Created attachment 98109 [details] Patch
(In reply to comment #11) > Created an attachment (id=98109) [details] > Patch This new patch also updates the tests results. The failures in fast/parser/element-nesting-cap.html come from the fact that the old parser had a different limit for different types of elements.
Attachment 98109 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/ChangeLog', u'LayoutTests/fast..." exit_code: 1 Source/WebCore/html/parser/HTMLTreeBuilder.h:119: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Source/WebCore/html/parser/HTMLTreeBuilder.h:120: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 2 in 14 files If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 98109 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=98109&action=review > Source/WebCore/ChangeLog:16 > + Limit the DOM tree depth by turning attempts to add a child passed a certain Typo: passed > Source/WebCore/html/parser/HTMLDocumentParser.cpp:575 > + ASSERT(document); > + if (!document->settings()) Since you are immediately dereferencing document, the assertion has very limited usefulness. > Source/WebCore/html/parser/HTMLElementStack.h:78 > + size_t size() const { return m_size; } Why are you using size_t for the size of the stack but unsigned for the depth of the DOM tree? > Source/WebCore/page/Settings.cpp:103 > + , m_maximumDOMTreeDepth(4096) Please turn this into a named constant, and expose it via a static member in Settings so that HTMLDocumentParser::maximumDOMTreeDepth() can refer to it. > Source/WebCore/page/Settings.h:392 > + void setMaximumDOMTreeDepth(unsigned maximumDOMTreeDepth) { m_maximumDOMTreeDepth = maximumDOMTreeDepth; } > + unsigned maximumDOMTreeDepth() const { return m_maximumDOMTreeDepth; } This name is misleading. The name should indicate that this limit only applies to the parser, not to the DOM API.
-PASS s2.parentNode === s1 is true -FAIL s3.previousSibling === s2 should be true. Was false. +FAIL s2.parentNode === s1 should be true. Was false. +PASS s3.previousSibling === s2 is true Is this test wrong?
(In reply to comment #15) > -PASS s2.parentNode === s1 is true > -FAIL s3.previousSibling === s2 should be true. Was false. > +FAIL s2.parentNode === s1 should be true. Was false. > +PASS s3.previousSibling === s2 is true > > Is this test wrong? Define wrong. We used to have a different/larger limit for inlines, something about a test of hyatt's. Mitz might know more.
> Define wrong. Is it something that needs to be fixed, or should the test be fixed to say PASS?
(In reply to comment #17) > > Define wrong. > > Is it something that needs to be fixed, or should the test be fixed to say PASS? Ah, I don't know the answer to that.
Comment on attachment 98109 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=98109&action=review I agree with mitz's comments. Eric should verify that we're keeping track of m_size properly. > Source/WebCore/html/parser/HTMLConstructionSite.cpp:103 > + parent->parserAddChild(child); > + } else > + parent->parserAddChild(child); Should these lines go outside the if-block?
Created attachment 98119 [details] Patch
Attachment 98119 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/ChangeLog', u'LayoutTests/fast..." exit_code: 1 Source/WebCore/html/parser/HTMLTreeBuilder.h:119: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Source/WebCore/html/parser/HTMLTreeBuilder.h:120: The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 2 in 14 files If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 98119 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=98119&action=review This looks good. I've now checked the HTMLElementStack accounting in detail. I think the only remaining issue is whether the foster-parenting code path correct enforces the depth limit. I'm marking r+/cq-, so you should feel free to either land this patch with the nits fixed and deal with foster parenting in a followup patch or post an updated version of the patch with the foster parenting issues addressed/tested. > Source/WebCore/html/parser/HTMLConstructionSite.cpp:103 > + parent->parserAddChild(child); > + } else > + parent->parserAddChild(child); Looks like you didn't move this code outside the if-block. Do we need a similar check in HTMLConstructionSite::attachAtSite (or in the code that chooses the site for attachAtSite? >> Source/WebCore/html/parser/HTMLTreeBuilder.h:119 >> + HTMLTreeBuilder(HTMLDocumentParser* parser, HTMLDocument*, bool reportErrors, bool usePreHTML5ParserQuirks, unsigned maximumDOMTreeDepth); > > The parameter name "parser" adds no information, so it should be removed. [readability/parameter_name] [5] You could fix these style nits if you were so inclined.
Do we know why we need this? Does this crash iPhone safari? Does this need to be a security bug?
Comment on attachment 98119 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=98119&action=review > LayoutTests/fast/parser/element-nesting-cap-expected.txt:7 > +FAIL s2.parentNode === s1 should be true. Was false. We should either adjust the constant in this test from 5100 to 4096 (or whatever it was decided on), or remove the test (since I"m not sure how much it adds over the block-nesting-cap.html
You may have already done so, but just in case: you should run the HTML parser perf tests: http://trac.webkit.org/browser/trunk/PerformanceTests/Parser They're quick and easy, and will make sure that we don't regress other public benchmarks that rely on html parsing. This is unlikely to be a hit, as we've found that per-token branches don't really affect perf (but per char ones definitely do!)
It would also be nice to add some commentary in the tests or code as to why this is needed. :)
Committed r89453: <http://trac.webkit.org/changeset/89453>
Eric, sorry I missed your comments before landing. I don't think this needs to be a security bug as the result of the crashes that can result have not been exploitable. In general, this code is needed because rendering can become pathological for largely nested code. It was also present in the old parser, and I find it objectionable that is was not added to the new one, even though there were tests that had to be turned into failures. In general, we should not be removing these type of safe guards without understanding why it was added in the first place. Regarding the test that still fails. I want to talk to hyatt about why he wanted that behavior in the first place before changing it, but alas, he is on vacation.
(In reply to comment #25) > You may have already done so, but just in case: you should run the HTML parser perf tests: > http://trac.webkit.org/browser/trunk/PerformanceTests/Parser > > They're quick and easy, and will make sure that we don't regress other public benchmarks that rely on html parsing. > > This is unlikely to be a hit, as we've found that per-token branches don't really affect perf (but per char ones definitely do!) I ran the perf tests and saw no change.