WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
43586
Use the HTML5 TreeBuilder for Fragment Parsing
https://bugs.webkit.org/show_bug.cgi?id=43586
Summary
Use the HTML5 TreeBuilder for Fragment Parsing
Eric Seidel (no email)
Reported
2010-08-05 16:13:35 PDT
Use the HTML5 TreeBuilder for Fragment Parsing
Attachments
Patch
(25.96 KB, patch)
2010-08-05 16:13 PDT
,
Eric Seidel (no email)
no flags
Details
Formatted Diff
Diff
crashes due to DocumentParser use after free?
(42.22 KB, patch)
2010-08-06 17:31 PDT
,
Eric Seidel (no email)
no flags
Details
Formatted Diff
Diff
works
(30.86 KB, patch)
2010-08-07 21:27 PDT
,
Eric Seidel (no email)
no flags
Details
Formatted Diff
Diff
Patch
(39.72 KB, patch)
2010-08-09 09:28 PDT
,
Eric Seidel (no email)
no flags
Details
Formatted Diff
Diff
Patch
(41.40 KB, patch)
2010-08-09 12:19 PDT
,
Eric Seidel (no email)
no flags
Details
Formatted Diff
Diff
Patch for landing
(41.38 KB, patch)
2010-08-09 12:40 PDT
,
Eric Seidel (no email)
eric
: commit-queue-
Details
Formatted Diff
Diff
Show Obsolete
(5)
View All
Add attachment
proposed patch, testcase, etc.
Eric Seidel (no email)
Comment 1
2010-08-05 16:13:56 PDT
Created
attachment 63656
[details]
Patch
Eric Seidel (no email)
Comment 2
2010-08-06 17:31:17 PDT
Created
attachment 63788
[details]
crashes due to DocumentParser use after free?
Eric Seidel (no email)
Comment 3
2010-08-07 21:27:25 PDT
Created
attachment 63837
[details]
works
Eric Seidel (no email)
Comment 4
2010-08-09 09:28:27 PDT
Created
attachment 63903
[details]
Patch
Darin Adler
Comment 5
2010-08-09 09:33:17 PDT
Comment on
attachment 63903
[details]
Patch
> // FIXME: parse error > + // fragment case handled by callers. > mergeAttributesFromTokenIntoElement(token, m_openElements.bodyElement());
This is a confusing comment. Exactly what is handled by the callers? Is there some way to assert or check that? It's also strange that the comment has a period but does not start with a capital letter.
> +// This is a direct transcription of step 4 from: > +//
http://www.whatwg.org/specs/web-apps/current-work/multipage/the-end.html#fragment-case
> +HTMLTokenizer::State tokenizerStateForContextElement(Element* contextElement, bool reportErrors) > +{ > + if (!contextElement) > + return HTMLTokenizer::DataState; > + > + const QualifiedName& contextTag = contextElement->tagQName(); > + > + if (contextTag.matches(titleTag) || contextTag.matches(textareaTag)) > + return HTMLTokenizer::RCDATAState; > + if (contextTag.matches(styleTag) > + || contextTag.matches(xmpTag) > + || contextTag.matches(iframeTag) > + || (contextTag.matches(noembedTag) && HTMLTreeBuilder::pluginsEnabled(contextElement->document()->frame())) > + || (contextTag.matches(noscriptTag) && HTMLTreeBuilder::scriptEnabled(contextElement->document()->frame())) > + || contextTag.matches(noframesTag)) > + return reportErrors ? HTMLTokenizer::RAWTEXTState : HTMLTokenizer::PLAINTEXTState; > + if (contextTag.matches(scriptTag)) > + return reportErrors ? HTMLTokenizer::ScriptDataState : HTMLTokenizer::PLAINTEXTState; > + if (contextTag.matches(plaintextTag)) > + return HTMLTokenizer::PLAINTEXTState; > + return HTMLTokenizer::DataState; > +}
It is better to write this code out like this rather than using functions in the element classes? Have to run, no time to read the rest of the patch right now.
Eric Seidel (no email)
Comment 6
2010-08-09 09:40:38 PDT
(In reply to
comment #5
)
> (From update of
attachment 63903
[details]
) > > // FIXME: parse error > > + // fragment case handled by callers. > > mergeAttributesFromTokenIntoElement(token, m_openElements.bodyElement()); > > This is a confusing comment. Exactly what is handled by the callers? Is there some way to assert or check that?
Agreed. Basically that function should be killed or renamed. I should kill it or rename it. It was a victim of our first effort to split the original HTMLTreeBuilder.cpp into multiple data structures. That function ended up on the wrong side of the split IMO.
> It's also strange that the comment has a period but does not start with a capital letter.
I'm just going to remove the comment, it's not helpful.
> It is better to write this code out like this rather than using functions in the element classes?
Interesting thought. I think in this case, the if-cascade using tag names is better, since it more closely matches the spec. But I agree with you, many times it's better to shove this stuff into the DOM implementations.
> Have to run, no time to read the rest of the patch right now.
Thanks for the look!
Adam Barth
Comment 7
2010-08-09 10:16:32 PDT
Comment on
attachment 63903
[details]
Patch WebCore/html/HTMLTreeBuilder.cpp:824 + bool isFragmentCaseWithoutBody(const HTMLElementStack* elementStack) This is wrong. You probably want to cache this information. Basically, you're not covering the case when there are more than three elements on the stack. WebCore/html/HTMLTreeBuilder.cpp:2902 + m_document->finishedParsing(); I would use a return here instead of an else. The else is uglies. WebCore/html/HTMLTreeBuilder.cpp:2900 + // Warning, this may delete the parser, so don't try to do anything else after this. Please move this comment closer to m_document->finishedParsing(). WebCore/html/HTMLTreeBuilder.cpp:2914 + RefPtr<Node> newChild = fragment->document()->adoptNode(child, ec); This is dangerous. We need a version of adoptNode that doesn't run JavaScript. Also, you should grab references to all the children first and then move them to their new home. WebCore/html/HTMLTreeBuilder.h:212 + RefPtr<Document> m_dummyDocumentForFragmentParse; m_dummyDocumentForFragmentParse -> m_dummyDocumentForFragmentParsing ? WebCore/html/HTMLTreeBuilder.h:221 + FragmentParsingContext m_fragmentContext; Insert blank line above.
Eric Seidel (no email)
Comment 8
2010-08-09 12:19:08 PDT
Created
attachment 63915
[details]
Patch
Adam Barth
Comment 9
2010-08-09 12:26:55 PDT
Comment on
attachment 63915
[details]
Patch WebCore/html/HTMLElementStack.cpp:155 + return m_bodyElement; !!m_bodyElement ?
Eric Seidel (no email)
Comment 10
2010-08-09 12:40:24 PDT
Created
attachment 63916
[details]
Patch for landing
Eric Seidel (no email)
Comment 11
2010-08-09 13:59:36 PDT
Comment on
attachment 63916
[details]
Patch for landing Failed runner.html on the commit-bot, will need one more tweak.
Eric Seidel (no email)
Comment 12
2010-08-09 14:52:24 PDT
Committed
r65006
: <
http://trac.webkit.org/changeset/65006
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug