Bug 84646 - Implement Document.parse
: Implement Document.parse
Status: RESOLVED WONTFIX
: WebKit
HTML DOM
: 528+ (Nightly build)
: Unspecified Unspecified
: P2 Normal
Assigned To:
:
: WebExposed
:
:
  Show dependency treegraph
 
Reported: 2012-04-23 15:58 PST by
Modified: 2012-11-28 12:16 PST (History)


Attachments
Patch (12.90 KB, patch)
2012-04-23 16:00 PST, Rafael Weinstein
no flags Review Patch | Details | Formatted Diff | Diff
Patch (26.87 KB, patch)
2012-04-30 16:43 PST, Rafael Weinstein
no flags Review Patch | Details | Formatted Diff | Diff
Patch (27.98 KB, patch)
2012-05-03 16:47 PST, Rafael Weinstein
no flags Review Patch | Details | Formatted Diff | Diff
Patch (29.07 KB, patch)
2012-05-07 11:28 PST, Rafael Weinstein
no flags Review Patch | Details | Formatted Diff | Diff
Patch (28.66 KB, patch)
2012-05-09 15:25 PST, Rafael Weinstein
no flags Review Patch | Details | Formatted Diff | Diff
Patch (28.66 KB, patch)
2012-05-10 10:23 PST, Rafael Weinstein
no flags Review Patch | Details | Formatted Diff | Diff
Patch (29.88 KB, patch)
2012-05-16 16:21 PST, Rafael Weinstein
no flags Review Patch | Details | Formatted Diff | Diff
Patch (30.43 KB, patch)
2012-05-17 11:01 PST, Rafael Weinstein
no flags Review Patch | Details | Formatted Diff | Diff
Patch (31.10 KB, patch)
2012-05-24 13:22 PST, Rafael Weinstein
no flags Review Patch | Details | Formatted Diff | Diff
Patch (33.60 KB, patch)
2012-05-31 11:46 PST, Rafael Weinstein
no flags Review Patch | Details | Formatted Diff | Diff
Patch (33.52 KB, patch)
2012-06-04 15:30 PST, Rafael Weinstein
no flags Review Patch | Details | Formatted Diff | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2012-04-23 15:58:31 PST
Should be willing to parse without a context element. I.e.

df.innerHTML = '<tr><td>Foo</td></tr>'

creates a single child for the fragment which is a <tr>, which has a single <td> child, which has a single 'Foo' textNode child.
------- Comment #1 From 2012-04-23 16:00:40 PST -------
Created an attachment (id=138447) [details]
Patch
------- Comment #2 From 2012-04-23 16:02:59 PST -------
Attachment 138447 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1
Source/WebCore/ChangeLog:8:  You should remove the 'No new tests' and either add and list tests, or explain why no new tests were possible.  [changelog/nonewtests] [5]
Total errors found: 1 in 10 files


If any of these errors are false positives, please file a bug against check-webkit-style.
------- Comment #3 From 2012-04-23 16:03:17 PST -------
The discussion on public-webapps about the template element has gotten very focused on the semantics of the template elements' children. What doesn't seem to be controversial is the changes to allow context-free parsing.

Given that this has come up before and been requested as an independently useful thing, I thought a good way to make progress would be to separate the concerns.

This patch just adds innerHTML to DocumentFragment and makes the parser changes.

I'd looking for opinions on if this makes sense and if this is good way to make forward progress.
------- Comment #4 From 2012-04-23 16:11:50 PST -------
(From update of attachment 138447 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=138447&action=review

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:390
> +    m_tree.openElements()->pushRootNode(fragment);
> +    resetInsertionModeAppropriately();

I remember this being a kinda big deal.  Having a non-element on the element stack causes a lot of complexity in the parser.  I feel like there was a dream of removing this contextElement stuff at some point.  Or maybe it was the other way... of always requiring a context element.  This part of the change scares me, and I need to page this all back in to be sure it's OK.

> Source/WebCore/html/parser/HTMLTreeBuilder.h:96
>      enum InsertionMode {
> +        ContextFreeMode,

I take it this is a new section fo the spec?
------- Comment #5 From 2012-04-23 16:18:51 PST -------
(From update of attachment 138447 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=138447&action=review

>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:390
>> +    resetInsertionModeAppropriately();
> 
> I remember this being a kinda big deal.  Having a non-element on the element stack causes a lot of complexity in the parser.  I feel like there was a dream of removing this contextElement stuff at some point.  Or maybe it was the other way... of always requiring a context element.  This part of the change scares me, and I need to page this all back in to be sure it's OK.

Any and all guidance is apppreciated

>> Source/WebCore/html/parser/HTMLTreeBuilder.h:96
>> +        ContextFreeMode,
> 
> I take it this is a new section fo the spec?

Yes
------- Comment #6 From 2012-04-23 22:42:53 PST -------
(From update of attachment 138447 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=138447&action=review

Clearing r? since I don't think you're looking for a formal review.

> Source/WebCore/dom/DocumentFragment.cpp:104
> +    if (!document()->isHTMLDocument()) {
> +        ec = INVALID_STATE_ERR;
> +        return;
> +    }

The last time I checked, folks on public-webapp wanted to be able to parse both HTML and SVG properly.
If we're going without a proper SVG support, then I'd like to know why.

Also we should do a feature-announcement for this since this is a pretty significant change to the DOM API.

> Source/WebCore/dom/DocumentFragment.cpp:107
> +    RefPtr<DocumentFragment> fragment = createFragmentFromSource(html, document(), 0 /* contextElement */, ec);
> +    if (fragment)

WebKit style is to declare fragment inside the if's parenthesis or exit early when !fragment.
Also, we don't normally add inline comments like /* contextElement */.

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1470
> +        if (isCaptionColOrColgroupTag(token.name()) || isTableBodyContextTag(token.name()))
> +            setInsertionMode(InTableMode);
> +        else if (token.name() == trTag)
> +            setInsertionMode(InTableBodyMode);
> +        else if (isTableCellContextTag(token.name()))
> +            setInsertionMode(InRowMode);
> +        else
> +            setInsertionMode(InBodyMode);

It seems like a bad idea to modify insertion mode like this. Can't we by-pass the code where the insertion mode is checked?
------- Comment #7 From 2012-04-24 10:21:41 PST -------
See also bug 60316 (saying in particular that support for document.innerHTML was removed from spec, making WebKit happier).
------- Comment #8 From 2012-04-25 21:02:02 PST -------
> 
> > Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1470
> > +        if (isCaptionColOrColgroupTag(token.name()) || isTableBodyContextTag(token.name()))
> > +            setInsertionMode(InTableMode);
> > +        else if (token.name() == trTag)
> > +            setInsertionMode(InTableBodyMode);
> > +        else if (isTableCellContextTag(token.name()))
> > +            setInsertionMode(InRowMode);
> > +        else
> > +            setInsertionMode(InBodyMode);
> 
> It seems like a bad idea to modify insertion mode like this. Can't we by-pass the code where the insertion mode is checked?

Actually, it's exactly the opposite -- bypassing insertion mode checks would bring the entire state machine to its knees.
------- Comment #9 From 2012-04-30 16:43:40 PST -------
Created an attachment (id=139544) [details]
Patch
------- Comment #10 From 2012-04-30 16:45:13 PST -------
Attachment 139544 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1
Source/WebCore/ChangeLog:11:  You should remove the 'No new tests' and either add and list tests, or explain why no new tests were possible.  [changelog/nonewtests] [5]
Total errors found: 1 in 13 files


If any of these errors are false positives, please file a bug against check-webkit-style.
------- Comment #11 From 2012-04-30 16:50:02 PST -------
Ok, this patch implements the current proposal: http://lists.w3.org/Archives/Public/public-webapps/2012AprJun/0334.html.

This no longer requires any parser changes (the only code changes here are to change the contextElement to a contextTag within the parser, so as to avoid having to create elements for the implied case).

Full review not necessary yet as I still need to add parser tests.
------- Comment #12 From 2012-04-30 17:02:03 PST -------
(In reply to comment #10)
> If any of these errors are false positives, please file a bug against check-webkit-style.

You can use webkit-patch --no-review to avoid setting r? flag.
------- Comment #13 From 2012-05-03 16:47:00 PST -------
Created an attachment (id=140125) [details]
Patch
------- Comment #14 From 2012-05-03 17:00:05 PST -------
(From update of attachment 140125 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=140125&action=review

> Source/WebCore/dom/QualifiedName.h:109
> +inline const QualifiedName& nullQName() { return anyName; }

Typo? This should be return nullName, I assume. Any idea why you need nullQName() in addition to nullName?

> Source/WebCore/dom/ShadowRoot.cpp:-147
> -    RefPtr<DocumentFragment> fragment = createFragmentFromSource(markup, host(), ec);

You're changing behavior of ShadowRoot, since you're now using the new parsing mode instead of using the host's context. Is this intentional?

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1685
> +        if (tagName.matches(selectTag)) {

Why matches() instead of operator==? matches() seems not quite as strict (it allows non-matching prefixes, not clear if this matters here).
------- Comment #15 From 2012-05-03 17:13:20 PST -------
(From update of attachment 140125 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=140125&action=review

Looks very reasonable.  aklein asks some good questions.

> Source/WebCore/html/parser/HTMLDocumentParser.cpp:284
> +        if (m_queuedAtomicTokens.size()) {

There should be an isEmpty() function on m_queuedAtomicTokens that's more idiomatic than size().

> Source/WebCore/html/parser/HTMLDocumentParser.cpp:286
> +                m_treeBuilder->constructTreeFromAtomicToken(*(it->get()));

What happens if this causes us to re-enter this loop?  It's better to make this code re-entrant, just in case.  For example, you can move the queue from m_queuedAtomicTokens to a local variable and then iterate over the local variable.  That when we won't get confused if we get re-entered.

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1634
> +static const QualifiedName& getImpliedContextTag(const AtomicString& tagName)

getImpliedContextTag => toImpliedContextTag ?

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1640
> +    if (tagName == tbodyTag
> +        || tagName == tfootTag
> +        || tagName == theadTag
> +        || tagName == captionTag
> +        || tagName == colgroupTag) {

Isn't there a inline helper function for this set?  Something like isTableContext()?  Maybe that was just in dglazkov's patch.
------- Comment #16 From 2012-05-04 10:46:55 PST -------
(From update of attachment 140125 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=140125&action=review

>> Source/WebCore/html/parser/HTMLDocumentParser.cpp:284
>> +        if (m_queuedAtomicTokens.size()) {
> 
> There should be an isEmpty() function on m_queuedAtomicTokens that's more idiomatic than size().

No reason for this if statement. Iterating should be fast on an empty vector, and clear should also be fast. Just leave the if statement out. Unless Adam’s suggestions make the code expensive enough that you need it.

>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1634
>> +static const QualifiedName& getImpliedContextTag(const AtomicString& tagName)
> 
> getImpliedContextTag => toImpliedContextTag ?

I’d suggest just impliedContextTag. I see no reason for a “get” or a “to” prefix on the function name.

>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1640
>> +        || tagName == colgroupTag) {
> 
> Isn't there a inline helper function for this set?  Something like isTableContext()?  Maybe that was just in dglazkov's patch.

It’s good to have a named helper even if there isn’t already one, especially if there is a name in the HTML standard document we can use.

> Source/WebCore/html/parser/HTMLTreeBuilder.h:90
> +    inline bool canContinue(HTMLToken& token) { return !m_fragmentContext.fragment() || m_fragmentContext.contextTag() != nullQName() || tryToSetImpliedContextTag(token); }

The “inline” here is unneeded and has no effect.

This member function should be const.

> Source/WebCore/xml/parser/MarkupTokenBase.h:525
> +    WTF::Vector<UChar> m_internalCharacters;

No WTF:: prefix is needed here.

Is it the right tradeoff to always pay the overhead of a vector for internal characters? A Vector is a reasonable data structure if the buffer will be resized. If not, then an OwnArrayPtr plus a separate size would be more efficient.
------- Comment #17 From 2012-05-07 11:13:25 PST -------
(From update of attachment 140125 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=140125&action=review

>> Source/WebCore/dom/QualifiedName.h:109
>> +inline const QualifiedName& nullQName() { return anyName; }
> 
> Typo? This should be return nullName, I assume. Any idea why you need nullQName() in addition to nullName?

Copy & paste error, yes. Thanks for the catch. This appears to be a GCC static initialization issue, but to be honest, I'm not clear on exactly what the issue is.

>> Source/WebCore/dom/ShadowRoot.cpp:-147
>> -    RefPtr<DocumentFragment> fragment = createFragmentFromSource(markup, host(), ec);
> 
> You're changing behavior of ShadowRoot, since you're now using the new parsing mode instead of using the host's context. Is this intentional?

Yes. Dimitri says this is what ShadowRoot wants.

>>> Source/WebCore/html/parser/HTMLDocumentParser.cpp:284
>>> +        if (m_queuedAtomicTokens.size()) {
>> 
>> There should be an isEmpty() function on m_queuedAtomicTokens that's more idiomatic than size().
> 
> No reason for this if statement. Iterating should be fast on an empty vector, and clear should also be fast. Just leave the if statement out. Unless Adam’s suggestions make the code expensive enough that you need it.

Yeah, I'm going to take adam's suggestion, so I'll swap to a local Vector if not empty, and leave this check in place.

>> Source/WebCore/html/parser/HTMLDocumentParser.cpp:286
>> +                m_treeBuilder->constructTreeFromAtomicToken(*(it->get()));
> 
> What happens if this causes us to re-enter this loop?  It's better to make this code re-entrant, just in case.  For example, you can move the queue from m_queuedAtomicTokens to a local variable and then iterate over the local variable.  That when we won't get confused if we get re-entered.

This seems like a good safety measure so that the failure case is bad output and not a crash. However, I think it's strictly an error to end up in the case. Adding ASSERTs to this effect.

>>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1634
>>> +static const QualifiedName& getImpliedContextTag(const AtomicString& tagName)
>> 
>> getImpliedContextTag => toImpliedContextTag ?
> 
> I’d suggest just impliedContextTag. I see no reason for a “get” or a “to” prefix on the function name.

done

>>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1640
>>> +        || tagName == colgroupTag) {
>> 
>> Isn't there a inline helper function for this set?  Something like isTableContext()?  Maybe that was just in dglazkov's patch.
> 
> It’s good to have a named helper even if there isn’t already one, especially if there is a name in the HTML standard document we can use.

Yup. done.

>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1685
>> +        if (tagName.matches(selectTag)) {
> 
> Why matches() instead of operator==? matches() seems not quite as strict (it allows non-matching prefixes, not clear if this matters here).

Honestly, I don't know. That's what node->hasTagName does for it's implementation, and I didn't want to risk changing the semantics here. Abarth? Is it safe you make these ==?

>> Source/WebCore/html/parser/HTMLTreeBuilder.h:90
>> +    inline bool canContinue(HTMLToken& token) { return !m_fragmentContext.fragment() || m_fragmentContext.contextTag() != nullQName() || tryToSetImpliedContextTag(token); }
> 
> The “inline” here is unneeded and has no effect.
> 
> This member function should be const.

inline removed. It can't be const because it calls non-const member functions transitively

>> Source/WebCore/xml/parser/MarkupTokenBase.h:525
>> +    WTF::Vector<UChar> m_internalCharacters;
> 
> No WTF:: prefix is needed here.
> 
> Is it the right tradeoff to always pay the overhead of a vector for internal characters? A Vector is a reasonable data structure if the buffer will be resized. If not, then an OwnArrayPtr plus a separate size would be more efficient.

Now using an OwnArrayPtr.
------- Comment #18 From 2012-05-07 11:15:00 PST -------
Thanks, everyone for the review comments. This patch addresses all comments. Tests coming.
------- Comment #19 From 2012-05-07 11:28:06 PST -------
Created an attachment (id=140560) [details]
Patch
------- Comment #20 From 2012-05-09 15:25:18 PST -------
Created an attachment (id=141027) [details]
Patch
------- Comment #21 From 2012-05-09 15:26:51 PST -------
removed incorrect ASSERT(). pumpTokenizer needs to be called at any point so that new network IO can be processed. This isn't a problem is practice for this patch, but it's a conceptual problem since the HTMLDocumentParser is stateful WRT network io.
------- Comment #22 From 2012-05-10 10:23:00 PST -------
Created an attachment (id=141194) [details]
Patch
------- Comment #23 From 2012-05-10 11:36:00 PST -------
(From update of attachment 141194 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=141194&action=review

> Source/WebCore/dom/DocumentFragment.h:42
> +    String innerHTML() const;
> +    void setInnerHTML(const String&, ExceptionCode&);

Should we be adding this without a define? I mean, I trust that you're awesome, but if we have to adjust our approach,  I'd hate to have a shipping implementation out there somewhere with the old bits.
------- Comment #24 From 2012-05-16 16:21:48 PST -------
Created an attachment (id=142363) [details]
Patch
------- Comment #25 From 2012-05-16 18:33:49 PST -------
This last patch now represents the consensus with Mozilla about the details, which includes moving the API call from DocumentFragment.innerHTML to Document.parse().

This patch also now includes tests.

The main thing left to do here is to support the inference of SVG and MathML elements.
------- Comment #26 From 2012-05-16 20:40:53 PST -------
(From update of attachment 142363 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=142363&action=review

This looks great.  Thanks for talking this through with folks on the standards lists.  I think this is a great result.

> Source/WebCore/dom/Document.idl:359
> +#if defined(ENABLE_IMPLIED_CONTEXT_PARSING) && ENABLE_IMPLIED_CONTEXT_PARSING

You can just use the [Conditional] IDL attribute as on line 357 of this file.

> Source/WebCore/dom/ShadowRoot.cpp:151
> -    RefPtr<DocumentFragment> fragment = createFragmentFromSource(markup, host(), ec);
> +#if ENABLE(IMPLIED_CONTEXT_PARSING)
> +    RefPtr<DocumentFragment> fragment = createFragmentFromSource(markup, document(), 0 /* contextElement */, ec);
> +#else
> +    RefPtr<DocumentFragment> fragment = createFragmentFromSource(markup, document(), host(), ec);
> +#endif

Is this part of the change still correct?

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:247
> +// DocumentFragment.innerHTML is based on the identity of the tagName of the first startTag token encountered

DocumentFragment.innerHTML -> document.parse?  Or maybe we're doing both.

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:257
> +        || tagName == colgroupTag) {
> +        return tableTag;
> +    }

No { } in WebKit style, I believe.

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1482
> +    case ImpliedContextMode:

Ah, so we're using a mode.  Interesting.

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1491
> +

nit: I think we've been skipping this blank line in this file.

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1500
> +        ASSERT(false);

ASSERT_NOT_REACHED()

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:2285
> +        ASSERT(false);

ASSERT_NOT_REACHED

> LayoutTests/html5lib/resources/runner.js:160
> +                element = document.parse(input);

That is pretty hot.
------- Comment #27 From 2012-05-17 11:01:29 PST -------
Created an attachment (id=142504) [details]
Patch
------- Comment #28 From 2012-05-17 11:01:47 PST -------
(From update of attachment 142363 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=142363&action=review

>> Source/WebCore/dom/ShadowRoot.cpp:151
>> +#endif
> 
> Is this part of the change still correct?

I asked Dimitri about this and he said that they want ShadowRoot.innerHTML to parse context-free, but to be honest I can't remember the use case. It seems naively that it makes more sense to use the host() as the context element. I'm going to remove this and leave it up the ShadowRoot people to re-enable implied-context parsing if that's what they really want.

>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:247
>> +// DocumentFragment.innerHTML is based on the identity of the tagName of the first startTag token encountered
> 
> DocumentFragment.innerHTML -> document.parse?  Or maybe we're doing both.

Nope. The comment just needed an update.

>> LayoutTests/html5lib/resources/runner.js:160
>> +                element = document.parse(input);
> 
> That is pretty hot.

;-)
------- Comment #29 From 2012-05-23 17:35:36 PST -------
(From update of attachment 142504 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=142504&action=review

> Source/WebCore/dom/Document.cpp:834
> +    return createFragmentFromSource(html, document(), 0 /* contextElement */, ec);

Since we want parsed script elements to run when it's inserted to somewhere in the document, we can't use createFragmentFromSource.
I'll give these functions more descriptive names. We've accumulated so many functions of obnoxious names in this area.
------- Comment #30 From 2012-05-23 18:11:27 PST -------
(From update of attachment 142504 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=142504&action=review

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:254
> +    if (isTableBodyContextTag(tagName)
> +        || tagName == captionTag
> +        || tagName == colgroupTag)

Would read better all on one line.
------- Comment #31 From 2012-05-24 13:22:34 PST -------
Created an attachment (id=143878) [details]
Patch
------- Comment #32 From 2012-05-24 13:24:18 PST -------
(From update of attachment 142504 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=142504&action=review

>> Source/WebCore/dom/Document.cpp:834
>> +    return createFragmentFromSource(html, document(), 0 /* contextElement */, ec);
> 
> Since we want parsed script elements to run when it's inserted to somewhere in the document, we can't use createFragmentFromSource.
> I'll give these functions more descriptive names. We've accumulated so many functions of obnoxious names in this area.

createFragmentFromSource now takes an optional FragmentScriptingPermission argument which is set to AllowScriptingContentAndDoNotMarkAlreadyStarted in this case.

>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:254
>> +        || tagName == colgroupTag)
> 
> Would read better all on one line.

done
------- Comment #33 From 2012-05-24 13:33:14 PST -------
(In reply to comment #32)
> (From update of attachment 142504 [details] [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=142504&action=review
> 
> >> Source/WebCore/dom/Document.cpp:834
> >> +    return createFragmentFromSource(html, document(), 0 /* contextElement */, ec);
> > 
> > Since we want parsed script elements to run when it's inserted to somewhere in the document, we can't use createFragmentFromSource.
> > I'll give these functions more descriptive names. We've accumulated so many functions of obnoxious names in this area.
> 
> createFragmentFromSource now takes an optional FragmentScriptingPermission argument which is set to AllowScriptingContentAndDoNotMarkAlreadyStarted in this case.

Sorry, I have a patch on https://bugs.webkit.org/show_bug.cgi?id=87339 that's going to conflict with your patch. The function is renamed to createFragmentForInnerOuterHTML. However, I'm intending to do another around of cleanups once that patch is landed.
------- Comment #34 From 2012-05-24 13:49:36 PST -------
(From update of attachment 143878 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=143878&action=review

Just some drive-by nits:

> Source/WebCore/ChangeLog:3
> +        Implement HTMLTemplateElement

IIRC, you've changed the title of the bug.

> Source/WebCore/dom/Document.idl:359
> +        [Conditional=IMPLIED_CONTEXT_PARSING] DocumentFragment parse(in [TreatNullAs=NullString,Optional=DefaultIsUndefined] DOMString tagname)

Is our style to not have spaces between attribute options?

> Source/WebCore/dom/QualifiedName.cpp:117
> +        new ((void*)&nullName) QualifiedName(nullAtom, nullAtom, nullAtom);

It makes me very suspicious that we've survived this long w/o this and that we'd now need it...

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:250
> +const QualifiedName& getImpliedContextTag(const AtomicString& tagName)

I think a more webkity-name would be something like impliedContextForTag().  The style-guide is kinda anti "get" on "getters" (not that this is an explicit getter per-say).

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1486
> +        if (token.name() == styleTag
> +            || token.name() == scriptTag
> +            || token.name() == metaTag
> +            || token.name() == linkTag) {

Do we have a name for this class of tag names?

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1698
> -        if (node->hasTagName(tbodyTag) || node->hasTagName(theadTag) || node->hasTagName(tfootTag))
> +        if (tagName.matches(tbodyTag) || tagName.matches(theadTag) || tagName.matches(tfootTag))

I believe we have helper functions for some classes of tag names, like possibly this one.  Something like isTableSection(tagName) might already exist for you.

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1725
> +            ASSERT(tagName != nullQName());

QualifiedName doesn't have an isNull()?
------- Comment #35 From 2012-05-24 14:25:29 PST -------
(From update of attachment 143878 [details])
Attachment 143878 [details] did not pass gtk-ews (gtk):
Output: http://queues.webkit.org/results/12803249
------- Comment #36 From 2012-05-24 14:27:18 PST -------
(From update of attachment 143878 [details])
Attachment 143878 [details] did not pass mac-ews (mac):
Output: http://queues.webkit.org/results/12801244
------- Comment #37 From 2012-05-25 00:30:54 PST -------
Do ensure to test that:

* document.parse() throws a TypeError
* document.parse(null) returns a fragment with a single text node "null"
* document.parse(undefined) returns a fragment with a single text node "undefined"
------- Comment #38 From 2012-05-25 15:05:48 PST -------
@Ms2ger: So it would seem the most natural thing would be to have the same behavior as range.createContextualFragment.

In Webkit this is:

empty & undefined => "undefined"
null => "null"

This doesn't appear to match Gecko, which is

empty => throws
undefined => "undefined"
null => empty fragment (no children)

=-(.

It seems like this is an IDL issue. Anyone else have opinions here?

(For now, I'll update the patch such that this is consistent with WebKit's createContextualFragment, and adds tests for these cases).
------- Comment #39 From 2012-05-25 15:49:25 PST -------
(From update of attachment 143878 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=143878&action=review

>> Source/WebCore/ChangeLog:3
>> +        Implement HTMLTemplateElement
> 
> IIRC, you've changed the title of the bug.

Yes. Sorry. This is actually the wrong changelog. Git fail. Fixing...

>> Source/WebCore/dom/Document.idl:359
>> +        [Conditional=IMPLIED_CONTEXT_PARSING] DocumentFragment parse(in [TreatNullAs=NullString,Optional=DefaultIsUndefined] DOMString tagname)
> 
> Is our style to not have spaces between attribute options?

It appears to be mixed, but I've changed this to match Range.createContextualFragment, so now there's only one k/v

>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:250
>> +const QualifiedName& getImpliedContextTag(const AtomicString& tagName)
> 
> I think a more webkity-name would be something like impliedContextForTag().  The style-guide is kinda anti "get" on "getters" (not that this is an explicit getter per-say).

done

>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1486
>> +            || token.name() == linkTag) {
> 
> Do we have a name for this class of tag names?

unfortunately, no.

>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1698
>> +        if (tagName.matches(tbodyTag) || tagName.matches(theadTag) || tagName.matches(tfootTag))
> 
> I believe we have helper functions for some classes of tag names, like possibly this one.  Something like isTableSection(tagName) might already exist for you.

This is matching qualified names. the helper functions you're describing only match local name. There may be no impact, but I'd be nervous about changing the semantics here.
------- Comment #40 From 2012-05-25 16:13:43 PST -------
> It seems like this is an IDL issue. Anyone else have opinions here?

The behavior for empty in range.createContextualFragment is likely a result of WebKit's previous lax treatment of missing parameters.  We've kept the legacy behavior in many places to avoid breaking existing content.  For new APIs like this one, we're trying to follow the specs and throw when we have too few arguments, like Gecko does.
------- Comment #41 From 2012-05-25 16:25:53 PST -------
fair enough. doing as Ms2ger & abarth suggest...
------- Comment #42 From 2012-05-26 17:15:01 PST -------
(In reply to comment #40)
> > It seems like this is an IDL issue. Anyone else have opinions here?
> 
> The behavior for empty in range.createContextualFragment is likely a result of WebKit's previous lax treatment of missing parameters.  We've kept the legacy behavior in many places to avoid breaking existing content.  For new APIs like this one, we're trying to follow the specs and throw when we have too few arguments, like Gecko does.

That covers the no-argument case, but there’s also the null case. The way that normally works is that we set the proper flags in the IDL (or perhaps it’s the default behavior) so that null turns into a null string at the binding layer. That’s distinct from an empty string; the function in the DOM can check with isNull and handle it properly, or if the handling for null and an empty string are identical, it can just omit an explicit isNull check and typically get the correct result without any special code, since null strings return true for isEmpty and 0 for length.
------- Comment #43 From 2012-05-31 11:46:02 PST -------
Created an attachment (id=145123) [details]
Patch
------- Comment #44 From 2012-05-31 11:48:56 PST -------
Recent comments addressed. Tests added for empty/null/undefined argument.

@darin: Simply removing the TreatNullAs=NullString from the IDL brought the behavior inline what Ms2ger & Abarth are asking for. I'm honestly unsure if this is what you had wanted. Please let me know.
------- Comment #45 From 2012-06-01 10:43:56 PST -------
(In reply to comment #44)
> Recent comments addressed. Tests added for empty/null/undefined argument.
> 
> @darin: Simply removing the TreatNullAs=NullString from the IDL brought the behavior inline what Ms2ger & Abarth are asking for.

Good.

> I'm honestly unsure if this is what you had wanted. Please let me know.

I’m sorry you had trouble understanding.

I wanted you to both set the flag in the IDL and change the C++ function if needed, to get the desired result. Sounds like you did. I assume that the test results provide the proof that it’s right.
------- Comment #46 From 2012-06-04 14:20:53 PST -------
(From update of attachment 145123 [details])
Attachment 145123 [details] did not pass qt-wk2-ews (qt):
Output: http://queues.webkit.org/results/12902299
------- Comment #47 From 2012-06-04 14:24:17 PST -------
(From update of attachment 145123 [details])
Attachment 145123 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/12906082
------- Comment #48 From 2012-06-04 14:27:59 PST -------
(From update of attachment 145123 [details])
Attachment 145123 [details] did not pass win-ews (win):
Output: http://queues.webkit.org/results/12902300
------- Comment #49 From 2012-06-04 15:04:12 PST -------
(From update of attachment 145123 [details])
Attachment 145123 [details] did not pass mac-ews (mac):
Output: http://queues.webkit.org/results/12900295
------- Comment #50 From 2012-06-04 15:08:59 PST -------
(From update of attachment 145123 [details])
Attachment 145123 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/12895398
------- Comment #51 From 2012-06-04 15:30:21 PST -------
Created an attachment (id=145640) [details]
Patch
------- Comment #52 From 2012-06-04 15:31:31 PST -------
(lastest patch: missing semi-colon broke llvm builds)
------- Comment #53 From 2012-08-22 15:24:44 PST -------
Unclear what the next step here is.  Is this just missing a technical review, or is this caught up in standards work still?
------- Comment #54 From 2012-08-22 15:39:12 PST -------
(In reply to comment #53)
> Unclear what the next step here is.  Is this just missing a technical review, or is this caught up in standards work still?

Standards work.
------- Comment #55 From 2012-09-13 15:03:52 PST -------
(From update of attachment 145640 [details])
Clearing r? flag until the standards discussion settles.
------- Comment #56 From 2012-10-17 11:17:35 PST -------
Document.parse() is now superceeded by the HTMLTemplateElement:

http://dvcs.w3.org/hg/webcomponents/raw-file/tip/spec/templates/index.html

https://bugs.webkit.org/show_bug.cgi?id=86031