Bug 84646 - Implement Document.parse
Summary: Implement Document.parse
Status: RESOLVED WONTFIX
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Rafael Weinstein
URL:
Keywords: WebExposed
Depends on:
Blocks:
 
Reported: 2012-04-23 15:58 PDT by Rafael Weinstein
Modified: 2012-11-28 12:16 PST (History)
24 users (show)

See Also:


Attachments
Patch (12.90 KB, patch)
2012-04-23 16:00 PDT, Rafael Weinstein
no flags Details | Formatted Diff | Diff
Patch (26.87 KB, patch)
2012-04-30 16:43 PDT, Rafael Weinstein
no flags Details | Formatted Diff | Diff
Patch (27.98 KB, patch)
2012-05-03 16:47 PDT, Rafael Weinstein
no flags Details | Formatted Diff | Diff
Patch (29.07 KB, patch)
2012-05-07 11:28 PDT, Rafael Weinstein
no flags Details | Formatted Diff | Diff
Patch (28.66 KB, patch)
2012-05-09 15:25 PDT, Rafael Weinstein
no flags Details | Formatted Diff | Diff
Patch (28.66 KB, patch)
2012-05-10 10:23 PDT, Rafael Weinstein
no flags Details | Formatted Diff | Diff
Patch (29.88 KB, patch)
2012-05-16 16:21 PDT, Rafael Weinstein
no flags Details | Formatted Diff | Diff
Patch (30.43 KB, patch)
2012-05-17 11:01 PDT, Rafael Weinstein
no flags Details | Formatted Diff | Diff
Patch (31.10 KB, patch)
2012-05-24 13:22 PDT, Rafael Weinstein
no flags Details | Formatted Diff | Diff
Patch (33.60 KB, patch)
2012-05-31 11:46 PDT, Rafael Weinstein
no flags Details | Formatted Diff | Diff
Patch (33.52 KB, patch)
2012-06-04 15:30 PDT, Rafael Weinstein
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Rafael Weinstein 2012-04-23 15:58:31 PDT
Should be willing to parse without a context element. I.e.

df.innerHTML = '<tr><td>Foo</td></tr>'

creates a single child for the fragment which is a <tr>, which has a single <td> child, which has a single 'Foo' textNode child.
Comment 1 Rafael Weinstein 2012-04-23 16:00:40 PDT
Created attachment 138447 [details]
Patch
Comment 2 WebKit Review Bot 2012-04-23 16:02:59 PDT
Attachment 138447 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1
Source/WebCore/ChangeLog:8:  You should remove the 'No new tests' and either add and list tests, or explain why no new tests were possible.  [changelog/nonewtests] [5]
Total errors found: 1 in 10 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 3 Rafael Weinstein 2012-04-23 16:03:17 PDT
The discussion on public-webapps about the template element has gotten very focused on the semantics of the template elements' children. What doesn't seem to be controversial is the changes to allow context-free parsing.

Given that this has come up before and been requested as an independently useful thing, I thought a good way to make progress would be to separate the concerns.

This patch just adds innerHTML to DocumentFragment and makes the parser changes.

I'd looking for opinions on if this makes sense and if this is good way to make forward progress.
Comment 4 Eric Seidel (no email) 2012-04-23 16:11:50 PDT
Comment on attachment 138447 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=138447&action=review

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:390
> +    m_tree.openElements()->pushRootNode(fragment);
> +    resetInsertionModeAppropriately();

I remember this being a kinda big deal.  Having a non-element on the element stack causes a lot of complexity in the parser.  I feel like there was a dream of removing this contextElement stuff at some point.  Or maybe it was the other way... of always requiring a context element.  This part of the change scares me, and I need to page this all back in to be sure it's OK.

> Source/WebCore/html/parser/HTMLTreeBuilder.h:96
>      enum InsertionMode {
> +        ContextFreeMode,

I take it this is a new section fo the spec?
Comment 5 Rafael Weinstein 2012-04-23 16:18:51 PDT
Comment on attachment 138447 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=138447&action=review

>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:390
>> +    resetInsertionModeAppropriately();
> 
> I remember this being a kinda big deal.  Having a non-element on the element stack causes a lot of complexity in the parser.  I feel like there was a dream of removing this contextElement stuff at some point.  Or maybe it was the other way... of always requiring a context element.  This part of the change scares me, and I need to page this all back in to be sure it's OK.

Any and all guidance is apppreciated

>> Source/WebCore/html/parser/HTMLTreeBuilder.h:96
>> +        ContextFreeMode,
> 
> I take it this is a new section fo the spec?

Yes
Comment 6 Ryosuke Niwa 2012-04-23 22:42:53 PDT
Comment on attachment 138447 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=138447&action=review

Clearing r? since I don't think you're looking for a formal review.

> Source/WebCore/dom/DocumentFragment.cpp:104
> +    if (!document()->isHTMLDocument()) {
> +        ec = INVALID_STATE_ERR;
> +        return;
> +    }

The last time I checked, folks on public-webapp wanted to be able to parse both HTML and SVG properly.
If we're going without a proper SVG support, then I'd like to know why.

Also we should do a feature-announcement for this since this is a pretty significant change to the DOM API.

> Source/WebCore/dom/DocumentFragment.cpp:107
> +    RefPtr<DocumentFragment> fragment = createFragmentFromSource(html, document(), 0 /* contextElement */, ec);
> +    if (fragment)

WebKit style is to declare fragment inside the if's parenthesis or exit early when !fragment.
Also, we don't normally add inline comments like /* contextElement */.

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1470
> +        if (isCaptionColOrColgroupTag(token.name()) || isTableBodyContextTag(token.name()))
> +            setInsertionMode(InTableMode);
> +        else if (token.name() == trTag)
> +            setInsertionMode(InTableBodyMode);
> +        else if (isTableCellContextTag(token.name()))
> +            setInsertionMode(InRowMode);
> +        else
> +            setInsertionMode(InBodyMode);

It seems like a bad idea to modify insertion mode like this. Can't we by-pass the code where the insertion mode is checked?
Comment 7 Alexey Proskuryakov 2012-04-24 10:21:41 PDT
See also bug 60316 (saying in particular that support for document.innerHTML was removed from spec, making WebKit happier).
Comment 8 Dimitri Glazkov (Google) 2012-04-25 21:02:02 PDT
> 
> > Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1470
> > +        if (isCaptionColOrColgroupTag(token.name()) || isTableBodyContextTag(token.name()))
> > +            setInsertionMode(InTableMode);
> > +        else if (token.name() == trTag)
> > +            setInsertionMode(InTableBodyMode);
> > +        else if (isTableCellContextTag(token.name()))
> > +            setInsertionMode(InRowMode);
> > +        else
> > +            setInsertionMode(InBodyMode);
> 
> It seems like a bad idea to modify insertion mode like this. Can't we by-pass the code where the insertion mode is checked?

Actually, it's exactly the opposite -- bypassing insertion mode checks would bring the entire state machine to its knees.
Comment 9 Rafael Weinstein 2012-04-30 16:43:40 PDT
Created attachment 139544 [details]
Patch
Comment 10 WebKit Review Bot 2012-04-30 16:45:13 PDT
Attachment 139544 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1
Source/WebCore/ChangeLog:11:  You should remove the 'No new tests' and either add and list tests, or explain why no new tests were possible.  [changelog/nonewtests] [5]
Total errors found: 1 in 13 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 11 Rafael Weinstein 2012-04-30 16:50:02 PDT
Ok, this patch implements the current proposal: http://lists.w3.org/Archives/Public/public-webapps/2012AprJun/0334.html.

This no longer requires any parser changes (the only code changes here are to change the contextElement to a contextTag within the parser, so as to avoid having to create elements for the implied case).

Full review not necessary yet as I still need to add parser tests.
Comment 12 Ryosuke Niwa 2012-04-30 17:02:03 PDT
(In reply to comment #10)
> If any of these errors are false positives, please file a bug against check-webkit-style.

You can use webkit-patch --no-review to avoid setting r? flag.
Comment 13 Rafael Weinstein 2012-05-03 16:47:00 PDT
Created attachment 140125 [details]
Patch
Comment 14 Adam Klein 2012-05-03 17:00:05 PDT
Comment on attachment 140125 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=140125&action=review

> Source/WebCore/dom/QualifiedName.h:109
> +inline const QualifiedName& nullQName() { return anyName; }

Typo? This should be return nullName, I assume. Any idea why you need nullQName() in addition to nullName?

> Source/WebCore/dom/ShadowRoot.cpp:-147
> -    RefPtr<DocumentFragment> fragment = createFragmentFromSource(markup, host(), ec);

You're changing behavior of ShadowRoot, since you're now using the new parsing mode instead of using the host's context. Is this intentional?

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1685
> +        if (tagName.matches(selectTag)) {

Why matches() instead of operator==? matches() seems not quite as strict (it allows non-matching prefixes, not clear if this matters here).
Comment 15 Adam Barth 2012-05-03 17:13:20 PDT
Comment on attachment 140125 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=140125&action=review

Looks very reasonable.  aklein asks some good questions.

> Source/WebCore/html/parser/HTMLDocumentParser.cpp:284
> +        if (m_queuedAtomicTokens.size()) {

There should be an isEmpty() function on m_queuedAtomicTokens that's more idiomatic than size().

> Source/WebCore/html/parser/HTMLDocumentParser.cpp:286
> +                m_treeBuilder->constructTreeFromAtomicToken(*(it->get()));

What happens if this causes us to re-enter this loop?  It's better to make this code re-entrant, just in case.  For example, you can move the queue from m_queuedAtomicTokens to a local variable and then iterate over the local variable.  That when we won't get confused if we get re-entered.

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1634
> +static const QualifiedName& getImpliedContextTag(const AtomicString& tagName)

getImpliedContextTag => toImpliedContextTag ?

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1640
> +    if (tagName == tbodyTag
> +        || tagName == tfootTag
> +        || tagName == theadTag
> +        || tagName == captionTag
> +        || tagName == colgroupTag) {

Isn't there a inline helper function for this set?  Something like isTableContext()?  Maybe that was just in dglazkov's patch.
Comment 16 Darin Adler 2012-05-04 10:46:55 PDT
Comment on attachment 140125 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=140125&action=review

>> Source/WebCore/html/parser/HTMLDocumentParser.cpp:284
>> +        if (m_queuedAtomicTokens.size()) {
> 
> There should be an isEmpty() function on m_queuedAtomicTokens that's more idiomatic than size().

No reason for this if statement. Iterating should be fast on an empty vector, and clear should also be fast. Just leave the if statement out. Unless Adam’s suggestions make the code expensive enough that you need it.

>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1634
>> +static const QualifiedName& getImpliedContextTag(const AtomicString& tagName)
> 
> getImpliedContextTag => toImpliedContextTag ?

I’d suggest just impliedContextTag. I see no reason for a “get” or a “to” prefix on the function name.

>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1640
>> +        || tagName == colgroupTag) {
> 
> Isn't there a inline helper function for this set?  Something like isTableContext()?  Maybe that was just in dglazkov's patch.

It’s good to have a named helper even if there isn’t already one, especially if there is a name in the HTML standard document we can use.

> Source/WebCore/html/parser/HTMLTreeBuilder.h:90
> +    inline bool canContinue(HTMLToken& token) { return !m_fragmentContext.fragment() || m_fragmentContext.contextTag() != nullQName() || tryToSetImpliedContextTag(token); }

The “inline” here is unneeded and has no effect.

This member function should be const.

> Source/WebCore/xml/parser/MarkupTokenBase.h:525
> +    WTF::Vector<UChar> m_internalCharacters;

No WTF:: prefix is needed here.

Is it the right tradeoff to always pay the overhead of a vector for internal characters? A Vector is a reasonable data structure if the buffer will be resized. If not, then an OwnArrayPtr plus a separate size would be more efficient.
Comment 17 Rafael Weinstein 2012-05-07 11:13:25 PDT
Comment on attachment 140125 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=140125&action=review

>> Source/WebCore/dom/QualifiedName.h:109
>> +inline const QualifiedName& nullQName() { return anyName; }
> 
> Typo? This should be return nullName, I assume. Any idea why you need nullQName() in addition to nullName?

Copy & paste error, yes. Thanks for the catch. This appears to be a GCC static initialization issue, but to be honest, I'm not clear on exactly what the issue is.

>> Source/WebCore/dom/ShadowRoot.cpp:-147
>> -    RefPtr<DocumentFragment> fragment = createFragmentFromSource(markup, host(), ec);
> 
> You're changing behavior of ShadowRoot, since you're now using the new parsing mode instead of using the host's context. Is this intentional?

Yes. Dimitri says this is what ShadowRoot wants.

>>> Source/WebCore/html/parser/HTMLDocumentParser.cpp:284
>>> +        if (m_queuedAtomicTokens.size()) {
>> 
>> There should be an isEmpty() function on m_queuedAtomicTokens that's more idiomatic than size().
> 
> No reason for this if statement. Iterating should be fast on an empty vector, and clear should also be fast. Just leave the if statement out. Unless Adam’s suggestions make the code expensive enough that you need it.

Yeah, I'm going to take adam's suggestion, so I'll swap to a local Vector if not empty, and leave this check in place.

>> Source/WebCore/html/parser/HTMLDocumentParser.cpp:286
>> +                m_treeBuilder->constructTreeFromAtomicToken(*(it->get()));
> 
> What happens if this causes us to re-enter this loop?  It's better to make this code re-entrant, just in case.  For example, you can move the queue from m_queuedAtomicTokens to a local variable and then iterate over the local variable.  That when we won't get confused if we get re-entered.

This seems like a good safety measure so that the failure case is bad output and not a crash. However, I think it's strictly an error to end up in the case. Adding ASSERTs to this effect.

>>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1634
>>> +static const QualifiedName& getImpliedContextTag(const AtomicString& tagName)
>> 
>> getImpliedContextTag => toImpliedContextTag ?
> 
> I’d suggest just impliedContextTag. I see no reason for a “get” or a “to” prefix on the function name.

done

>>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1640
>>> +        || tagName == colgroupTag) {
>> 
>> Isn't there a inline helper function for this set?  Something like isTableContext()?  Maybe that was just in dglazkov's patch.
> 
> It’s good to have a named helper even if there isn’t already one, especially if there is a name in the HTML standard document we can use.

Yup. done.

>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1685
>> +        if (tagName.matches(selectTag)) {
> 
> Why matches() instead of operator==? matches() seems not quite as strict (it allows non-matching prefixes, not clear if this matters here).

Honestly, I don't know. That's what node->hasTagName does for it's implementation, and I didn't want to risk changing the semantics here. Abarth? Is it safe you make these ==?

>> Source/WebCore/html/parser/HTMLTreeBuilder.h:90
>> +    inline bool canContinue(HTMLToken& token) { return !m_fragmentContext.fragment() || m_fragmentContext.contextTag() != nullQName() || tryToSetImpliedContextTag(token); }
> 
> The “inline” here is unneeded and has no effect.
> 
> This member function should be const.

inline removed. It can't be const because it calls non-const member functions transitively

>> Source/WebCore/xml/parser/MarkupTokenBase.h:525
>> +    WTF::Vector<UChar> m_internalCharacters;
> 
> No WTF:: prefix is needed here.
> 
> Is it the right tradeoff to always pay the overhead of a vector for internal characters? A Vector is a reasonable data structure if the buffer will be resized. If not, then an OwnArrayPtr plus a separate size would be more efficient.

Now using an OwnArrayPtr.
Comment 18 Rafael Weinstein 2012-05-07 11:15:00 PDT
Thanks, everyone for the review comments. This patch addresses all comments. Tests coming.
Comment 19 Rafael Weinstein 2012-05-07 11:28:06 PDT
Created attachment 140560 [details]
Patch
Comment 20 Rafael Weinstein 2012-05-09 15:25:18 PDT
Created attachment 141027 [details]
Patch
Comment 21 Rafael Weinstein 2012-05-09 15:26:51 PDT
removed incorrect ASSERT(). pumpTokenizer needs to be called at any point so that new network IO can be processed. This isn't a problem is practice for this patch, but it's a conceptual problem since the HTMLDocumentParser is stateful WRT network io.
Comment 22 Rafael Weinstein 2012-05-10 10:23:00 PDT
Created attachment 141194 [details]
Patch
Comment 23 Dimitri Glazkov (Google) 2012-05-10 11:36:00 PDT
Comment on attachment 141194 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=141194&action=review

> Source/WebCore/dom/DocumentFragment.h:42
> +    String innerHTML() const;
> +    void setInnerHTML(const String&, ExceptionCode&);

Should we be adding this without a define? I mean, I trust that you're awesome, but if we have to adjust our approach,  I'd hate to have a shipping implementation out there somewhere with the old bits.
Comment 24 Rafael Weinstein 2012-05-16 16:21:48 PDT
Created attachment 142363 [details]
Patch
Comment 25 Rafael Weinstein 2012-05-16 18:33:49 PDT
This last patch now represents the consensus with Mozilla about the details, which includes moving the API call from DocumentFragment.innerHTML to Document.parse().

This patch also now includes tests.

The main thing left to do here is to support the inference of SVG and MathML elements.
Comment 26 Adam Barth 2012-05-16 20:40:53 PDT
Comment on attachment 142363 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=142363&action=review

This looks great.  Thanks for talking this through with folks on the standards lists.  I think this is a great result.

> Source/WebCore/dom/Document.idl:359
> +#if defined(ENABLE_IMPLIED_CONTEXT_PARSING) && ENABLE_IMPLIED_CONTEXT_PARSING

You can just use the [Conditional] IDL attribute as on line 357 of this file.

> Source/WebCore/dom/ShadowRoot.cpp:151
> -    RefPtr<DocumentFragment> fragment = createFragmentFromSource(markup, host(), ec);
> +#if ENABLE(IMPLIED_CONTEXT_PARSING)
> +    RefPtr<DocumentFragment> fragment = createFragmentFromSource(markup, document(), 0 /* contextElement */, ec);
> +#else
> +    RefPtr<DocumentFragment> fragment = createFragmentFromSource(markup, document(), host(), ec);
> +#endif

Is this part of the change still correct?

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:247
> +// DocumentFragment.innerHTML is based on the identity of the tagName of the first startTag token encountered

DocumentFragment.innerHTML -> document.parse?  Or maybe we're doing both.

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:257
> +        || tagName == colgroupTag) {
> +        return tableTag;
> +    }

No { } in WebKit style, I believe.

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1482
> +    case ImpliedContextMode:

Ah, so we're using a mode.  Interesting.

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1491
> +

nit: I think we've been skipping this blank line in this file.

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1500
> +        ASSERT(false);

ASSERT_NOT_REACHED()

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:2285
> +        ASSERT(false);

ASSERT_NOT_REACHED

> LayoutTests/html5lib/resources/runner.js:160
> +                element = document.parse(input);

That is pretty hot.
Comment 27 Rafael Weinstein 2012-05-17 11:01:29 PDT
Created attachment 142504 [details]
Patch
Comment 28 Rafael Weinstein 2012-05-17 11:01:47 PDT
Comment on attachment 142363 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=142363&action=review

>> Source/WebCore/dom/ShadowRoot.cpp:151
>> +#endif
> 
> Is this part of the change still correct?

I asked Dimitri about this and he said that they want ShadowRoot.innerHTML to parse context-free, but to be honest I can't remember the use case. It seems naively that it makes more sense to use the host() as the context element. I'm going to remove this and leave it up the ShadowRoot people to re-enable implied-context parsing if that's what they really want.

>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:247
>> +// DocumentFragment.innerHTML is based on the identity of the tagName of the first startTag token encountered
> 
> DocumentFragment.innerHTML -> document.parse?  Or maybe we're doing both.

Nope. The comment just needed an update.

>> LayoutTests/html5lib/resources/runner.js:160
>> +                element = document.parse(input);
> 
> That is pretty hot.

;-)
Comment 29 Ryosuke Niwa 2012-05-23 17:35:36 PDT
Comment on attachment 142504 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=142504&action=review

> Source/WebCore/dom/Document.cpp:834
> +    return createFragmentFromSource(html, document(), 0 /* contextElement */, ec);

Since we want parsed script elements to run when it's inserted to somewhere in the document, we can't use createFragmentFromSource.
I'll give these functions more descriptive names. We've accumulated so many functions of obnoxious names in this area.
Comment 30 Darin Adler 2012-05-23 18:11:27 PDT
Comment on attachment 142504 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=142504&action=review

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:254
> +    if (isTableBodyContextTag(tagName)
> +        || tagName == captionTag
> +        || tagName == colgroupTag)

Would read better all on one line.
Comment 31 Rafael Weinstein 2012-05-24 13:22:34 PDT
Created attachment 143878 [details]
Patch
Comment 32 Rafael Weinstein 2012-05-24 13:24:18 PDT
Comment on attachment 142504 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=142504&action=review

>> Source/WebCore/dom/Document.cpp:834
>> +    return createFragmentFromSource(html, document(), 0 /* contextElement */, ec);
> 
> Since we want parsed script elements to run when it's inserted to somewhere in the document, we can't use createFragmentFromSource.
> I'll give these functions more descriptive names. We've accumulated so many functions of obnoxious names in this area.

createFragmentFromSource now takes an optional FragmentScriptingPermission argument which is set to AllowScriptingContentAndDoNotMarkAlreadyStarted in this case.

>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:254
>> +        || tagName == colgroupTag)
> 
> Would read better all on one line.

done
Comment 33 Ryosuke Niwa 2012-05-24 13:33:14 PDT
(In reply to comment #32)
> (From update of attachment 142504 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=142504&action=review
> 
> >> Source/WebCore/dom/Document.cpp:834
> >> +    return createFragmentFromSource(html, document(), 0 /* contextElement */, ec);
> > 
> > Since we want parsed script elements to run when it's inserted to somewhere in the document, we can't use createFragmentFromSource.
> > I'll give these functions more descriptive names. We've accumulated so many functions of obnoxious names in this area.
> 
> createFragmentFromSource now takes an optional FragmentScriptingPermission argument which is set to AllowScriptingContentAndDoNotMarkAlreadyStarted in this case.

Sorry, I have a patch on https://bugs.webkit.org/show_bug.cgi?id=87339 that's going to conflict with your patch. The function is renamed to createFragmentForInnerOuterHTML. However, I'm intending to do another around of cleanups once that patch is landed.
Comment 34 Eric Seidel (no email) 2012-05-24 13:49:36 PDT
Comment on attachment 143878 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=143878&action=review

Just some drive-by nits:

> Source/WebCore/ChangeLog:3
> +        Implement HTMLTemplateElement

IIRC, you've changed the title of the bug.

> Source/WebCore/dom/Document.idl:359
> +        [Conditional=IMPLIED_CONTEXT_PARSING] DocumentFragment parse(in [TreatNullAs=NullString,Optional=DefaultIsUndefined] DOMString tagname)

Is our style to not have spaces between attribute options?

> Source/WebCore/dom/QualifiedName.cpp:117
> +        new ((void*)&nullName) QualifiedName(nullAtom, nullAtom, nullAtom);

It makes me very suspicious that we've survived this long w/o this and that we'd now need it...

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:250
> +const QualifiedName& getImpliedContextTag(const AtomicString& tagName)

I think a more webkity-name would be something like impliedContextForTag().  The style-guide is kinda anti "get" on "getters" (not that this is an explicit getter per-say).

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1486
> +        if (token.name() == styleTag
> +            || token.name() == scriptTag
> +            || token.name() == metaTag
> +            || token.name() == linkTag) {

Do we have a name for this class of tag names?

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1698
> -        if (node->hasTagName(tbodyTag) || node->hasTagName(theadTag) || node->hasTagName(tfootTag))
> +        if (tagName.matches(tbodyTag) || tagName.matches(theadTag) || tagName.matches(tfootTag))

I believe we have helper functions for some classes of tag names, like possibly this one.  Something like isTableSection(tagName) might already exist for you.

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1725
> +            ASSERT(tagName != nullQName());

QualifiedName doesn't have an isNull()?
Comment 35 Gustavo Noronha (kov) 2012-05-24 14:25:29 PDT
Comment on attachment 143878 [details]
Patch

Attachment 143878 [details] did not pass gtk-ews (gtk):
Output: http://queues.webkit.org/results/12803249
Comment 36 Build Bot 2012-05-24 14:27:18 PDT
Comment on attachment 143878 [details]
Patch

Attachment 143878 [details] did not pass mac-ews (mac):
Output: http://queues.webkit.org/results/12801244
Comment 37 Ms2ger (he/him; ⌚ UTC+1/+2) 2012-05-25 00:30:54 PDT
Do ensure to test that:

* document.parse() throws a TypeError
* document.parse(null) returns a fragment with a single text node "null"
* document.parse(undefined) returns a fragment with a single text node "undefined"
Comment 38 Rafael Weinstein 2012-05-25 15:05:48 PDT
@Ms2ger: So it would seem the most natural thing would be to have the same behavior as range.createContextualFragment.

In Webkit this is:

empty & undefined => "undefined"
null => "null"

This doesn't appear to match Gecko, which is

empty => throws
undefined => "undefined"
null => empty fragment (no children)

=-(.

It seems like this is an IDL issue. Anyone else have opinions here?

(For now, I'll update the patch such that this is consistent with WebKit's createContextualFragment, and adds tests for these cases).
Comment 39 Rafael Weinstein 2012-05-25 15:49:25 PDT
Comment on attachment 143878 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=143878&action=review

>> Source/WebCore/ChangeLog:3
>> +        Implement HTMLTemplateElement
> 
> IIRC, you've changed the title of the bug.

Yes. Sorry. This is actually the wrong changelog. Git fail. Fixing...

>> Source/WebCore/dom/Document.idl:359
>> +        [Conditional=IMPLIED_CONTEXT_PARSING] DocumentFragment parse(in [TreatNullAs=NullString,Optional=DefaultIsUndefined] DOMString tagname)
> 
> Is our style to not have spaces between attribute options?

It appears to be mixed, but I've changed this to match Range.createContextualFragment, so now there's only one k/v

>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:250
>> +const QualifiedName& getImpliedContextTag(const AtomicString& tagName)
> 
> I think a more webkity-name would be something like impliedContextForTag().  The style-guide is kinda anti "get" on "getters" (not that this is an explicit getter per-say).

done

>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1486
>> +            || token.name() == linkTag) {
> 
> Do we have a name for this class of tag names?

unfortunately, no.

>> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:1698
>> +        if (tagName.matches(tbodyTag) || tagName.matches(theadTag) || tagName.matches(tfootTag))
> 
> I believe we have helper functions for some classes of tag names, like possibly this one.  Something like isTableSection(tagName) might already exist for you.

This is matching qualified names. the helper functions you're describing only match local name. There may be no impact, but I'd be nervous about changing the semantics here.
Comment 40 Adam Barth 2012-05-25 16:13:43 PDT
> It seems like this is an IDL issue. Anyone else have opinions here?

The behavior for empty in range.createContextualFragment is likely a result of WebKit's previous lax treatment of missing parameters.  We've kept the legacy behavior in many places to avoid breaking existing content.  For new APIs like this one, we're trying to follow the specs and throw when we have too few arguments, like Gecko does.
Comment 41 Rafael Weinstein 2012-05-25 16:25:53 PDT
fair enough. doing as Ms2ger & abarth suggest...
Comment 42 Darin Adler 2012-05-26 17:15:01 PDT
(In reply to comment #40)
> > It seems like this is an IDL issue. Anyone else have opinions here?
> 
> The behavior for empty in range.createContextualFragment is likely a result of WebKit's previous lax treatment of missing parameters.  We've kept the legacy behavior in many places to avoid breaking existing content.  For new APIs like this one, we're trying to follow the specs and throw when we have too few arguments, like Gecko does.

That covers the no-argument case, but there’s also the null case. The way that normally works is that we set the proper flags in the IDL (or perhaps it’s the default behavior) so that null turns into a null string at the binding layer. That’s distinct from an empty string; the function in the DOM can check with isNull and handle it properly, or if the handling for null and an empty string are identical, it can just omit an explicit isNull check and typically get the correct result without any special code, since null strings return true for isEmpty and 0 for length.
Comment 43 Rafael Weinstein 2012-05-31 11:46:02 PDT
Created attachment 145123 [details]
Patch
Comment 44 Rafael Weinstein 2012-05-31 11:48:56 PDT
Recent comments addressed. Tests added for empty/null/undefined argument.

@darin: Simply removing the TreatNullAs=NullString from the IDL brought the behavior inline what Ms2ger & Abarth are asking for. I'm honestly unsure if this is what you had wanted. Please let me know.
Comment 45 Darin Adler 2012-06-01 10:43:56 PDT
(In reply to comment #44)
> Recent comments addressed. Tests added for empty/null/undefined argument.
> 
> @darin: Simply removing the TreatNullAs=NullString from the IDL brought the behavior inline what Ms2ger & Abarth are asking for.

Good.

> I'm honestly unsure if this is what you had wanted. Please let me know.

I’m sorry you had trouble understanding.

I wanted you to both set the flag in the IDL and change the C++ function if needed, to get the desired result. Sounds like you did. I assume that the test results provide the proof that it’s right.
Comment 46 Early Warning System Bot 2012-06-04 14:20:53 PDT
Comment on attachment 145123 [details]
Patch

Attachment 145123 [details] did not pass qt-wk2-ews (qt):
Output: http://queues.webkit.org/results/12902299
Comment 47 Early Warning System Bot 2012-06-04 14:24:17 PDT
Comment on attachment 145123 [details]
Patch

Attachment 145123 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/12906082
Comment 48 Build Bot 2012-06-04 14:27:59 PDT
Comment on attachment 145123 [details]
Patch

Attachment 145123 [details] did not pass win-ews (win):
Output: http://queues.webkit.org/results/12902300
Comment 49 Build Bot 2012-06-04 15:04:12 PDT
Comment on attachment 145123 [details]
Patch

Attachment 145123 [details] did not pass mac-ews (mac):
Output: http://queues.webkit.org/results/12900295
Comment 50 Gyuyoung Kim 2012-06-04 15:08:59 PDT
Comment on attachment 145123 [details]
Patch

Attachment 145123 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/12895398
Comment 51 Rafael Weinstein 2012-06-04 15:30:21 PDT
Created attachment 145640 [details]
Patch
Comment 52 Rafael Weinstein 2012-06-04 15:31:31 PDT
(lastest patch: missing semi-colon broke llvm builds)
Comment 53 Eric Seidel (no email) 2012-08-22 15:24:44 PDT
Unclear what the next step here is.  Is this just missing a technical review, or is this caught up in standards work still?
Comment 54 Ryosuke Niwa 2012-08-22 15:39:12 PDT
(In reply to comment #53)
> Unclear what the next step here is.  Is this just missing a technical review, or is this caught up in standards work still?

Standards work.
Comment 55 Ryosuke Niwa 2012-09-13 15:03:52 PDT
Comment on attachment 145640 [details]
Patch

Clearing r? flag until the standards discussion settles.
Comment 56 Rafael Weinstein 2012-10-17 11:17:35 PDT
Document.parse() is now superceeded by the HTMLTemplateElement:

http://dvcs.w3.org/hg/webcomponents/raw-file/tip/spec/templates/index.html

https://bugs.webkit.org/show_bug.cgi?id=86031