Bug 44390

Summary: HTML5 TreeBuilder builds wrong DOM for <a><svg><tr><input></a>
Product: WebKit Reporter: Eric Seidel (no email) <eric>
Component: WebCore Misc.Assignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Major CC: abarth, commit-queue, ddkilzer, eric, inferno, sam, tonyg
Priority: P1    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Bug Depends on: 44170    
Bug Blocks:    
Attachments:
Description Flags
Patch none

Description Eric Seidel (no email) 2010-08-21 20:07:15 PDT
HTML5 TreeBuilder builds wrong DOM for <a><svg><tr><input></a>

The problem is that when we wrote "in foreign content" mode, the spec was broken.  I believe Ian has since fixed it, but we need to update our loop to pop each non-html element off the stack before we end up processing the </a> tag.

Should be an easy fix assuming the spec is sane now. :)
Comment 1 Eric Seidel (no email) 2010-08-22 10:10:01 PDT
Actually, I think this might be an adoption agency bug caused by this FIXME:

bool isNotFormattingAndNotPhrasing(const Element* element)
{
    // The spec often says "node is not in the formatting category, and is not
    // in the phrasing category". !phrasing && !formatting == scoping || special
    // scoping || special is easier to compute.
    // FIXME: localName() is wrong for non-html content.
    const AtomicString& tagName = element->localName();
    return isScopingTag(tagName) || isSpecialTag(tagName);
Comment 2 Eric Seidel (no email) 2010-08-22 10:11:04 PDT
We also need to update our lists to match  the current spec.  There is no such thing as a "phrasing" element anymore:

Special
The following HTML elements have varying levels of special parsing rules: address, applet, area, article, aside, base, basefont, bgsound, blockquote, body, br, button, caption, center, col, colgroup, command, dd, details, dir, div, dl, dt, embed, fieldset, figcaption, figure, footer, form, frame, frameset, h1, h2, h3, h4, h5, h6, head, header, hgroup, hr, html, iframe, img, input, isindex, li, link, listing, marquee, menu, meta, nav, noembed, noframes, noscript, object, ol, p, param, plaintext, pre, script, section, select, style, summary, table, tbody, td, textarea, tfoot, th, thead, title, tr, ul, wbr, xmp, and SVG's foreignObject.

Formatting
The following HTML elements are those that end up in the list of active formatting elements: a, b, big, code, em, font, i, nobr, s, small, strike, strong, tt, and u.

Ordinary
All other elements found while parsing an HTML document.
Comment 3 Eric Seidel (no email) 2010-08-22 10:38:20 PDT
Created attachment 65063 [details]
Patch
Comment 4 Eric Seidel (no email) 2010-08-22 10:39:32 PDT
The only problem with this change is that this is sorta a hodge-podge update of a few cases in the spec.  Really we should look at the total diff in the HTML5 spec from when we transcribed the TreeBuilder until today.  However, test-driven development like this is much easier. :)
Comment 5 Adam Barth 2010-08-22 11:20:28 PDT
Comment on attachment 65063 [details]
Patch

Yeah, we should look at the diff.  We can also sync with HTML5lib to get more test cases.
Comment 6 WebKit Commit Bot 2010-08-22 11:39:05 PDT
Comment on attachment 65063 [details]
Patch

Clearing flags on attachment: 65063

Committed r65785: <http://trac.webkit.org/changeset/65785>
Comment 7 WebKit Commit Bot 2010-08-22 11:39:10 PDT
All reviewed patches have been landed.  Closing bug.