Bug 44390 - HTML5 TreeBuilder builds wrong DOM for <a><svg><tr><input></a>
Summary: HTML5 TreeBuilder builds wrong DOM for <a><svg><tr><input></a>
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P1 Major
Assignee: Nobody
Depends on: 44170
  Show dependency treegraph
Reported: 2010-08-21 20:07 PDT by Eric Seidel (no email)
Modified: 2010-08-22 11:39 PDT (History)
7 users (show)

See Also:

Patch (7.04 KB, patch)
2010-08-22 10:38 PDT, Eric Seidel (no email)
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Seidel (no email) 2010-08-21 20:07:15 PDT
HTML5 TreeBuilder builds wrong DOM for <a><svg><tr><input></a>

The problem is that when we wrote "in foreign content" mode, the spec was broken.  I believe Ian has since fixed it, but we need to update our loop to pop each non-html element off the stack before we end up processing the </a> tag.

Should be an easy fix assuming the spec is sane now. :)
Comment 1 Eric Seidel (no email) 2010-08-22 10:10:01 PDT
Actually, I think this might be an adoption agency bug caused by this FIXME:

bool isNotFormattingAndNotPhrasing(const Element* element)
    // The spec often says "node is not in the formatting category, and is not
    // in the phrasing category". !phrasing && !formatting == scoping || special
    // scoping || special is easier to compute.
    // FIXME: localName() is wrong for non-html content.
    const AtomicString& tagName = element->localName();
    return isScopingTag(tagName) || isSpecialTag(tagName);
Comment 2 Eric Seidel (no email) 2010-08-22 10:11:04 PDT
We also need to update our lists to match  the current spec.  There is no such thing as a "phrasing" element anymore:

The following HTML elements have varying levels of special parsing rules: address, applet, area, article, aside, base, basefont, bgsound, blockquote, body, br, button, caption, center, col, colgroup, command, dd, details, dir, div, dl, dt, embed, fieldset, figcaption, figure, footer, form, frame, frameset, h1, h2, h3, h4, h5, h6, head, header, hgroup, hr, html, iframe, img, input, isindex, li, link, listing, marquee, menu, meta, nav, noembed, noframes, noscript, object, ol, p, param, plaintext, pre, script, section, select, style, summary, table, tbody, td, textarea, tfoot, th, thead, title, tr, ul, wbr, xmp, and SVG's foreignObject.

The following HTML elements are those that end up in the list of active formatting elements: a, b, big, code, em, font, i, nobr, s, small, strike, strong, tt, and u.

All other elements found while parsing an HTML document.
Comment 3 Eric Seidel (no email) 2010-08-22 10:38:20 PDT
Created attachment 65063 [details]
Comment 4 Eric Seidel (no email) 2010-08-22 10:39:32 PDT
The only problem with this change is that this is sorta a hodge-podge update of a few cases in the spec.  Really we should look at the total diff in the HTML5 spec from when we transcribed the TreeBuilder until today.  However, test-driven development like this is much easier. :)
Comment 5 Adam Barth 2010-08-22 11:20:28 PDT
Comment on attachment 65063 [details]

Yeah, we should look at the diff.  We can also sync with HTML5lib to get more test cases.
Comment 6 WebKit Commit Bot 2010-08-22 11:39:05 PDT
Comment on attachment 65063 [details]

Clearing flags on attachment: 65063

Committed r65785: <http://trac.webkit.org/changeset/65785>
Comment 7 WebKit Commit Bot 2010-08-22 11:39:10 PDT
All reviewed patches have been landed.  Closing bug.