Bug 7321

Summary: REGRESSION: style tag in body causes two head elements to appear in the DOM
Product: WebKit Reporter: David Kilzer (:ddkilzer) <ddkilzer@webkit.org>
Component: HTML DOMAssignee: David Kilzer (:ddkilzer) <ddkilzer@webkit.org>
Status: VERIFIED FIXED    
Severity: Normal Keywords: EasyFix, Regression
Priority: P1    
Version: 420+   
Hardware: Macintosh   
OS: Mac OS X 10.4   
URL: http://software.hixie.ch/utilities/js/live-dom-viewer/?%3Chtml%3E%3Chead%3E%3C/head%3E%3Cbody%3E%3Cstyle%3E%3C/style%3E%3C/body%3E%3C/html%3E%0A
Attachments:
Description Flags
Patch v1
mjs: review-
Patch v2 darin: review+

Description From 2006-02-16 21:48:05 PST
Summary:

A <style> tag that appears in the <body> of an HTML document causes two <head> elements to appear in the DOM.

Steps to reproduce:

1. Load an HTML document with a <style> tag in the <body> and a <head> element.
2. View the HTML document's DOM.  Ian Hixie's Live DOM Inspector works well, too.  http://software.hixie.ch/utilities/js/live-dom-viewer/

Expected results:

The DOM should only have one HEAD element.

Actual results:

The DOM has two HEAD elements.

Regression:

This is a regression from Safari 2.0.3 (417.8) on Mac OS X 10.4.5, which only displays one HEAD element in the DOM.

Tested on WebKit nightly build r12864.
------- Comment #1 From 2006-02-18 16:27:39 PST -------
Checking the nightly builds, this bug is as old as the first nightly, WebKit-CVS-2005-10-01 03:27:01 GMT.dmg, so it's been around a while.
------- Comment #2 From 2006-02-18 17:40:02 PST -------
The bug is in HTMLParser::headCreateErrorCheck. That function should always return false, and when it decides to create a head element it needs to store a pointer to it in "head".

See formCreateErrorCheck and mapCreateErrorCheck for similar logic. The only part I'm not entirely sure about is the "current->localName() == htmlTag" clause. That means that you're allowed to create a second head if you're inside an <html> element; would be good to test that to see if it's the behavior we want.
------- Comment #3 From 2006-02-18 21:01:58 PST -------
(In reply to comment #2)
> The bug is in HTMLParser::headCreateErrorCheck. That function should always
> return false, and when it decides to create a head element it needs to store a
> pointer to it in "head".

What about HTMLParser::createHead() and the duplicate head-creation code under "Step 2" in HTMLParser::handleError()?
------- Comment #4 From 2006-02-19 17:50:11 PST -------
With Darin's proposed fix:

bool HTMLParser::headCreateErrorCheck(Token* t, RefPtr<NodeImpl>& result)
{
    if (!head || current->localName() == htmlTag) {
        head = new HTMLHeadElementImpl(document);
        result = head;
    }
    return false;
}

We must make sure that no more than one <title> and one <base> tag appear in the <head>.  (This is not the case if only the above change is made.)  Firefox 1.5 currently prevents multiple <title> tags, but it doesn't prevent multiple <base> tags.  I haven't surveyed MSIE 6 yet.

Regarding Comment #3, there seems like there might be a lot of unused code in the handleError() method, but I don't grok the HTML parser well enough (yet) to understand when all that code may be used.
------- Comment #5 From 2006-02-19 19:52:46 PST -------
(In reply to comment #4)
> bool HTMLParser::headCreateErrorCheck(Token* t, RefPtr<NodeImpl>& result)
> {
>     if (!head || current->localName() == htmlTag) {
>         head = new HTMLHeadElementImpl(document);
>         result = head;
>     }
>     return false;
> }

This change also causes an HTML document like this to crash WebKit:

<<head><style></style></head>

I don't think this change will be quite as easy as Darin first expected.
------- Comment #6 From 2006-02-20 07:49:25 PST -------
This fix won't crash with the test case from Comment #5, but it fails about 375 LayoutTests.  More investigation will be needed.  (I didn't try running LayoutTests after applying the change from Comment #5, yet.)

bool HTMLParser::headCreateErrorCheck(Token* t, RefPtr<NodeImpl>& result)
{
    createHead();
    return false;
}

NOTE: The multiple <title> and <base> tags in <head> issue also occurs on Safari 2.0.3 on 10.4.5, so this isn't a regression.  I will file a new bug (or bugs) for this issue.
------- Comment #7 From 2006-02-20 08:00:06 PST -------
Darin's fix causes a crash (see Comment #5) but only fails one LayoutTest:  editing/inserting/editing-empty-divs.
------- Comment #8 From 2006-02-22 22:25:25 PST -------
I believe I have a fix for this.  Also had to modify HTMLParser::parseToken() so that 'head' behaved the same way as 'map' and 'form'.

Need a test case, though.
------- Comment #9 From 2006-02-23 05:58:49 PST -------
Created an attachment (id=6676) [details]
Patch v1

Patch v1.  Maciej commented on IRC that he wanted to see my next patch, so I'm giving him first dibs on the review.  :)  I assume I'll get a review- because there is no layout test, but I'm looking for feedback on the patch and on how to implement a layout test.

Regarding a layout test, it seems to me that a layoutTestController.dumpDOMTree() method would be really handy for a test like this.  Is this worth implementing, or is there a preferred method (such as JavaScript DOM inspection) to test that the DOM has but a single HEAD node?
------- Comment #10 From 2006-02-23 06:08:54 PST -------
FWIW, the crashes I found from previous attempts at patches (Comment #5 and Comment #7) were because HTMLParser::parseToken() was not modified to set "head = 0".  In those cases, the new head node was never inserted into the tree and subsequently caused a double-free error later in the parsing when an attempt was made to add a child node to it.
------- Comment #11 From 2006-02-23 07:19:01 PST -------
(In reply to comment #9)
> Created an attachment (id=6676) [edit] [details]
> Patch v1

I should also note that this patch passes all current layout tests as of r12944 (except for the ones that have known issues like some of the new http tests--failures are nearly identical to buildbot--and the timezone issue with fast/js/date-constructor).
------- Comment #12 From 2006-02-23 07:35:52 PST -------
(In reply to comment #9)
> Maciej commented on IRC that he wanted to see my next patch, so I'm
> giving him first dibs on the review.  :)  I assume I'll get a review- because
> there is no layout test, but I'm looking for feedback on the patch and on how
> to implement a layout test.

The patch looks perfect to me.

And I also think it's straightforward enough that it doesn't need to be Maciej who reviews it.

> Regarding a layout test, it seems to me that a
> layoutTestController.dumpDOMTree() method would be really handy for a test like
> this.  Is this worth implementing, or is there a preferred method (such as
> JavaScript DOM inspection) to test that the DOM has but a single HEAD node?

It's very easy to find out how many <head> elements there are.

    document.getElementsByTagName("head").length()

So I don't think you need a new layoutTestController feature.
------- Comment #13 From 2006-02-23 08:42:58 PST -------
Given the other code to special-case <head>, I'd like to see layout tests that check all the different code paths. A great way to do that is to set a breakpoint on each one and make sure every one gets tickled.
------- Comment #14 From 2006-02-23 10:49:34 PST -------
I didn't mean I wanted to be the one to review necessarily, I just wanted to encourage you to send more patches. :-) The patch looks good to me too, marking r- for lack of test case.
------- Comment #15 From 2006-02-24 22:42:48 PST -------
Created an attachment (id=6721) [details]
Patch v2

Changes since Patch v1:

- Added a test that uses an iframe to load various HTML documents that test each code path in htmlparser.cpp.
------- Comment #16 From 2006-02-25 20:44:17 PST -------
(From update of attachment 6721 [details])
r=me
------- Comment #17 From 2006-03-07 20:29:59 PST -------
Verified fixed as of r13202 (locally built).