Bug 27780 - Whitespace after the <head> but before the <body> lost during parsing
Summary: Whitespace after the <head> but before the <body> lost during parsing
Status: RESOLVED INVALID
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC OS X 10.5
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-07-28 14:30 PDT by Justin Garcia
Modified: 2023-04-01 00:24 PDT (History)
4 users (show)

See Also:


Attachments
test case (297 bytes, text/html)
2009-07-28 14:31 PDT, Justin Garcia
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Justin Garcia 2009-07-28 14:30:49 PDT
If I have something like:

<html>
<head><script>
function foo() {
   alert(document.documentElement.innerHTML);
}
</script></head>

<body>World</body>
</html>

I see that the whitespace after </head> and before <body> is lost.

See the attached test case.
Comment 1 Justin Garcia 2009-07-28 14:31:09 PDT
Created attachment 33671 [details]
test case
Comment 2 Dave Hyatt 2009-07-28 15:01:36 PDT
Are you sure it's lost?  I think the significant whitespace following the </head> causes the parser to have to open a <body> implicitly.... so the whitespace is probably just inside the <body> instead...
Comment 3 Andy Matuschak 2009-07-28 15:29:22 PDT
That's a sensible explanation, but from the viewpoint of roundtripping HTML, it would be really nice if that *didn't* happen... :)
Comment 4 Andy Matuschak 2009-07-28 20:35:41 PDT
Upon investigation it seems that this is not the case. The whitespace is actually lost.
Comment 5 Alexey Proskuryakov 2009-07-29 17:55:16 PDT
Is this significant whitespace in your test? I'm not absolutely sure, but I think that you'd need an XML document with xml:space="preserve" to make this whitespace significant.
Comment 6 Andy Matuschak 2009-07-29 17:59:08 PDT
Regardless of whether it's significant, when making tools that work with HTML the user wants to round-trip, this behavior is undesirable.
Comment 7 Anne van Kesteren 2023-04-01 00:24:49 PDT
This behavior is covered by the HTML Standard.