Summary: | REGRESSION (r64712): Microsoft Outlook 2011: original message contents not included when replying to an email. | ||||||
---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Andy Estes <aestes> | ||||
Component: | DOM | Assignee: | Andy Estes <aestes> | ||||
Status: | RESOLVED FIXED | ||||||
Severity: | Normal | CC: | abarth, aestes, eric, hsivonen | ||||
Priority: | P1 | Keywords: | InRadar, NeedsReduction, Regression | ||||
Version: | 528+ (Nightly build) | ||||||
Hardware: | Mac | ||||||
OS: | OS X 10.6 | ||||||
Bug Depends on: | |||||||
Bug Blocks: | 41115 | ||||||
Attachments: |
|
Description
Andy Estes
2011-04-04 14:56:13 PDT
Caused by http://trac.webkit.org/changeset/64712 This is fixed by flipping the parser quirks bit for Outlook. How does Firefox 4 behave? (In reply to comment #4) > How does Firefox 4 behave? This is a application that embeds WebKit.framework. Oh, I assumed you meant OWA. (In reply to comment #3) > This is fixed by flipping the parser quirks bit for Outlook. s/is fixed/is *not* fixed/g :( I'm not quite sure what to make of this yet. I sent myself an HTML email with this content: This text is <b>bold</b> When I select this message and click reply, I see the roughly the following sequence of ObjC DOM API calls: [DOMHTMLElement setInnerHTML:@"This text is <b>bold</b>"] ... [DOMHTMLElement innerHMTL] ... [WebFrame loadHTMLString:@"<stringThatIncludesMessageHeadersAndContent>"] <stringThatIncludesMessageHeadersAndContent> contains "This text is <b>bold</b>", prefixed with message headers (From, Date, To, Subject). Basically, it's the content you'd expect to see below your cursor when replying to an email. When the legacy parser is used, [DOMHTMLElement innerHTML] returns the string that was set in the previous call to [DOMHTMLElement setInnerHTML]. When the new parser is enabled, [DOMHTMLElement innerHTML] returns an empty string. Presumably Outlook does the setInnerHTML/innerHTML combination so that it can do some DOM manipulation on the message contents, and I'm guessing it expects a certain quirk of the old parser to do this manipulation correctly. Maybe it expects a particular DOM layout that is no longer guaranteed due to creation of implicit tags like HEAD. I'll set some more breakpoints in HTMLElement. Maybe I can catch it removing a node or something. I should clarify that <stringThatIncludesMessageHeadersAndContent> is also only valid when the legacy parser is used. With the new parser, it contains the message headers but not the body. It appears to be constructed based in part on the previous call to [DOMHTMLElement innerHTML]. > [DOMHTMLElement setInnerHTML:@"This text is <b>bold</b>"]
> ...
> [DOMHTMLElement innerHMTL]
Are these called on the same element? What is the DOM below that element?
(In reply to comment #10) > > [DOMHTMLElement setInnerHTML:@"This text is <b>bold</b>"] > > ... > > [DOMHTMLElement innerHMTL] > > Are these called on the same element? What is the DOM below that element? They are the same element. If I walk up to the document element and call innerHTML, here is what I see: <head></head><body></body><body>this text is <b>bold</b></body> No doubt it's that extra BODY that is causing the problem. They probably later ask for document.body, which has no children, when they really want the second body element. Interesting! I don't think you're supposed to be able to end with with two body elements... (In reply to comment #11) > (In reply to comment #10) > > > [DOMHTMLElement setInnerHTML:@"This text is <b>bold</b>"] > > > ... > > > [DOMHTMLElement innerHMTL] > > > > Are these called on the same element? What is the DOM below that element? > > They are the same element. If I walk up to the document element and call innerHTML, here is what I see: > > <head></head><body></body><body>this text is <b>bold</b></body> > The legacy parser gives us: <body>this text is <b>bold</b></body> The problem could possibly be that they assume document.firstChild.firstChild == document.body, in which case the implicit HEAD is a problem. It seems more likely to be related to the duplicate BODY element though. (In reply to comment #12) > Interesting! I don't think you're supposed to be able to end with with two body elements... It's pretty simple to end up with two BODY elements: <script> document.firstChild.innerHTML = ""; </script> will give you: <head></head><body></body><body></body> This is because the fragment will start in BeforeHeadMode so will get implicit HEAD and BODY tags which get moved to the main document. Then when regular parsing resumes a second implicit BODY is added. Firefox 4 does this too. > It's pretty simple to end up with two BODY elements:
Crazy! I'm not sure if that's an intentional behavior.
I agree, it doesn't seem right to behave like this. I'll file a spec bug. (In reply to comment #15) > > It's pretty simple to end up with two BODY elements: > > Crazy! I'm not sure if that's an intentional behavior. The behavior is entirely logical and makes sense, because 1) DOM operations (including the innerHTML setter) don't care about the parser that's parsing the network stream. AND 2) The parser doesn't read back from the DOM. If you don't like the outcome, don't write code that sets innerHTML of the document element before the network stream parser has finished. (In reply to comment #15) > > It's pretty simple to end up with two BODY elements: > > Crazy! I'm not sure if that's an intentional behavior. It also turns out that this behavior is unrelated to the Outlook bug. I found that Outlook is explicitly creating a BODY node that parents the message fragment, so this bug is all about the new parser creating a duplicate implicit BODY. The parser needs a new quirk to accommodate Outlook that doesn't create these implicit nodes for empty documents. Apologies for the red herring. Even better, we might be able to do this with an injected script a la <https://bugs.webkit.org/show_bug.cgi?id=45693>. I was going to suggest that, but we got a mid-air collision. :) <http://trac.webkit.org/changeset/71339> made it such that we can't inject scripts into the initial document, which prevents me from removing the implicit BODY from the initial empty document before Outlook adds its own BODY. I suspect we could double-quirk it for mail. :) i.e. turn back on the always-inject feature only for outlook and then write such an early-inject script for outlook. :Donno. The inject-based quirks are just so clean to write! Created attachment 88560 [details]
Patch
Comment on attachment 88560 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=88560&action=review > Source/WebCore/loader/FrameLoader.cpp:-796 > + m_frame->injectUserScripts(InjectAtDocumentEnd); > + > if (m_stateMachine.creatingInitialEmptyDocument()) > return; > > - m_frame->injectUserScripts(InjectAtDocumentEnd); No explanation of this change. This is why per-file/per-function comments in ChangeLog are better. You could have said why you changed this. > Source/WebCore/page/Frame.cpp:522 > + if (loader()->stateMachine()->creatingInitialEmptyDocument() > + && !settings()->injectUserScriptsInInitialEmptyDocument()) I suggest combining into one line. > Source/WebCore/page/Settings.cpp:175 > + , m_injectUserScriptsInInitialEmptyDocument(false) SInce this is a verb phrase, I think you should add the word “should” to its name. > Source/WebCore/page/Settings.h:390 > + void setInjectUserScriptsInInitialEmptyDocument(bool flag) { m_injectUserScriptsInInitialEmptyDocument = flag; } > + bool injectUserScriptsInInitialEmptyDocument() { return m_injectUserScriptsInInitialEmptyDocument; } Ditto. > Source/WebCore/page/Settings.h:493 > + bool m_injectUserScriptsInInitialEmptyDocument : 1; Ditto. Very slick. Comment on attachment 88560 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=88560&action=review >> Source/WebCore/page/Settings.cpp:175 >> + , m_injectUserScriptsInInitialEmptyDocument(false) > > SInce this is a verb phrase, I think you should add the word “should” to its name. Yea I should have known better. In my defense, I consciously rejected doing this in preference of being consistent with the other boolean members of this class. Committed r83201: <http://trac.webkit.org/changeset/83201> |