57794 – REGRESSION (r64712): Microsoft Outlook 2011: original message contents not included when replying to an email.

RESOLVED FIXED 57794

REGRESSION (r64712): Microsoft Outlook 2011: original message contents not included when replying to an email.

https://bugs.webkit.org/show_bug.cgi?id=57794

Summary REGRESSION (r64712): Microsoft Outlook 2011: original message contents not in...

Andy Estes

Reported 2011-04-04 14:56:13 PDT

When Microsoft Outlook 2011 is running with a version of WebKit >= r64712, reply messages do not include the original message content. I have a copy of Outlook 2011 installed and am working on creating a reduction.

Attachments
Patch (20.24 KB, patch) 2011-04-06 20:11 PDT, Andy Estes	darin: review+	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Andy Estes

Comment 1 2011-04-04 14:56:24 PDT

<rdar://problem/9065834>

Andy Estes

Comment 2 2011-04-04 15:05:24 PDT

Caused by http://trac.webkit.org/changeset/64712

Andy Estes

Comment 3 2011-04-04 16:00:22 PDT

This is fixed by flipping the parser quirks bit for Outlook.

Adam Barth

Comment 4 2011-04-04 16:14:54 PDT

How does Firefox 4 behave?

Andy Estes

Comment 5 2011-04-04 16:15:44 PDT

(In reply to comment #4) > How does Firefox 4 behave? This is a application that embeds WebKit.framework.

Adam Barth

Comment 6 2011-04-04 16:18:46 PDT

Oh, I assumed you meant OWA.

Andy Estes

Comment 7 2011-04-04 20:08:00 PDT

(In reply to comment #3) > This is fixed by flipping the parser quirks bit for Outlook. s/is fixed/is *not* fixed/g :(

Andy Estes

Comment 8 2011-04-05 02:14:37 PDT

I'm not quite sure what to make of this yet. I sent myself an HTML email with this content: This text is bold When I select this message and click reply, I see the roughly the following sequence of ObjC DOM API calls: [DOMHTMLElement setInnerHTML:@"This text is bold"] ... [DOMHTMLElement innerHMTL] ... [WebFrame loadHTMLString:@"<stringThatIncludesMessageHeadersAndContent>"] <stringThatIncludesMessageHeadersAndContent> contains "This text is bold", prefixed with message headers (From, Date, To, Subject). Basically, it's the content you'd expect to see below your cursor when replying to an email. When the legacy parser is used, [DOMHTMLElement innerHTML] returns the string that was set in the previous call to [DOMHTMLElement setInnerHTML]. When the new parser is enabled, [DOMHTMLElement innerHTML] returns an empty string. Presumably Outlook does the setInnerHTML/innerHTML combination so that it can do some DOM manipulation on the message contents, and I'm guessing it expects a certain quirk of the old parser to do this manipulation correctly. Maybe it expects a particular DOM layout that is no longer guaranteed due to creation of implicit tags like HEAD. I'll set some more breakpoints in HTMLElement. Maybe I can catch it removing a node or something.

Andy Estes

Comment 9 2011-04-05 02:16:45 PDT

I should clarify that <stringThatIncludesMessageHeadersAndContent> is also only valid when the legacy parser is used. With the new parser, it contains the message headers but not the body. It appears to be constructed based in part on the previous call to [DOMHTMLElement innerHTML].

Adam Barth

Comment 10 2011-04-05 10:03:49 PDT

> [DOMHTMLElement setInnerHTML:@"This text is bold"] > ... > [DOMHTMLElement innerHMTL] Are these called on the same element? What is the DOM below that element?

Andy Estes

Comment 11 2011-04-05 17:36:33 PDT

(In reply to comment #10) > > [DOMHTMLElement setInnerHTML:@"This text is bold"] > > ... > > [DOMHTMLElement innerHMTL] > > Are these called on the same element? What is the DOM below that element? They are the same element. If I walk up to the document element and call innerHTML, here is what I see: <head></head><body></body><body>this text is bold</body> No doubt it's that extra BODY that is causing the problem. They probably later ask for document.body, which has no children, when they really want the second body element.

Adam Barth

Comment 12 2011-04-05 17:39:51 PDT

Interesting! I don't think you're supposed to be able to end with with two body elements...

Andy Estes

Comment 13 2011-04-05 17:42:29 PDT

(In reply to comment #11) > (In reply to comment #10) > > > [DOMHTMLElement setInnerHTML:@"This text is bold"] > > > ... > > > [DOMHTMLElement innerHMTL] > > > > Are these called on the same element? What is the DOM below that element? > > They are the same element. If I walk up to the document element and call innerHTML, here is what I see: > > <head></head><body></body><body>this text is bold</body> > The legacy parser gives us: <body>this text is bold</body> The problem could possibly be that they assume document.firstChild.firstChild == document.body, in which case the implicit HEAD is a problem. It seems more likely to be related to the duplicate BODY element though.

Andy Estes

Comment 14 2011-04-05 19:02:12 PDT

(In reply to comment #12) > Interesting! I don't think you're supposed to be able to end with with two body elements... It's pretty simple to end up with two BODY elements: <script> document.firstChild.innerHTML = ""; </script> will give you: <head></head><body></body><body></body> This is because the fragment will start in BeforeHeadMode so will get implicit HEAD and BODY tags which get moved to the main document. Then when regular parsing resumes a second implicit BODY is added. Firefox 4 does this too.

Adam Barth

Comment 15 2011-04-05 22:02:55 PDT

> It's pretty simple to end up with two BODY elements: Crazy! I'm not sure if that's an intentional behavior.

Andy Estes

Comment 16 2011-04-05 23:21:28 PDT

I agree, it doesn't seem right to behave like this. I'll file a spec bug.

Henri Sivonen

Comment 17 2011-04-06 00:51:01 PDT

(In reply to comment #15) > > It's pretty simple to end up with two BODY elements: > > Crazy! I'm not sure if that's an intentional behavior. The behavior is entirely logical and makes sense, because 1) DOM operations (including the innerHTML setter) don't care about the parser that's parsing the network stream. AND 2) The parser doesn't read back from the DOM. If you don't like the outcome, don't write code that sets innerHTML of the document element before the network stream parser has finished.

Andy Estes

Comment 18 2011-04-06 01:03:33 PDT

(In reply to comment #15) > > It's pretty simple to end up with two BODY elements: > > Crazy! I'm not sure if that's an intentional behavior. It also turns out that this behavior is unrelated to the Outlook bug. I found that Outlook is explicitly creating a BODY node that parents the message fragment, so this bug is all about the new parser creating a duplicate implicit BODY. The parser needs a new quirk to accommodate Outlook that doesn't create these implicit nodes for empty documents. Apologies for the red herring.

Andy Estes

Comment 19 2011-04-06 01:06:58 PDT

Even better, we might be able to do this with an injected script a la <https://bugs.webkit.org/show_bug.cgi?id=45693>.

Adam Barth

Comment 20 2011-04-06 01:07:41 PDT

I was going to suggest that, but we got a mid-air collision. :)

Andy Estes

Comment 21 2011-04-06 04:21:35 PDT

<http://trac.webkit.org/changeset/71339> made it such that we can't inject scripts into the initial document, which prevents me from removing the implicit BODY from the initial empty document before Outlook adds its own BODY.

Eric Seidel (no email)

Comment 22 2011-04-06 05:24:13 PDT

I suspect we could double-quirk it for mail. :) i.e. turn back on the always-inject feature only for outlook and then write such an early-inject script for outlook. :Donno. The inject-based quirks are just so clean to write!

Andy Estes

Comment 23 2011-04-06 20:11:57 PDT

Created attachment 88560 [details] Patch

Darin Adler

Comment 24 2011-04-06 20:24:08 PDT

Comment on attachment 88560 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=88560&action=review > Source/WebCore/loader/FrameLoader.cpp:-796 > + m_frame->injectUserScripts(InjectAtDocumentEnd); > + > if (m_stateMachine.creatingInitialEmptyDocument()) > return; > > - m_frame->injectUserScripts(InjectAtDocumentEnd); No explanation of this change. This is why per-file/per-function comments in ChangeLog are better. You could have said why you changed this. > Source/WebCore/page/Frame.cpp:522 > + if (loader()->stateMachine()->creatingInitialEmptyDocument() > + && !settings()->injectUserScriptsInInitialEmptyDocument()) I suggest combining into one line. > Source/WebCore/page/Settings.cpp:175 > + , m_injectUserScriptsInInitialEmptyDocument(false) SInce this is a verb phrase, I think you should add the word “should” to its name. > Source/WebCore/page/Settings.h:390 > + void setInjectUserScriptsInInitialEmptyDocument(bool flag) { m_injectUserScriptsInInitialEmptyDocument = flag; } > + bool injectUserScriptsInInitialEmptyDocument() { return m_injectUserScriptsInInitialEmptyDocument; } Ditto. > Source/WebCore/page/Settings.h:493 > + bool m_injectUserScriptsInInitialEmptyDocument : 1; Ditto.

Eric Seidel (no email)

Comment 25 2011-04-06 20:27:46 PDT

Very slick.

Andy Estes

Comment 26 2011-04-07 12:56:11 PDT

Comment on attachment 88560 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=88560&action=review >> Source/WebCore/page/Settings.cpp:175 >> + , m_injectUserScriptsInInitialEmptyDocument(false) > > SInce this is a verb phrase, I think you should add the word “should” to its name. Yea I should have known better. In my defense, I consciously rejected doing this in preference of being consistent with the other boolean members of this class.

Andy Estes

Comment 27 2011-04-07 13:23:52 PDT

Committed r83201: <http://trac.webkit.org/changeset/83201>

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P1

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Mac

OS OS X 10.6

Product WebKit

Component DOM

Assignee

Andy Estes

Reported

2011-04-04 14:56 PDT

Modified

2011-04-07 13:23 PDT History

CC List

4 users Show

URL

Keywords InRadar, NeedsReduction, Regression

Depends on

Blocks

41115

Dependencies

tree graph