Summary: | HTML5 Parser: document.write in a asynchronous script which is specified to load before page finish blows away document | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Johnny(Jianning) Ding <jnd> | ||||||||||
Component: | WebCore Misc. | Assignee: | Adam Barth <abarth> | ||||||||||
Status: | RESOLVED FIXED | ||||||||||||
Severity: | Normal | CC: | abarth, ap, commit-queue, eric, tonyg | ||||||||||
Priority: | P2 | ||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||
Hardware: | All | ||||||||||||
OS: | All | ||||||||||||
URL: | http://www.tianya.cn/publicforum/articleslist/0/funinfo.shtml | ||||||||||||
Bug Depends on: | 40745 | ||||||||||||
Bug Blocks: | 41115 | ||||||||||||
Attachments: |
|
Description
Johnny(Jianning) Ding
2010-07-15 06:39:58 PDT
Created attachment 61647 [details]
html file of the test case
Created attachment 61648 [details]
js file of the test case
When talking with Adam for this issue, he told me to read a related W3C bug. (http://www.w3.org/Bugs/Public/show_bug.cgi?id=9767) In the comment#10 of that bug, Ian provided a solution: "It would need a flag on the <script> element set when the element is added to the document, based on whether the parser is on the stack, and then for the script execution to set a similar flag". According to Ian's solution, if script is added when document's parser is on the stack, which means the author of the webpage wants to execute the script in current document, and the document.write in that script should not blow way the document. This solution looks reasonable for me. My thought to fix this bug is 1.Add a flag in ScriptElementData to indicate whether the script is added when document's parser is on the stack. (the ScriptElementData::m_createByParser is right flag to use?) 2. in Document.write, if a script is running (check both HTMLDocumentParser::m_scriptRunner and Document::m_scriptsToExecuteSoon), and if the script was added when document's parser was on the stack, ignore this document.write. Otherwise do write. I think this is a duplicate of bug 40745, no? (In reply to comment #4) > I think this is a duplicate of bug 40745, no? Unclear. There are a bunch of different cases here. I fixed the cases in Bug 40745 to the extent that the referenced sites work properly. In the end, we're going to them a the same time by updating to what the spec says to do. Adam's r63165: <http://trac.webkit.org/changeset/63165> already fixed the issue on http://www.tianya.cn/publicforum/articleslist/0/funinfo.shtml. But there is a race condition. if "show_ads.js" is the last resource of the main doc, then parser will be closed before executing the "show_ads.js". Then when executing "show_ads.js", document.write will blow away the document. (Like the test case I gave in this issue) (In reply to comment #6) > Adam's r63165: <http://trac.webkit.org/changeset/63165> already fixed the issue on http://www.tianya.cn/publicforum/articleslist/0/funinfo.shtml. > > But there is a race condition. if "show_ads.js" is the last resource of the main doc, then parser will be closed before executing the "show_ads.js". Then when executing "show_ads.js", document.write will blow away the document. (Like the test case I gave in this issue) Here's another site that repros: https://www.delta.com/booking/searchFlights.do?displayTripType=multicity This appears to be the trigger that adds the async script: <script type="text/javascript" language="JavaScript"> function embedChat(){ var chatScript = document.createElement('script'); chatScript.type = 'text/javascript'; chatScript.src = "https://kanachat.delta.com/tracker/vtc.php?orgid=1044339&ichannelid=Y91zET1044345"; get("online_chat").appendChild(chatScript); } addEvent(window, 'load', embedChat, false); </script> Created attachment 62338 [details]
Patch
Comment on attachment 62338 [details] Patch I'm having trouble understanding this patch. LayoutTests/http/tests/misc/write-from-dom-script.html:9 + script.src = "data:text/javascript,document.write('FAIL');document.close();"; It's sad that this script would "PASS" if your script had an error. Would be better if the script set some global that was also checked and could cause FAIL. LayoutTests/http/tests/misc/write-while-waiting.html:8 + setTimeout("document.write('PASS');document.close();", 100); Would be nice if you could explain in the ChangeLog why this is right? Maybe you did and I didn't understand. WebCore/dom/Document.cpp:366 + , m_isWriteNeutralised(false) It's sad that this is on Document and not DocumentWriter or something else. Document is such a dumping ground. WebCore/dom/Document.cpp:2020 + bool insertionPointIsUndefined = !m_parser || !m_parser->hasInsertionPoint(); This would have been more clear (to me) as: bool hasInsertionPoint = m_parser && m_parser->hasInsertionPoint(); WebCore/dom/Document.cpp:2023 + if (insertionPointIsUndefined && isWriteNeutralised()) Then this becomes !hasInsertionPoint WebCore/dom/Document.h:1005 + bool isWriteNeutralised() const { return m_isWriteNeutralised; } I wish the spec called this "writeDisabled", Neutralized is a poor name IMO. Hard to spell, hard to type, hard for our numerous non-english speakers to understand. WebCore/dom/DocumentParser.h:44 + virtual bool hasInsertionPoint() { return true; } So what does this mean when true? Who should override it? If it's true, you'll always get an insert call? WebCore/dom/ScriptElement.cpp:59 + // If the element's Document has an active parser, and the parser's script This comment makes it seem like this should not go inside ScriptElement. WebCore/dom/ScriptElement.cpp:61 + // "parser-inserted" flag set, the user agent must set the element's Or maybe that the Document should be notified that the script was inserted and it should make a decision to nueter itself or not. WebCore/dom/ScriptElement.cpp: + static inline bool useHTML5Parser(Document* document) This was previous dead code? WebCore/dom/ScriptElement.cpp:203 + // flag was set as being itself "write-neutralised". Let neutralised doc God this is an awful name. I need to complain to Hixie. Done: http://www.w3.org/Bugs/Public/show_bug.cgi?id=10228 Btw, my spelling corrector seems to think it's "neutralized" not "neutralised". Yet another reason why that name needs to go. WebCore/html/HTMLDocumentParser.cpp:218 + if (m_scriptRunner && m_scriptRunner->inScriptExecution()) This needs a comment to explain why. It's not clear why "inScirptExecution" means there is an insertion point. Why would that even be needed? Can't we check m_input? WebCore/html/HTMLDocumentParser.cpp:220 + // FIXME: Checking whether we've seen end of file isn't quite right when I don't undersand this FIXME. Can you explain better? I think I can guess why, but can we document somewhere why the LegacyHTMLDocumentParser doesn't want a isWriteNeuteredOrWhateverTheBrokenNameInTheSpecIs() function? Comment on attachment 62338 [details]
Patch
WebCore/dom/ScriptElement.cpp:65
+ data.setIsWriteNeutralised(true);
Once write neutered, always neutered. Is this the idea? Or should this flag get reset if the script element is moved to a new document before it's fully loaded? Not sure, sounds like a complicated edge case.
> I'm having trouble understanding this patch. Eric and I discussed the patch verbally. > LayoutTests/http/tests/misc/write-from-dom-script.html:9 > + script.src = "data:text/javascript,document.write('FAIL');document.close();"; > It's sad that this script would "PASS" if your script had an error. Would be better if the script set some global that was also checked and could cause FAIL. Fixed. > LayoutTests/http/tests/misc/write-while-waiting.html:8 > + setTimeout("document.write('PASS');document.close();", 100); > Would be nice if you could explain in the ChangeLog why this is right? Maybe you did and I didn't understand. I added more details to the ChangeLog. Ian's trying to narrowly address the compatibility problem. > WebCore/dom/Document.cpp:366 > + , m_isWriteNeutralised(false) > It's sad that this is on Document and not DocumentWriter or something else. Document is such a dumping ground. Of all the state on document, this one is very appropriate. The spec stores the state on Document and it's read by Document::write, which seems to really belong on document. > WebCore/dom/Document.cpp:2020 > + bool insertionPointIsUndefined = !m_parser || !m_parser->hasInsertionPoint(); > This would have been more clear (to me) as: > bool hasInsertionPoint = m_parser && m_parser->hasInsertionPoint(); Done. > WebCore/dom/Document.cpp:2023 > + if (insertionPointIsUndefined && isWriteNeutralised()) > Then this becomes !hasInsertionPoint Yep. > WebCore/dom/Document.h:1005 > + bool isWriteNeutralised() const { return m_isWriteNeutralised; } > I wish the spec called this "writeDisabled", Neutralized is a poor name IMO. > Hard to spell, hard to type, hard for our numerous non-english speakers to understand. Done. I chose the name to match the spec. It still has a hyperlink to the spec, so folks should be able to figure it out. > WebCore/dom/DocumentParser.h:44 > + virtual bool hasInsertionPoint() { return true; } > So what does this mean when true? Who should override it? If it's true, you'll always get an insert call? We discussed this over the phone. The default should be false, but that messes up some SVG font code. I'll make another patch that changes the default to false and fixes the SVG font code to use the right APIs. > WebCore/dom/ScriptElement.cpp:59 > + // If the element's Document has an active parser, and the parser's script > This comment makes it seem like this should not go inside ScriptElement. I didn't understand this comment. It's in the "script element" section of the spec and it's manipulating some part of the script element's state... Putting the code here seems to make sense. > WebCore/dom/ScriptElement.cpp:61 > + // "parser-inserted" flag set, the user agent must set the element's > Or maybe that the Document should be notified that the script was inserted and it should make a decision to nueter itself or not. I think this comment stems from your confusing about how the dataflow works. The document does get notified, but when the script runs. Here we're remembering what life was like when we were inserted into the document. > WebCore/dom/ScriptElement.cpp: > + static inline bool useHTML5Parser(Document* document) > This was previous dead code? Yes. Technically this change is unrelated, but I just saw it and removed it. > WebCore/dom/ScriptElement.cpp:203 > + // flag was set as being itself "write-neutralised". Let neutralised doc > God this is an awful name. I need to complain to Hixie. > Done: http://www.w3.org/Bugs/Public/show_bug.cgi?id=10228 > > Btw, my spelling corrector seems to think it's "neutralized" not "neutralised". Yet another reason why that name needs to go. The name is gone, from the code at least. > WebCore/html/HTMLDocumentParser.cpp:218 > + if (m_scriptRunner && m_scriptRunner->inScriptExecution()) > This needs a comment to explain why. It's not clear why "inScirptExecution" means there is an insertion point. Why would that even be needed? Can't we check m_input? I've changed this to get the information from m_input. I've also renamed HTMLScriptRunner::inScriptExecution to something more informative. > WebCore/html/HTMLDocumentParser.cpp:220 > + // FIXME: Checking whether we've seen end of file isn't quite right when > I don't undersand this FIXME. Can you explain better? I've removed it. Created attachment 62376 [details]
Patch
Comment on attachment 62376 [details]
Patch
WebCore/dom/Document.cpp:2021
+ // "write-neutralised" flag set, then abort these steps.
Doesn't match the code, but matches the spec, so OK.
WebCore/dom/Document.cpp:2026
+ if (!hasInsertionPoint)
I thought you said you had to change this back to being if (m_parser)?
WebCore/dom/Document.cpp:366
+ , m_writeDisabled(false)
I hope there is just the one constructor...
WebCore/dom/ScriptElement.cpp:65
+ data.setWriteDisabled(true);
We'll never set this back to "false" if we move into a different document at a later time. Should we?
WebCore/dom/ScriptElement.cpp:207
+ m_element->document()->setWriteDisabled(true);
This probably would have been cleaner as a stack-object (like you're so fond of). :)
WebCore/html/HTMLInputStream.h:72
+ // FIXME: Somehow we need to understand the difference between
I thought you said this FIXME could go away if we moved the check back to m_parser instead? What are the concequences of this FIXME? What are we doing wrong now? How bad is it?
WebCore/dom/DocumentParser.h:44
+ virtual bool hasInsertionPoint() { return true; }
You mentioned having this return false?
> + // "write-neutralised" flag set, then abort these steps. > Doesn't match the code, but matches the spec, so OK. These are just quotes from the spec. > WebCore/dom/Document.cpp:2026 > + if (!hasInsertionPoint) > I thought you said you had to change this back to being if (m_parser)? That turned out not to work because of documents created by document.open(). At a basic level, we need to distinguish between the two cases in order to get things 100% correct. > I hope there is just the one constructor... I believe so. > We'll never set this back to "false" if we move into a different document at a later time. Should we? I'm not sure how to write a test case to observe the difference since scripts only run once. Maybe if it gets moved between the first attack and when it runs? I think our behavior is correct w.r.t. the spec in that case anyway since the spec never changes this flag. > + m_element->document()->setWriteDisabled(true); > This probably would have been cleaner as a stack-object (like you're so fond of). :) Maybe. I can change it if you like, but I tried to keep my zest for RAIIs contained. > + // FIXME: Somehow we need to understand the difference between > I thought you said this FIXME could go away if we moved the check back to m_parser instead? What are the concequences of this FIXME? What are we doing wrong now? How bad is it? That turned out not to work. The consequence is if the async script runs before we've gotten EOF from the network, we'll document.write into the network byte stream at the point where we're waiting for bytes from the network. This matches our original behavior. We can polish this further in a followup patch since that case is much more obscure than the bug we're trying to fix (and the behavior matches our old behavior). > WebCore/dom/DocumentParser.h:44 > + virtual bool hasInsertionPoint() { return true; } > You mentioned having this return false? I'll do that in a followup patch. It involves donking around with the SVG font code, so I didn't want to do it in this patch. Comment on attachment 62376 [details]
Patch
I think your zest for RAIIs is totally justified in this case, but it's not a big deal. Just easier to get this "wrong" in any second usage were there to be one.
Thank you for the replies. Comment on attachment 62376 [details]
Patch
Thanks for the detailed review. :)
Comment on attachment 62376 [details] Patch Clearing flags on attachment: 62376 Committed r63998: <http://trac.webkit.org/changeset/63998> All reviewed patches have been landed. Closing bug. |