Bug 67214

Summary: PreloadScanner shouldn't load images inside noscript via doc.write
Product: WebKit Reporter: Tony Gentilcore <tonyg>
Component: New BugsAssignee: Tony Gentilcore <tonyg>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, eric, koivisto, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch none

Description Tony Gentilcore 2011-08-30 11:50:08 PDT
PreloadScanner shouldn't load images inside noscript via doc.write
Comment 1 Tony Gentilcore 2011-08-30 12:01:08 PDT
Created attachment 105662 [details]
Patch
Comment 2 Darin Adler 2011-08-30 12:02:49 PDT
Comment on attachment 105662 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=105662&action=review

> Source/WebCore/html/parser/HTMLDocumentParser.h:154
> +    // Check the document.write() output with a separate preload scanner as
> +    // the main scanner can't deal with insertions.

I liked this comment better in its old location.
Comment 3 Tony Gentilcore 2011-08-30 12:03:58 PDT
Apparently this bug hit a lot of DoubleClick users:
http://code.google.com/p/chromium/issues/detail?id=86911

I'm not thrilled about this approach, but it seems workable. The problem mentioned in the ChangeLog could be mitigated by clearing the m_insertionPreloadScanner after each script block is executed. But I'm not sure whether that is even worthwhile. PTAL and let me know if you have any other ideas.
Comment 4 Tony Gentilcore 2011-08-30 12:05:56 PDT
(In reply to comment #2)
> (From update of attachment 105662 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=105662&action=review
> 
> > Source/WebCore/html/parser/HTMLDocumentParser.h:154
> > +    // Check the document.write() output with a separate preload scanner as
> > +    // the main scanner can't deal with insertions.
> 
> I liked this comment better in its old location.

OK, I'll move the comment back before landing.

But I'll wait for feedback from Antti/Adam on the overall approach before uploading a new version of the patch.
Comment 5 Adam Barth 2011-08-30 12:18:49 PDT
Comment on attachment 105662 [details]
Patch

Interesting.  Maybe we should clear the m_insertionPreloadScanner when we pop the current insertion point?  That would keep the tokenizer in a consistent state throughout a single insertion point but get a fresh state for each new insertion point.
Comment 6 Adam Barth 2011-08-30 12:25:13 PDT
Comment on attachment 105662 [details]
Patch

This approach is also fine if there are some complexities in figuring out when it is safe to reset the state.
Comment 7 Tony Gentilcore 2011-08-30 13:05:41 PDT
Created attachment 105679 [details]
Patch
Comment 8 Tony Gentilcore 2011-08-30 13:08:06 PDT
Okay, now I'm clearing the insertion scanner in resumeParsingAfterScriptExecution() and I updated the test to make sure we don't get stuck in the noscript state.

I've also moved the comment back to the cpp file like Darin suggested.

Please take another look.
Comment 9 WebKit Review Bot 2011-08-30 14:26:10 PDT
Comment on attachment 105679 [details]
Patch

Clearing flags on attachment: 105679

Committed r94111: <http://trac.webkit.org/changeset/94111>
Comment 10 WebKit Review Bot 2011-08-30 14:26:14 PDT
All reviewed patches have been landed.  Closing bug.