Bug 40919 - beforeload event not sent to document for <img src> added to document using .innerHTML (affects images.google.com)
Summary: beforeload event not sent to document for <img src> added to document using ....
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Page Loading (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Andy Estes
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2010-06-21 08:19 PDT by Adam Roben (:aroben)
Modified: 2010-08-05 15:35 PDT (History)
5 users (show)

See Also:


Attachments
test case (581 bytes, text/html)
2010-06-21 08:19 PDT, Adam Roben (:aroben)
no flags Details
Patch (1.71 KB, patch)
2010-06-23 11:20 PDT, Dave Hyatt
no flags Details | Formatted Diff | Diff
Patch (4.90 KB, patch)
2010-06-30 16:17 PDT, Andy Estes
no flags Details | Formatted Diff | Diff
Patch (3.94 KB, patch)
2010-06-30 23:59 PDT, Andy Estes
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Adam Roben (:aroben) 2010-06-21 08:19:18 PDT
Created attachment 59252 [details]
test case

To reproduce:

1. Load attached test case

The document's beforeload event listener never gets called for the image. This is indicated by the URL of the image not being printed into the page.
Comment 1 Adam Roben (:aroben) 2010-06-21 08:19:58 PDT
images.google.com uses this technique, meaning that beforeload doesn't work to block many images on images.google.com.
Comment 2 Adam Roben (:aroben) 2010-06-21 08:20:56 PDT
The problem is that the <img> element's inDocument() function returns false when the beforeload event is being dispatched. This causes Node::eventAncestors to return 0 ancestors, so the event never gets up to the document.
Comment 3 Adam Roben (:aroben) 2010-06-21 08:21:20 PDT
<rdar://problem/8113003>
Comment 4 Adam Roben (:aroben) 2010-06-21 08:28:47 PDT
The beforeload event is sent via a call to ImageLoader::dispatchPendingBeforeLoadEvents() at the end of HTMLDocumentParser::write(). Apparently this is before the <img> has actually been added to the document.
Comment 5 Dave Hyatt 2010-06-21 12:40:59 PDT
Technically this is correct behavior.  We need to figure out what we're going to do with the loads that happen on DOM elements that aren't in the document yet.  I suspect we may need to do something funky like allow bubbling/capturing from the root of the unattached tree up to the document.

No other event behaves like this, but I can't think of a better way.  All loads need to be easily observable from a single listener, even ones that occur on elements that aren't in the document yet.
Comment 6 Adam Roben (:aroben) 2010-06-21 12:59:06 PDT
(In reply to comment #5)
> Technically this is correct behavior.  We need to figure out what we're going to do with the loads that happen on DOM elements that aren't in the document yet.  I suspect we may need to do something funky like allow bubbling/capturing from the root of the unattached tree up to the document.

Another option is to delay the loading until the element is inserted into the document. But I assume that would cause compatibility problems.
Comment 7 Adam Roben (:aroben) 2010-06-22 12:06:30 PDT
The same bug does not affect the following code:

var img = new Image();
img.src = foo;
document.body.appendChild(img);

Setting @src schedules the beforeload event to be sent after a timer fires (via ImageEventSender::dispatchEventSoon). By the time the timer fires, the image is already in the document.
Comment 8 Dave Hyatt 2010-06-23 11:20:07 PDT
Created attachment 59536 [details]
Patch

This patch does two things:

(1) It makes beforeload flow up to the document even for nodes that aren't reachable from the document element.  This lets the document observe orphaned image loads.

(2) It doesn't bother doing the synchronous beforeload dispatch after parsing document fragments.  This ensures that fragments that then get assigned into the document are in the document at the time beforeload fires.
Comment 9 Andy Estes 2010-06-30 16:17:29 PDT
Created attachment 60166 [details]
Patch
Comment 10 Andy Estes 2010-06-30 16:18:49 PDT
(In reply to comment #9)
> Created an attachment (id=60166) [details]
> Patch

This is hyatt's patch with a layout test.
Comment 11 Darin Adler 2010-06-30 17:06:21 PDT
Comment on attachment 60166 [details]
Patch

> +    // beforeload is always observable by the document, even for elements that are not contained in
> +    // the document.
> +    if (!inDocument() && event->type() == eventNames().beforeloadEvent)
> +        ancestors.append(document());

Is this really the design we want for beforeload? It's very strange! Why would the document need to get the event for a node that's not in it? What’s the reason for this event to be different from all others in this respect?

(I would say "The beforeload event" so you can capitalize the comment sentence.)
Comment 12 Andy Estes 2010-06-30 23:20:29 PDT
(In reply to comment #11)
> (From update of attachment 60166 [details])
> > +    // beforeload is always observable by the document, even for elements that are not contained in
> > +    // the document.
> > +    if (!inDocument() && event->type() == eventNames().beforeloadEvent)
> > +        ancestors.append(document());
> 
> Is this really the design we want for beforeload? It's very strange! Why would the document need to get the event for a node that's not in it? What’s the reason for this event to be different from all others in this respect?

I think the problem is that a clever page could bypass document-level beforeload handlers by doing something like this:

var img = new Image();
img.src = ...;
img.onload = function(event) {
    targetNode.appendChild(img);
}

For injected scripts to be able to cancel loads, it sounds like beforeload events from fragments will need to be propagated to somewhere in the DOM.  Sam mentioned in person that perhaps parenting fragments to the window would make more sense.

Since this change isn't strictly necessary to fix the .innerHTML bug, I think I should file a new bug to track this issue and resubmit the patch with the Node.cpp change removed.
Comment 13 Andy Estes 2010-06-30 23:59:37 PDT
Created attachment 60206 [details]
Patch
Comment 14 Andy Estes 2010-07-01 15:17:20 PDT
Committed r62302: <http://trac.webkit.org/changeset/62302>
Comment 15 Andy Estes 2010-07-06 19:29:45 PDT
(In reply to comment #12)
> Since this change isn't strictly necessary to fix the .innerHTML bug, I think I should file a new bug to track this issue and resubmit the patch with the Node.cpp change removed.

I filed https://bugs.webkit.org/show_bug.cgi?id=41730 to track this issue.
Comment 16 Tony Gentilcore 2010-08-05 13:48:49 PDT
FYI, this caused crashes:

    if (m_noMoreData && !m_inWrite && !state.loadingExtScript() && !m_executingScript && !m_timer.isActive())
        end(); // this actually causes us to be deleted

    // After parsing, go ahead and dispatch image beforeload events, but only if we're doing
    // document parsing.  For document fragments we wait, since they'll likely end up in the document by the time
    // the beforeload events fire.
    if (!m_fragment)
        ImageLoader::dispatchPendingBeforeLoadEvents();

Notice end() might cause this to be deleted and m_fragment is a member. This is a blocker on the Chrome release branch so I'm working on a test and fix.
Comment 17 Tony Gentilcore 2010-08-05 14:54:28 PDT
> This is a blocker on the Chrome release branch so I'm working on a test and fix.

Scratch that. I've just rolled this out on the branch.
Comment 18 Andy Estes 2010-08-05 15:04:51 PDT
(In reply to comment #17)
> > This is a blocker on the Chrome release branch so I'm working on a test and fix.
> 
> Scratch that. I've just rolled this out on the branch.

Tony, can you attach a test case that reproduces the crash?  This is obviously an issue we'd like to fix in ToT.
Comment 19 Tony Gentilcore 2010-08-05 15:35:09 PDT
(In reply to comment #18)
> (In reply to comment #17)
> > > This is a blocker on the Chrome release branch so I'm working on a test and fix.
> > 
> > Scratch that. I've just rolled this out on the branch.
> 
> Tony, can you attach a test case that reproduces the crash?  This is obviously an issue we'd like to fix in ToT.

I don't have a reduced case. The trick is to reach the branch that triggers end().

FWIW, this reproduces pretty reliably (but not 100%) for me with http://www.mashable.com/

I wasn't planning on spending time reducing since this is in the legacy parser. We just switched back to the legacy parser for Chrome 6 because of what's left in https://bugs.webkit.org/show_bug.cgi?id=41115.