Bug 74533

Summary: REGRESSION(r102619): Reproducible crash closing window with video + poster image inside an object element
Product: WebKit Reporter: Brady Eidson <beidson>
Component: Page LoadingAssignee: Brady Eidson <beidson>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, darin, kbr, mrowe, ossy, tkent, webkit.review.bot, zherczeg
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 74607    
Bug Blocks: 74555    
Attachments:
Description Flags
Patch v1 - Fix + layout test
beidson: review-
Patch v2 - Style fix and better layout test
darin: review+
Patch v3 - Rename "document activation callbacks" to "page cache suspension" callbacks and use them darin: review+

Brady Eidson
Reported 2011-12-14 13:11:59 PST
REGRESSION(r102619): Reproducible crash closing window with video + poster image inside an object element 0 com.apple.WebCore 0x000000010ca60dac WebCore::FrameLoader::loadedResourceFromMemoryCache(WebCore::CachedResource*) + 76 1 com.apple.WebCore 0x000000010c92fb15 WebCore::CachedResourceLoader::requestResource(WebCore::CachedResource::Type, WebCore::ResourceRequest&, WTF::String const&, WebCore::ResourceLoaderOptions const&, WebCore::ResourceLoadPriority, bool) + 821 2 com.apple.WebCore 0x000000010c92f702 WebCore::CachedResourceLoader::requestImage(WebCore::ResourceRequest&) + 402 3 com.apple.WebCore 0x000000010c42d1d1 WebCore::ImageLoader::updateFromElement() + 769 4 com.apple.WebCore 0x000000010c470389 WebCore::HTMLVideoElement::attach() + 121 5 com.apple.WebCore 0x000000010c30ec5c WebCore::ContainerNode::attach() + 28 6 com.apple.WebCore 0x000000010c31d9d9 WebCore::Element::attach() + 105 7 com.apple.WebCore 0x000000010c30ec5c WebCore::ContainerNode::attach() + 28 8 com.apple.WebCore 0x000000010c31d9d9 WebCore::Element::attach() + 105 9 com.apple.WebCore 0x000000010c34e520 WebCore::Element::recalcStyle(WebCore::Node::StyleChange) + 592 10 com.apple.WebCore 0x000000010ca9ea2b WebCore::HTMLPlugInImageElement::documentWillBecomeInactive() + 203 11 com.apple.WebCore 0x000000010c375479 WebCore::Document::detach() + 409 12 com.apple.WebCore 0x000000010c300c48 WebCore::Frame::setView(WTF::PassRefPtr<WebCore::FrameView>) + 72 13 com.apple.WebCore 0x000000010c373a9a WebCore::FrameLoader::detachFromParent() + 314 14 com.apple.WebKit2 0x000000010bd7f59a WebKit::WebPage::close() + 306 This is due to running what was intended to be "suspend" code at teardown time. Fix is forthcoming. In radar as <rdar://problem/10576732>
Attachments
Patch v1 - Fix + layout test (9.08 KB, patch)
2011-12-14 13:54 PST, Brady Eidson
beidson: review-
Patch v2 - Style fix and better layout test (9.27 KB, patch)
2011-12-14 14:27 PST, Brady Eidson
darin: review+
Patch v3 - Rename "document activation callbacks" to "page cache suspension" callbacks and use them (27.49 KB, patch)
2011-12-16 10:54 PST, Brady Eidson
darin: review+
Brady Eidson
Comment 1 2011-12-14 13:54:58 PST
Created attachment 119290 [details] Patch v1 - Fix + layout test
WebKit Review Bot
Comment 2 2011-12-14 13:58:57 PST
Attachment 119290 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/ChangeLog', u'LayoutTests/medi..." exit_code: 1 Source/WebCore/html/HTMLPlugInImageElement.h:26: Alphabetical sorting problem. [build/include_order] [4] Total errors found: 1 in 7 files If any of these errors are false positives, please file a bug against check-webkit-style.
Brady Eidson
Comment 3 2011-12-14 13:59:58 PST
(In reply to comment #2) > Attachment 119290 [details] did not pass style-queue: > > Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/ChangeLog', u'LayoutTests/medi..." exit_code: 1 > > Source/WebCore/html/HTMLPlugInImageElement.h:26: Alphabetical sorting problem. [build/include_order] [4] > Total errors found: 1 in 7 files > > > If any of these errors are false positives, please file a bug against check-webkit-style. Will fix this, and clean up the layout test some more - Just realized it had unnecessary remnants from a past version.
Brady Eidson
Comment 4 2011-12-14 14:27:57 PST
Created attachment 119297 [details] Patch v2 - Style fix and better layout test
Darin Adler
Comment 5 2011-12-14 14:32:15 PST
Comment on attachment 119297 [details] Patch v2 - Style fix and better layout test View in context: https://bugs.webkit.org/attachment.cgi?id=119297&action=review > Source/WebCore/html/HTMLPlugInImageElement.h:59 > + virtual bool canSuspend() const OVERRIDE; > + virtual void suspend(ReasonForSuspension) OVERRIDE; > + virtual void resume() OVERRIDE; Do these need to be public? Why can’t they be protected?
Brady Eidson
Comment 6 2011-12-14 14:47:15 PST
(In reply to comment #5) > (From update of attachment 119297 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=119297&action=review > > > Source/WebCore/html/HTMLPlugInImageElement.h:59 > > + virtual bool canSuspend() const OVERRIDE; > > + virtual void suspend(ReasonForSuspension) OVERRIDE; > > + virtual void resume() OVERRIDE; > > Do these need to be public? Why can’t they be protected? Good point, fixing before landing. Thanks for the review!
Brady Eidson
Comment 7 2011-12-14 14:52:33 PST
Alexey Proskuryakov
Comment 8 2011-12-14 15:32:00 PST
I don't think that using ActiveDOMObject here is right. In particular, it looks like the suspend() function can fire events, breaking iteration over the objects in ScriptExecutionContext. ActiveDOMObject was meant as a means to keep non-node objects alive, and using it with nodes might still be causing issues along the lines of bug 45309. Also, as you mentioned on IRC, the semantics of moving ActiveDOMObject nodes between documents are unclear.
Brady Eidson
Comment 9 2011-12-14 15:40:50 PST
(In reply to comment #8) > I don't think that using ActiveDOMObject here is right. In particular, it looks like the suspend() function can fire events, breaking iteration over the objects in ScriptExecutionContext. Breaking how? Can't the ScriptExecutionContext iterate in a manner that won't be broken? > ActiveDOMObject was meant as a means to keep non-node objects alive, and using it with nodes might still be causing issues along the lines of bug 45309. That might have been one of it's purposes, but doesn't it also do much more than fits the Document activation callback use case quite well? I think that's why over time more elements have found themselves moving to ActiveDOMObject. >Also, as you mentioned on IRC, the semantics of moving ActiveDOMObject nodes between documents are unclear. Indeed. Such semantics are only clear for elements that override willMoveToNewOwnerDocument and didMoveToNewOwnerDocument, which is dreadfully few. In general our support for moving objects between Documents is horrible. Making ActiveDOMObject aware of this would be useful.
Alexey Proskuryakov
Comment 10 2011-12-14 15:51:36 PST
> Breaking how? Can't the ScriptExecutionContext iterate in a manner that won't be broken? An event handler can add or remove ActiveDOMObjects. This was previously discussed at length in the context of IndexedDB, and no, it doesn't seem easy to change the code in a way that would be sufficiently robust. No matter how we change iteration in ScriptExecutionContext, there will be cases that don't work right in general case. Since this patch almost certainly introduces a security bug, it should probably be rolled out.
Brady Eidson
Comment 11 2011-12-14 16:03:20 PST
(In reply to comment #10) > > Breaking how? Can't the ScriptExecutionContext iterate in a manner that won't be broken? > > An event handler can add or remove ActiveDOMObjects. > > This was previously discussed at length in the context of IndexedDB, and no, it doesn't seem easy to change the code in a way that would be sufficiently robust. No matter how we change iteration in ScriptExecutionContext, there will be cases that don't work right in general case. > > Since this patch almost certainly introduces a security bug, it should probably be rolled out. Can you explain why this patch almost certainly introduces a security bug whereas the Media and Marquee element stuff do not, and should not be rolled out? I'm still unclear on what the actual security bug is. Why can't we copy the set of objects to a vector before iterating them...? Going forward we should have precisely *one* mechanism for handling suspend/resume instead of the current two mechanisms. I think the right course is to fix the shortcomings in the obviously superior mechanism so we can get rid of the obviously inferior one (which I already filed as https://bugs.webkit.org/show_bug.cgi?id=74545)
Kenneth Russell
Comment 12 2011-12-14 16:05:02 PST
Another point: this patch caused the layout test fast/images/move-image-to-new-document.html to start timing out, which would be another point in favor of rolling out and trying another approach.
Alexey Proskuryakov
Comment 13 2011-12-14 16:12:48 PST
> Can you explain why this patch almost certainly introduces a security bug whereas the Media and Marquee element stuff do not, and should not be rolled out? HTMLPlugInImageElement::suspend does a style recalculation, which can do things like instantiating plug-ins or firing events. I haven't looked at the other active nodes now, perhaps they also do something terrible like that. Someone who wants to redesign ActiveDOMObject can follow discussions in 52719, bug 52366 and related bugs. My take from those discussions is that ActiveDOMObject should be avoided. Given that it often causes such strong desire to use it, it should probably be made less prominent or removed altogether.
Kenneth Russell
Comment 14 2011-12-14 16:17:04 PST
(In reply to comment #12) > Another point: this patch caused the layout test fast/images/move-image-to-new-document.html to start timing out, which would be another point in favor of rolling out and trying another approach. To be clear, the regression in that test case looks pretty serious -- load events aren't firing for an object element when it's moved from one page to another.
Zoltan Herczeg
Comment 15 2011-12-15 01:50:08 PST
Test fix seems broke fast/images/move-image-to-new-document.html on several bots (at least on Mac, GTK and QT). Any idea how to fix it?
Csaba Osztrogonác
Comment 16 2011-12-15 07:08:26 PST
New bug report for the new regression: https://bugs.webkit.org/show_bug.cgi?id=74607
Brady Eidson
Comment 17 2011-12-15 08:54:06 PST
(In reply to comment #12) > Another point: this patch caused the layout test fast/images/move-image-to-new-document.html to start timing out, which would be another point in favor of rolling out and trying another approach. Covered by https://bugs.webkit.org/show_bug.cgi?id=74555 I'll take action on this today. (In reply to comment #14) > (In reply to comment #12) > To be clear, the regression in that test case looks pretty serious -- load events aren't firing for an object element when it's moved from one page to another. Actually our support for moving nodes between documents is spotty at best. I'm surprised I happened to touch a cross section of the code where it *was* handled properly. Reproducible crashers are also serious, but obviously it's not okay to cause a known regression either.
Brady Eidson
Comment 18 2011-12-15 10:51:04 PST
Brady Eidson
Comment 19 2011-12-15 15:37:15 PST
I have a patch for a better fix, but I can't test it easily right now as http://trac.webkit.org/changeset/102866 is causing a crash. Covered by https://bugs.webkit.org/show_bug.cgi?id=74655
Brady Eidson
Comment 20 2011-12-16 10:54:04 PST
Created attachment 119634 [details] Patch v3 - Rename "document activation callbacks" to "page cache suspension" callbacks and use them This changes the "document will become inactive" callbacks from being called both on page cache suspension and document detachment in to "page cache suspension" callbacks that are absolutely only called when the document goes in to the page cache.
Darin Adler
Comment 21 2011-12-16 16:06:56 PST
Comment on attachment 119634 [details] Patch v3 - Rename "document activation callbacks" to "page cache suspension" callbacks and use them View in context: https://bugs.webkit.org/attachment.cgi?id=119634&action=review > Source/WebCore/ChangeLog:11 > + Elements that used to register for "document activation callbacks" now register for "page cache suspension" callbacks. > + The Document only dispatches these callbacks specifically when it suspended in to the page cache or resumed from it. I think this description is unnecessarily oblique. The real issue is that we don’t need to do all this work when tearing down documents; only when suspending documents we plan to resume later. Maybe there’s a more straightforward way of saying that. > Source/WebCore/dom/Document.h:982 > + enum ReasonForInactivation { > + Suspension, > + Detach > + }; I’d rather tree this all on one line. I don’t think that “suspension” and “detach” are parallel. Maybe call it “render tree destruction” instead of “detach”? I wonder what the name is for a process when a document goes into a state where it’s no longer showed in a frame, but nodes in it may still exist.
Darin Adler
Comment 22 2011-12-16 16:08:19 PST
Comment on attachment 119634 [details] Patch v3 - Rename "document activation callbacks" to "page cache suspension" callbacks and use them View in context: https://bugs.webkit.org/attachment.cgi?id=119634&action=review > Source/WebCore/dom/Document.cpp:3922 > -void Document::documentWillBecomeInactive() > +void Document::documentWillBecomeInactive(ReasonForInactivation reason) I think this should just be two functions. One can call the other. There’s no reason to use an enum here instead of a second function.
Brady Eidson
Comment 23 2011-12-16 16:33:15 PST
Note You need to log in before you can comment on or make changes to this bug.