We've hit a number of flaky tests due to the ownerElement member of the frame (and document) being deallocated without the frame or document code being aware of this. This seems like a good case to use WeakPtr, so we can make sure the "owned" party is aware of the state of the owning element. This is most often a problem when event handling code causes the containing element to be destroyed while the contained element is either still being used, or is in the process of being moved to another containing element.
Created attachment 277447 [details] Patch
Comment on attachment 277447 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=277447&action=review > Source/WebCore/page/Frame.cpp:159 > + , m_ownerElement(ownerElement ? ownerElement->createWeakPtr() : WeakPtr<HTMLFrameOwnerElement>()) will nullptr or { } work here?
(In reply to comment #2) > Comment on attachment 277447 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=277447&action=review > > > Source/WebCore/page/Frame.cpp:159 > > + , m_ownerElement(ownerElement ? ownerElement->createWeakPtr() : WeakPtr<HTMLFrameOwnerElement>()) > > will nullptr or { } work here? I got a compile error when I tried it. :-(
Comment on attachment 277447 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=277447&action=review > Source/WebCore/rendering/RenderLayer.cpp:2536 > + // Event handling may have removed the owner element: > + HTMLFrameOwnerElement* currentOwnerElement = renderer().document().ownerElement(); > + if (currentOwnerElement && currentOwnerElement->renderer()) > + parentLayer = currentOwnerElement->renderer()->enclosingLayer(); > + else > + parentLayer = nullptr; You say there's no behavior change, but this looks like one.
Comment on attachment 277447 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=277447&action=review >>> Source/WebCore/page/Frame.cpp:159 >>> + , m_ownerElement(ownerElement ? ownerElement->createWeakPtr() : WeakPtr<HTMLFrameOwnerElement>()) >> >> will nullptr or { } work here? > > I got a compile error when I tried it. :-( You tried both?
(In reply to comment #5) > Comment on attachment 277447 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=277447&action=review > > >>> Source/WebCore/page/Frame.cpp:159 > >>> + , m_ownerElement(ownerElement ? ownerElement->createWeakPtr() : WeakPtr<HTMLFrameOwnerElement>()) > >> > >> will nullptr or { } work here? > > > > I got a compile error when I tried it. :-( > > You tried both? Yes :-)
Created attachment 277530 [details] Patch (added another test case)
(In reply to comment #4) > Comment on attachment 277447 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=277447&action=review > > > Source/WebCore/rendering/RenderLayer.cpp:2536 > > + // Event handling may have removed the owner element: > > + HTMLFrameOwnerElement* currentOwnerElement = renderer().document().ownerElement(); > > + if (currentOwnerElement && currentOwnerElement->renderer()) > > + parentLayer = currentOwnerElement->renderer()->enclosingLayer(); > > + else > > + parentLayer = nullptr; > > You say there's no behavior change, but this looks like one. Maybe this isn't necessary. I've tried to construct a test case that would cause the frame to be removed while handling a scrollIntoView, but it doesn't seem like we execute JS during this processing. Do you know if 'delegatesScrolling' or any other settings could be used to cause something to fire in this case? If not, I could remove this extra check.
What I don't get is that the ~HTMLFrameOwnerElement() destructor already calls disconnectOwnerElement() on its contentFrame. Frame::disconnectOwnerElement() takes care of nullifying Frame::m_ownerElement already. I am not sure moving to WeakPtr is the best way to go here. Chances are we fails to disassociate a Frame with its owner at some point.
Comment on attachment 277530 [details] Patch (added another test case) View in context: https://bugs.webkit.org/attachment.cgi?id=277530&action=review > Source/WebCore/rendering/RenderLayer.cpp:2533 > + if (currentOwnerElement && currentOwnerElement->renderer()) We already do this check at line 2515. So it is possible for currentOwnerElement to die between line 2515 and 2533?
Comment on attachment 277530 [details] Patch (added another test case) View in context: https://bugs.webkit.org/attachment.cgi?id=277530&action=review > LayoutTests/ChangeLog:12 > + * fast/dom/HTMLAnchorElement/anchor-in-noscroll-iframe-crash.html: Added. I tried to reproduce the crash with a debug build (without and without guardmalloc) using this test but wasn't able to.
Comment on attachment 277530 [details] Patch (added another test case) I don't think we should land this given that HTMLFrameOwnerElement should already nullify its contentFrame's m_ownerElement member whenever its gets destroyed or detached. Adding a WeakPtr seems redundant and may be papering over a real bug. I was hoping to investigate but sadly I could not reproduce the crash with the test provided.
(In reply to comment #12) > Comment on attachment 277530 [details] > Patch (added another test case) > > I don't think we should land this given that HTMLFrameOwnerElement should > already nullify its contentFrame's m_ownerElement member whenever its gets > destroyed or detached. Adding a WeakPtr seems redundant and may be papering > over a real bug. > I was hoping to investigate but sadly I could not reproduce the crash with > the test provided. Fair enough. I think a better test to catch this happening is imported/blink/fast/dom/HTMLBodyElement/body-inserting-iframe-crash.html, which is what made me think about this particular category of issues.
Created attachment 277559 [details] Patch
I audited all of the sites where ownerElement is used, and I think there are only a handful where anything might be wrong. I was mistaken about the m_ownerElement in Frame being left as a dangling pointer -- it does look like the pointer is properly nulled when I step through the code. Consequently, the only things remaining to worry about is the possibility of us holding onto a pointer to something that was deleted. I've added a couple of assertions that I would like to use to catch if this ever happens.
Comment on attachment 277559 [details] Patch Attachment 277559 [details] did not pass mac-ews (mac): Output: http://webkit-queues.webkit.org/results/1231450 New failing tests: fast/dom/HTMLAnchorElement/anchor-in-noscroll-iframe-crash.html
Created attachment 277568 [details] Archive of layout-test-results from ews102 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews102 Port: mac-yosemite Platform: Mac OS X 10.10.5
Comment on attachment 277559 [details] Patch Attachment 277559 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.webkit.org/results/1231456 New failing tests: fast/dom/HTMLAnchorElement/anchor-in-noscroll-iframe-crash.html
Created attachment 277570 [details] Archive of layout-test-results from ews105 for mac-yosemite-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews105 Port: mac-yosemite-wk2 Platform: Mac OS X 10.10.5
Comment on attachment 277559 [details] Patch Attachment 277559 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/1231510 New failing tests: fast/dom/HTMLAnchorElement/anchor-in-noscroll-iframe-crash.html
Created attachment 277571 [details] Archive of layout-test-results from ews114 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews114 Port: mac-yosemite Platform: Mac OS X 10.10.5
Comment on attachment 277559 [details] Patch Attachment 277559 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: http://webkit-queues.webkit.org/results/1231502 New failing tests: fast/dom/HTMLAnchorElement/anchor-in-noscroll-iframe-crash.html
Created attachment 277572 [details] Archive of layout-test-results from ews122 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews122 Port: ios-simulator-wk2 Platform: Mac OS X 10.11.4
New test is failing: --- /Volumes/Data/EWS/WebKit/WebKitBuild/Release/layout-test-results/retries/fast/dom/HTMLAnchorElement/anchor-in-noscroll-iframe-crash-expected.txt +++ /Volumes/Data/EWS/WebKit/WebKitBuild/Release/layout-test-results/retries/fast/dom/HTMLAnchorElement/anchor-in-noscroll-iframe-crash-actual.txt @@ -1,8 +1,9 @@ +CONSOLE MESSAGE: line 43: TypeError: undefined is not an object (evaluating 'window.frames['target'].document.getElementById') This tests whether clicking on an anchor in an iframe with scrolling="no" will scroll to anchor. If clicking on the link below triggers a scroll, the test passes. -PASS This tests whether clicking on an anchor in an iframe with scrolling="no" will scroll to anchor. If clicking on the link below triggers a scroll, the test passes. + PASS
Created attachment 277578 [details] Patch
Comment on attachment 277578 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=277578&action=review > Source/WebCore/ChangeLog:11 > + No new tests. There should be no change in behavior, but this will reveal if we need Now "No new tests" is incorrect :)
(In reply to comment #15) > Consequently, the only things remaining to worry about is the possibility of > us holding onto a pointer to something that was deleted. I've added a couple > of assertions that I would like to use to catch if this ever happens. It’s OK with me to add the assertions. While they do nothing to solve a problem if it happens, they may make it easier to detect a problem in debug builds including on the bots. But there are two other ways to address this type of problem if there is a risk of a real problem: 1) We could re-fetch the ownerElement and check if for null if there is any risk that it could actually become null. 2) We could keep the ownerElement in a Ref or RefPtr local variable, which means that the object won’t be destroyed even if it’s no longer the ownerElement.
Created attachment 277624 [details] Patch
> (In reply to comment #15) > 2) We could keep the ownerElement in a Ref or RefPtr local variable, which > means that the object won’t be destroyed even if it’s no longer the > ownerElement. That seems like a better option. I'll take that approach.
Comment on attachment 277624 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=277624&action=review > Source/WebCore/inspector/InspectorDOMAgent.cpp:1881 > + Ref<Element> protector(*frameOwner); Protector variables are ugly. Even more so when they are speculative. Could you make frameOwner a RefPtr<> at line 1872 instead? > Source/WebCore/rendering/RenderLayer.cpp:2525 > + Ref<HTMLFrameOwnerElement> protector(*ownerElement); Why isn't this ref'd at line 2513 instead? i.e. RefPtr<HTMLFrameOwnerElement> ownerElement = renderer().document().ownerElement(); Also frameElementBase should probably be declared at line 2519 instead of 2516.
Comment on attachment 277624 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=277624&action=review >> Source/WebCore/inspector/InspectorDOMAgent.cpp:1881 >> + Ref<Element> protector(*frameOwner); > > Protector variables are ugly. Even more so when they are speculative. > Could you make frameOwner a RefPtr<> at line 1872 instead? Sure. >> Source/WebCore/rendering/RenderLayer.cpp:2525 >> + Ref<HTMLFrameOwnerElement> protector(*ownerElement); > > Why isn't this ref'd at line 2513 instead? i.e. > RefPtr<HTMLFrameOwnerElement> ownerElement = renderer().document().ownerElement(); > > Also frameElementBase should probably be declared at line 2519 instead of 2516. I figured there was no point in bothering to ref the pointer if we weren't going to use it.
Comment on attachment 277624 [details] Patch Also note that while ref'ing the ownerElement does extend its lifetime, it does not necessarily mean that it is still the Frame's ownerElement by the time it is used.
Comment on attachment 277624 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=277624&action=review >>> Source/WebCore/rendering/RenderLayer.cpp:2525 >>> + Ref<HTMLFrameOwnerElement> protector(*ownerElement); >> >> Why isn't this ref'd at line 2513 instead? i.e. >> RefPtr<HTMLFrameOwnerElement> ownerElement = renderer().document().ownerElement(); >> >> Also frameElementBase should probably be declared at line 2519 instead of 2516. > > I figured there was no point in bothering to ref the pointer if we weren't going to use it. Is there really any JS execution between this line and line 2533? Do we need to ref at all?
Comment on attachment 277624 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=277624&action=review >>>> Source/WebCore/rendering/RenderLayer.cpp:2525 >>>> + Ref<HTMLFrameOwnerElement> protector(*ownerElement); >>> >>> Why isn't this ref'd at line 2513 instead? i.e. >>> RefPtr<HTMLFrameOwnerElement> ownerElement = renderer().document().ownerElement(); >>> >>> Also frameElementBase should probably be declared at line 2519 instead of 2516. >> >> I figured there was no point in bothering to ref the pointer if we weren't going to use it. > > Is there really any JS execution between this line and line 2533? Do we need to ref at all? Note that if we don't believe there is any synchronous JS execution in this range, we could assert so using: NoEventDispatchAssertion assertNoEventDispatch;
Comment on attachment 277624 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=277624&action=review >>>>> Source/WebCore/rendering/RenderLayer.cpp:2525 >>>>> + Ref<HTMLFrameOwnerElement> protector(*ownerElement); >>>> >>>> Why isn't this ref'd at line 2513 instead? i.e. >>>> RefPtr<HTMLFrameOwnerElement> ownerElement = renderer().document().ownerElement(); >>>> >>>> Also frameElementBase should probably be declared at line 2519 instead of 2516. >>> >>> I figured there was no point in bothering to ref the pointer if we weren't going to use it. >> >> Is there really any JS execution between this line and line 2533? Do we need to ref at all? > > Note that if we don't believe there is any synchronous JS execution in this range, we could assert so using: > NoEventDispatchAssertion assertNoEventDispatch; Oh! I didn't know about that! That is an even better option.
Created attachment 277629 [details] Patch
Comment on attachment 277629 [details] Patch ok
Committed r200216: <http://trac.webkit.org/changeset/200216>
Comment on attachment 277629 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=277629&action=review > Source/WebCore/inspector/InspectorDOMAgent.cpp:1885 > + Ref<Inspector::Protocol::DOM::Node> value = buildObjectForNode(frameOwner.get(), 0, &m_documentNodeToIdMap); So wordy. Why to use auto on a line like this?
Comment on attachment 277629 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=277629&action=review >> Source/WebCore/inspector/InspectorDOMAgent.cpp:1885 >> + Ref<Inspector::Protocol::DOM::Node> value = buildObjectForNode(frameOwner.get(), 0, &m_documentNodeToIdMap); > > So wordy. Why to use auto on a line like this? Oh, I should have. I was just changing "frameOwner" to "frameOwner.get()" since it was changing to a RefPtr.