Bug 157040 - Make sure we don't mishandle HTMLFrameOwnerElement lifecycle
Summary: Make sure we don't mishandle HTMLFrameOwnerElement lifecycle
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Brent Fulgham
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-04-26 12:24 PDT by Brent Fulgham
Modified: 2016-04-28 17:40 PDT (History)
14 users (show)

See Also:


Attachments
Patch (5.15 KB, patch)
2016-04-26 23:07 PDT, Brent Fulgham
no flags Details | Formatted Diff | Diff
Patch (added another test case) (9.00 KB, patch)
2016-04-27 13:28 PDT, Brent Fulgham
no flags Details | Formatted Diff | Diff
Patch (8.21 KB, patch)
2016-04-27 17:48 PDT, Brent Fulgham
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews102 for mac-yosemite (783.17 KB, application/zip)
2016-04-27 18:32 PDT, Build Bot
no flags Details
Archive of layout-test-results from ews105 for mac-yosemite-wk2 (789.91 KB, application/zip)
2016-04-27 18:36 PDT, Build Bot
no flags Details
Archive of layout-test-results from ews114 for mac-yosemite (853.91 KB, application/zip)
2016-04-27 18:48 PDT, Build Bot
no flags Details
Archive of layout-test-results from ews122 for ios-simulator-wk2 (606.54 KB, application/zip)
2016-04-27 18:53 PDT, Build Bot
no flags Details
Patch (6.46 KB, patch)
2016-04-27 19:42 PDT, Brent Fulgham
no flags Details | Formatted Diff | Diff
Patch (6.36 KB, patch)
2016-04-28 09:04 PDT, Brent Fulgham
no flags Details | Formatted Diff | Diff
Patch (7.40 KB, patch)
2016-04-28 09:47 PDT, Brent Fulgham
cdumez: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Brent Fulgham 2016-04-26 12:24:50 PDT
We've hit a number of flaky tests due to the ownerElement member of the frame (and document) being deallocated without the frame or document code being aware of this. This seems like a good case to use WeakPtr, so we can make sure the "owned" party is aware of the state of the owning element.

This is most often a problem when event handling code causes the containing element to be destroyed while the contained element is either still being used, or is in the process of being moved to another containing element.
Comment 1 Brent Fulgham 2016-04-26 23:07:10 PDT
Created attachment 277447 [details]
Patch
Comment 2 Alex Christensen 2016-04-26 23:46:14 PDT
Comment on attachment 277447 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=277447&action=review

> Source/WebCore/page/Frame.cpp:159
> +    , m_ownerElement(ownerElement ? ownerElement->createWeakPtr() : WeakPtr<HTMLFrameOwnerElement>())

will nullptr or { } work here?
Comment 3 Brent Fulgham 2016-04-26 23:57:40 PDT
(In reply to comment #2)
> Comment on attachment 277447 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=277447&action=review
> 
> > Source/WebCore/page/Frame.cpp:159
> > +    , m_ownerElement(ownerElement ? ownerElement->createWeakPtr() : WeakPtr<HTMLFrameOwnerElement>())
> 
> will nullptr or { } work here?

I got a compile error when I tried it. :-(
Comment 4 Simon Fraser (smfr) 2016-04-27 08:31:40 PDT
Comment on attachment 277447 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=277447&action=review

> Source/WebCore/rendering/RenderLayer.cpp:2536
> +                    // Event handling may have removed the owner element:
> +                    HTMLFrameOwnerElement* currentOwnerElement = renderer().document().ownerElement();
> +                    if (currentOwnerElement && currentOwnerElement->renderer())
> +                        parentLayer = currentOwnerElement->renderer()->enclosingLayer();
> +                    else
> +                        parentLayer = nullptr;

You say there's no behavior change, but this looks like one.
Comment 5 Darin Adler 2016-04-27 11:54:54 PDT
Comment on attachment 277447 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=277447&action=review

>>> Source/WebCore/page/Frame.cpp:159
>>> +    , m_ownerElement(ownerElement ? ownerElement->createWeakPtr() : WeakPtr<HTMLFrameOwnerElement>())
>> 
>> will nullptr or { } work here?
> 
> I got a compile error when I tried it. :-(

You tried both?
Comment 6 Brent Fulgham 2016-04-27 12:02:53 PDT
(In reply to comment #5)
> Comment on attachment 277447 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=277447&action=review
> 
> >>> Source/WebCore/page/Frame.cpp:159
> >>> +    , m_ownerElement(ownerElement ? ownerElement->createWeakPtr() : WeakPtr<HTMLFrameOwnerElement>())
> >> 
> >> will nullptr or { } work here?
> > 
> > I got a compile error when I tried it. :-(
> 
> You tried both?

Yes :-)
Comment 7 Brent Fulgham 2016-04-27 13:28:58 PDT
Created attachment 277530 [details]
Patch (added another test case)
Comment 8 Brent Fulgham 2016-04-27 13:33:54 PDT
(In reply to comment #4)
> Comment on attachment 277447 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=277447&action=review
> 
> > Source/WebCore/rendering/RenderLayer.cpp:2536
> > +                    // Event handling may have removed the owner element:
> > +                    HTMLFrameOwnerElement* currentOwnerElement = renderer().document().ownerElement();
> > +                    if (currentOwnerElement && currentOwnerElement->renderer())
> > +                        parentLayer = currentOwnerElement->renderer()->enclosingLayer();
> > +                    else
> > +                        parentLayer = nullptr;
> 
> You say there's no behavior change, but this looks like one.

Maybe this isn't necessary. I've tried to construct a test case that would cause the frame to be removed while handling a scrollIntoView, but it doesn't seem like we execute JS during this processing.

Do you know if 'delegatesScrolling' or any other settings could be used to cause something to fire in this case?

If not, I could remove this extra check.
Comment 9 Chris Dumez 2016-04-27 15:04:25 PDT
What I don't get is that the ~HTMLFrameOwnerElement() destructor already calls disconnectOwnerElement() on its contentFrame. Frame::disconnectOwnerElement() takes care of nullifying Frame::m_ownerElement already.

I am not sure moving to WeakPtr is the best way to go here. Chances are we fails to disassociate a Frame with its owner at some point.
Comment 10 Chris Dumez 2016-04-27 15:14:40 PDT
Comment on attachment 277530 [details]
Patch (added another test case)

View in context: https://bugs.webkit.org/attachment.cgi?id=277530&action=review

> Source/WebCore/rendering/RenderLayer.cpp:2533
> +                    if (currentOwnerElement && currentOwnerElement->renderer())

We already do this check at line 2515. So it is possible for currentOwnerElement to die between line 2515 and 2533?
Comment 11 Chris Dumez 2016-04-27 15:44:35 PDT
Comment on attachment 277530 [details]
Patch (added another test case)

View in context: https://bugs.webkit.org/attachment.cgi?id=277530&action=review

> LayoutTests/ChangeLog:12
> +        * fast/dom/HTMLAnchorElement/anchor-in-noscroll-iframe-crash.html: Added.

I tried to reproduce the crash with a debug build (without and without guardmalloc) using this test but wasn't able to.
Comment 12 Chris Dumez 2016-04-27 16:05:40 PDT
Comment on attachment 277530 [details]
Patch (added another test case)

I don't think we should land this given that HTMLFrameOwnerElement should already nullify its contentFrame's m_ownerElement member whenever its gets destroyed or detached. Adding a WeakPtr seems redundant and may be papering over a real bug.
I was hoping to investigate but sadly I could not reproduce the crash with the test provided.
Comment 13 Brent Fulgham 2016-04-27 16:10:02 PDT
(In reply to comment #12)
> Comment on attachment 277530 [details]
> Patch (added another test case)
> 
> I don't think we should land this given that HTMLFrameOwnerElement should
> already nullify its contentFrame's m_ownerElement member whenever its gets
> destroyed or detached. Adding a WeakPtr seems redundant and may be papering
> over a real bug.
> I was hoping to investigate but sadly I could not reproduce the crash with
> the test provided.

Fair enough.

I think a better test to catch this happening is imported/blink/fast/dom/HTMLBodyElement/body-inserting-iframe-crash.html, which is what made me think about this particular category of issues.
Comment 14 Brent Fulgham 2016-04-27 17:48:44 PDT
Created attachment 277559 [details]
Patch
Comment 15 Brent Fulgham 2016-04-27 17:51:42 PDT
I audited all of the sites where ownerElement is used, and I think there are only a handful where anything might be wrong.

I was mistaken about the m_ownerElement in Frame being left as a dangling pointer -- it does look like the pointer is properly nulled when I step through the code.

Consequently, the only things remaining to worry about is the possibility of us holding onto a pointer to something that was deleted. I've added a couple of assertions that I would like to use to catch if this ever happens.
Comment 16 Build Bot 2016-04-27 18:32:36 PDT
Comment on attachment 277559 [details]
Patch

Attachment 277559 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/1231450

New failing tests:
fast/dom/HTMLAnchorElement/anchor-in-noscroll-iframe-crash.html
Comment 17 Build Bot 2016-04-27 18:32:42 PDT
Created attachment 277568 [details]
Archive of layout-test-results from ews102 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews102  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 18 Build Bot 2016-04-27 18:36:31 PDT
Comment on attachment 277559 [details]
Patch

Attachment 277559 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/1231456

New failing tests:
fast/dom/HTMLAnchorElement/anchor-in-noscroll-iframe-crash.html
Comment 19 Build Bot 2016-04-27 18:36:35 PDT
Created attachment 277570 [details]
Archive of layout-test-results from ews105 for mac-yosemite-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews105  Port: mac-yosemite-wk2  Platform: Mac OS X 10.10.5
Comment 20 Build Bot 2016-04-27 18:48:45 PDT
Comment on attachment 277559 [details]
Patch

Attachment 277559 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/1231510

New failing tests:
fast/dom/HTMLAnchorElement/anchor-in-noscroll-iframe-crash.html
Comment 21 Build Bot 2016-04-27 18:48:52 PDT
Created attachment 277571 [details]
Archive of layout-test-results from ews114 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews114  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 22 Build Bot 2016-04-27 18:53:46 PDT
Comment on attachment 277559 [details]
Patch

Attachment 277559 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: http://webkit-queues.webkit.org/results/1231502

New failing tests:
fast/dom/HTMLAnchorElement/anchor-in-noscroll-iframe-crash.html
Comment 23 Build Bot 2016-04-27 18:53:53 PDT
Created attachment 277572 [details]
Archive of layout-test-results from ews122 for ios-simulator-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews122  Port: ios-simulator-wk2  Platform: Mac OS X 10.11.4
Comment 24 Chris Dumez 2016-04-27 18:59:08 PDT
New test is failing:
--- /Volumes/Data/EWS/WebKit/WebKitBuild/Release/layout-test-results/retries/fast/dom/HTMLAnchorElement/anchor-in-noscroll-iframe-crash-expected.txt
+++ /Volumes/Data/EWS/WebKit/WebKitBuild/Release/layout-test-results/retries/fast/dom/HTMLAnchorElement/anchor-in-noscroll-iframe-crash-actual.txt
@@ -1,8 +1,9 @@
+CONSOLE MESSAGE: line 43: TypeError: undefined is not an object (evaluating 'window.frames['target'].document.getElementById')
 This tests whether clicking on an anchor in an iframe with scrolling="no" will scroll to anchor. If clicking on the link below triggers a scroll, the test passes.
 
 
-PASS
 This tests whether clicking on an anchor in an iframe with scrolling="no" will scroll to anchor. If clicking on the link below triggers a scroll, the test passes.
 
 
+
 PASS
Comment 25 Brent Fulgham 2016-04-27 19:42:58 PDT
Created attachment 277578 [details]
Patch
Comment 26 Alex Christensen 2016-04-27 22:34:18 PDT
Comment on attachment 277578 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=277578&action=review

> Source/WebCore/ChangeLog:11
> +        No new tests. There should be no change in behavior, but this will reveal if we need

Now "No new tests" is incorrect :)
Comment 27 Darin Adler 2016-04-28 08:31:24 PDT
(In reply to comment #15)
> Consequently, the only things remaining to worry about is the possibility of
> us holding onto a pointer to something that was deleted. I've added a couple
> of assertions that I would like to use to catch if this ever happens.

It’s OK with me to add the assertions. While they do nothing to solve a problem if it happens, they may make it easier to detect a problem in debug builds including on the bots.

But there are two other ways to address this type of problem if there is a risk of a real problem:

1) We could re-fetch the ownerElement and check if for null if there is any risk that it could actually become null.

2) We could keep the ownerElement in a Ref or RefPtr local variable, which means that the object won’t be destroyed even if it’s no longer the ownerElement.
Comment 28 Brent Fulgham 2016-04-28 09:04:41 PDT
Created attachment 277624 [details]
Patch
Comment 29 Brent Fulgham 2016-04-28 09:04:57 PDT
> (In reply to comment #15)
> 2) We could keep the ownerElement in a Ref or RefPtr local variable, which
> means that the object won’t be destroyed even if it’s no longer the
> ownerElement.

That seems like a better option. I'll take that approach.
Comment 30 Chris Dumez 2016-04-28 09:23:03 PDT
Comment on attachment 277624 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=277624&action=review

> Source/WebCore/inspector/InspectorDOMAgent.cpp:1881
> +    Ref<Element> protector(*frameOwner);

Protector variables are ugly. Even more so when they are speculative.
Could you make frameOwner a RefPtr<> at line 1872 instead?

> Source/WebCore/rendering/RenderLayer.cpp:2525
> +                Ref<HTMLFrameOwnerElement> protector(*ownerElement);

Why isn't this ref'd at line 2513 instead? i.e.
RefPtr<HTMLFrameOwnerElement> ownerElement = renderer().document().ownerElement();

Also frameElementBase should probably be declared at line 2519 instead of 2516.
Comment 31 Brent Fulgham 2016-04-28 09:26:51 PDT
Comment on attachment 277624 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=277624&action=review

>> Source/WebCore/inspector/InspectorDOMAgent.cpp:1881
>> +    Ref<Element> protector(*frameOwner);
> 
> Protector variables are ugly. Even more so when they are speculative.
> Could you make frameOwner a RefPtr<> at line 1872 instead?

Sure.

>> Source/WebCore/rendering/RenderLayer.cpp:2525
>> +                Ref<HTMLFrameOwnerElement> protector(*ownerElement);
> 
> Why isn't this ref'd at line 2513 instead? i.e.
> RefPtr<HTMLFrameOwnerElement> ownerElement = renderer().document().ownerElement();
> 
> Also frameElementBase should probably be declared at line 2519 instead of 2516.

I figured there was no point in bothering to ref the pointer if we weren't going to use it.
Comment 32 Chris Dumez 2016-04-28 09:27:20 PDT
Comment on attachment 277624 [details]
Patch

Also note that while ref'ing the ownerElement does extend its lifetime, it does not necessarily mean that it is still the Frame's ownerElement by the time it is used.
Comment 33 Chris Dumez 2016-04-28 09:28:59 PDT
Comment on attachment 277624 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=277624&action=review

>>> Source/WebCore/rendering/RenderLayer.cpp:2525
>>> +                Ref<HTMLFrameOwnerElement> protector(*ownerElement);
>> 
>> Why isn't this ref'd at line 2513 instead? i.e.
>> RefPtr<HTMLFrameOwnerElement> ownerElement = renderer().document().ownerElement();
>> 
>> Also frameElementBase should probably be declared at line 2519 instead of 2516.
> 
> I figured there was no point in bothering to ref the pointer if we weren't going to use it.

Is there really any JS execution between this line and line 2533? Do we need to ref at all?
Comment 34 Chris Dumez 2016-04-28 09:30:53 PDT
Comment on attachment 277624 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=277624&action=review

>>>> Source/WebCore/rendering/RenderLayer.cpp:2525
>>>> +                Ref<HTMLFrameOwnerElement> protector(*ownerElement);
>>> 
>>> Why isn't this ref'd at line 2513 instead? i.e.
>>> RefPtr<HTMLFrameOwnerElement> ownerElement = renderer().document().ownerElement();
>>> 
>>> Also frameElementBase should probably be declared at line 2519 instead of 2516.
>> 
>> I figured there was no point in bothering to ref the pointer if we weren't going to use it.
> 
> Is there really any JS execution between this line and line 2533? Do we need to ref at all?

Note that if we don't believe there is any synchronous JS execution in this range, we could assert so using:
NoEventDispatchAssertion assertNoEventDispatch;
Comment 35 Brent Fulgham 2016-04-28 09:32:29 PDT
Comment on attachment 277624 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=277624&action=review

>>>>> Source/WebCore/rendering/RenderLayer.cpp:2525
>>>>> +                Ref<HTMLFrameOwnerElement> protector(*ownerElement);
>>>> 
>>>> Why isn't this ref'd at line 2513 instead? i.e.
>>>> RefPtr<HTMLFrameOwnerElement> ownerElement = renderer().document().ownerElement();
>>>> 
>>>> Also frameElementBase should probably be declared at line 2519 instead of 2516.
>>> 
>>> I figured there was no point in bothering to ref the pointer if we weren't going to use it.
>> 
>> Is there really any JS execution between this line and line 2533? Do we need to ref at all?
> 
> Note that if we don't believe there is any synchronous JS execution in this range, we could assert so using:
> NoEventDispatchAssertion assertNoEventDispatch;

Oh! I didn't know about that! That is an even better option.
Comment 36 Brent Fulgham 2016-04-28 09:47:25 PDT
Created attachment 277629 [details]
Patch
Comment 37 Chris Dumez 2016-04-28 16:20:41 PDT
Comment on attachment 277629 [details]
Patch

ok
Comment 38 Brent Fulgham 2016-04-28 16:37:38 PDT
Committed r200216: <http://trac.webkit.org/changeset/200216>
Comment 39 Darin Adler 2016-04-28 17:34:40 PDT
Comment on attachment 277629 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=277629&action=review

> Source/WebCore/inspector/InspectorDOMAgent.cpp:1885
> +    Ref<Inspector::Protocol::DOM::Node> value = buildObjectForNode(frameOwner.get(), 0, &m_documentNodeToIdMap);

So wordy. Why to use auto on a line like this?
Comment 40 Brent Fulgham 2016-04-28 17:40:53 PDT
Comment on attachment 277629 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=277629&action=review

>> Source/WebCore/inspector/InspectorDOMAgent.cpp:1885
>> +    Ref<Inspector::Protocol::DOM::Node> value = buildObjectForNode(frameOwner.get(), 0, &m_documentNodeToIdMap);
> 
> So wordy. Why to use auto on a line like this?

Oh, I should have. I was just changing "frameOwner" to "frameOwner.get()" since it was changing to a RefPtr.