Bug 186081 - REGRESSION(r216119): DocumentLoader::detachFromFrame still encounters nullptr frame
Summary: REGRESSION(r216119): DocumentLoader::detachFromFrame still encounters nullptr...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Brent Fulgham
URL:
Keywords: InRadar, Regression
Depends on: 171604
Blocks:
  Show dependency treegraph
 
Reported: 2018-05-29 21:09 PDT by Brent Fulgham
Modified: 2018-05-31 14:20 PDT (History)
7 users (show)

See Also:


Attachments
Patch (1.69 KB, patch)
2018-05-29 21:25 PDT, Brent Fulgham
no flags Details | Formatted Diff | Diff
Patch (1.99 KB, patch)
2018-05-30 15:19 PDT, Brent Fulgham
no flags Details | Formatted Diff | Diff
Patch (1.92 KB, patch)
2018-05-31 12:55 PDT, Brent Fulgham
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Brent Fulgham 2018-05-29 21:09:31 PDT
In Bug 171604, we removed a nullptr check that we deemed unnecessary due to reentrancy checks and assertions that confirmed that the current frame is valid.

However, long term monitoring of crash data indicates that we are still encountering nullptr frames in this call stack.

It's unclear if PolicyChecker::stopCheck() can cause m_frame to be nulled out somehow. Since m_frame is Ref’d in this method, it doesn’t seem like FrameDestructionObserver could be causing this.

Could one of these be happening?
(1) DocumentLoader is being told to observe a different frame?
(2) DocumentLoader is being destroyed, which would set m_frame to nullptr?

We should put the nullptr check back, and add additional assertions to help catch this case in the wild.
Comment 1 Brent Fulgham 2018-05-29 21:25:59 PDT
Created attachment 341557 [details]
Patch
Comment 2 Darin Adler 2018-05-29 21:59:48 PDT
Comment on attachment 341557 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=341557&action=review

> Source/WebCore/loader/DocumentLoader.cpp:1187
> +    ASSERT(m_frame && m_frame->refCount() >= 1);

No need to include "m_frame &&" in this second assertion since the one above will already have terminated the process if m_frame is nullptr.

I don’t think this refCount assertion is valuable, because RefCountedBase never sets a reference count to 0, so that assertion will only fail if there is some kind of memory smashing bug. Instead we want to write something more like:

    ASSERT(!m_frame->deletionHasBegun());

> Source/WebCore/loader/DocumentLoader.cpp:1192
> +    ASSERT(m_frame && m_frame->refCount() >= 1);

Ditto.
Comment 3 Brent Fulgham 2018-05-30 08:27:19 PDT
Comment on attachment 341557 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=341557&action=review

>> Source/WebCore/loader/DocumentLoader.cpp:1187
>> +    ASSERT(m_frame && m_frame->refCount() >= 1);
> 
> No need to include "m_frame &&" in this second assertion since the one above will already have terminated the process if m_frame is nullptr.
> 
> I don’t think this refCount assertion is valuable, because RefCountedBase never sets a reference count to 0, so that assertion will only fail if there is some kind of memory smashing bug. Instead we want to write something more like:
> 
>     ASSERT(!m_frame->deletionHasBegun());

I originally did try that, only to realize that m_frame is a ThreadSafeRefCounted, not RefCounted, and does not have a 'deletionHasBegun' member. Instead, if deletion has begun refCount() will return 0.
Comment 4 Brent Fulgham 2018-05-30 13:45:12 PDT
<rdar://problem/34918109>
Comment 5 Brent Fulgham 2018-05-30 14:02:39 PDT
I sat down wit Dan Bates and discussed the issue, and learned the following:

1. If the DocumentLoader is in 'm_waitingForNavigationPolicy', and we call 'policyChecker().stopCheck()', we will clear the frame as part of the cleanup phase.

2. If we are in 'm_waitingForContentPolicy', and we call stopCheck, we will also null out the m_frame.

So the two most common crash traces are fully explained, since 'cancelPolicyCheckIfneeded()' can cause m_frame to become nullptr.

Instead, we should pass 'protectedFrame' to the InspectorInstrumentation::loaderDetachedFromFrame, since that is guaranteed by the release assert to be non-null at the start of the method, and should stay non-null even while m_frame might be cleared out.
Comment 6 Brent Fulgham 2018-05-30 15:19:32 PDT
Created attachment 341610 [details]
Patch
Comment 7 Daniel Bates 2018-05-30 16:03:19 PDT
(In reply to Brent Fulgham from comment #5)
> I sat down wit Dan Bates and discussed the issue, and learned the following:
> 
> 1. If the DocumentLoader is in 'm_waitingForNavigationPolicy', and we call
> 'policyChecker().stopCheck()', we will clear the frame as part of the
> cleanup phase.
> 
> 2. If we are in 'm_waitingForContentPolicy', and we call stopCheck, we will
> also null out the m_frame.
> 
Notice that DocumentLoader::detachFromFrame() is the only function that nullifies DocumentLoader::m_frame (via observeFrame(nullptr)). m_frame is being cleared because cancelPolicyCheckIfNeeded() is causing us to re-enter this function. (Note that it cannot be code above cancelPolicyCheckIfNeeded() that re-enters this function because we would see a RELEASE_ASSERT() crash in cancelPolicyCheckIfNeeded() at <https://trac.webkit.org/browser/trunk/Source/WebCore/loader/DocumentLoader.cpp?rev=232147#L1810> and I have not seen reports for such crashes).

> So the two most common crash traces are fully explained, since
> 'cancelPolicyCheckIfneeded()' can cause m_frame to become nullptr.
> 
> Instead, we should pass 'protectedFrame' to the
> InspectorInstrumentation::loaderDetachedFromFrame, since that is guaranteed
> by the release assert to be non-null at the start of the method, and should
> stay non-null even while m_frame might be cleared out.

I inadvertently suggested using protectedFrame. Instead we should just bail out early after cancelPolicyCheckIfneeded() is called if m_frame is null.

Any chance we can write a test for this change?
Comment 8 Brent Fulgham 2018-05-31 12:55:35 PDT
Created attachment 341686 [details]
Patch
Comment 9 David Kilzer (:ddkilzer) 2018-05-31 13:33:45 PDT
Comment on attachment 341686 [details]
Patch

rs=me, but please file a bugs.webkit.org bug to track creation of a test.
Comment 10 David Kilzer (:ddkilzer) 2018-05-31 13:36:07 PDT
(In reply to David Kilzer (:ddkilzer) from comment #9)
> Comment on attachment 341686 [details]
> Patch
> 
> rs=me, but please file a bugs.webkit.org bug to track creation of a test.

And/or to implement a better fix, then add a FIXME comment to that code when you file the bug.
Comment 11 WebKit Commit Bot 2018-05-31 13:54:31 PDT
Comment on attachment 341686 [details]
Patch

Clearing flags on attachment: 341686

Committed r232370: <https://trac.webkit.org/changeset/232370>
Comment 12 WebKit Commit Bot 2018-05-31 13:54:32 PDT
All reviewed patches have been landed.  Closing bug.