Bug 32839 - REGRESSION (r52446) - Crash starting an OutlivePage load
Summary: REGRESSION (r52446) - Crash starting an OutlivePage load
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Page Loading (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P1 Major
Assignee: Nate Chapin
URL:
Keywords: Regression
Depends on:
Blocks:
 
Reported: 2009-12-21 13:30 PST by Brady Eidson
Modified: 2010-01-25 09:26 PST (History)
4 users (show)

See Also:


Attachments
Revert r52446 (22.66 KB, patch)
2009-12-21 14:19 PST, Nate Chapin
no flags Details | Formatted Diff | Diff
Patch minus the crashing (27.43 KB, patch)
2010-01-08 13:56 PST, Nate Chapin
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Brady Eidson 2009-12-21 13:30:27 PST
After updating past http://trac.webkit.org/changeset/52446 (the fix for https://bugs.webkit.org/show_bug.cgi?id=30457), I've seen a crash twice with one of these outlive page loads.  The crash is a null-deref.

I do not know exactly steps to reproduce yet (haven't tried) but it's easy to see the cause.

#0	0x10258fa9a in WebCore::ResourceRequestBase::updateResourceRequest at ResourceRequestBase.cpp:386
#1	0x10258fbb3 in WebCore::ResourceRequestBase::cachePolicy at ResourceRequestBase.cpp:132
#2	0x102629d58 in WebCore::SubresourceLoader::create at SubresourceLoader.cpp:97
#3	0x1023cd027 in WebCore::Loader::Host::servePendingRequests at loader.cpp:359
#4	0x1023cd217 in WebCore::Loader::Host::servePendingRequests at loader.cpp:313
#5	0x1023ce2b8 in WebCore::Loader::servePendingRequests at loader.cpp:191
#6	0x1023ce36c in WebCore::Loader::requestTimerFired at loader.cpp:170
#7	0x1023cfc05 in WebCore::Timer<WebCore::Loader>::fired at Timer.h:98
#8	0x102757744 in WebCore::ThreadTimers::sharedTimerFiredInternal at ThreadTimers.cpp:112
#9	0x102757889 in WebCore::ThreadTimers::sharedTimerFired at ThreadTimers.cpp:90
#10	0x1025ea6d2 in WebCore::timerFired at SharedTimerMac.mm:86
#11	0x7fff86677a58 in __CFRunLoopRun
#12	0x7fff86675c2f in CFRunLoopRunSpecific
#13	0x7fff83370a4e in RunCurrentEventLoopInMode
#14	0x7fff83370853 in ReceiveNextEventCommon
#15	0x7fff8337070c in BlockUntilNextEventMatchingListInMode
...


The interesting point of the crash is in frame #2 in SubresourceLoader::create at SubresourceLoader.cpp:97
        newRequest.setCachePolicy(fl->originalRequest().cachePolicy());

The Frame is valid, kept alive past the lifetime of the Document.
The FrameLoader is valid, attached to the Frame.
But inside FrameLoader::originalRequest() is a call to FrameLoader::activeDocumentLoader(), and the Frame has been detached so the active DocumentLoader is null.

Therefore the ResourceRequest returned is bogus.

This crash is probably a specific manifestation of a more general problem with r52446 - I think there are loads of callsites to FrameLoader::activeDocumentLoader() that don't null check the result.  That is because until now there has been an underlying assumption that FrameLoaders *ALWAYS* have an active DocumentLoader, and this change changes that very fundamental fact.  I think we'll be seeing more of these...
Comment 1 Brady Eidson 2009-12-21 13:39:15 PST
Should clarify:

Multiple callers to both FrameLoader::activeDocumentLoader() and FrameLoader::documentLoader() that don't null check, because neither were ever null before now.
Comment 2 Nate Chapin 2009-12-21 13:45:27 PST
(In reply to comment #1)
> Should clarify:
> 
> Multiple callers to both FrameLoader::activeDocumentLoader() and
> FrameLoader::documentLoader() that don't null check, because neither were ever
> null before now.

Should I revert r52446 while a solution gets worked out?
Comment 3 Darin Adler 2009-12-21 13:49:49 PST
(In reply to comment #2)
> Should I revert r52446 while a solution gets worked out?

Probably, unless we can come up with a quick solution in the next few hours.
Comment 4 Nate Chapin 2009-12-21 13:58:23 PST
(In reply to comment #3)
> (In reply to comment #2)
> > Should I revert r52446 while a solution gets worked out?
> 
> Probably, unless we can come up with a quick solution in the next few hours.

I'm assuming just keeping the document loader alive the same way r52446 keeps the frame alive doesn't count as a solution?
Comment 5 Darin Adler 2009-12-21 13:59:52 PST
(In reply to comment #4)
> I'm assuming just keeping the document loader alive the same way r52446 keeps
> the frame alive doesn't count as a solution?

It counts if it works.

Normally document loaders are closely tied to a document, so having a document loader around after the document is gone seems like it’s probably not a good thing. It might be better to have some sort of dummy document and a loader that goes along with that.
Comment 6 Nate Chapin 2009-12-21 14:03:53 PST
(In reply to comment #5)
> (In reply to comment #4)
> > I'm assuming just keeping the document loader alive the same way r52446 keeps
> > the frame alive doesn't count as a solution?
> 
> It counts if it works.
> 
> Normally document loaders are closely tied to a document, so having a document
> loader around after the document is gone seems like it’s probably not a good
> thing. It might be better to have some sort of dummy document and a loader that
> goes along with that.

That was my assumption, but I'm new enough to the loader that I wasn't sure.

I guess I'll go ahead and revert for now, since I'm off on vacation starting tomorrow and I'm not confident I can get a (good) fix in by end of day today.
Comment 7 Brady Eidson 2009-12-21 14:13:16 PST
(In reply to comment #5)
> (In reply to comment #4)
> > I'm assuming just keeping the document loader alive the same way r52446 keeps
> > the frame alive doesn't count as a solution?
> 
> It counts if it works.
> 
> Normally document loaders are closely tied to a document, so having a document
> loader around after the document is gone seems like it’s probably not a good
> thing. It might be better to have some sort of dummy document and a loader that
> goes along with that.

I think the Document->DocumentLoader relationship is pretty much as set-in-stone as the Frame->FrameLoader relationship, and we should keep it that way.

It seems like we'll be opening the door for more of these "ping" type loads in the future, and it would be best to keep them as lightweight as possible.  So I like this dummy Document/DocumentLoader idea going forward, as it would contain the ping loads inside one lightweight context instead of spreading them out over many heavyweight contexts (like keeping each full-blown DocumentLoader alive)
Comment 8 Nate Chapin 2009-12-21 14:19:11 PST
Created attachment 45349 [details]
Revert r52446
Comment 9 WebKit Review Bot 2009-12-21 14:20:04 PST
style-queue ran check-webkit-style on attachment 45349 [details] without any errors.
Comment 10 Darin Adler 2009-12-21 14:28:39 PST
Comment on attachment 45349 [details]
Revert r52446

rs=me
Comment 11 Nate Chapin 2009-12-21 14:44:15 PST
(In reply to comment #10)
> (From update of attachment 45349 [details])
> rs=me

Reverted: http://trac.webkit.org/changeset/52456.

I'm going to leave this bug opened and assigned to me for fixing the crash and re-landing.
Comment 12 Eric Seidel (no email) 2009-12-28 23:44:47 PST
Comment on attachment 45349 [details]
Revert r52446

Clearing Darin Adler's r+ on this obsolete patch.
Comment 13 Nate Chapin 2010-01-06 16:12:01 PST
Did you ever find a consistent repro case for this crash?  I've got something that might be a fix, but I've yet to find a good way to verify it.

(In reply to comment #0)
> After updating past http://trac.webkit.org/changeset/52446 (the fix for
> https://bugs.webkit.org/show_bug.cgi?id=30457), I've seen a crash twice with
> one of these outlive page loads.  The crash is a null-deref.
> 
> I do not know exactly steps to reproduce yet (haven't tried) but it's easy to
> see the cause.
> 
> #0    0x10258fa9a in WebCore::ResourceRequestBase::updateResourceRequest at
> ResourceRequestBase.cpp:386
> #1    0x10258fbb3 in WebCore::ResourceRequestBase::cachePolicy at
> ResourceRequestBase.cpp:132
> #2    0x102629d58 in WebCore::SubresourceLoader::create at
> SubresourceLoader.cpp:97
> #3    0x1023cd027 in WebCore::Loader::Host::servePendingRequests at
> loader.cpp:359
> #4    0x1023cd217 in WebCore::Loader::Host::servePendingRequests at
> loader.cpp:313
> #5    0x1023ce2b8 in WebCore::Loader::servePendingRequests at loader.cpp:191
> #6    0x1023ce36c in WebCore::Loader::requestTimerFired at loader.cpp:170
> #7    0x1023cfc05 in WebCore::Timer<WebCore::Loader>::fired at Timer.h:98
> #8    0x102757744 in WebCore::ThreadTimers::sharedTimerFiredInternal at
> ThreadTimers.cpp:112
> #9    0x102757889 in WebCore::ThreadTimers::sharedTimerFired at
> ThreadTimers.cpp:90
> #10    0x1025ea6d2 in WebCore::timerFired at SharedTimerMac.mm:86
> #11    0x7fff86677a58 in __CFRunLoopRun
> #12    0x7fff86675c2f in CFRunLoopRunSpecific
> #13    0x7fff83370a4e in RunCurrentEventLoopInMode
> #14    0x7fff83370853 in ReceiveNextEventCommon
> #15    0x7fff8337070c in BlockUntilNextEventMatchingListInMode
> ...
> 
> 
> The interesting point of the crash is in frame #2 in SubresourceLoader::create
> at SubresourceLoader.cpp:97
>         newRequest.setCachePolicy(fl->originalRequest().cachePolicy());
> 
> The Frame is valid, kept alive past the lifetime of the Document.
> The FrameLoader is valid, attached to the Frame.
> But inside FrameLoader::originalRequest() is a call to
> FrameLoader::activeDocumentLoader(), and the Frame has been detached so the
> active DocumentLoader is null.
> 
> Therefore the ResourceRequest returned is bogus.
> 
> This crash is probably a specific manifestation of a more general problem with
> r52446 - I think there are loads of callsites to
> FrameLoader::activeDocumentLoader() that don't null check the result.  That is
> because until now there has been an underlying assumption that FrameLoaders
> *ALWAYS* have an active DocumentLoader, and this change changes that very
> fundamental fact.  I think we'll be seeing more of these...
Comment 14 Nate Chapin 2010-01-07 16:07:04 PST
(In reply to comment #13)
> Did you ever find a consistent repro case for this crash?  I've got something
> that might be a fix, but I've yet to find a good way to verify it.

For future reference, I eventually found http://www.travelsupermarket.com/flights.aspx consistently crashed with my initial checkin.
Comment 15 Nate Chapin 2010-01-08 13:56:25 PST
Created attachment 46158 [details]
Patch minus the crashing

The additions to http://trac.webkit.org/changeset/52446 can be summarize as such:
Request.h/cpp : In addition to the Frame, maintain RefPtrs to the Document and DocumentLoader in the case of a request that can outlive the page.
DocumentLoader.cpp / ResourceLoader.cpp : Allow for the possibility that Frame::page() and Frame::settings() may be null (as can now be the case if the unload event is triggered by closing rather than navigating).
FrameLoader.h/cpp : Keep a count of the number of requests that may outlive the page that refer to this FrameLoader, and don't allow m_documentLoader to be nulled if that count is non-zero.

I'm not sure that this is a good solution, but it does appear to stop the crashing.  It seems to me, though, that I've allowed the possibility that, when an OutlivePage Request is executed, FrameLoader::m_documentLoader will not be the DocumentLoader that originally associated with the Request.  Is that a Bad Thing?
Comment 16 WebKit Review Bot 2010-01-08 14:03:53 PST
style-queue ran check-webkit-style on attachment 46158 [details] without any errors.
Comment 17 Nate Chapin 2010-01-25 09:24:06 PST
Comment on attachment 46158 [details]
Patch minus the crashing

I think this should've been posted on https://bugs.webkit.org/show_bug.cgi?id=30457
Comment 18 Nate Chapin 2010-01-25 09:26:52 PST
(In reply to comment #17)
> (From update of attachment 46158 [details])
> I think this should've been posted on
> https://bugs.webkit.org/show_bug.cgi?id=30457

Closing this issue since the regression was reverted and the new patch is up on the original bug.