Bug 30839

Summary: REGRESSION: crashes in WebCore::RedirectScheduler::timerFired(WebCore::Timer<WebCore::RedirectScheduler>*)
Product: WebKit Reporter: Simon Fraser (smfr) <simon.fraser>
Component: Page LoadingAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, beidson, commit-queue, darin, hybrid.one
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Mac   
OS: OS X 10.5   
URL: http://www.linkinpark.it/forum/index.php
Attachments:
Description Flags
Patch v1 none

Simon Fraser (smfr)
Reported 2009-10-27 15:51:13 PDT
Nameless on #webkit is reporting crashes here: Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000092 Crashed Thread: 0 Dispatch queue: com.apple.main-thread Thread 0 Crashed: Dispatch queue: com.apple.main-thread 0 com.apple.WebCore 0x0000000100dc289a WebCore::RedirectScheduler::timerFired(WebCore::Timer<WebCore::RedirectScheduler>*) + 42 1 com.apple.WebCore 0x0000000100fc6767 WebCore::ThreadTimers::sharedTimerFiredInternal() + 151 2 com.apple.WebCore 0x0000000100ed9e05 WebCore::timerFired(__CFRunLoopTimer*, void*) + 53 3 com.apple.CoreFoundation 0x00007fff85003a78 __CFRunLoopRun + 5480 4 com.apple.CoreFoundation 0x00007fff8500203f CFRunLoopRunSpecific + 575 5 com.apple.HIToolbox 0x00007fff837ffc4e RunCurrentEventLoopInMode + 333 6 com.apple.HIToolbox 0x00007fff837ffa53 ReceiveNextEventCommon + 310 7 com.apple.HIToolbox 0x00007fff837ff90c BlockUntilNextEventMatchingListInMode + 59 8 com.apple.AppKit 0x00007fff82804520 _DPSNextEvent + 718 9 com.apple.AppKit 0x00007fff82803e89 -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] + 155 10 com.apple.Safari 0x000000010000bcf0 0x100000000 + 48368 11 com.apple.AppKit 0x00007fff827c9a7d -[NSApplication run] + 395 12 com.apple.AppKit 0x00007fff827c2798 NSApplicationMain + 364 13 com.apple.Safari 0x0000000100001d0c 0x100000000 + 7436 (s)he says it happens at http://www.linkinpark.it/forum/index.php, in various places in those fora. Regression from bug 30424?
Attachments
Patch v1 (1.13 KB, patch)
2009-10-28 01:08 PDT, Adam Barth
no flags
Simon Fraser (smfr)
Comment 1 2009-10-27 15:52:49 PDT
It's not clear that bug 30424 affected this.
Brady Eidson
Comment 2 2009-10-27 15:53:41 PDT
Probably not a regression from https://bugs.webkit.org/show_bug.cgi?id=30424 - we've had this crash in radar pretty much since Adam added the redirect scheduler. (see <rdar://problem/7313414>)
Nameless
Comment 3 2009-10-27 16:04:04 PDT
I am running Version 4.0.3 (6531.9, r50124) I've updated the browser 3 times, but the problem persists. The Segmentation Fault happens when browsing the phpbb3 forum, but it's general. Sometimes happens when i'm only browsing, others when i'm replying to a post. I don't really know
Nameless
Comment 4 2009-10-27 16:08:54 PDT
(In reply to comment #3) > I am running Version 4.0.3 (6531.9, r50124) > I've updated the browser 3 times, but the problem persists. > The Segmentation Fault happens when browsing the phpbb3 forum, but it's > general. > Sometimes happens when i'm only browsing, others when i'm replying to a post. > I don't really know Oh i forgot. i'm running Snow Leopard
Brady Eidson
Comment 5 2009-10-27 16:51:14 PDT
I'm on a debug build of r50166 and I simply can't reproduce this. I haven't tried replying to a comment yet, but according to the reporter that doesn't actually reliably reproduce it for him, either.
Darin Adler
Comment 6 2009-10-27 17:28:11 PDT
The invalid address at 0x92 indicates we’re dereferencing a null pointer with an offset of 0x92. And defersLoading is at an offset of 0x92. So I’m pretty sure this crash means that RedirectScheduler::timerFired is being called when m_frame->page() is 0. One thing we can do about this right away is to add a return statement for cases where page is 0 after the assertion. Then we can still look for this bug in debug builds, but possible render it harmless in release builds. Another thing we can do is do the thought experiment about how m_frame->page() could become zero without telling the RedirectScheduler to stop its timer.
Darin Adler
Comment 7 2009-10-27 17:41:14 PDT
I see nothing that would guarantee that RedirectScheduleer::clear() is called on the frames in a page when a page is destroyed. The Frame::pageDestroyed() function calls loader()->checkLoadComplete() on the parent frame but otherwise does nothing to stop the loader or clear the redirect scheduler. There’s also CachedFrame::destroy(). My guess is that case already guarantees the loading has stopped somehow, including clearing the redirect scheduler, but I'm not sure about that case. The lowest level way to fix this would be to just check for 0 in the timerFired function. Moving up the food chain, we could add a call to RedirectScheduler::clear() in Frame::detachFromPage(). Or we could add one to Frame::pageDestroyed() if we satisfy ourselves it’s not needed in CachedFrame::destroy(). If we could figure out a clean existing or new higher level concept of when we should stop loading due to the page going away, including clearing the scheduled redirect, then we could just fix that to work properly. Given we can’t reproduce the crash reliably, we may not be able to make a test case, but this is trivial to fix!
Adam Barth
Comment 8 2009-10-28 00:08:14 PDT
Huh... I thought I commented on this bug already. Anyway, I'm looking into it.
Adam Barth
Comment 9 2009-10-28 01:08:31 PDT
Created attachment 42004 [details] Patch v1
Adam Barth
Comment 10 2009-10-28 01:12:02 PDT
Adding the null check seems more robust than making sure some elaborate set of cancelations happens correctly. How can I test this behavior? Is there a good way to hold on to a Frame after it's page has died? Nothing jumped out at me when I grepped the code base. The most promising thing I saw was Database callbacks...
Darin Adler
Comment 11 2009-10-28 07:39:39 PDT
Comment on attachment 42004 [details] Patch v1 I would have put the Page* into a local variable. But Maciej has chided me about that and told me to just trust the compiler to optimize the common subexpression in a simple, inlined case like this.
Darin Adler
Comment 12 2009-10-28 07:41:47 PDT
(In reply to comment #10) > How can I test this behavior? Is there a good way to hold on to a Frame after > it's page has died? Nothing jumped out at me when I grepped the code base. > The most promising thing I saw was Database callbacks... I really have no idea. We could try to read some of the code at <http://www.linkinpark.it/forum/index.php> for clues. For the most part, I know about these "frame afterlife" issues from crash traces.
Adam Barth
Comment 13 2009-10-28 12:45:29 PDT
Comment on attachment 42004 [details] Patch v1 Long term, we might want to consider adding some testing specific APIs to WebKit like "please hold a reference to this Frame and keep it alive." Issues like this are easier to test with unit tests. Maybe we should consider unit tests in addition to LayoutTests? I like the LayoutTest model, but there are a lot of branches for null pages. It's sad if we don't have any way of testing them.
Darin Adler
Comment 14 2009-10-28 16:25:47 PDT
(In reply to comment #13) > I like the LayoutTest model, but there > are a lot of branches for null pages. It's sad if we don't have any way of > testing them. We can probably crack the code and figure out how to test them. I’ll think about it. One of the easiest ways to find a way to trigger a condition is to stick in an assertion and then run the browser under the debugger for a while.
WebKit Commit Bot
Comment 15 2009-10-28 17:22:23 PDT
Comment on attachment 42004 [details] Patch v1 Clearing flags on attachment: 42004 Committed r50251: <http://trac.webkit.org/changeset/50251>
WebKit Commit Bot
Comment 16 2009-10-28 17:22:27 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.