Bug 23551

Summary: Crash on page load with profiler enabled and running
Product: WebKit Reporter: zextra
Component: JavaScriptCoreAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Major CC: eric, ggaren, mrowe, oliver, thsutton, yves.vangoethem, zwarich
Priority: P1 Keywords: InRadar, NeedsReduction
Version: 528+ (Nightly build)   
Hardware: PC   
OS: Windows XP   
URL: http://www.askgamblers.com
Attachments:
Description Flags
Crash log generated by dr. watson 32
none
dump file generated by dr. watson 32
none
Proposed patch
zwarich: review-
Real proposed patch
none
With a better ChangeLog oliver: review+

zextra
Reported 2009-01-26 14:21:05 PST
When I try opening this page (or any other page on this website), with profiler enabled and running, safari just crashes. It seems to me that it's javascript-related (judging by crash log made using drwtsn32). It has been confirmed by other person on different platform.
Attachments
Crash log generated by dr. watson 32 (51.98 KB, text/plain)
2009-01-26 14:22 PST, zextra
no flags
dump file generated by dr. watson 32 (43.57 KB, application/octet-stream)
2009-01-26 14:22 PST, zextra
no flags
Proposed patch (8.66 KB, patch)
2009-01-29 13:10 PST, Cameron Zwarich (cpst)
zwarich: review-
Real proposed patch (1.72 KB, patch)
2009-01-29 13:12 PST, Cameron Zwarich (cpst)
no flags
With a better ChangeLog (1.90 KB, patch)
2009-01-29 13:15 PST, Cameron Zwarich (cpst)
oliver: review+
zextra
Comment 1 2009-01-26 14:22:01 PST
Created attachment 27046 [details] Crash log generated by dr. watson 32
zextra
Comment 2 2009-01-26 14:22:45 PST
Created attachment 27048 [details] dump file generated by dr. watson 32
Mark Rowe (bdash)
Comment 3 2009-01-26 14:25:10 PST
Cameron Zwarich (cpst)
Comment 4 2009-01-28 23:40:03 PST
This can be triggered by loading about:blank, starting a profile, and reloading about:blank again and again (with the inspector window open) until it crashes. I don't think it has much to do with the page. All of these "profiler crashes during navigation" bugs seem to be the same bug. I'll mark them all duplicates of this one, because this one has an associated Radar.
Cameron Zwarich (cpst)
Comment 5 2009-01-28 23:40:22 PST
*** Bug 23204 has been marked as a duplicate of this bug. ***
Cameron Zwarich (cpst)
Comment 6 2009-01-28 23:41:18 PST
*** Bug 23447 has been marked as a duplicate of this bug. ***
Cameron Zwarich (cpst)
Comment 7 2009-01-28 23:50:45 PST
The problematic profiler call is in Interpreter::execute(FunctionBodyNode*, ...): if (*profiler) (*profiler)->didExecute(newCallFrame, function); Profiler::didExecute() calls a method on newCallFrame->lexicalGlobalObject(), which is garbage. The cause seems to be that part of the ScopeChain is deleted. Enabling COLLECT_ON_EVERY_ALLOCATION doesn't seem to make it more reproducible.
Cameron Zwarich (cpst)
Comment 8 2009-01-29 01:02:12 PST
I am having a lot of trouble tracking this one down. I'll unassign it to myself so someone else who has a bit more luck can fix it.
Cameron Zwarich (cpst)
Comment 9 2009-01-29 02:06:14 PST
So, this appears to be a difference in ScopeChain handling between Debug and Release. It's not a random compiler bug, because I still get it with NDEBUG and -O0. Printing out the ScopeChain right before the offending profiler hook along with ref counts shows this difference. I have no clue what is causing this. Oliver suspects it is a difference in conservative GC marking that is saving the day. Since ScopeChainNodes aren't allocated on the JS heap, it would have to be some other object, like a closure, keeping the ScopeChainNode alive. Geoff, you seem to be the only one who really understands the ScopeChainNode management scheme. Any thoughts?
Geoffrey Garen
Comment 10 2009-01-29 10:39:14 PST
I don't understand how you could trigger this problem just by loading about:blank, since you shouldn't execute any JavaScript, let alone Interpreter::execute(FunctionBodyNode*,...), which only gets called for special direct invocations of functions, like event handlers. That said, a function derefs its scope chain before returning, so calling "didExecute(newCallFrame...)" might provide a garbage scope chain to the profiler. That would explain why lexicalGlobalObject() could crash. The best solution is probably to pass "callFrame" rather than "newCallFrame" to the profiler.
Geoffrey Garen
Comment 11 2009-01-29 10:39:55 PST
Afterwards, you should probably check whether the other crashes are truly dups.
Cameron Zwarich (cpst)
Comment 12 2009-01-29 11:23:30 PST
> I don't understand how you could trigger this problem just by loading > about:blank, since you shouldn't execute any JavaScript, let alone > Interpreter::execute(FunctionBodyNode*,...), which only gets called for special > direct invocations of functions, like event handlers. I assumed it was due to the code that the profiler executes when a page loads. > The best solution is probably to pass "callFrame" rather than "newCallFrame" to > the profiler. Won't that give the profiler the wrong lexical global object in some cases, giving the profiler the wrong profile group? Is that a problem? All fast/profiler tests pass and the crashes goes away. I still don't understand the difference between Debug and Release.
Cameron Zwarich (cpst)
Comment 13 2009-01-29 11:36:29 PST
I added an assertion that the profile groups are the same, and just from browsing the web they seem to be. Is this always going to be the case, Geoff?
Geoffrey Garen
Comment 14 2009-01-29 12:57:48 PST
> > The best solution is probably to pass "callFrame" rather than "newCallFrame" to > > the profiler. > > Won't that give the profiler the wrong lexical global object in some cases, > giving the profiler the wrong profile group? Is that a problem? All > fast/profiler tests pass and the crashes goes away. I would pass "callFrame" both to willExecute and to didExecute. That will give the profiler the lexical global object of the caller, which is what we want. (We don't want to profile the function call if the caller is not being profiled.) It *would* be a slight error to pass newCallFrame to willExecute and callFrame to didExecute. I believe Interpreter::execute for EvalNode and ProgramNode make this mistake, and should probably be fixed.
Cameron Zwarich (cpst)
Comment 15 2009-01-29 13:00:37 PST
Okay. I'll put a patch up for review.
Cameron Zwarich (cpst)
Comment 16 2009-01-29 13:10:55 PST
Created attachment 27156 [details] Proposed patch Here's the fix. Is there a way to test the difference in behaviour with the didExecute() call in DRT, even if I can't get the crash?
Cameron Zwarich (cpst)
Comment 17 2009-01-29 13:12:12 PST
Comment on attachment 27156 [details] Proposed patch Oops. Wrong profiler.diff on my machine. This already landed. ;-)
Cameron Zwarich (cpst)
Comment 18 2009-01-29 13:12:57 PST
Created attachment 27157 [details] Real proposed patch Here's the fix. Is there a way to test the difference in behaviour with the didExecute() call in DRT, even if I can't get the crash?
Cameron Zwarich (cpst)
Comment 19 2009-01-29 13:15:51 PST
Created attachment 27158 [details] With a better ChangeLog I didn't hit command-S on the ChangeLog. ;-)
Cameron Zwarich (cpst)
Comment 20 2009-01-29 19:10:10 PST
Landed in r40396.
Note You need to log in before you can comment on or make changes to this bug.