Bug 16782 - REGRESSION(r29266): Reproducible crash in fast/replaced/image-map.html
Summary: REGRESSION(r29266): Reproducible crash in fast/replaced/image-map.html
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Mac OS X 10.5
: P1 Normal
Assignee: Nobody
URL:
Keywords: InRadar, Regression
Depends on:
Blocks:
 
Reported: 2008-01-07 21:39 PST by Mark Rowe (bdash)
Modified: 2008-01-12 11:46 PST (History)
5 users (show)

See Also:


Attachments
Backtrace (6.85 KB, text/plain)
2008-01-07 21:39 PST, Mark Rowe (bdash)
no flags Details
Fixes crashing test (3.24 KB, patch)
2008-01-08 16:33 PST, Adam Barth
no flags Details | Formatted Diff | Diff
Protect native objects from deletion by self-invalidation (8.06 KB, patch)
2008-01-10 01:53 PST, mitz
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Rowe (bdash) 2008-01-07 21:39:01 PST
On Leopard I am seeing this test reproducibly crashing.  In a debug build, it occasionally generates an assertion failure:
ASSERTION FAILED: _beginCount >= 0
(/Volumes/Data/Home/Documents/Work/WebKit-git/OpenSource/JavaScriptCore/bindings/objc/objc_instance.mm:72 virtual void KJS::Bindings::ObjcInstance::end())

With guard malloc enabled, it crashes 100% of the time with the following trace:

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0xd7c47fe4
0x00008384 in -[EventSendingController mouseUp] (self=0xd7c47fe0, _cmd=0x19094) at /Volumes/Data/Home/Documents/Work/WebKit-git/OpenSource/WebKitTools/DumpRenderTree/mac/EventSendingController.mm:249
249	    down = NO;
(gdb) bt
#0  0x00008384 in -[EventSendingController mouseUp] (self=0xd7c47fe0, _cmd=0x19094) at /Volumes/Data/Home/Documents/Work/WebKit-git/OpenSource/WebKitTools/DumpRenderTree/mac/EventSendingController.mm:249
#1  0x937bfb7d in __invoking___ ()
#2  0x937bf568 in -[NSInvocation invoke] ()
#3  0x00008c78 in +[EventSendingController replaySavedEvents] (self=0x2a8c0, _cmd=0x1c404) at /Volumes/Data/Home/Documents/Work/WebKit-git/OpenSource/WebKitTools/DumpRenderTree/mac/EventSendingController.mm:342
#4  0x00008159 in -[EventSendingController mouseUp] (self=0xd7c47fe0, _cmd=0x19094) at /Volumes/Data/Home/Documents/Work/WebKit-git/OpenSource/WebKitTools/DumpRenderTree/mac/EventSendingController.mm:226
#5  0x937bfb7d in __invoking___ ()
#6  0x937bf568 in -[NSInvocation invoke] ()
#7  0x0033254b in KJS::Bindings::ObjcInstance::invokeMethod (this=0xd7c53fe0, exec=0xbfffe48c, methodList=@0xdc089ff0, args=@0xbfffe250) at /Volumes/Data/Home/Documents/Work/WebKit-git/OpenSource/JavaScriptCore/bindings/objc/objc_instance.mm:186

Full crash trace will be attached in a more readable format.
Comment 1 Mark Rowe (bdash) 2008-01-07 21:39:48 PST
Created attachment 18325 [details]
Backtrace
Comment 2 Mark Rowe (bdash) 2008-01-08 05:29:10 PST
<rdar://problem/5675331>
Comment 3 Maciej Stachowiak 2008-01-08 05:30:33 PST
This is currently crashing for me, and on the buildbot.
Comment 4 Mark Rowe (bdash) 2008-01-08 08:12:53 PST
Caused by <http://trac.webkit.org/projects/webkit/changeset/29266>.
Comment 5 Adam Barth 2008-01-08 14:30:05 PST
Haven't been able to reproduce the crash on Linux.  I'll try on Mac shortly.

The test has a bug:
... href="javascript:document.getElementById('result').innerHTML='area clicked'"

This javascript: URI evaluates to a non-undefined value and so replaces the current document after it sets the innerHTML property as of r29051 (matching Firefox and others).

It looks like someone is pointing to the old Document after it gets deallocated.
Comment 6 Adam Barth 2008-01-08 16:08:46 PST
We think we understand what is happening now.  When the test calls eventSender.mouseUp(), the mouse up event is sent synchronously instead of unwinding the stack and sending the event from the main event loop.  HTMLAnchorElement then also loads the javascript: URL synchronously, which replaces the current document.  At this point, we're screwed because there is a bunch of code on the stack that's not expecting the document to be ripped out from under it.

I don't know if this bug can be triggered outside of a LayoutTest.  I don't know of a programmatic way to click an anchor element in WebKit (Opera provides a click() method).  document.location and window.open load asynchronously.  The remaining possibility is the submit() method on forms, which does run synchronously, but I couldn't get it to crash (need to find code somewhere that holds a stale pointer to a document).

Experiments on Firefox indicate that javascript: URLs run asynchronously.

It seems like the best route is to back out the part of the change that fixes javascript URLs that return a non-undefined value and leave in the security part of the change.  There are a bunch of other correctness bugs with javascript URLs, so it might make sense to fix them together.
Comment 7 Adam Barth 2008-01-08 16:33:12 PST
Created attachment 18335 [details]
Fixes crashing test

Here is a patch that backs out the portion of r29266 that replaces the document for javascript hyperlinks and forms.  This seems to fix the crashing test.
Comment 8 Adam Roben (:aroben) 2008-01-08 16:35:50 PST
I've disabled this test for now in r29326

Comment 9 Darin Adler 2008-01-08 21:49:39 PST
Comment on attachment 18335 [details]
Fixes crashing test

I'm confused. This doesn't seem right. Either replacing the document synchronously is OK or it's not. If it's OK, we should figure out the real reason the crash is happening. If it's not, then there's no reason to have a FIXME comment.
Comment 10 Adam Barth 2008-01-08 22:14:44 PST
We can continue investigating to see what exactly is causing the crash.  There are a couple of factors to consider:

1) You'll probably want to ship r29266 as a security update.  The attached patch attempts to revert r29266 to the minimal change necessary to fix the security bug.  Hopefully this will minimize the chance of shipping a bad security update.

2) WebKit's handling of javascript: URLs has a number of other correctness bugs, see for example <http://bugs.webkit.org/show_bug.cgi?id=16523#c13>.  It might make sense to address all these issues together in another bug/patch.
Comment 11 mitz 2008-01-09 23:22:08 PST
(In reply to comment #6)
> I don't know if this bug can be triggered outside of a LayoutTest.  I don't
> know of a programmatic way to click an anchor element in WebKit (Opera provides
> a click() method).

You can do it in WebKit by dispatching a mouse click DOM event to the element. When the document is replaced, scriptable objects are invalidated, which destroys the underlying runtime objects (in the case of the bug, the event sender object). So I do not know if the bug can be triggered without a native object, but I think it can happen with plug-ins as well, if they trigger the replacement.
Comment 12 mitz 2008-01-10 01:53:13 PST
Created attachment 18361 [details]
Protect native objects from deletion by self-invalidation
Comment 13 Adam Barth 2008-01-10 08:09:58 PST
Comment on attachment 18335 [details]
Fixes crashing test

Thanks for taking a look at this.  Yours is a better fix than mine.
Comment 14 Darin Adler 2008-01-10 09:12:35 PST
Comment on attachment 18361 [details]
Protect native objects from deletion by self-invalidation

r=me
Comment 16 mitz 2008-01-11 23:08:01 PST
The test is crashing again, with a different backtrace. <http://build.webkit.org/results/trunk-mac-ppc-release/10257/results.html>. Investigating.
Comment 17 mitz 2008-01-11 23:37:54 PST
I was wronf(In reply to comment #9)
> Either replacing the document
> synchronously is OK or it's not.

According to Maciej and Geoff on irc, it is not OK.
Comment 18 mitz 2008-01-11 23:40:38 PST
Given the above, the fix should be along the lines of Adam Barth's patch.
Comment 19 Darin Adler 2008-01-12 09:53:08 PST
(In reply to comment #17)
> I was wronf(In reply to comment #9)
> > Either replacing the document
> > synchronously is OK or it's not.
> 
> According to Maciej and Geoff on irc, it is not OK.

The comment needs to be corrected then. It says "FIXME: We should do this synchronously." But we should not!
Comment 20 mitz 2008-01-12 10:30:57 PST
Fixed in <http://trac.webkit.org/projects/webkit/changeset/29432> (but tests results not rolled back to reflect the old behavior).
Comment 21 Adam Barth 2008-01-12 10:40:46 PST
I'll open a new bug to fix the correctness problems with javascript URLs.
Comment 22 mitz 2008-01-12 11:46:56 PST
(In reply to comment #21)
> I'll open a new bug to fix the correctness problems with javascript URLs.
> 

Thanks. That's bug 16855.