Bug 16782 - REGRESSION(r29266): Reproducible crash in fast/replaced/image-map.html
: REGRESSION(r29266): Reproducible crash in fast/replaced/image-map.html
Status: RESOLVED FIXED
: WebKit
New Bugs
: 528+ (Nightly build)
: Macintosh Mac OS X 10.5
: P1 Normal
Assigned To:
:
: InRadar, Regression, ReviewedForRadar
:
:
  Show dependency treegraph
 
Reported: 2008-01-07 21:39 PST by
Modified: 2008-01-12 11:46 PST (History)


Attachments
Backtrace (6.85 KB, text/plain)
2008-01-07 21:39 PST, Mark Rowe (bdash)
no flags Details
Fixes crashing test (3.24 KB, patch)
2008-01-08 16:33 PST, Adam Barth
no flags Review Patch | Details | Formatted Diff | Diff
Protect native objects from deletion by self-invalidation (8.06 KB, patch)
2008-01-10 01:53 PST, mitz@webkit.org
no flags Review Patch | Details | Formatted Diff | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2008-01-07 21:39:01 PST
On Leopard I am seeing this test reproducibly crashing.  In a debug build, it occasionally generates an assertion failure:
ASSERTION FAILED: _beginCount >= 0
(/Volumes/Data/Home/Documents/Work/WebKit-git/OpenSource/JavaScriptCore/bindings/objc/objc_instance.mm:72 virtual void KJS::Bindings::ObjcInstance::end())

With guard malloc enabled, it crashes 100% of the time with the following trace:

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0xd7c47fe4
0x00008384 in -[EventSendingController mouseUp] (self=0xd7c47fe0, _cmd=0x19094) at /Volumes/Data/Home/Documents/Work/WebKit-git/OpenSource/WebKitTools/DumpRenderTree/mac/EventSendingController.mm:249
249        down = NO;
(gdb) bt
#0  0x00008384 in -[EventSendingController mouseUp] (self=0xd7c47fe0, _cmd=0x19094) at /Volumes/Data/Home/Documents/Work/WebKit-git/OpenSource/WebKitTools/DumpRenderTree/mac/EventSendingController.mm:249
#1  0x937bfb7d in __invoking___ ()
#2  0x937bf568 in -[NSInvocation invoke] ()
#3  0x00008c78 in +[EventSendingController replaySavedEvents] (self=0x2a8c0, _cmd=0x1c404) at /Volumes/Data/Home/Documents/Work/WebKit-git/OpenSource/WebKitTools/DumpRenderTree/mac/EventSendingController.mm:342
#4  0x00008159 in -[EventSendingController mouseUp] (self=0xd7c47fe0, _cmd=0x19094) at /Volumes/Data/Home/Documents/Work/WebKit-git/OpenSource/WebKitTools/DumpRenderTree/mac/EventSendingController.mm:226
#5  0x937bfb7d in __invoking___ ()
#6  0x937bf568 in -[NSInvocation invoke] ()
#7  0x0033254b in KJS::Bindings::ObjcInstance::invokeMethod (this=0xd7c53fe0, exec=0xbfffe48c, methodList=@0xdc089ff0, args=@0xbfffe250) at /Volumes/Data/Home/Documents/Work/WebKit-git/OpenSource/JavaScriptCore/bindings/objc/objc_instance.mm:186

Full crash trace will be attached in a more readable format.
------- Comment #1 From 2008-01-07 21:39:48 PST -------
Created an attachment (id=18325) [details]
Backtrace
------- Comment #2 From 2008-01-08 05:29:10 PST -------
<rdar://problem/5675331>
------- Comment #3 From 2008-01-08 05:30:33 PST -------
This is currently crashing for me, and on the buildbot.
------- Comment #4 From 2008-01-08 08:12:53 PST -------
Caused by <http://trac.webkit.org/projects/webkit/changeset/29266>.
------- Comment #5 From 2008-01-08 14:30:05 PST -------
Haven't been able to reproduce the crash on Linux.  I'll try on Mac shortly.

The test has a bug:
... href="javascript:document.getElementById('result').innerHTML='area clicked'"

This javascript: URI evaluates to a non-undefined value and so replaces the current document after it sets the innerHTML property as of r29051 (matching Firefox and others).

It looks like someone is pointing to the old Document after it gets deallocated.
------- Comment #6 From 2008-01-08 16:08:46 PST -------
We think we understand what is happening now.  When the test calls eventSender.mouseUp(), the mouse up event is sent synchronously instead of unwinding the stack and sending the event from the main event loop.  HTMLAnchorElement then also loads the javascript: URL synchronously, which replaces the current document.  At this point, we're screwed because there is a bunch of code on the stack that's not expecting the document to be ripped out from under it.

I don't know if this bug can be triggered outside of a LayoutTest.  I don't know of a programmatic way to click an anchor element in WebKit (Opera provides a click() method).  document.location and window.open load asynchronously.  The remaining possibility is the submit() method on forms, which does run synchronously, but I couldn't get it to crash (need to find code somewhere that holds a stale pointer to a document).

Experiments on Firefox indicate that javascript: URLs run asynchronously.

It seems like the best route is to back out the part of the change that fixes javascript URLs that return a non-undefined value and leave in the security part of the change.  There are a bunch of other correctness bugs with javascript URLs, so it might make sense to fix them together.
------- Comment #7 From 2008-01-08 16:33:12 PST -------
Created an attachment (id=18335) [details]
Fixes crashing test

Here is a patch that backs out the portion of r29266 that replaces the document for javascript hyperlinks and forms.  This seems to fix the crashing test.
------- Comment #8 From 2008-01-08 16:35:50 PST -------
I've disabled this test for now in r29326
------- Comment #9 From 2008-01-08 21:49:39 PST -------
(From update of attachment 18335 [details])
I'm confused. This doesn't seem right. Either replacing the document synchronously is OK or it's not. If it's OK, we should figure out the real reason the crash is happening. If it's not, then there's no reason to have a FIXME comment.
------- Comment #10 From 2008-01-08 22:14:44 PST -------
We can continue investigating to see what exactly is causing the crash.  There are a couple of factors to consider:

1) You'll probably want to ship r29266 as a security update.  The attached patch attempts to revert r29266 to the minimal change necessary to fix the security bug.  Hopefully this will minimize the chance of shipping a bad security update.

2) WebKit's handling of javascript: URLs has a number of other correctness bugs, see for example <http://bugs.webkit.org/show_bug.cgi?id=16523#c13>.  It might make sense to address all these issues together in another bug/patch.
------- Comment #11 From 2008-01-09 23:22:08 PST -------
(In reply to comment #6)
> I don't know if this bug can be triggered outside of a LayoutTest.  I don't
> know of a programmatic way to click an anchor element in WebKit (Opera provides
> a click() method).

You can do it in WebKit by dispatching a mouse click DOM event to the element. When the document is replaced, scriptable objects are invalidated, which destroys the underlying runtime objects (in the case of the bug, the event sender object). So I do not know if the bug can be triggered without a native object, but I think it can happen with plug-ins as well, if they trigger the replacement.
------- Comment #12 From 2008-01-10 01:53:13 PST -------
Created an attachment (id=18361) [details]
Protect native objects from deletion by self-invalidation
------- Comment #13 From 2008-01-10 08:09:58 PST -------
(From update of attachment 18335 [details])
Thanks for taking a look at this.  Yours is a better fix than mine.
------- Comment #14 From 2008-01-10 09:12:35 PST -------
(From update of attachment 18361 [details])
r=me
------- Comment #15 From 2008-01-10 09:40:20 PST -------
Landed <http://trac.webkit.org/projects/webkit/changeset/29362>.
------- Comment #16 From 2008-01-11 23:08:01 PST -------
The test is crashing again, with a different backtrace. <http://build.webkit.org/results/trunk-mac-ppc-release/10257/results.html>. Investigating.
------- Comment #17 From 2008-01-11 23:37:54 PST -------
I was wronf(In reply to comment #9)
> Either replacing the document
> synchronously is OK or it's not.

According to Maciej and Geoff on irc, it is not OK.
------- Comment #18 From 2008-01-11 23:40:38 PST -------
Given the above, the fix should be along the lines of Adam Barth's patch.
------- Comment #19 From 2008-01-12 09:53:08 PST -------
(In reply to comment #17)
> I was wronf(In reply to comment #9)
> > Either replacing the document
> > synchronously is OK or it's not.
> 
> According to Maciej and Geoff on irc, it is not OK.

The comment needs to be corrected then. It says "FIXME: We should do this synchronously." But we should not!
------- Comment #20 From 2008-01-12 10:30:57 PST -------
Fixed in <http://trac.webkit.org/projects/webkit/changeset/29432> (but tests results not rolled back to reflect the old behavior).
------- Comment #21 From 2008-01-12 10:40:46 PST -------
I'll open a new bug to fix the correctness problems with javascript URLs.
------- Comment #22 From 2008-01-12 11:46:56 PST -------
(In reply to comment #21)
> I'll open a new bug to fix the correctness problems with javascript URLs.
> 

Thanks. That's bug 16855.