I don't think any WebKit developers have MediaTemple accounts which makes tracking down the issue a little bit trickier. The more information we can get about the crash and when it started occurring, the easier it will be to fix. Would it be possible for you to try a few older nightly builds to try and narrow down roughly when the crash started happening?
Does this happen with all non-Adobe/Macromedia third-party extensions removed? (e.g., com.stclairsoft.DefaultFolderX, com.yousoftware.youhelper, com.unmarked.textsoap.osax, URIEscapeOSAX, com.satimage.Numerics et al, SmartWrap, Cocoa Patcher, com.ioxperts.vdig.webcam)
I can try removing all the third-party extensions listed. However, these have been in use for some time without apparent problems, and Safari Version 3.1.2 (5525.20.1) does not crash under identical circumstances with these extensions installed. I'll try removing them regardless.
(In reply to comment #3)
> Does this happen with all non-Adobe/Macromedia third-party extensions removed?
> (e.g., com.stclairsoft.DefaultFolderX, com.yousoftware.youhelper,
> com.unmarked.textsoap.osax, URIEscapeOSAX, com.satimage.Numerics et al,
> SmartWrap, Cocoa Patcher, com.ioxperts.vdig.webcam)
>
r34944 crashes, but r34798 does not crash - I'll move forward from r34798 to find the first version that starts crashing...
(In reply to comment #1)
> I don't think any WebKit developers have MediaTemple accounts which makes
> tracking down the issue a little bit trickier. The more information we can get
> about the crash and when it started occurring, the easier it will be to fix.
> Would it be possible for you to try a few older nightly builds to try and
> narrow down roughly when the crash started happening?
>
I tested the various earlier nightly builds and it appears that this problem started with the July 2nd r34941 build. Builds r34798, r34822, and r34824 do not crash when I tested these per your suggestion.
(In reply to comment #1)
> I don't think any WebKit developers have MediaTemple accounts which makes
> tracking down the issue a little bit trickier. The more information we can get
> about the crash and when it started occurring, the easier it will be to fix.
> Would it be possible for you to try a few older nightly builds to try and
> narrow down roughly when the crash started happening?
>
Deirdre, I tried r35024 again without the third-party add-ins you listed on a "secondary" drive (the "primary" drive which produced the previously reported problems crashed on re-boot and I had to do an "archive and install" to restore functionality). Since the "secondary" drive is not set up identically to the "primary" drive, I'll remove the add-ins from the "primary" and try again to reduce any possible other variables from influencing the results. I'll post those results shortly.
(In reply to comment #11)
> Craig, please remove all third-party add-ins, then see if the crash reproduces.
>
If it will help, I will set up a log in to one of my MediaTemple hosting accounts and make it available to WebKit developers provided the User ID/Password is kept confidential and not posted publicly. Please let me know if you'd like me to provide this capability.
(In reply to comment #1)
> I don't think any WebKit developers have MediaTemple accounts which makes
> tracking down the issue a little bit trickier. The more information we can get
> about the crash and when it started occurring, the easier it will be to fix.
> Would it be possible for you to try a few older nightly builds to try and
> narrow down roughly when the crash started happening?
>
Craig, thanks for the crash logs. I don't think we'll be needing any more of those at this point :-) I would appreciate if you could set up account like you described so that we can reproduce the problem directly. Feel free to email the details of it to me.
Based on the backtrace and disassembly, it looks like convertValueToNPVariant has been called with a null "value" argument. This implies that the line "JSValue* resultV = call(exec, function, callType, callData, obj->imp, argList);" inside _NPN_Invoke is returning null.
Mark, I set up a log in for you and have emailed the details to mrowe@apple.com. I hope this helps.
(In reply to comment #19)
> Craig, thanks for the crash logs. I don't think we'll be needing any more of
> those at this point :-) I would appreciate if you could set up account like
> you described so that we can reproduce the problem directly. Feel free to
> email the details of it to me.
>
Cameron and All -Sorry about pasting the full stack trace in the bug report and "over-attaching" crash reports to this bug - won't do this again (wasn't thinking, need more sleep, didn't read all the "good reporting practices" posts). Any way I can edit this down? :-(
(In reply to comment #22)
> If I can reproduce this, then I can likely fix it. This also seems very similar
> to bug 19926, but they don't occur in the exact same place.
>
Craig, it's just something to keep in mind for any future reports. There's not much that can be done about existing ones, and it's not really a big problem.
Created attachment 22141[details]
Proposed patch
Here's a patch that fixes the problem. We made NPN_SetException a no-op in order to fix bug 19853, but an exception could also be set from JavaScript code itself. Clearing exceptions after calling out to JavaScript code seems to be the only fix besides properly implementing exceptions in the Netscape plugin API. If we checked for an exception in Machine::execute() and returned jsNull() if one was set, then we would probably be breaking a lot of legitimate calls to JavaScript. The only case this will change is where JavaScript calls out to the Netscape plugin API, and the Netscape plugin executes a single reentrant call to JavaScript and then returns (a second call would cause a crash, just like in this bug). In that case, there may currently be an exception returned that is now missed with this patch.
Mark pointed out that the NPAPI documentation suggests that the return value of any of these functions should be false if the call does not succeed, instead of true. I will try to make a TestPlugin test case to determine what Mozilla actually does. Either way, we likely want to clear it before returning.
You also need to patch NPN_SetProperty, NPN_RemoveProperty, NPN_HasProperty, NPN_HasMethod, and NPN_Enumerate.
(I believe that enumerate and has* can't throw, but I'm not sure, so let's be safe rather than sorry.)
Created attachment 22142[details]
Revised proposed patch
This patch incorporates Geoff's comments. I thought it also fixed bug 19926, but it just makes it harder to reproduce. There are two different stack traces there, and one of them seems to be this bug, but the other is different.
Comment on attachment 22142[details]
Revised proposed patch
Cameron mentioned on IRC that he would file a separate bug to address Mark's comment.
I think the ChangeLog could be a little clearer. i would say something like, "Clear the exception after invoking an NPAPI callback, to prevent it from short-circuiting the next script that executes. FIXME: Find a way to return this exception information through the NPAPI. See http..."
2008-07-06 02:37 PDT, Craig W. Cadwallader
2008-07-06 02:38 PDT, Craig W. Cadwallader
2008-07-06 02:55 PDT, Craig W. Cadwallader
2008-07-06 09:29 PDT, Craig W. Cadwallader
2008-07-06 11:00 PDT, Craig W. Cadwallader
2008-07-06 11:49 PDT, Craig W. Cadwallader
2008-07-06 13:23 PDT, Craig W. Cadwallader
2008-07-06 13:24 PDT, Craig W. Cadwallader
2008-07-06 13:25 PDT, Craig W. Cadwallader
2008-07-07 15:25 PDT, Cameron Zwarich (cpst)
2008-07-07 16:21 PDT, Cameron Zwarich (cpst)