Bug 177577 - REGRESSION(2.18.0): [GTK] Crash in WebKitWebContext disposal (Eclipse)
Summary: REGRESSION(2.18.0): [GTK] Crash in WebKitWebContext disposal (Eclipse)
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKitGTK (show other bugs)
Version: WebKit Local Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-09-27 15:51 PDT by Leo Ufimtsev
Modified: 2019-05-02 16:24 PDT (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Leo Ufimtsev 2017-09-27 15:51:04 PDT
Hello WebkitGtk developers, 

I'm having a hard time figuring out why webkit crashes as of 2.18 with us, I was wondering if the below stack trace looks familiar to anyone?

Context:
- We use WebkiGtk for Eclipse's/SWT's Browser functionality on Linux. (Java native Interface to Webkitgtk)
- We have a custom container for widget layout.
- We tend to dispose widgets from parent down to child.

In 2.16, all was fine. As of 2.18.0, when we exit Eclipse, Webkit now crashes with the following backtrace:

(Stack trace(s) was shortened/reduced for clarity)

1    /lib64/libjavascriptcoregtk-4.0.so.18 WTFCrash
2    /lib64/libwebkit2gtk-4.0.so.37
..6    /lib64/libgobject-2.0.so.0(g_object_unref..)
..9    /lib64/libc.so.6(__libc_start_main..)
..10   /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-7.b12.fc26.x86_64/bin/java

If I download/install debug information and attach with gdb, I get a stacktrace like:
(gdb) bt
#0  WTFCrash() () at Source/WTF/wtf/Assertions.cpp:278
#1  WebKit::CallbackMap::invalidate(WebKit::CallbackBase::Error) 
          (error=WebKit::CallbackBase::Error::OwnerWasInvalidated) /WebKit/UIProcess/GenericCallback.h:225
#2  WebKit::WebCookieManagerProxy::processPoolDestroyed() /WebKit/UIProcess/WebCookieManagerProxy.cpp:76
#3  WebKit::WebProcessPool::~WebProcessPool()  __in_chrg=.. /Source/WebKit/UIProcess/WebProcessPool.cpp:298
#4  WebKit::WebProcessPool::~WebProcessPool()  WebKit/UIProcess/WebProcessPool.cpp:317
#5  WTF::ThreadSafeRefCounted<API::Object>::deref() const .. WTF/wtf/ThreadSafeRefCounted.h:71
#6  WTF::derefIfNotNull<WebKit::WebProcessPool>(WebKit::WebProcessPool*)(ptr=../WTF/wtf/RefPtr.h:45
#7  WTF::RefPtr<WebKit::WebProcessPool>::~RefPtr() __in_chrg=.. /WTF/wtf/RefPtr.h:69
#8  _WebKitWebContextPrivate::~_WebKitWebContextPrivate()  __in_chrg=. /WebKit/UIProcess/API/glib/WebKitWebContext.cpp:163
#9  webkit_web_context_finalize(GObject*) (object=... /WebKit/UIProcess/API/glib/WebKitWebContext.cpp:245
#10 g_object_unref (_object=0x7fde296e7110) at gobject.c:3185
#11 __run_exit_handlers (status=0, listp=0x7.... <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:83
#12 __GI_exit (status=<optimized out>) at exit.c:105
#13 __libc_start_main (main=0x151b63e670 <main>, argc=5, argv=0x7ffc1db781f8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffc1db781e8)
    at ../csu/libc-start.c:329
#14in _start ()
...

^ Observe "g_object_unref()" on #10. It seems gblib tries to run an exit handler, which tries to run a g_object_unref() on webview, which leads to crash?

The crash occurs in Webkit/Javascript core after Java/Jvm code has finished, in java native code.

I noticed that if I do an g_object_ref(webview) somewhere near disposal, then the crash doesn't occur. 

Does the above ring a bell with anyone?

Was there a mechanism added to webkit/webkitgtk to somehow auto-cleanup somewhere that didn't exist before?

Btw, what's the correct way to destroy/dispose webkit/webview?

System:
Linux Fedora 25
Gtk3.22
WebkitGtk 2.18 (New from upstream/Rawhide)

Please advise, thank you.
Comment 1 Tomas Popela 2017-09-29 01:34:03 PDT
(In reply to Leo Ufimtsev from comment #0)
> I noticed that if I do an g_object_ref(webview) somewhere near disposal,
> then the crash doesn't occur. 
 
Just to be sure, are you holding a reference on the web view in your Eclipse's/SWT's Browser when you create it?

> Does the above ring a bell with anyone?

What I would try is to debug the reference count of the web view (set a break point on glibc's g_object_ref()/g_object_unref() one you will know the address of the newly created web view).

> Was there a mechanism added to webkit/webkitgtk to somehow auto-cleanup
> somewhere that didn't exist before?

No, at least I'm not aware of anything..

> Btw, what's the correct way to destroy/dispose webkit/webview?

Just o_object_unref() it.
 
> System:
> Linux Fedora 25
> Gtk3.22
> WebkitGtk 2.18 (New from upstream/Rawhide)

Try to please use https://bodhi.fedoraproject.org/updates/FEDORA-2017-368b385c04 and don't use the rawhide.

> Please advise, thank you.
Comment 2 Leo Ufimtsev 2017-10-03 08:29:08 PDT
(In reply to Tomas Popela from comment #1)
> (In reply to Leo Ufimtsev from comment #0)
> > I noticed that if I do an g_object_ref(webview) somewhere near disposal,
> > then the crash doesn't occur. 
>  
> Just to be sure, are you holding a reference on the web view in your
> Eclipse's/SWT's Browser when you create it?
> 
> > Does the above ring a bell with anyone?
> 
> What I would try is to debug the reference count of the web view (set a
> break point on glibc's g_object_ref()/g_object_unref() one you will know the
> address of the newly created web view).
> 
> > Was there a mechanism added to webkit/webkitgtk to somehow auto-cleanup
> > somewhere that didn't exist before?
> 
> No, at least I'm not aware of anything..
> 
> > Btw, what's the correct way to destroy/dispose webkit/webview?
> 
> Just o_object_unref() it.

Ah, thank you.

> > System:
> > Linux Fedora 25
typo, I'm on F26 btw.

> > Gtk3.22
> > WebkitGtk 2.18 (New from upstream/Rawhide)
> 
> Try to please use
> https://bodhi.fedoraproject.org/updates/FEDORA-2017-368b385c04 and don't use
> the rawhide.
> 
> > Please advise, thank you.

Hello Tomas,

Thank you for your suggestions. They were very helpful.

I've removed rawhide packages and installed 2.18 from regular Fedora repos. The crash still occurs (But I guess that was expected).

I found that I only have to g_object_ref(webview) the first instance that is created, then the crash no longer occurs.
This first instance is then cleaned up when JVM shuts down.

I haven't gotten around doing a deeper debug session on what g_object_(ref|unrefs) webview. For the time being we'll use the workaround mentioned above [1]. Later, once we've completed the Webkit2 port[2], we will come back to revisit this crash issue. (may take some months).

I can't reproduce the issue in a snippet, it only occurs in the context of a JVM.

Since I won't get around to doing deeper research for some time to come, should we close the issue for the time being ?(works for me?) and then once I'll get back to this crash issue and have more news re-open it?


[1]
Bug 522733 – Crash on close with webkitgtk 2.18 
https://bugs.eclipse.org/bugs/show_bug.cgi?id=522733

[2] 
Bug 516838 – [GTK3][webkit] Port SWT Browser to webkit2gtk (4.8) 
https://bugs.eclipse.org/bugs/show_bug.cgi?id=516838
Comment 3 Michael Catanzaro 2017-10-25 21:09:55 PDT
Just for future reference: the secret to getting a timely reply is to select the 'WebKit Gtk' component, as otherwise it's just luck if we ever see the bug. I know that's not obvious; sorry about that. :(

There's no new auto-cleanup mechanism.

As for "what's the correct way to destroy/dispose webkit/webview?"... normally GTK+ will do that for you: WebKitWebView comes with a floating ref that will be sunk when you add it to the widget hierarchy, so you can forget about destroying it manually. I imagine Eclipse is probably not using a real GTK+ widget hierarchy, though, so you can just call g_object_unref() directly to destroy it. You also first have to call g_object_ref_sink() after creating the WebKitWebView, if you are not going to be adding it to a widget hierarchy, to sink the floating ref.

Now, on to this bug....

I'm not sure if this is a bug in WebKit or in Eclipse. Adding an extra ref to your WebKitWebView, ensuring it gets leaked, fixed the issue for you because WebKitWebView refs its WebKitWebContext, so the WebKitWebContext never gets destroyed. The fact that it's being destroyed in an exit handler in your backtrace indicates that you're using the default WebKitWebContext, so another workaround you could use would be to not do that: create your own WebKitWebContext, use it when creating your WebKitWebViews, and unref it when you want to, rather than letting WebKit do it badly.

Now, the default WebKitWebContext gets freed in an exit handler because it's a C++ static object (declared in createDefaultWebContext() in WebKitWebContext.cpp), and exit handlers are where static objects get freed. It's good practice to always leak such objects, because running C++ destructors in exit handlers leads to lots and lots of crashes in practice. This has been a major source of bugs for us in the past (and present, e.g. bug #176490). If objects just free memory in its dispose/finalize or its C++ destructor, then it's no big deal, because that's not needed when the program is terminating anyway, but sometimes often do important things in dispose/finalize that need to happen. (See bug #166029 for a recent example of this.)

One option is that, instead of trying to paper over this problem by trying to fix this particular crash, we should probably always leak the default WebKitWebContext. This means we'll consequently to remove WebProcessPool from our leak tracking pool (see also: bug #163504).

On the other hand, this crash looks like it might not even be related to being run in an exit handler. We'd need a better backtrace ('bt full') to be sure, but it looks like the WebKitWebContext's WebCookieManagerProxy has somehow been destroyed before its WebProcessPool. It's supposed to happen in reverse order, because the WebProcessPool is declared first in the priv struct, meaning it is destroyed last. Does Eclipse use WebCookieManagerProxy directly, and if so, is it possible that Eclipse has some refcounting mistake with it? That seems most likely, as otherwise I'm surprised that no other developers have reported this issue yet.
Comment 4 Michael Catanzaro 2017-10-25 21:18:56 PDT
Actually, I see from https://bugs.eclipse.org/bugs/show_bug.cgi?id=522733#c7 that you really are creating a normal GTK+ widget hierarchy, so you should not need to dispose the WebKitWebView manually or do anything other than add it to the GtkContainer. After you create the WebKitWebView and add it to the GtkContainer, the GtkContainer has sunk the floating ref and the WebKitWebView is now owned by the GtkContainer.

Note: it's expected that dispose may be called multiple times, as that's required to break reference cycles. What's not OK is if *finalize* gets called multiple times. That should never happen, though, because g_object_unref() should spit obvious criticals out on stderr if it gets called when the refcount is already zero. So I don't think it's a WebKitWebView refcounting error unless you are seeing criticals like these.

Also, my question "does Eclipse use WebCookieManagerProxy directly" was pretty dumb, because that's a C++ object, not something exposed in our API. So forget about that. (I'm working a bit too late at night again. :)

(Sorry I haven't actually found the issue, but hopefully the above gives you something to work with. First thing I would do to investigate further would be 'bt full' in gdb.