Today I've received the webkit2gtk3-2.36.5-1.fc36.x86_64 update in Fedora 36 here, and now Eclipse (using SWT) is crashing - on purpose it seems - when calling webkit_web_context_get_default(), because WebKit::allDataStores() fails to assert being in the UI Thread.
The Eclipse versions I've seen crashing are 2021-06, 2022-03, 2022-06 - not sure about earlier ones.
Downgrading to webkit2gtk3-2.36.0 as performed by 'dnf downgrade webkit2gtk3' does work around the issue, but it was working last week with 2.36.4 as well.
Any idea so far about whether this is a regression of webkit-gtk, or what needs to be adapted in Eclipse or SWT now eventually?
If necessary, I may be able to bisect which change between 2.36.4 and 2.36.5 is causing this crash.
Thanks a lot!
A backtrace using Eclipse 2022-06 (with OpenJDK 11.0.14+9) is:
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1 0x00007f2f1b68ecb3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2 0x00007f2f1b63e9c6 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00007f2f1b6287f4 in __GI_abort () at abort.c:79
#4 0x00007f2d4146d5cb in WTFCrashWithInfo(int, char const*, char const*, int) () at /usr/src/debug/webkit2gtk3-2.36.5-1.fc36.x86_64/redhat-linux-build/WTF/Headers/wtf/Assertions.h:741
#5 WebKit::allDataStores() () at /usr/src/debug/webkit2gtk3-2.36.5-1.fc36.x86_64/Source/WebKit/UIProcess/WebsiteData/WebsiteDataStore.cpp:101
#6 0x00007f2d417ee7c0 in WebKit::WebsiteDataStore::forEachWebsiteDataStore(WTF::Function<void (WebKit::WebsiteDataStore&)>&&) (function=...)
#7 0x00007f2d417728a7 in WebKit::WebProcessPool::registerURLSchemeAsSecure(WTF::String const&) (urlScheme=<optimized out>, this=<optimized out>) at /usr/include/c++/12/bits/unique_ptr.h:189
#8 registerSecurityPolicyForURIScheme(WebKitSecurityManager*, char const*, SecurityPolicy) (manager=<optimized out>, scheme=<optimized out>, policy=<optimized out>)
#9 0x00007f2d417a0bd2 in WebKit::WebKitProtocolHandler::WebKitProtocolHandler(_WebKitWebContext*) (context=<optimized out>, this=0x7f2d3f2ff028)
#10 std::make_unique<WebKit::WebKitProtocolHandler, _WebKitWebContext*&>(_WebKitWebContext*&) () at /usr/include/c++/12/bits/unique_ptr.h:1065
#11 WTF::makeUnique<WebKit::WebKitProtocolHandler, _WebKitWebContext*&>(_WebKitWebContext*&) () at /usr/src/debug/webkit2gtk3-2.36.5-1.fc36.x86_64/redhat-linux-build/WTF/Headers/wtf/StdLibExtras.h:540
#12 webkitWebContextConstructed(GObject*) (object=<optimized out>) at /usr/src/debug/webkit2gtk3-2.36.5-1.fc36.x86_64/Source/WebKit/UIProcess/API/glib/WebKitWebContext.cpp:444
#13 0x00007f2f196c3f4c in g_object_new_internal (class=0x7f2f140f7510, params=params@entry=0x0, n_params=n_params@entry=0) at ../gobject/gobject.c:2053
#14 0x00007f2f196c5081 in g_object_new_with_properties (object_type=0x7f2f14623420 [None], n_properties=0, names=names@entry=0x0, values=values@entry=0x0) at ../gobject/gobject.c:2181
#15 0x00007f2f196c5b21 in g_object_new (object_type=<optimized out>, first_property_name=<optimized out>) at ../gobject/gobject.c:1821
#16 0x00007f2d41780669 in createDefaultWebContext(gpointer) () at /usr/src/debug/webkit2gtk3-2.36.5-1.fc36.x86_64/Source/WebKit/UIProcess/API/glib/WebKitWebContext.cpp:689
#17 0x00007f2ee633c6e5 in g_once_impl (once=0x7f2d43b4ef00 <webkit_web_context_get_default::onceInit>, func=0x7f2d41780600 <createDefaultWebContext(gpointer)>, arg=0x0) at ../glib/gthread.c:640
#18 0x00007f2d417806f5 in webkit_web_context_get_default() () at /usr/src/debug/webkit2gtk3-2.36.5-1.fc36.x86_64/Source/WebKit/UIProcess/API/glib/WebKitWebContext.cpp:703
#19 0x00007f2d43c05e8c in Java_org_eclipse_swt_internal_webkit_WebKitGTK_webkit_1web_1context_1get_1default ()
#20 0x00007f2efc80fa10 in ()
#21 0x00007f2f1b9fd0e0 in ()
#22 0x00007f2d4c2b5fc8 in ()
#23 0x00007f2f1b9fd140 in ()
#24 0x00007f2d4c2b7f70 in ()
#25 0x0000000000000000 in ()
(gdb) frame 5
#5 WebKit::allDataStores () at /usr/src/debug/webkit2gtk3-2.36.5-1.fc36.x86_64/Source/WebKit/UIProcess/WebsiteData/WebsiteDataStore.cpp:101
96 allowsWebsiteDataRecordsForAllOrigins = true;
99 static HashMap<PAL::SessionID, WebsiteDataStore*>& allDataStores()
102 static NeverDestroyed<HashMap<PAL::SessionID, WebsiteDataStore*>> map;
103 return map;
(gdb) info threads
Id Target Id Frame
1 Thread 0x7f2f1bd72480 (LWP 541855) "java" __futex_abstimed_wait_common64 (private=128, cancel=true, abstime=0x0, op=265, expected=541856, futex_word=0x7f2f1b9ff910) at futex-internal.c:57
* 2 Thread 0x7f2f1b9ff640 (LWP 541856) "java" __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
3 Thread 0x7f2f19f14640 (LWP 541857) "GC Thread#0" 0x00007f2f1a7f195c in G1ParScanThreadState::copy_to_survivor_space(InCSetState, oopDesc*, markOopDesc*) ()
Presumably the main thread is thread 1, and it looks like you are using WebKitGTK on thread 2? I'm afraid that's your bug: the WebKitGTK API must be used exclusively on the main thread. Same for GTK itself.
(OK, technically I think GTK can be used on a secondary thread as long as you initialize GTK on one thread and never use it on any other thread... but in practice, nobody does that, and I would be impressed if it works.)
Somebody else noticed this changed in bug #242690. Before, isMainThread() only checked to ensure the current thread is the same as the thread that first called initializeMainThread(); i.e. it was possible for WebKit's main thread to not be the actual main thread. Now, isMainThread() checks to ensure it is actually the main thread.
In theory, using a secondary thread as the "main" thread *might* have previously been safe to do as long as you restrict all use of WebKitGTK and GTK and <who knows how many other libraries> to only that thread. In practice, probably not a good idea. I'd say Eclipse should not do this.
Maybe we should have not backported this change to the stable branch, though. It's nicer when things break in major version updates instead of minor updates.
Agreed, such a change should not be shipped in a minor version.
Further thought: To allow for applications to recognize as early as possible, does initializeMainThread() assert for Thread 1 as well?
But then: Right now, Eclipse calls initializeMainThread() in Thread 2 here, and I would really be surprised if Eclipse ever has done that in Thread 1.
So while I understand/agree to use GTK/WebKit/SWT within one single thread only, I don't get the reason for why this has to be Thread 1.
Also, we do have some SWT/GTK application here in production for years, and the gdb-breakpoint in gtk_main_iteration_do() is fired in Thread 51 this time.
So yes, I would say GTK in no way does require the UI thread to be Thread 1.
(gdb) info threads
* 51 Thread 0x7fdb13bff640 (LWP 549155) "Thread-15" gtk_main_iteration_do (blocking=0) at ../gtk/gtkmain.c:1455
#0 gtk_main_iteration_do (blocking=0) at ../gtk/gtkmain.c:1455
#0 gtk_main_iteration_do (blocking=0) at ../gtk/gtkmain.c:1455
#1 0x00007fdb13256dbc in Java_org_eclipse_swt_internal_gtk3_GTK3_gtk_1main_1iteration_1do ()
#2 0x00007fdc23a154e3 in ()
#3 0x00007fdb13bfe908 in ()
#4 0x00007fdc23a15064 in ()
#5 0x00007fdb13bfe850 in ()
#6 0x00007fdb12b58700 in ()
#7 0x00007fdb13bfe8b8 in ()
#8 0x00007fdb12b604e0 in ()
#9 0x0000000000000000 in ()
(In reply to Michael Haubenwallner from comment #3)
> So yes, I would say GTK in no way does require the UI thread to be Thread 1.
In theory, you can use GTK on any thread you want. The thread that you initialize GTK on is the GTK main thread (or UI thread). You need to make sure that (a) you use GTK only from this thread, and (b) you iterate the default main context only from this thread, never from any other thread. At least, I *think* it's *theoretically* safe if you follow those rules.
In practice, is it actually safe? Programmers are likely going to make assumptions that the GTK thread is the application's main thread, or that WebKit's main thread is the application's main thread (as we see here). Certainly it's hard to imagine any good reason to do this. So why is Eclipse doing this?
Anyway, I don't see any compelling reason to break apps that previously worked with 2.36, so we should revert this there at least. I'm not sure about what to do with main. Opinions welcome.
> Also, we do have some SWT/GTK application here in production for years, and the gdb-breakpoint in gtk_main_iteration_do() is fired in Thread 51 this time.
Now that's really suspicious. Are you careful to ensure you never use GTK anywhere except Thread 51? Do you ensure that you never iterate the default main context except on Thread 51? That's such a random thread ID that it seems very unlikely your code is correct.
At least Eclipse seems to be using WebKitGTK on thread 2, which could *plausibly* be correct if Java reserves thread 1 for itself and uses thread 2 as the main thread for Java code.
It looks like Eclipse uses the "main" JVM thread (where the Java main function is called) as the UI thread. However, this thread is not the main OS thread of the Eclipse process (i.e., the one running the C main function of the JVM). For example, when I ran Eclipse to test this, the Eclipse instance got the PID 65892, but the JVM thread dump (from VisualVM) indicated the LWP 65898 as the "main" JVM thread. Indeed, LWP 65898 is the second thread spawned by PID 65892, while LWP 65892 is of course the main OS thread. Therefore, it might be impossible for Eclipse to use the main OS thread as the UI thread, since that thread is taken for other purposes by the underlying JVM.
Eclipse calls native code (including GTK and WebKit) via JNI on the JVM "main" thread. When I pause Eclipse with a JVM debugger, its "main" JVM thread is sitting in, e.g., a native gtk_main_iteration_do call, happily chugging away on a non-main OS thread.
OK, let's revert this then. What Eclipse is doing may not be ideal, but it sounds theoretically OK.
(I think your app that's using thread 51 is very likely not OK, though. You'll want to investigate that.)
Pull request: https://github.com/WebKit/WebKit/pull/2904
Hey James, were your commits intended to fix anything in particular, or was this just intended as an easy robustness improvement?
It sounds like it should work this way on all platforms using the GTK backend, then?
Can Eclipse (or any other user of WebKitGTK on JVM) not run on FreeBSD or does pthread_main_np not work like I think it does?
(In reply to Jan Alexander Steffens (heftig) from comment #10)
> It sounds like it should work this way on all platforms using the GTK
> backend, then?
I'd say so, yes.
> Can Eclipse (or any other user of WebKitGTK on JVM) not run on FreeBSD or does pthread_main_np not work like I think it does?
Guess: nobody has ever attempted this?
Committed 253010@main (a6277d4834cc): <https://commits.webkit.org/253010@main>
Reviewed commits have been landed. Closing PR #2904 and removing active labels.
(In reply to Michael Catanzaro from comment #7)
> OK, let's revert this then. What Eclipse is doing may not be ideal, but it
> sounds theoretically OK.
> (I think your app that's using thread 51 is very likely not OK, though.
> You'll want to investigate that.)
Well, the thread number is actually irrelevant, and in no way is constant:
A Java application eventually opening an SWT GUI may run lots of threads before it actually creates the SWT UI thread. The SWT framework, in turn, creates a dedicated thread, dynamically loads the GTK/WebKit/etc. shared libraries there, and performs any JNI call in that same thread - whatever number it does have.
Agreed, the application code is required to run SWT calls in the _same_ thread, but that does not impose any particular thread number.
For applications creating the SWT UI thread: Eclipse seems to do that quite early during startup, but our SWT application seems to perform lots of different actions (read: create threads) before.
(In reply to Michael Catanzaro from comment #9)
> Hey James, were your commits intended to fix anything in particular, or was
> this just intended as an easy robustness improvement?
Well I made these changes since I was hitting weird threading issues that somehow were getting missed by most everyone else and the isMainThread() implementation looked suspect(guess I was right about it not working as expected):