Bug 183883 - Crash with WebKitConsoleMessageLevel
Summary: Crash with WebKitConsoleMessageLevel
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKitGTK (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-03-21 18:27 PDT by Michael Gratton
Modified: 2018-04-13 08:55 PDT (History)
2 users (show)

See Also:


Attachments
Backtrace from WebProcess core (24.72 KB, text/plain)
2018-03-21 18:27 PDT, Michael Gratton
no flags Details
Generated web extension code 2.20 (48.28 KB, text/x-csrc)
2018-03-26 16:20 PDT, Michael Gratton
no flags Details
Generated web extension code 2.18 (51.80 KB, text/x-csrc)
2018-03-26 16:24 PDT, Michael Gratton
no flags Details
Generated web extension header 2.20 (1.67 KB, text/x-chdr)
2018-03-26 16:28 PDT, Michael Gratton
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Gratton 2018-03-21 18:27:01 PDT
Created attachment 336253 [details]
Backtrace from WebProcess core

I just upgraded to Ubuntu bionic which currently has WebKitGTK 2.20.0, and WebProcess is now crashing when Geary's WebExtension is attempting to print a console message. Geary is using the standard vala to_string enum magic to get a string version of the WebKitConsoleMessageLevel returned by webkit_console_message_get_level(): https://git.gnome.org/browse/geary/tree/src/client/web-process/web-process-extension.vala#n57

This translates into the following C code:

> g_enum_get_value (g_type_class_ref (webkit_console_message_level_get_type ()), webkit_console_message_get_level (message));

The call to g_type_class_ref() is the one causing the segfault. Despite that it looks like WebKitGTK bug since Geary does this in a lot of different places and it isn't crashing anywhere else. Maybe webkit_console_message_level_get_type() is returning a bogus value?

Back trace from a resulting core is attached, couldn't get a full trace since gdb is crashing doing a "bt full".
Comment 1 Michael Gratton 2018-03-21 18:28:29 PDT
NB this is a moderately severe issue, since it stops any email that causes a console message to be printed to not be displays.
Comment 2 Michael Gratton 2018-03-21 19:55:34 PDT
Turns out this also happens with the WebKitConsoleMessageSource enum as well as the level.
Comment 3 Michael Catanzaro 2018-03-22 10:49:11 PDT
I suspect https://trac.webkit.org/changeset/226366/webkit in combination with a Vala bug. Note my changelog entry:

"Since one of the signal arguments is an enum, we now have to run glib-mkenums for the web process API. And that has resulted in this patch also adding GType goo for WebKitConsoleMessageLevel and WebKitConsoleMessageSource that was previously missing. Any applications that for some unlikely reason want to use these enums in properties or signals will be happy."

So webkit_console_message_level_get_type() is brand new in 2.20. Did not exist at all in 2.18.

So:

(In reply to Michael Gratton from comment #0)
> g_enum_get_value (g_type_class_ref (webkit_console_message_level_get_type ()), webkit_console_message_get_level (message));

Vala cannot possibly have generated this code in 2.18, because it would not have compiled. And if it did generate this code, then it has its own conflicting definition for webkit_console_message_level_get_type() with its own generated GType goo, infringing on WebKit's namespace, which would be a Vala bug.

The first thing I would do to check is build Geary twice with the same version of Vala, once with 2.18 and once with 2.20, and diff the generated C code to see what it's doing with respect to webkit_console_message_level_get_type().
Comment 4 Michael Gratton 2018-03-23 22:52:42 PDT
Okay, so I just found out the WebProcess crashes here when I run the locally built binary in the source tree, but does not crash when I build a debian package from the same source tree, with the same valac and libraries, install that and run it.

When using the same meson build options for both, the only thing that is different in the generated C source is that the deb build's source doesn't contain vala line number annotations. The generated C line is identical to that below.

The only difference with the binaries seems to be that is that the installed deb's binary has debug symbol stripped.

So I'm not sure where to go with this. It's definitely only happening with enums from WebkitGTK+ though.

Will try to track down some source generated against 2.18 and see what it looks like, need to reboot to try to get libvirt working for that though. o_O
Comment 5 Michael Gratton 2018-03-23 23:28:24 PDT
Okay, so the difference in the source code when compiling against 2.18 vs against 2.20 is that for the former, valac will generate a helper function which uses a C-switch on the value returned from the message obejct, rather than trying to use g_enum_get_value and getting the name from that.

E.g. for the level:

> static const char*
> _webkit_console_message_level_to_string (WebKitConsoleMessageLevel value)
> {
>         switch (value) {
>                 case WEBKIT_CONSOLE_MESSAGE_LEVEL_INFO:
>                 return "WEBKIT_CONSOLE_MESSAGE_LEVEL_INFO";
>                 ...
>         }
>         return NULL;
> }

The g_enum_get_value() method does get used in quite a lot of other places however, it's only these newly blessed enums that are causing a crash with my local build.

I'm sort of out of ideas about what could be causing the issue with the plain local build vs the deb local build though.
Comment 6 Michael Catanzaro 2018-03-24 13:50:18 PDT
OK that's more complicated than what I was expecting, I'll take a look...
Comment 7 Michael Catanzaro 2018-03-25 18:12:05 PDT
> g_enum_get_value (g_type_class_ref (webkit_console_message_level_get_type ()), webkit_console_message_get_level (message));

I added this debug code to ephy-web-extension-main.c:

  GType type;
  GEnumClass *class;
  GEnumValue *value;

  type = webkit_console_message_level_get_type ();
  class = g_type_class_ref (type);
  g_warning ("class=%p", class);
  value = g_enum_get_value (class, WEBKIT_CONSOLE_MESSAGE_LEVEL_DEBUG);
  g_warning ("value=%d name=%s nick=%s", value->value, value->value_name, value->value_nick);
  g_type_class_unref (class);

It works properly:

** (WebKitWebProcess:18): WARNING **: 00:57:43.265: class=0x5635884a8820

** (WebKitWebProcess:18): WARNING **: 00:57:43.265: value=4 name=WEBKIT_CONSOLE_MESSAGE_LEVEL_DEBUG nick=debug

But I had to manually add this to the top:

#define WEBKIT2_COMPILATION 1
#include <webkit2/WebKitWebProcessEnumTypes.h>

due to bug #183998.

I guess there's some (very slim) possibility that the missing #include is somehow confusing Vala... could you grep through the generated code and make sure Vala isn't trying to define its own webkit_console_message_level_get_type() anywhere?

Could you post a copy of the generated web-process-extension.c?
Comment 8 Michael Catanzaro 2018-03-26 08:19:17 PDT
BTW I tried adding Geary to world in gnome-build-meta so that I can build it against the latest WebKit, but I noticed it depends on libcanberra and I didn't really want to add that as well, so I gave up. :) Would be interesting to see what the generated source is doing.
Comment 9 Michael Gratton 2018-03-26 16:20:59 PDT
Created attachment 336549 [details]
Generated web extension code 2.20

This is the code valac is generating against WebKitGTK 2.20.

Strangely it declares webkit_console_message_level_to_string(), but neither uses nor defines the function body.
Comment 10 Michael Gratton 2018-03-26 16:24:27 PDT
Created attachment 336550 [details]
Generated web extension code 2.18

This is the code valac is generating against WebKitGTK 2.18.

This also webkit_console_message_level_to_string(), and again neither uses nor defines the function body, but also declares, defines and uses 
_webkit_console_message_level_to_string(). They differ by return type.
Comment 11 Michael Gratton 2018-03-26 16:28:45 PDT
Created attachment 336551 [details]
Generated web extension header 2.20

And for completeness, the header generated against 2.20.
Comment 12 Michael Gratton 2018-03-26 16:36:23 PDT
Oh, it's a runtime issue.

Running the installed deb build from /usr doesn't exhibit the problem, but running the same binary from ~/.../geary/debian (i.e the deb build directory) does.

Is the WebProcess making assumptions about where extension libs are loaded/installed/executed from?
Comment 13 Michael Catanzaro 2018-03-26 16:51:04 PDT
(In reply to Michael Gratton from comment #12)
> Oh, it's a runtime issue.
> 
> Running the installed deb build from /usr doesn't exhibit the problem, but
> running the same binary from ~/.../geary/debian (i.e the deb build
> directory) does.
> 
> Is the WebProcess making assumptions about where extension libs are
> loaded/installed/executed from?

It loads web extensions from the directory the application sets with webkit_web_context_set_web_extensions_directory().
Comment 14 Michael Catanzaro 2018-03-26 16:57:50 PDT
I don't see anything obviously wrong with the generated code....

(In reply to Michael Gratton from comment #12)
> Oh, it's a runtime issue.
> 
> Running the installed deb build from /usr doesn't exhibit the problem, but
> running the same binary from ~/.../geary/debian (i.e the deb build
> directory) does.
> 
> Is the WebProcess making assumptions about where extension libs are
> loaded/installed/executed from?

Are you sure that it's really linked against 2.20 when you're running it from the build directory? Check with ldd.
Comment 15 Michael Catanzaro 2018-03-29 07:52:45 PDT
(In reply to Michael Gratton from comment #12)
> Oh, it's a runtime issue.
> 
> Running the installed deb build from /usr doesn't exhibit the problem, but
> running the same binary from ~/.../geary/debian (i.e the deb build
> directory) does.

NB: https://bugzilla.redhat.com/show_bug.cgi?id=1560266
Comment 16 Michael Gratton 2018-04-12 22:40:52 PDT
> Are you sure that it's really linked against 2.20 when you're running it from
> the build directory? Check with ldd.

Yeah, looks like it:

> mjg@payens:~/Projects/GNOME/geary$ ldd build/src/libgeary-web-process.so | grep webkit
>	libwebkit2gtk-4.0.so.37 => /usr/lib/x86_64-linux-gnu/libwebkit2gtk-4.0.so.37 (0x00007f1cbed5c000)
> mjg@payens:~/Projects/GNOME/geary$ dpkg -S /usr/lib/x86_64-linux-gnu/libwebkit2gtk-4.0.so.37
> libwebkit2gtk-4.0-37:amd64: /usr/lib/x86_64-linux-gnu/libwebkit2gtk-4.0.so.37
> mjg@payens:~/Projects/GNOME/geary$ dpkg -l | grep libwebkit2gtk-4.0-37
> ii  libwebkit2gtk-4.0-37:amd64            2.20.1-1                             amd64        Web content engine library for GTK+

Although it also looks like upgrading to 2.20.1 has fixed the issue, so maybe the missing header was somehow causing it?
Comment 17 Michael Catanzaro 2018-04-13 08:55:46 PDT
Surprising, but OK, it was surely caused by the missing header, then. Somehow. I guess to figure out how, you could look at how the code generated by Vala changed between 2.20.0 and 2.20.1.