Bug 219456

Summary: [Flatpak][GTK][X11] WebProcess assertion failure with Nvidia drivers
Product: WebKit Reporter: Sam Sneddon [:gsnedders] <gsnedders>
Component: WebKitGTKAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: bugs-noreply, clopez, mcatanzaro, pnormand, twilco.o
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Linux   
See Also: https://bugs.webkit.org/show_bug.cgi?id=199666
https://bugs.webkit.org/show_bug.cgi?id=217323

Description Sam Sneddon [:gsnedders] 2020-12-02 14:11:22 PST
libEGL warning: DRI2: failed to authenticate
ASSERTION FAILED: windowConfig
../../Source/WebCore/platform/graphics/glx/GLContextGLX.cpp(150) : static std::unique_ptr<WebCore::GLContextGLX> WebCore::GLContextGLX::createWindowContext(GLNativeWindowType, WebCore::PlatformDisplay&, GLXContext)
1   0x7fffdce3315b WTFCrash
2   0x7fffeb93e517 /app/webkit/WebKitBuild/Debug/lib/libwebkit2gtk-4.0.so.37(+0xcf86517) [0x7fffeb93e517]
3   0x7fffef7bec8a WebCore::GLContextGLX::createWindowContext(unsigned long, WebCore::PlatformDisplay&, __GLXcontextRec*)
4   0x7fffef7bf614 WebCore::GLContextGLX::createContext(unsigned long, WebCore::PlatformDisplay&)
5   0x7fffef74753f WebCore::GLContext::createContextForWindow(unsigned long, WebCore::PlatformDisplay*)
6   0x7fffec47beba WebKit::ThreadedCompositor::createGLContext()
...


I'm surprised it has no idea about symbols in libwebkit2gtk-4.0.so.37, and I can't manage to get the crash happening within gdb.

It's easy enough to trigger (https://webkit.org triggers it, but not about:blank or https://example.com/).
Comment 1 Sam Sneddon [:gsnedders] 2020-12-02 14:31:22 PST
And then just after I give up I try using gdbserver again and hey now I have it in a debugger:

In WebCore::GLContextGLX::createWindowContext there, glXGetFBConfigs returns no (zero) configs, hence why windowConfig remains nullptr.

I wonder if this is something odd with how things are running within flatpak and then talking to the wider system? `Reading /usr/lib/x86_64-linux-gnu/libEGL_mesa.so.0 from remote target...` got logged by gdb just before it all kicked off.

I'll be around more here/Slack during the European work day tomorrow if any of the European Igalians want to help!
Comment 2 Carlos Alberto Lopez Perez 2020-12-02 15:26:54 PST
Nvidia proprietary drivers are problematic.

I have an nvidia card and I ended up using the free drivers (Nouveau) due to the continuous issues I was having with WebKit and other software.

You can workaround this issue by exporting the environment variable WEBKIT_DISABLE_COMPOSITING_MODE=1 and that will disable AC mode in WebKit.

I wonder if your issues is with flatpak or with webkitgtk itself. 

Does the issue happen if you build WebKit without flatpak? (for that simply wipe the WebKitBuild directory and start a new build without calling the update-webkitgtk-libs script before the build. You may need to pass the flag '--no-experimental-features' to build-webkit to disable some features that perhaps don't build on your distro)
Comment 3 Tyler Wilcock 2020-12-02 17:12:57 PST
I just tested it -- it does seem to be the combination of the Nvidia proprietary drivers and Flatpak.

```
rm -rf WebKitBuild
update-webkitgtk-libs --debug
Tools/Scripts/build-webkit --gtk --debug --no-experimental-features --cmakeargs="-DUSE_WPE_RENDERER=OFF -DENABLE_GAMEPAD=OFF"
run-minibrowser --gtk --debug https://www.webkit.org
```

Reproduces the crash, while:

```
rm -rf WebKitBuild
Tools/Scripts/build-webkit --gtk --debug --no-experimental-features --cmakeargs="-DUSE_WPE_RENDERER=OFF -DENABLE_GAMEPAD=OFF"
run-minibrowser --gtk --debug https://www.webkit.org
```

does not.
Comment 4 Carlos Alberto Lopez Perez 2020-12-02 18:16:10 PST
Perhaps it works if you install the nvidia drivers inside the flatpak runtime of WebKit.

Try this:

1. Check what flatpak runtimes you have intalled in the WebKit directory. It should print something like this.

$ FLATPAK_USER_DIR=/home/igalia/clopez/webkit/WebKitBuild/UserFlatpak flatpak --user list
Name                                             Application ID                                       Version            Branch           Origin
Rust stable Sdk extension                        org.freedesktop.Sdk.Extension.rust-stable            1.48.0             20.08            flathub
WebKit Platform (0.3)                            org.webkit.Platform                                  r270199            0.3              webkit-sdk
WebKit Software Development Kit (0.3)            org.webkit.Sdk                                       r270199            0.3              webkit-sdk

2. Get the nvidia version that matches more closely what you host has.

$ FLATPAK_USER_DIR=/home/igalia/clopez/webkit/WebKitBuild/UserFlatpak flatpak remote-ls|grep nvidia

3. Install it

$ FLATPAK_USER_DIR=/home/igalia/clopez/webkit/WebKitBuild/UserFlatpak flatpak --user install org.freedesktop.Platform.GL.nvidia-455-38
Looking for matches…
Found similar ref(s) for ‘org.freedesktop.Platform.GL.nvidia-455-38’ in remote ‘flathub’ (user).
Use this remote? [Y/n]: y


        ID                                                    Branch            Op            Remote             Download
 1. [✓] org.freedesktop.Platform.GL.nvidia-455-38             1.4               i             flathub            356.3 kB / 127.0 MB

Installation complete.


4. Check that is installed, the command from point 1 now should return:


$ FLATPAK_USER_DIR=/home/igalia/clopez/webkit/WebKitBuild/UserFlatpak flatpak --user list
Name                                             Application ID                                       Version            Branch           Origin
nvidia-455-38                                    org.freedesktop.Platform.GL.nvidia-455-38                               1.4              flathub
Rust stable Sdk extension                        org.freedesktop.Sdk.Extension.rust-stable            1.48.0             20.08            flathub
WebKit Platform (0.3)                            org.webkit.Platform                                  r270199            0.3              webkit-sdk
WebKit Software Development Kit (0.3)            org.webkit.Sdk                                       r270199            0.3              webkit-sdk



5. Finally try to run webkit now to see if there is better luck


(of course, change the path on the environment variable FLATPAK_USER_DIR on all the commands above to point to your $webkit/WebKitBuild/UserFlatpak directory)
Comment 5 Carlos Alberto Lopez Perez 2020-12-02 18:28:38 PST
BTW, i'm not sure if the above is going to work.. i have not tested it since I don't use the nvidia propietary drivers. But when entering into a shell with "Tools/Scripts/webkit-flatpak -c bash" I can't see the nvidia drivers anywhere mounted inside the runtime.

Perhaps the WebKit SDK runtime needs to declare a dependency on the nvidia extensions and mount it somewhere like we do for other extensions on Tools/buildstream/elements/flatpak/sdk.bst

Here is some info on how this is supposed to work https://lists.freedesktop.org/archives/flatpak/2017-February/000534.html
Comment 6 Tyler Wilcock 2020-12-02 19:11:06 PST
OK, well in any case I'll try it now and report back.

---

FWIW, I started debugging through and here is what I've found so far:

Starting from this line of the stacktrace:

4   0x7fffef7bf614 WebCore::GLContextGLX::createContext(unsigned long, WebCore::PlatformDisplay&)

GLContextGLX::createContext calls into PlatformDisplay::sharingGLContext

https://github.com/WebKit/webkit/blob/817c46e152af795d735678386db68805d0aa505e/Source/WebCore/platform/graphics/glx/GLContextGLX.cpp#L279

https://github.com/WebKit/webkit/blob/817c46e152af795d735678386db68805d0aa505e/Source/WebCore/platform/graphics/PlatformDisplay.cpp#L179

Which then calls into GLContext::createSharingContext:

https://github.com/WebKit/webkit/blob/817c46e152af795d735678386db68805d0aa505e/Source/WebCore/platform/graphics/GLContext.cpp#L115

The highlighted line fails to create a glxContent (`auto glxContext =  GLContextGLX::createSharingContext(display)`).  Need to do more digging as to why this fails...I briefly stepped through, but not enough to understand.

Execution then continues into this portion of the function:

#if USE(EGL) || PLATFORM(WAYLAND) || PLATFORM(WPE)
    if (auto eglContext = GLContextEGL::createSharingContext(display))
        return eglContext;
#endif

which is where the `libEGL warning: DRI2: failed to authenticate` log is spawned.
Comment 7 Tyler Wilcock 2020-12-02 19:42:57 PST
Hmm, that was a good idea, but it didn't seem to work.  Installing the recommended driver for my hardware, org.freedesktop.Platform.GL.nvidia-450-80-02, and re-running the Minibrowser via Flatpak results in the same crash.
Comment 8 Philippe Normand 2020-12-03 07:51:26 PST
(In reply to Tyler Wilcock from comment #7)
> Hmm, that was a good idea, but it didn't seem to work.  Installing the
> recommended driver for my hardware,
> org.freedesktop.Platform.GL.nvidia-450-80-02, and re-running the Minibrowser
> via Flatpak results in the same crash.

Can you check if the nvidia extension was mounted correctly in the sandbox?

webkit-flatpak -c cat /.flatpak-info

It should be listed somewhere in that file.
Comment 9 Philippe Normand 2020-12-03 07:54:48 PST
If the nvidia extension is not listed in `runtime-extensions` then it's not been mounted in the sandbox.
Comment 10 Carlos Alberto Lopez Perez 2020-12-03 08:13:01 PST
(In reply to Philippe Normand from comment #9)
> If the nvidia extension is not listed in `runtime-extensions` then it's not
> been mounted in the sandbox.

how can you tell flatpak to mount it? is it needed to rebuild the webkit-sdk runtime?
Comment 11 Philippe Normand 2020-12-03 08:30:54 PST
(In reply to Carlos Alberto Lopez Perez from comment #10)
> (In reply to Philippe Normand from comment #9)
> > If the nvidia extension is not listed in `runtime-extensions` then it's not
> > been mounted in the sandbox.
> 
> how can you tell flatpak to mount it? is it needed to rebuild the webkit-sdk
> runtime?

Yeah, undo the change I did in Tools/buildstream/elements/flatpak/sdk.bst in https://bugs.webkit.org/show_bug.cgi?id=215763
Comment 12 Philippe Normand 2020-12-03 08:34:42 PST
The problem is that by bringing Mesa in our SDK, I had to get rid of the FDO GL extension which is able to enable the Nvidia extension too... So this won't be easy to revert...

We'd need to find a way to enable the nvidia extension without bringing back the dependency on the GL extension.
Comment 13 Carlos Alberto Lopez Perez 2020-12-03 08:50:15 PST
(In reply to Philippe Normand from comment #12)
> The problem is that by bringing Mesa in our SDK, I had to get rid of the FDO
> GL extension which is able to enable the Nvidia extension too... So this
> won't be easy to revert...
> 
> We'd need to find a way to enable the nvidia extension without bringing back
> the dependency on the GL extension.

Maybe we can have that mesa version installed in a special path like /lib/softGL and before starting the tests add that path to LD_LIBRARY_PATH and LIBGL_DRIVERS_PATH. So we only use the mesa we build for layout tests when is passed '--display-server=xvfb' (the default) or '--display-server=weston'. We did (do?) something like this for JHBuild.

But when running the MiniBrowser or tests with '--display-server=wayland' or '--display-server=xorg' then use the host GL (FDO GL extensions)

WDYT?
Comment 14 Philippe Normand 2020-12-03 08:56:43 PST
That won't work... We can't assume libs from the host will be usable in the sandbox.
Comment 15 Philippe Normand 2020-12-03 08:58:41 PST
Ah sorry I misread your comment... Yeah maybe it could work...
Comment 16 Philippe Normand 2021-09-29 03:55:45 PDT
This fixed by the SDK update to version 21.08.