Bug 201507 - [GTK] Crash in Nicosia::GC3DLayer::makeContextCurrent due to failure in EGL display creation
Summary: [GTK] Crash in Nicosia::GC3DLayer::makeContextCurrent due to failure in EGL d...
Status: REOPENED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKitGTK (show other bugs)
Version: WebKit Nightly Build
Hardware: PC Linux
: P2 Normal
Assignee: Adrian Perez
URL:
Keywords:
Depends on:
Blocks: 192523
  Show dependency treegraph
 
Reported: 2019-09-05 08:14 PDT by Michael Catanzaro
Modified: 2020-08-05 08:55 PDT (History)
9 users (show)

See Also:


Attachments
Patch (7.33 KB, patch)
2020-07-22 07:26 PDT, Adrian Perez
no flags Details | Formatted Diff | Diff
Patch v2 (13.01 KB, patch)
2020-07-28 07:59 PDT, Adrian Perez
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Catanzaro 2019-09-05 08:14:07 PDT
Visit https://www.washingtonpost.com/technology/2019/08/26/spy-your-wallet-credit-cards-have-privacy-problem/?noredirect=on in Tech Preview (2.25.4) and wait about 10-15 seconds. The page will crash:

Core was generated by `/usr/libexec/webkit2gtk-4.0/WebKitWebProcess 17 31'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f127fa8fa58 in Nicosia::GC3DLayer::makeContextCurrent (
    this=<optimized out>) at /usr/include/c++/9.2.0/bits/unique_ptr.h:352
352	      get() const noexcept

(gdb) bt
#0  0x00007f127fa8fa58 in Nicosia::GC3DLayer::makeContextCurrent() (this=<optimized out>)
    at /usr/include/c++/9.2.0/bits/unique_ptr.h:352
#1  0x00007f127fa84b80 in WebCore::GraphicsContext3D::makeContextCurrent() (this=this@entry=0x7f11e79dc600)
    at /usr/include/c++/9.2.0/bits/unique_ptr.h:352
#2  0x00007f127fa84de7 in WebCore::GraphicsContext3D::GraphicsContext3D(WebCore::GraphicsContext3DAttributes, WebCore::HostWindow*, WebCore::GraphicsContext3D::RenderStyle, WebCore::GraphicsContext3D*)
    (this=0x7f11e79dc600, attributes=..., renderStyle=WebCore::GraphicsContext3D::RenderOffscreen, sharedContext=<optimized out>) at ../Source/WebCore/platform/graphics/texmap/GraphicsContext3DTextureMapper.cpp:114
#3  0x00007f127fa859de in WebCore::GraphicsContext3D::create(WebCore::GraphicsContext3DAttributes, WebCore::HostWindow*, WebCore::GraphicsContext3D::RenderStyle) (attributes=..., hostWindow=hostWindow@entry=
    0x7f1275590060, renderStyle=renderStyle@entry=WebCore::GraphicsContext3D::RenderOffscreen)
    at DerivedSources/ForwardingHeaders/wtf/RefCounted.h:140
#4  0x00007f127f092c1f in WebCore::WebGLRenderingContextBase::create(WebCore::CanvasBase&, WebCore::GraphicsContext3DAttributes&, WTF::String const&) (canvas=..., attributes=..., type=...)
    at ../Source/WebCore/html/canvas/WebGLRenderingContextBase.cpp:601
#5  0x00007f127ef78603 in WebCore::HTMLCanvasElement::createContextWebGL(WTF::String const&, WebCore::GraphicsContext3DAttributes&&) (this=0x7f122426b610, type=..., attrs=...) at ../Source/WebCore/html/HTMLCanvasElement.cpp:408
#6  0x00007f127ef7c7d7 in WebCore::HTMLCanvasElement::getContext(JSC::ExecState&, WTF::String const&, WTF::Vector<JSC::Strong<JSC::Unknown>, 0ul, WTF::CrashOnOverflow, 16ul>&&)
    (this=this@entry=0x7f122426b610, state=..., contextId=..., arguments=...)
    at ../Source/WebCore/html/HTMLCanvasElement.cpp:276
#7  0x00007f127e4aea1d in WebCore::jsHTMLCanvasElementPrototypeFunctionGetContextBody
    (throwScope=..., castedThis=0x7f12013c6380, state=0x7fff32f9c750)
    at DerivedSources/WebCore/JSHTMLCanvasElement.cpp:291
#8  0x00007f127e4aea1d in WebCore::IDLOperation<WebCore::JSHTMLCanvasElement>::call<WebCore::jsHTMLCanvasElementPrototypeFunctionGetContextBody> (operationName=0x7f127fcc3fa6 "getContext", state=...)
    at ../Source/WebCore/bindings/js/JSDOMOperation.h:53
#9  0x00007f127e4aea1d in WebCore::jsHTMLCanvasElementPrototypeFunctionGetContext(JSC::ExecState*)
    (state=0x7fff32f9c750) at DerivedSources/WebCore/JSHTMLCanvasElement.cpp:296
#10 0x00007f1227fff16b in  ()
#11 0x00007fff32f9c860 in  ()
#12 0x00007f127b9df421 in llint_op_call () at /usr/lib/x86_64-linux-gnu/libjavascriptcoregtk-4.0.so.18
#13 0x0000000000000000 in  ()


(gdb) bt full
#0  0x00007f127fa8fa58 in Nicosia::GC3DLayer::makeContextCurrent() (this=<optimized out>)
    at /usr/include/c++/9.2.0/bits/unique_ptr.h:352
#1  0x00007f127fa84b80 in WebCore::GraphicsContext3D::makeContextCurrent() (this=this@entry=0x7f11e79dc600)
    at /usr/include/c++/9.2.0/bits/unique_ptr.h:352
#2  0x00007f127fa84de7 in WebCore::GraphicsContext3D::GraphicsContext3D(WebCore::GraphicsContext3DAttributes, WebCore::HostWindow*, WebCore::GraphicsContext3D::RenderStyle, WebCore::GraphicsContext3D*)
    (this=0x7f11e79dc600, attributes=..., renderStyle=WebCore::GraphicsContext3D::RenderOffscreen, sharedContext=<optimized out>) at ../Source/WebCore/platform/graphics/texmap/GraphicsContext3DTextureMapper.cpp:114
        ANGLEResources = 
          {MaxVertexAttribs = 913974467, MaxVertexUniformVectors = 2051993642, MaxVaryingVectors = 1025385667, MaxVertexTextureImageUnits = 2073325422, MaxCombinedTextureImageUnits = 0, MaxTextureImageUnits = 32767, MaxFragmentUniformVectors = 2102440001, MaxDrawBuffers = 32530, OES_standard_derivatives = 2103732800, OES_EGL_image_external = 32530, OES_EGL_image_external_essl3 = 0, NV_EGL_stream_consumer_external = 0, ARB_texture_rectangle = 7, EXT_blend_func_extended = 32530, EXT_draw_buffers = 0, EXT_frag_depth = 0, EXT_shader_texture_lod = 7, WEBGL_debug_shader_precision = 0, EXT_shader_framebuffer_fetch = 320393242, NV_shader_framebuffer_fetch = 21954, ARM_shader_framebuffer_fetch = 112, OVR_multiview2 = 0, EXT_YUV_target = 94, EXT_geometry_shader = 0, OES_texture_storage_multisample_2d_array = 0, ANGLE_texture_multisample = 0, ANGLE_multi_draw = 320393272, NV_draw_buffers = 21954, FragmentPrecisionHigh = 144, MaxVertexOutputVectors = 0, MaxFragmentInputVectors = 1, MinProgramTexelOffset = 0, MaxProgramTexelOffset = 7, MaxDualSourceDrawBuffers = 49, MaxViewsOVR = 0, HashFunction = 0x0, ArrayIndexClampingStrategy = (unknown: 0), MaxExpressionComplexity = 0, MaxCallStackDepth = 124, MaxFunctionParameters = 119, MinProgramTextureGatherOffset = 110, MaxProgramTextureGatherOffset = 91, MaxImageUnits = 2040611200, MaxVertexImageUniforms = 32530, MaxFragmentImageUniforms = 5, MaxComputeImageUniforms = 0, MaxCombinedImageUniforms = 94, MaxUniformLocations = 0, MaxCombinedShaderOutputResources = 2103732704, MaxComputeWorkGroupCount = {_M_elems = {32530, 21, 0}}, MaxComputeWorkGroupSize = {_M_elems = {2040219680, 32530, -16}}, MaxComputeUniformComponents = -1, MaxComputeTextureImageUnits = 2102445017, MaxComputeAtomicCounters = 32530, MaxComputeAtomicCounterBuffers = 2040779328, MaxVertexAtomicCounters = 32530, MaxFragmentAtomicCounters = 2040779328, MaxCombinedAtomicCounters = 32530, MaxAtomicCounterBindings = 329073248, MaxVertexAtomicCounterBuffers = 21954, MaxFragmentAtomicCounterBuffers = 2040728757, MaxCombinedAtomicCounterBuffers = 32530, MaxAtomicCounterBufferSize = 0, MaxUniformBufferBindings = 0, MaxShaderStorageBufferBindings = 327478688, MaxPointSize = 3.07641065e-41, MaxGeometryUniformComponents = 2117502338, MaxGeometryUniformBlocks = 21, MaxGeometryInputComponents = 2101557456, MaxGeometryOutputComponents = 32530, MaxGeometryOutputVertices = 0, MaxGeometryTotalOutputComponents = 0, MaxGeometryTextureImageUnits = -2139654388, MaxGeometryAtomicCounterBuffers = 32530, MaxGeometryAtomicCounters = 2144069542, MaxGeometryShaderStorageBlocks = 32530, MaxGeometryShaderInvocations = 855229712, MaxGeometryImageUniforms = 32767}
        range = {1968767072, 32530}
        precision = 32529
#3  0x00007f127fa859de in WebCore::GraphicsContext3D::create(WebCore::GraphicsContext3DAttributes, WebCore::HostWindow*, WebCore::GraphicsContext3D::RenderStyle)
    (attributes=..., hostWindow=hostWindow@entry=0x7f1275590060, renderStyle=renderStyle@entry=WebCore::GraphicsContext3D::RenderOffscreen) at DerivedSources/ForwardingHeaders/wtf/RefCounted.h:140
        initialized = true
        success = true
        contexts = 
                @0x7f12807b9c00: {m_start = 0, m_end = 0, m_buffer = {<WTF::VectorBufferBase<WebCore::GraphicsContext3D*>> = {m_buffer = 0x7f12807b9c20 <WebCore::activeContexts()::s_activeContexts+32>, m_capacity = 16, m_size = 0}, m_inlineBuffer = {{__data = "\000\000\000\000\000\000\000", __align = {<No data fields>}} <repeats 16 times>}}}
#4  0x00007f127f092c1f in WebCore::WebGLRenderingContextBase::create(WebCore::CanvasBase&, WebCore::GraphicsContext3DAttributes&, WTF::String const&) (canvas=..., attributes=..., type=...)
    at ../Source/WebCore/html/canvas/WebGLRenderingContextBase.cpp:601
        isPendingPolicyResolution = false
        hostWindow = 0x7f1275590060
        canvasElement = <optimized out>
        context = 
          {static isRefPtr = <error reading variable: Missing ELF symbol "WTF::RefPtr<WebCore::GraphicsContext3D, WTF::DumbPtrTraits<WebCore::GraphicsContext3D> >::isRefPtr".>, m_ptr = 0x0}
        extensions = <optimized out>
        renderingContext = <optimized out>
#5  0x00007f127ef78603 in WebCore::HTMLCanvasElement::createContextWebGL(WTF::String const&, WebCore::GraphicsContext3DAttributes&&) (this=0x7f122426b610, type=..., attrs=...) at ../Source/WebCore/html/HTMLCanvasElement.cpp:408
#6  0x00007f127ef7c7d7 in WebCore::HTMLCanvasElement::getContext(JSC::ExecState&, WTF::String const&, WTF::Vector<JSC::Strong<JSC::Unknown>, 0ul, WTF::CrashOnOverflow, 16ul>&&) (this=this@entry=0x7f122426b610, state=..., contextId=..., arguments=...) at ../Source/WebCore/html/HTMLCanvasElement.cpp:276
        scope = {<JSC::ExceptionScope> = {m_vm = @0x7f1226b00000}, <No data fields>}
        attributes = {alpha = true, depth = true, stencil = false, antialias = true, premultipliedAlpha = true, preserveDrawingBuffer = false, failIfMajorPerformanceCaveat = false, powerPreference = WebCore::GraphicsContext3DPowerPreference::Default, shareResources = false, isWebGL2 = false, noExtensions = true, devicePixelRatio = 1, initialPowerPreference = WebCore::GraphicsContext3DPowerPreference::Default}
        context = <optimized out>
#7  0x00007f127e4aea1d in WebCore::jsHTMLCanvasElementPrototypeFunctionGetContextBody (throwScope=..., castedThis=0x7f12013c6380, state=0x7fff32f9c750) at DerivedSources/WebCore/JSHTMLCanvasElement.cpp:291
        impl = @0x7f122426b610: {<WebCore::HTMLElement> = {<WebCore::StyledElement> = {<WebCore::Element> = {<WebCore::ContainerNode> = {<WTF::CanMakeWeakPtr<WebCore::ContainerNode>> = {m_weakPtrFactory = {m_impl = {static isRefPtr = <error reading variable: Missing ELF symbol "WTF::RefPtr<WTF::WeakPtrImpl, WTF::DumbPtrTraits<WTF::WeakPtrImpl> >::isRefPtr".>, m_ptr = 0x0}}}, <WebCore::Node> = {<WebCore::EventTarget> = {<WebCore::ScriptWrappable> = {m_wrapper = {m_impl = 0x7f11e7933990}}, _vptr.EventTarget = 0x7f12806873d8 <vtable for WebCore::HTMLCanvasElement+16>}, static s_refCountIncrement = 2, static s_refCountMask = 4294967294, m_refCountAndParentBit = 2, m_nodeFlags = 524302, m_parentNode = 0x0, m_treeScope = 0x7f1224208c30, m_previous = 0x0, m_next = 0x0, m_data = {m_renderer = 0x0, m_rareData = 0x0}}, m_firstChild = 0x0, m_lastChild = 0x0}, m_tagName = {m_impl = {static isRefPtr = <error reading variable: Missing ELF symbol "WTF::RefPtr<WebCore::QualifiedName::QualifiedNameImpl, WTF::DumbPtrTraits<WebCore::QualifiedName::QualifiedNameImpl> >::isRefPtr".>, m_ptr = 0x7f12755880c8}}, m_elementData = {static isRefPtr = <error reading variable: Missing ELF symbol "WTF::RefPtr<WebCore::ElementData, WTF::DumbPtrTraits<WebCore::ElementData> >::isRefPtr".>, m_ptr = 0x0}}, <No data fields>}, <No data fields>}, <WebCore::CanvasBase> = {_vptr.CanvasBase = 0x7f12806878f8 <vtable for WebCore::HTMLCanvasElement+1328>, m_context = {_M_t = {_M_t = {<std::_Tuple_impl<0, WebCore::CanvasRenderingContext*, std::default_delete<WebCore::CanvasRenderingContext> >> = {<std::_Tuple_impl<1, std::default_delete<WebCore::CanvasRenderingContext> >> = {<std::_Head_base<1, std::default_delete<WebCore::CanvasRenderingContext>, true>> = {<std::default_delete<WebCore::CanvasRenderingContext>> = {<No data fields>}, <No data fields>}, <No data fields>}, <std::_Head_base<0, WebCore::CanvasRenderingContext*, false>> = {_M_head_impl = 0x0}, <No data fields>}, <No data fields>}}}, m_originClean = true, m_observers = {m_impl = {static m_maxLoad = 2, static m_minLoad = 6, m_table = 0x0, m_tableSize = 0, m_tableSizeMask = 0, m_keyCount = 0, m_deletedCount = 0}}}, m_dirtyRect = {m_location = {m_x = 0, m_y = 0}, m_size = {m_width = 0, m_height = 0}}, m_size = {m_width = 300, m_height = 150}, m_ignoreReset = false, m_usesDisplayListDrawing = false, m_tracksDisplayListReplay = false, m_imageBufferAssignmentLock = {static isHeldBit = 1 '\001', static hasParkedBit = 2 '\002', m_byte = {value = {<std::__atomic_base<unsigned char>> = {static _S_alignment = 1, _M_i = 0 '\000'}, static is_always_lock_free = true}}}, m_hasCreatedImageBuffer = false, m_didClearImageBuffer = false, m_imageBuffer = {_M_t = {_M_t = {<std::_Tuple_impl<0, WebCore::ImageBuffer*, std::default_delete<WebCore::ImageBuffer> >> = {<std::_Tuple_impl<1, std::default_delete<WebCore::ImageBuffer> >> = {<std::_Head_base<1, std::default_delete<WebCore::ImageBuffer>, true>> = {<std::default_delete<WebCore::ImageBuffer>> = {<No data fields>}, <No data fields>}, <No data fields>}, <std::_Head_base<0, WebCore::ImageBuffer*, false>> = {_M_head_impl = 0x0}, <No data fields>}, <No data fields>}}}, m_contextStateSaver = {_M_t = {_M_t = {<std::_Tuple_impl<0, WebCore::GraphicsContextStateSaver*, std::default_delete<WebCore::GraphicsContextStateSaver> >> = {<std::_Tuple_impl<1, std::default_delete<WebCore::GraphicsContextStateSaver> >> = {<std::_Head_base<1, std::default_delete<WebCore::GraphicsContextStateSaver>, true>> = {<std::default_delete<WebCore::GraphicsContextStateSaver>> = {<No data fields>}, <No data fields>}, <No data fields>}, <std::_Head_base<0, WebCore::GraphicsContextStateSaver*, false>> = {_M_head_impl = 0x0}, <No data fields>}, <No data fields>}}}, m_presentedImage = {static isRefPtr = <error reading variable: Missing ELF symbol "WTF::RefPtr<WebCore::Image, WTF::DumbPtrTraits<WebCore::Image> >::isRefPtr".>, m_ptr = 0x0}, m_copiedImage = {static isRefPtr = <error reading variable: Missing ELF symbol "WTF::RefPtr<WebCore::Image, WTF::DumbPtrTraits<WebCore::Image> >::isRefPtr".>, m_ptr = 0x0}}
        contextId = {static MaxLength = 2147483647, m_impl = {static isRefPtr = <error reading variable: Missing ELF symbol "WTF::RefPtr<WTF::StringImpl, WTF::DumbPtrTraits<WTF::StringImpl> >::isRefPtr".>, m_ptr = 0x7f11e79221e0}}
        arguments = {<WTF::VectorBuffer<JSC::Strong<JSC::Unknown>, 0>> = {<WTF::VectorBufferBase<JSC::Strong<JSC::Unknown> >> = {m_buffer = 0x0, m_capacity = 0, m_size = 0}, <No data fields>}, <No data fields>}
        throwScope = {<JSC::ExceptionScope> = {m_vm = @0x7f1226b00000}, <No data fields>}
        thisObject = 0x7f12013c6380
#8  0x00007f127e4aea1d in WebCore::IDLOperation<WebCore::JSHTMLCanvasElement>::call<WebCore::jsHTMLCanvasElementPrototypeFunctionGetContextBody> (operationName=0x7f127fcc3fa6 "getContext", state=...) at ../Source/WebCore/bindings/js/JSDOMOperation.h:53
        throwScope = {<JSC::ExceptionScope> = {m_vm = @0x7f1226b00000}, <No data fields>}
        thisObject = 0x7f12013c6380
#9  0x00007f127e4aea1d in WebCore::jsHTMLCanvasElementPrototypeFunctionGetContext(JSC::ExecState*) (state=0x7fff32f9c750) at DerivedSources/WebCore/JSHTMLCanvasElement.cpp:296
#10 0x00007f1227fff16b in  ()
#11 0x00007fff32f9c860 in  ()
#12 0x00007f127b9df421 in llint_op_call () at /usr/lib/x86_64-linux-gnu/libjavascriptcoregtk-4.0.so.18
#13 0x0000000000000000 in  ()
Comment 1 Michael Catanzaro 2019-09-20 07:47:03 PDT
Today I noticed every attempt to view Google Maps -- even in new web views -- resulted in this crash. I had to restart the entire browser before viewing Google Maps worked again.
Comment 2 Michael Catanzaro 2019-09-21 10:01:36 PDT
(In reply to Michael Catanzaro from comment #1)
> Today I noticed every attempt to view Google Maps -- even in new web views
> -- resulted in this crash. I had to restart the entire browser before
> viewing Google Maps worked again.

And again today.
Comment 3 Michael Catanzaro 2019-10-04 06:28:41 PDT
OK I have a better reproducer! Visit https://q13fox.com/ and the page will crash immediately.
Comment 4 Michael Catanzaro 2019-10-04 06:46:20 PDT
OK wow, it's related to bug #202362. I think this bug only occurs when the browser gets into the "bad state" where bug #202362 occurs. Seems that some pages display white while others crash with this trace. In my journal, I see the same warnings from bug #202362:

Oct 04 08:44:16 chargestone-cave org.gnome.Epiphany.Devel.desktop[1896]: Cannot create EGL context: invalid display (last error: EGL_SUCCESS)

<web process crashes here>

Oct 04 08:44:16 chargestone-cave org.gnome.Epiphany.Devel.desktop[1896]: Cannot get default EGL display: EGL_BAD_PARAMETER
Comment 5 Michael Catanzaro 2019-10-07 07:14:05 PDT
I hit this crash 15-20 times this weekend.

It would be nice if a relevant developer would acknowledge this issue.
Comment 6 Carlos Garcia Campos 2019-10-08 02:43:17 PDT
This seems to be crashing because m_glContext is nullptr in GC3DLayer::makeContextCurrent(), which is called right after Nicosia::GC3DLayer is created, so the only way that can happen is because GLContext::createOffscreenContext() failed in the constructor. So, we need to know why creating an offscreen context fails in your system. The info of about:gpu in your system would help here.
Comment 7 Carlos Garcia Campos 2019-10-08 02:44:59 PDT
Ah! I had forgotten the previous comments, so

Cannot create EGL context: invalid display (last error: EGL_SUCCESS)

that helps a bit
Comment 8 Carlos Garcia Campos 2019-10-08 02:50:00 PDT
about:gpu info is still useful in any case
Comment 9 Carlos Garcia Campos 2019-10-08 02:57:52 PDT
The origin is 

PlatformDisplayLibWPE: could not create the EGL display: EGL_SUCCESS.

which happens when 

m_eglDisplay = eglGetDisplay(wpe_renderer_backend_egl_get_native_display(m_backend));

fails. In that case we don't initialize egl display, so it's initialized on demand when PlatformDisplay::eglDispaly is called, but eglGetDisplay(EGL_DEFAULT_DISPLAY) (fortunately, because we don't really want to use the default display in this case).

So, the thing is why eglGetDisplay(wpe_renderer_backend_egl_get_native_display(m_backend)) is failing.
Comment 10 Miguel Gomez 2019-10-08 03:16:02 PDT
(In reply to Carlos Garcia Campos from comment #9)
> The origin is 
> 
> PlatformDisplayLibWPE: could not create the EGL display: EGL_SUCCESS.
> 
> which happens when 
> 
> m_eglDisplay =
> eglGetDisplay(wpe_renderer_backend_egl_get_native_display(m_backend));
> 
> fails. In that case we don't initialize egl display, so it's initialized on
> demand when PlatformDisplay::eglDispaly is called, but
> eglGetDisplay(EGL_DEFAULT_DISPLAY) (fortunately, because we don't really
> want to use the default display in this case).
> 
> So, the thing is why
> eglGetDisplay(wpe_renderer_backend_egl_get_native_display(m_backend)) is
> failing.

Were you able to reproduce this, Carlos? using which version exactly? I've tried ToT and I'm not able to see it. I'm now about to test 2.26.

Anyway, it's true that the problem is that we can't create a gl context because we can't get a valid display. What bothers me is this line:

Cannot get default EGL display: EGL_BAD_PARAMETER

cause, at least according to the documentation, eglGetdisplay should not gererate an EGL_BAD_PARAMETER error, which could mean that the error is coming from a previous EGL call.
Comment 11 Miguel Gomez 2019-10-08 03:16:47 PDT
(In reply to Michael Catanzaro from comment #3)
> OK I have a better reproducer! Visit https://q13fox.com/ and the page will
> crash immediately.

Sadly this is content is geoblocked :(
Comment 12 Carlos Garcia Campos 2019-10-08 03:23:49 PDT
No, I can't reproduce it.
Comment 13 Carlos Garcia Campos 2019-10-08 03:38:46 PDT
This is indeed a duplicate of bug #202362, it shows white pages for websites with normal AC content, and crashes when using WebGL. I've attached a patch to bug #202362 to handle the EGL display initialization failure and disable AC. That patch fixes both the white pages and this crash, so let's use this bug to figure out why EGL display creation fails.
Comment 14 Carlos Garcia Campos 2019-10-08 03:52:32 PDT
wpe_renderer_backend_egl_create() does the wayland display connection for the given fd (wl_display_connect_to_fd) and the returned WaylandDisplay is what is passed to eglGetDisplay(). So, it might be failing to connect to the wayland nested compositor and we are passing nullptr to eglGetDisplay(). Michael, it would help if you could add printfs to PlatformDisplayLibWPE::initialize() to show the hostFd and the native display returned by wpe_renderer_backend_egl_get_native_display() as a pointer.
Comment 15 Carlos Garcia Campos 2019-10-08 05:00:53 PDT
I'm checking logs reported in bug #202362.

Oct 04 08:14:31 chargestone-cave org.gnome.Epiphany.Devel.desktop[1896]: Cannot get default EGL display: EGL_BAD_PARAMETER
Oct 04 08:14:31 chargestone-cave org.gnome.Epiphany.Devel.desktop[1896]: Cannot create EGL context: invalid display (last error: EGL_SUCCESS)

This is the initialization of the default display, that is also failing so problem is not specific to wpe renderer. EGL_BAD_PARAMETER must be a previous error, as Miguel suggested, and the only egl call that should happen before the egl display initialization is eglGetPlatformDisplay. It returns a EGL_BAD_PARAMETER when the given platform is not supported, but if that was the case it would always fail. I see we are checking for EGL_KHR_platform_wayland and always passing EGL_PLATFORM_WAYLAND_KHR even when using eglGetPlatformDisplayEXT, but that shouldn't be a problem because EGL_PLATFORM_WAYLAND_KHR and EGL_PLATFORM_WAYLAND_EXT are both defined as 0x31D8. 

Oct 04 08:14:32 chargestone-cave org.gnome.Epiphany.Devel.desktop[1896]: Cannot get default EGL display: EGL_BAD_PARAMETER
Oct 04 08:14:32 chargestone-cave org.gnome.Epiphany.Devel.desktop[1896]: PlatformDisplayLibWPE: could not create the EGL display: EGL_SUCCESS.

And this is creating the share display for compositing. In this case I don't know here the EGL_BAD_PARAMETER comes from.
Comment 16 Michael Catanzaro 2019-10-08 09:12:10 PDT
(In reply to Michael Catanzaro from comment #3)
> OK I have a better reproducer! Visit https://q13fox.com/ and the page will
> crash immediately.

What I didn't realize when I added that comment is the bug is only reproducible once Epiphany gets into the "bad state" such that bug #202362 also occurs. Then it will crash 100% until Epiphany is restarted. But if you're not yet in this bad state, it works fine.

(In reply to Carlos Garcia Campos from comment #8)
> about:gpu info is still useful in any case

I can add a patch to our runtime to add about:gpu, if you want to provide a patch that builds against 2.26.1.

I won't update the runtime to 2.27.1 due to (a) the GitLab CI regression, it will become impossible to use Epiphany to develop Epiphany, and (b) the MSE regressions, I like to watch YouTube. So I want to keep Tech Preview on 2.26 for now.

(In reply to Carlos Garcia Campos from comment #14)
> Michael, it would help if you could add printfs to
> PlatformDisplayLibWPE::initialize() to show the hostFd and the native
> display returned by wpe_renderer_backend_egl_get_native_display() as a
> pointer.

This is more work than just adding a patch locally though. A local build isn't good enough because this bug is not reproducible; we need the patch in the real Tech Preview build for the debug to be available next time I notice the issue.

So I will add a debug patch to the runtime tomorrow. Please confirm that you still need it to print (a) hostFd, and (b) native display as a pointer, and (c) nothing else. If more would be helpful, now is the best time to add it because it would be nice to update the patch as few times as possible.
Comment 17 Michael Catanzaro 2019-10-08 09:19:02 PDT
(In reply to Michael Catanzaro from comment #16)
> I won't update the runtime to 2.27.1 due to (a) the GitLab CI regression, it
> will become impossible to use Epiphany to develop Epiphany, and (b) the MSE
> regressions, I like to watch YouTube. So I want to keep Tech Preview on 2.26
> for now.

That's bug #202594, bug #201726, bug #202078, and bug #202079.

(We also really need bug #202321, though that one is already broken in 2.26.)
Comment 18 Carlos Garcia Campos 2019-10-09 00:14:25 PDT
(In reply to Michael Catanzaro from comment #16)
> (In reply to Michael Catanzaro from comment #3)
> > OK I have a better reproducer! Visit https://q13fox.com/ and the page will
> > crash immediately.
> 
> What I didn't realize when I added that comment is the bug is only
> reproducible once Epiphany gets into the "bad state" such that bug #202362
> also occurs. Then it will crash 100% until Epiphany is restarted. But if
> you're not yet in this bad state, it works fine.

The fact that doesn't always happen, and it starts happening after a while, makes me think it's not a problem with the EGL config, because it's always the same. It could be something like OOM or that you run out of file descriptors or something like that.

> (In reply to Carlos Garcia Campos from comment #8)
> > about:gpu info is still useful in any case
> 
> I can add a patch to our runtime to add about:gpu, if you want to provide a
> patch that builds against 2.26.1.
> 
> I won't update the runtime to 2.27.1 due to (a) the GitLab CI regression, it
> will become impossible to use Epiphany to develop Epiphany, and (b) the MSE
> regressions, I like to watch YouTube. So I want to keep Tech Preview on 2.26
> for now.
> 
> (In reply to Carlos Garcia Campos from comment #14)
> > Michael, it would help if you could add printfs to
> > PlatformDisplayLibWPE::initialize() to show the hostFd and the native
> > display returned by wpe_renderer_backend_egl_get_native_display() as a
> > pointer.
> 
> This is more work than just adding a patch locally though. A local build
> isn't good enough because this bug is not reproducible; we need the patch in
> the real Tech Preview build for the debug to be available next time I notice
> the issue.
> 
> So I will add a debug patch to the runtime tomorrow. Please confirm that you
> still need it to print (a) hostFd, and (b) native display as a pointer, and
> (c) nothing else. If more would be helpful, now is the best time to add it
> because it would be nice to update the patch as few times as possible.

No, I don't really need that, because it also happens when initializing the main display, so it's not related to wpe renderer.
Comment 19 Michael Catanzaro 2019-10-09 08:01:27 PDT
(In reply to Carlos Garcia Campos from comment #18)
> The fact that doesn't always happen, and it starts happening after a while,
> makes me think it's not a problem with the EGL config, because it's always
> the same. It could be something like OOM or that you run out of file
> descriptors or something like that.

It's definitely not OOM.

I can check to see if an fd leak might be a problem next time this happens. (Sadly I just had it in a bad state a couple minutes ago and restarted so that I could use some website.)
Comment 20 Carlos Alberto Lopez Perez 2019-10-09 08:13:56 PDT
(In reply to Michael Catanzaro from comment #19)
> (In reply to Carlos Garcia Campos from comment #18)
> > The fact that doesn't always happen, and it starts happening after a while,
> > makes me think it's not a problem with the EGL config, because it's always
> > the same. It could be something like OOM or that you run out of file
> > descriptors or something like that.
> 
> It's definitely not OOM.
> 
> I can check to see if an fd leak might be a problem next time this happens.
> (Sadly I just had it in a bad state a couple minutes ago and restarted so
> that I could use some website.)

It can be also related with flatpak or some resource limit inside the container.

Can you reproduce it outside of flatpak?
Comment 21 Michael Catanzaro 2019-10-09 08:23:08 PDT
(In reply to Carlos Alberto Lopez Perez from comment #20)
> It can be also related with flatpak or some resource limit inside the
> container.
> 
> Can you reproduce it outside of flatpak?

I am not going to try. The only way to notice bugs like this is to use the browser all day, every day, for several days. That's too long for me to test something.
Comment 22 Michael Catanzaro 2019-10-15 02:33:26 PDT
I have it in the error state again night now and the problem is not fd exhaustion... the UI process has <100 open fds.

I will try to resist the urge to restart my browser to fix this for about half an hour, in case someone sees this immediately and wants me to try something else for debugging.
Comment 23 Michael Catanzaro 2019-10-29 07:05:06 PDT
Well I don't know what we can do about this. I'm hitting the bug now for the first time in about a week, but can't think of anything to do other than restart my browser so I can browse the web again and thereby throw away the opportunity to debug it for another week.

It seems nobody else has any idea how we can even attempt to debug it, either.
Comment 24 Michael Catanzaro 2019-12-03 07:16:27 PST
I have Epiphany in the bad state again right now. I discovered that, while the fallback to disable AC mode in this state that was implemented in bug #202362 *usually* works properly, it currently fails on https://www.linuxjournal.com/content/job-control-bash-feature-you-only-think-you-dont-need and we still get the same crash in comment #0. I think there is something about the web content on this page that triggers the crash.
Comment 25 Michael Catanzaro 2019-12-04 05:10:00 PST
(In reply to Michael Catanzaro from comment #24)
> I think there is something about the web content on this page that triggers the crash.

I have it in the bad state again today. https://riot.igalia.com/ is also a guaranteed crash. I noticed this warning:

** (WebKitWebProcess:16376): CRITICAL **: 07:08:38.759: gst_gl_display_egl_new_with_egl_display: assertion 'display != NULL' failed
Comment 26 Michael Catanzaro 2019-12-04 05:34:20 PST
Split into bug #204848
Comment 27 Philippe Normand 2019-12-05 05:15:53 PST
(In reply to Michael Catanzaro from comment #26)
> Split into bug #204848

I don't really see the point of filing another bug?
Shouldn't there at least be an ASSERT in PlatformDisplay::sharedDisplayForCompositing()?
Comment 28 Michael Catanzaro 2019-12-05 07:25:46 PST
I'd do a RELEASE_ASSERT, since we keep hitting it.

Bug #204848: ensure WebKit disables compositing instead of crashing when in this bad state

This bug: ensure WebKit does not enter this broken state in the first place
Comment 29 Michael Catanzaro 2019-12-08 08:38:27 PST
Here is the output from my about:gpu. I think we could use some work on the WebKit side to make this easier to copy/paste into bug reports; it's not very readable by default. I've added newlines to the output to make it easier to read.

Version Information

WebKit version
WebKitGTK 2.27.3 (tarball)

Operating system
Linux 5.3.13-300.fc31.x86_64 #1 SMP Mon Nov 25 17:25:25 UTC 2019 x86_64

Desktop
GNOME

Cairo version
1.16.0 (build) 1.16.0 (runtime)

GTK version
3.24.13 (build) 3.24.13 (runtime)

WPE version
1.4.0 (using fdo backend 1.4.0)


Display Information

Type
Wayland

Screen geometry
0,0 1920x1080

Screen work area
0,0 1920x1080

Depth
32

Bits per color component
8

DPI
96.00


Hardware Acceleration Information

Policy
on demand

WebGL enabled
Yes

API
OpenGL

Native interface
EGL

GL_RENDERER
Radeon RX 570 Series (POLARIS10, DRM 3.33.0, 5.3.13-300.fc31.x86_64, LLVM 9.0.0)

GL_VENDOR
X.Org

GL_VERSION
4.5 (Core Profile) Mesa 19.2.1

GL_SHADING_LANGUAGE_VERSION
4.50

GL_EXTENSIONS
GL_AMD_conservative_depth GL_AMD_depth_clamp_separate GL_AMD_draw_buffers_blend GL_AMD_framebuffer_multisample_advanced GL_AMD_gpu_shader_int64 GL_AMD_multi_draw_indirect GL_AMD_performance_monitor GL_AMD_pinned_memory GL_AMD_query_buffer_object GL_AMD_seamless_cubemap_per_texture GL_AMD_shader_stencil_export GL_AMD_shader_trinary_minmax GL_AMD_texture_texture4 GL_AMD_vertex_shader_layer GL_AMD_vertex_shader_viewport_index GL_ANGLE_texture_compression_dxt3 GL_ANGLE_texture_compression_dxt5 GL_ARB_ES2_compatibility GL_ARB_ES3_1_compatibility GL_ARB_ES3_2_compatibility GL_ARB_ES3_compatibility GL_ARB_arrays_of_arrays GL_ARB_base_instance GL_ARB_bindless_texture GL_ARB_blend_func_extended GL_ARB_buffer_storage GL_ARB_clear_buffer_object GL_ARB_clear_texture GL_ARB_clip_control GL_ARB_color_buffer_float GL_ARB_compressed_texture_pixel_storage GL_ARB_compute_shader GL_ARB_compute_variable_group_size GL_ARB_conditional_render_inverted GL_ARB_conservative_depth GL_ARB_copy_buffer GL_ARB_copy_image GL_ARB_cull_distance GL_ARB_debug_output GL_ARB_depth_buffer_float GL_ARB_depth_clamp GL_ARB_derivative_control GL_ARB_direct_state_access GL_ARB_draw_buffers GL_ARB_draw_buffers_blend GL_ARB_draw_elements_base_vertex GL_ARB_draw_indirect GL_ARB_draw_instanced GL_ARB_enhanced_layouts GL_ARB_explicit_attrib_location GL_ARB_explicit_uniform_location GL_ARB_fragment_coord_conventions GL_ARB_fragment_layer_viewport GL_ARB_fragment_shader GL_ARB_framebuffer_no_attachments GL_ARB_framebuffer_object GL_ARB_framebuffer_sRGB GL_ARB_get_program_binary GL_ARB_get_texture_sub_image GL_ARB_gpu_shader5 GL_ARB_gpu_shader_fp64 GL_ARB_gpu_shader_int64 GL_ARB_half_float_pixel GL_ARB_half_float_vertex GL_ARB_indirect_parameters GL_ARB_instanced_arrays GL_ARB_internalformat_query GL_ARB_internalformat_query2 GL_ARB_invalidate_subdata GL_ARB_map_buffer_alignment GL_ARB_map_buffer_range GL_ARB_multi_bind GL_ARB_multi_draw_indirect GL_ARB_occlusion_query2 GL_ARB_parallel_shader_compile GL_ARB_pipeline_statistics_query GL_ARB_pixel_buffer_object GL_ARB_point_sprite GL_ARB_polygon_offset_clamp GL_ARB_program_interface_query GL_ARB_provoking_vertex GL_ARB_query_buffer_object GL_ARB_robust_buffer_access_behavior GL_ARB_robustness GL_ARB_sample_shading GL_ARB_sampler_objects GL_ARB_seamless_cube_map GL_ARB_seamless_cubemap_per_texture GL_ARB_separate_shader_objects GL_ARB_shader_atomic_counter_ops GL_ARB_shader_atomic_counters GL_ARB_shader_ballot GL_ARB_shader_bit_encoding GL_ARB_shader_clock GL_ARB_shader_draw_parameters GL_ARB_shader_group_vote GL_ARB_shader_image_load_store GL_ARB_shader_image_size GL_ARB_shader_objects GL_ARB_shader_precision GL_ARB_shader_stencil_export GL_ARB_shader_storage_buffer_object GL_ARB_shader_subroutine GL_ARB_shader_texture_image_samples GL_ARB_shader_texture_lod GL_ARB_shader_viewport_layer_array GL_ARB_shading_language_420pack GL_ARB_shading_language_packing GL_ARB_sparse_buffer GL_ARB_stencil_texturing GL_ARB_sync GL_ARB_tessellation_shader GL_ARB_texture_barrier GL_ARB_texture_buffer_object GL_ARB_texture_buffer_object_rgb32 GL_ARB_texture_buffer_range GL_ARB_texture_compression_bptc GL_ARB_texture_compression_rgtc GL_ARB_texture_cube_map_array GL_ARB_texture_filter_anisotropic GL_ARB_texture_float GL_ARB_texture_gather GL_ARB_texture_mirror_clamp_to_edge GL_ARB_texture_multisample GL_ARB_texture_non_power_of_two GL_ARB_texture_query_levels GL_ARB_texture_query_lod GL_ARB_texture_rectangle GL_ARB_texture_rg GL_ARB_texture_rgb10_a2ui GL_ARB_texture_stencil8 GL_ARB_texture_storage GL_ARB_texture_storage_multisample GL_ARB_texture_swizzle GL_ARB_texture_view GL_ARB_timer_query GL_ARB_transform_feedback2 GL_ARB_transform_feedback3 GL_ARB_transform_feedback_instanced GL_ARB_transform_feedback_overflow_query GL_ARB_uniform_buffer_object GL_ARB_vertex_array_bgra GL_ARB_vertex_array_object GL_ARB_vertex_attrib_64bit GL_ARB_vertex_attrib_binding GL_ARB_vertex_buffer_object GL_ARB_vertex_shader GL_ARB_vertex_type_10f_11f_11f_rev GL_ARB_vertex_type_2_10_10_10_rev GL_ARB_viewport_array GL_ATI_blend_equation_separate GL_ATI_meminfo GL_ATI_texture_float GL_ATI_texture_mirror_once GL_EXT_abgr GL_EXT_blend_equation_separate GL_EXT_depth_bounds_test GL_EXT_draw_buffers2 GL_EXT_draw_instanced GL_EXT_framebuffer_blit GL_EXT_framebuffer_multisample GL_EXT_framebuffer_multisample_blit_scaled GL_EXT_framebuffer_object GL_EXT_framebuffer_sRGB GL_EXT_memory_object GL_EXT_memory_object_fd GL_EXT_packed_depth_stencil GL_EXT_packed_float GL_EXT_pixel_buffer_object GL_EXT_polygon_offset_clamp GL_EXT_provoking_vertex GL_EXT_semaphore GL_EXT_semaphore_fd GL_EXT_shader_image_load_formatted GL_EXT_shader_integer_mix GL_EXT_texture_array GL_EXT_texture_compression_dxt1 GL_EXT_texture_compression_rgtc GL_EXT_texture_compression_s3tc GL_EXT_texture_filter_anisotropic GL_EXT_texture_integer GL_EXT_texture_mirror_clamp GL_EXT_texture_sRGB GL_EXT_texture_sRGB_R8 GL_EXT_texture_sRGB_decode GL_EXT_texture_shared_exponent GL_EXT_texture_snorm GL_EXT_texture_swizzle GL_EXT_timer_query GL_EXT_transform_feedback GL_EXT_vertex_array_bgra GL_EXT_vertex_attrib_64bit GL_EXT_window_rectangles GL_IBM_multimode_draw_arrays GL_KHR_blend_equation_advanced GL_KHR_context_flush_control GL_KHR_debug GL_KHR_no_error GL_KHR_parallel_shader_compile GL_KHR_robust_buffer_access_behavior GL_KHR_robustness GL_KHR_texture_compression_astc_ldr GL_KHR_texture_compression_astc_sliced_3d GL_MESA_pack_invert GL_MESA_shader_integer_functions GL_MESA_texture_signed_rgba GL_NVX_gpu_memory_info GL_NV_conditional_render GL_NV_depth_clamp GL_NV_packed_depth_stencil GL_NV_texture_barrier GL_NV_vdpau_interop GL_OES_EGL_image GL_S3_s3tc

EGL_VERSION
1.5

EGL_VENDOR
Mesa Project

EGL_EXTENSIONS
EGL_EXT_device_base EGL_EXT_device_enumeration EGL_EXT_device_query EGL_EXT_platform_base EGL_KHR_client_get_all_proc_addresses EGL_EXT_client_extensions EGL_KHR_debug EGL_EXT_platform_wayland EGL_EXT_platform_x11 EGL_MESA_platform_gbm EGL_MESA_platform_surfaceless EGL_EXT_platform_device EGL_ANDROID_blob_cache EGL_ANDROID_native_fence_sync EGL_EXT_buffer_age EGL_EXT_create_context_robustness EGL_EXT_image_dma_buf_import EGL_EXT_swap_buffers_with_damage EGL_KHR_cl_event2 EGL_KHR_config_attribs EGL_KHR_create_context EGL_KHR_create_context_no_error EGL_KHR_fence_sync EGL_KHR_get_all_proc_addresses EGL_KHR_gl_colorspace EGL_KHR_gl_renderbuffer_image EGL_KHR_gl_texture_2D_image EGL_KHR_gl_texture_3D_image EGL_KHR_gl_texture_cubemap_image EGL_KHR_image_base EGL_KHR_no_config_context EGL_KHR_reusable_sync EGL_KHR_surfaceless_context EGL_KHR_swap_buffers_with_damage EGL_EXT_pixel_format_float EGL_KHR_wait_sync EGL_MESA_configless_context EGL_MESA_drm_image EGL_MESA_image_dma_buf_export EGL_MESA_query_driver EGL_WL_bind_wayland_display EGL_WL_create_wayland_buffer_from_image
Comment 30 Michael Catanzaro 2019-12-16 14:29:05 PST
Hi, it seems this issue is stalled.

As far as I know, this is a regression from the switch to WPE renderer? (I'm not certain of this, but I definitely never noticed the issue before this September, so the timing seems right.)

For Epiphany, I am aiming to reenable AC mode in upcoming Epiphany 3.34.3 and 3.32.6, because we've discovered various sites that require 3D transforms, and all known AC mode bugs other than this one have been fixed. This bug is my remaining hesitation. I wonder if we should switch WebKitGTK's build default from WPE renderer back to the WaylandCompositor until we have time to track this down and debug it? I'm a bit concerned because Epiphany does not have any control over whether WPE renderer or WaylandCompositor gets used.
Comment 31 Carlos Garcia Campos 2019-12-17 00:43:24 PST
(In reply to Michael Catanzaro from comment #30)
> Hi, it seems this issue is stalled.

I don't think I can do more without a way to reproduce it.

> As far as I know, this is a regression from the switch to WPE renderer? (I'm
> not certain of this, but I definitely never noticed the issue before this
> September, so the timing seems right.)

This is just a guess, we would need to confirm it. Since you seem to be the only one affected, we would need you to build WebKit without WPE renderer and check if the problem is gone.

> For Epiphany, I am aiming to reenable AC mode in upcoming Epiphany 3.34.3
> and 3.32.6, because we've discovered various sites that require 3D
> transforms, and all known AC mode bugs other than this one have been fixed.
> This bug is my remaining hesitation. I wonder if we should switch
> WebKitGTK's build default from WPE renderer back to the WaylandCompositor
> until we have time to track this down and debug it? I'm a bit concerned
> because Epiphany does not have any control over whether WPE renderer or
> WaylandCompositor gets used.

Let's confirm it's a regression of wpe renderer, because I don't think it is.
Comment 32 Michael Catanzaro 2019-12-17 07:02:41 PST
(In reply to Carlos Garcia Campos from comment #31)
> This is just a guess, we would need to confirm it. Since you seem to be the
> only one affected, we would need you to build WebKit without WPE renderer
> and check if the problem is gone.

Problem is, without a reproducer, it's impossible to ever know for sure if it's fixed. I can only guess based on whether I see pages that include video crashing (or, in the unlikely event I'm running in a terminal, if I see the error messages appearing there). I haven't noticed the issue for about two weeks now, when it happened two or three days in a row, and then there were several calm weeks before that.

Since multiple weeks without hitting the bug isn't evidence that it's fixed, might I suggest we build some debugging into WPEBackend-fdo to try to figure out what's going wrong next time this happens? Something must be going wrong in Instance::initialize in ws.cpp. Instead of returning false when initialization fails, how about we make this a fatal error and crash the web process instead? If we change each return false to g_error() then the backtrace from the crash will tell us exactly where it's failing inside this function. I don't see any other way to proceed, because the only way we have to indicate failure other than crashing is the bool return value, which doesn't convey enough information about the failure:

bool Instance::initialize(EGLDisplay eglDisplay)
{
    if (m_eglDisplay == eglDisplay)
        return true;

    if (m_eglDisplay != EGL_NO_DISPLAY) {
        g_warning("Multiple EGL displays are not supported.\n");
        return false;
    }

    const char* extensions = eglQueryString(eglDisplay, EGL_EXTENSIONS);
    if (isEGLExtensionSupported(extensions, "EGL_WL_bind_wayland_display")) {
        s_eglBindWaylandDisplayWL = reinterpret_cast<PFNEGLBINDWAYLANDDISPLAYWL>(eglGetProcAddress("eglBindWaylandDisplayWL"));
        assert(s_eglBindWaylandDisplayWL);
        s_eglQueryWaylandBufferWL = reinterpret_cast<PFNEGLQUERYWAYLANDBUFFERWL>(eglGetProcAddress("eglQueryWaylandBufferWL"));
        assert(s_eglQueryWaylandBufferWL);
    }
    if (!s_eglBindWaylandDisplayWL || !s_eglQueryWaylandBufferWL)
        return false;

    if (isEGLExtensionSupported(extensions, "EGL_KHR_image_base")) {
        s_eglCreateImageKHR = reinterpret_cast<PFNEGLCREATEIMAGEKHRPROC>(eglGetProcAddress("eglCreateImageKHR"));
        assert(s_eglCreateImageKHR);
        s_eglDestroyImageKHR = reinterpret_cast<PFNEGLDESTROYIMAGEKHRPROC>(eglGetProcAddress("eglDestroyImageKHR"));
        assert(s_eglDestroyImageKHR);
    }
    if (!s_eglCreateImageKHR || !s_eglDestroyImageKHR)
        return false;

    if (!s_eglBindWaylandDisplayWL(eglDisplay, m_display))
        return false;

    m_eglDisplay = eglDisplay;

    /* Initialize Linux dmabuf subsystem. */
    if (isEGLExtensionSupported(extensions, "EGL_EXT_image_dma_buf_import")
        && isEGLExtensionSupported(extensions, "EGL_EXT_image_dma_buf_import_modifiers")) {
        s_eglQueryDmaBufFormatsEXT = reinterpret_cast<PFNEGLQUERYDMABUFFORMATSEXTPROC>(eglGetProcAddress("eglQueryDmaBufFormatsEXT"));
        assert(s_eglQueryDmaBufFormatsEXT);
        s_eglQueryDmaBufModifiersEXT = reinterpret_cast<PFNEGLQUERYDMABUFMODIFIERSEXTPROC>(eglGetProcAddress("eglQueryDmaBufModifiersEXT"));
        assert(s_eglQueryDmaBufModifiersEXT);
    }

    if (s_eglQueryDmaBufFormatsEXT && s_eglQueryDmaBufModifiersEXT) {
        if (m_linuxDmabuf)
            assert(!"Linux-dmabuf has already been initialized");
        m_linuxDmabuf = linux_dmabuf_setup(m_display);
    }

    return true;
}
Comment 33 Michael Catanzaro 2019-12-17 07:05:36 PST
I notice my EGL doesn't support EGL_EXT_image_dma_buf_import_modifiers, but it looks like that shouldn't be causing any failure here.
Comment 34 Carlos Garcia Campos 2019-12-17 07:12:09 PST
Or we can add g_warning() before every "return false" there
Comment 35 Michael Catanzaro 2019-12-17 07:15:50 PST
That would be good too.

I might start with errors (crashes), so we don't fail to notice the warnings (it's impossible to notice warnings except when running from a terminal, which I rarely do), and then we can change them to warnings once we have solved this bug?

Or: we could do warnings upstream, and I can patch the GNOME runtime to change them into crashes.
Comment 36 Carlos Alberto Lopez Perez 2019-12-17 07:32:12 PST
(In reply to Michael Catanzaro from comment #33)
> I notice my EGL doesn't support EGL_EXT_image_dma_buf_import_modifiers, but
> it looks like that shouldn't be causing any failure here.

Mmmm....

If your EGL doesn't support that, then s_eglQueryDmaBufModifiersEXT() is NULL.
It seems also s_eglQueryDmaBufFormatsEXT() would be NULL in that case (looking at the if-code block guarding it, it only enters into it if both extensions are supported).

The code block you pasted here checks for that, but below that code, in Instance::foreachDmaBufModifier() it calls both s_eglQueryDmaBufModifiersEXT() and s_eglQueryDmaBufFormatsEXT() and doesn't seem to check the function pointers are valid.

https://github.com/Igalia/WPEBackend-fdo/blob/bee4104/src/ws.cpp#L534

may this explain your issue?
Comment 37 Michael Catanzaro 2019-12-17 08:29:57 PST
I don't think so. I'd expect crashes way more often if that was happening. Instance::foreachDmaBufModifier() is only called from bind_linux_dmabuf() in linux-dmabuf.cpp, and that's only called from linux_dmabuf_setup(), and that's only called from Instance::initialize in an if (s_eglQueryDmaBufFormatsEXT && s_eglQueryDmaBufModifiersEXT) block. So that shouldn't happen.

Note: I have EGL_EXT_image_dma_buf_import, just not EGL_EXT_image_dma_buf_import_modifiers. I wonder why it's not available?
Comment 38 Michael Catanzaro 2019-12-30 07:43:41 PST
Still crashing when creating WebGL context with 2.27.3. I thought we had downgraded this from a crash to disabling AC mode? I'm in the bad state again and every attempt to load a page that uses WebGL results in a crash. Here is a backtrace with 2.27.3:

#0  0x00007fd504da5c18 in Nicosia::GC3DLayer::makeContextCurrent() (this=<optimized out>)
    at /usr/include/c++/9.2.0/bits/unique_ptr.h:352
#1  0x00007fd504d9abc7 in WebCore::GraphicsContext3D::GraphicsContext3D(WebCore::GraphicsContext3DAttributes, WebCore::HostWindow*, WebCore::GraphicsContext3D::RenderStyle, WebCore::GraphicsContext3D*)
    (this=0x7fd493eb7000, attributes=..., renderStyle=WebCore::GraphicsContext3D::RenderOffscreen, sharedContext=<optimized out>) at ../Source/WebCore/platform/graphics/texmap/GraphicsContext3DTextureMapper.cpp:216
#2  0x00007fd504d9b7ce in WebCore::GraphicsContext3D::create(WebCore::GraphicsContext3DAttributes, WebCore::HostWindow*, WebCore::GraphicsContext3D::RenderStyle) (attributes=..., hostWindow=hostWindow@entry=
    0x7fd4fba5ba80, renderStyle=renderStyle@entry=WebCore::GraphicsContext3D::RenderOffscreen)
    at DerivedSources/ForwardingHeaders/wtf/RefCounted.h:185
#3  0x00007fd50432ce0f in WebCore::WebGLRenderingContextBase::create(WebCore::CanvasBase&, WebCore::GraphicsContext3DAttributes&, WTF::String const&) (canvas=..., attributes=..., type=...)
    at ../Source/WebCore/html/canvas/WebGLRenderingContextBase.cpp:606
#4  0x00007fd504213633 in WebCore::HTMLCanvasElement::createContextWebGL(WTF::String const&, WebCore::GraphicsContext3DAttributes&&) (this=0x7fd4abf4eaa0, type=..., attrs=...) at ../Source/WebCore/html/HTMLCanvasElement.cpp:411
#5  0x00007fd5042187a2 in WebCore::HTMLCanvasElement::getContext(JSC::JSGlobalObject&, WTF::String const&, WTF::Vector<JSC::Strong<JSC::Unknown, (JSC::ShouldStrongDestructorGrabLock)0>, 0ul, WTF::CrashOnOverflow, 16ul>&&)
    (this=this@entry=0x7fd4abf4eaa0, state=..., contextId=..., arguments=...)
    at ../Source/WebCore/html/HTMLCanvasElement.cpp:279
#6  0x00007fd50375056d in WebCore::jsHTMLCanvasElementPrototypeFunctionGetContextBody
    (throwScope=..., castedThis=0x7fd4abca9b80, callFrame=<optimized out>, lexicalGlobalObject=0x7fd4abcddf60)
    at DerivedSources/WebCore/JSHTMLCanvasElement.cpp:297
#7  0x00007fd50375056d in WebCore::IDLOperation<WebCore::JSHTMLCanvasElement>::call<WebCore::jsHTMLCanvasElementPrototypeFunctionGetContextBody> (operationName=0x7fd50500c689 "getContext", callFrame=..., lexicalGlobalObject=...)
    at ../Source/WebCore/bindings/js/JSDOMOperation.h:53
#8  0x00007fd50375056d in WebCore::jsHTMLCanvasElementPrototypeFunctionGetContext(JSC::JSGlobalObject*, JSC::CallFrame*) (lexicalGlobalObject=0x7fd4abcddf60, callFrame=<optimized out>)
    at DerivedSources/WebCore/JSHTMLCanvasElement.cpp:302
#9  0x00007fd4abfff16b in  ()
#10 0x00007ffc85385b50 in  ()
#11 0x00007fd500a54977 in llint_op_call () at /usr/lib/x86_64-linux-gnu/libjavascriptcoregtk-4.0.so.18
#12 0x0000000000000000 in  ()

Sadly I'm still unable to debug because I'm using flatpak 1.4.3, which has a broken 'flatpak enter' so no way to enter the sandbox environment.
Comment 39 Michael Catanzaro 2020-01-04 08:55:36 PST
(In reply to Carlos Garcia Campos from comment #15)
> I'm checking logs reported in bug #202362.
> 
> Oct 04 08:14:31 chargestone-cave org.gnome.Epiphany.Devel.desktop[1896]:
> Cannot get default EGL display: EGL_BAD_PARAMETER
> Oct 04 08:14:31 chargestone-cave org.gnome.Epiphany.Devel.desktop[1896]:
> Cannot create EGL context: invalid display (last error: EGL_SUCCESS)
> 
> This is the initialization of the default display, that is also failing so
> problem is not specific to wpe renderer. EGL_BAD_PARAMETER must be a
> previous error, as Miguel suggested, and the only egl call that should
> happen before the egl display initialization is eglGetPlatformDisplay. It
> returns a EGL_BAD_PARAMETER when the given platform is not supported, but if
> that was the case it would always fail. I see we are checking for
> EGL_KHR_platform_wayland and always passing EGL_PLATFORM_WAYLAND_KHR even
> when using eglGetPlatformDisplayEXT, but that shouldn't be a problem because
> EGL_PLATFORM_WAYLAND_KHR and EGL_PLATFORM_WAYLAND_EXT are both defined as
> 0x31D8. 
> 
> Oct 04 08:14:32 chargestone-cave org.gnome.Epiphany.Devel.desktop[1896]:
> Cannot get default EGL display: EGL_BAD_PARAMETER
> Oct 04 08:14:32 chargestone-cave org.gnome.Epiphany.Devel.desktop[1896]:
> PlatformDisplayLibWPE: could not create the EGL display: EGL_SUCCESS.
> 
> And this is creating the share display for compositing. In this case I don't
> know here the EGL_BAD_PARAMETER comes from.

In desperation, I'm looking at eglapi.c in mesa:

static EGLBoolean EGLAPIENTRY
eglBindWaylandDisplayWL(EGLDisplay dpy, struct wl_display *display)
{
   _EGLDisplay *disp = _eglLockDisplay(dpy);
   _EGLDriver *drv;
   EGLBoolean ret;

   _EGL_FUNC_START(disp, EGL_OBJECT_DISPLAY_KHR, NULL, EGL_FALSE);

   _EGL_CHECK_DISPLAY(disp, EGL_FALSE, drv);
   assert(disp->Extensions.WL_bind_wayland_display);

   if (!display)
      RETURN_EGL_ERROR(disp, EGL_BAD_PARAMETER, EGL_FALSE);

   ret = drv->API.BindWaylandDisplayWL(drv, disp, display);

   RETURN_EGL_EVAL(disp, ret);
}

Could wl_display be NULL? It's created in the WS::Instance constructor in ws.cpp, in WPEBackend-fdo, using wl_display_create(). It's documented to return NULL on failure and it looks like a WPEBackend-fdo bug that it's not checking for possible failure there.

I know this isn't likely. We need better debugging to figure out what is going on. I've opened https://github.com/Igalia/WPEBackend-fdo/pull/89 to add debug crashes, which I recommend we use in production until we figure out what's going on here. (Otherwise, at the rate we're going, we might never find the bug.)
Comment 40 Michael Catanzaro 2020-03-19 18:54:33 PDT
BTW this is still crashing with WebKitGTK 2.28.0, libwpe 1.6.0, wpebackend-fdo-1.6.0. It's still impossible to reproduce except when it randomly happens. The backtrace has changed a bit, it now looks like this:

#0  0x00007f50d5a5c128 in Nicosia::GC3DLayer::makeContextCurrent() (this=<optimized out>)
    at /usr/include/c++/9.2.0/bits/unique_ptr.h:352
#1  0x00007f50d5a51133 in WebCore::GraphicsContextGLOpenGL::GraphicsContextGLOpenGL(WebCore::GraphicsContextGLAttributes, WebCore::HostWindow*, WebCore::GraphicsContextGL::Destination, WebCore::GraphicsContextGLOpenGL*)
    (this=0x7f50c4ee1b80, attributes=..., destination=<optimized out>, sharedContext=<optimized out>)
    at ../Source/WebCore/platform/graphics/texmap/GraphicsContextGLTextureMapper.cpp:215
#2  0x00007f50d5a516ed in WebCore::GraphicsContextGLOpenGL::create(WebCore::GraphicsContextGLAttributes, WebCore::HostWindow*, WebCore::GraphicsContextGL::Destination) (attributes=..., hostWindow=hostWindow@entry=
    0x7f50cc194ae0, destination=destination@entry=WebCore::GraphicsContextGL::Destination::Offscreen)
    at DerivedSources/ForwardingHeaders/wtf/RefCounted.h:185
#3  0x00007f50d4fa8b77 in WebCore::WebGLRenderingContextBase::create(WebCore::CanvasBase&, WebCore::GraphicsContextGLAttributes&, WTF::String const&) (canvas=..., attributes=..., type=...)
    at ../Source/WebCore/html/canvas/WebGLRenderingContextBase.cpp:580
#4  0x00007f50d4e83bd3 in WebCore::HTMLCanvasElement::createContextWebGL(WTF::String const&, WebCore::GraphicsContextGLAttributes&&) (this=0x7f507ca94580, type=..., attrs=...) at ../Source/WebCore/html/HTMLCanvasElement.cpp:415
#5  0x00007f50d4e87552 in WebCore::HTMLCanvasElement::getContext(JSC::JSGlobalObject&, WTF::String const&, WTF::Vector<JSC::Strong<JSC::Unknown, (JSC::ShouldStrongDestructorGrabLock)0>, 0ul, WTF::CrashOnOverflow, 16ul, WTF::FastMalloc>&&) (this=this@entry=0x7f507ca94580, state=..., contextId=..., arguments=...)
    at ../Source/WebCore/html/HTMLCanvasElement.cpp:283
#6  0x00007f50d439e35d in WebCore::jsHTMLCanvasElementPrototypeFunctionGetContextBody
    (throwScope=..., castedThis=0x7f5065925320, callFrame=<optimized out>, lexicalGlobalObject=0x7f50c4cea068)
    at DerivedSources/WebCore/JSHTMLCanvasElement.cpp:298
#7  0x00007f50d439e35d in WebCore::IDLOperation<WebCore::JSHTMLCanvasElement>::call<WebCore::jsHTMLCanvasElementPrototypeFunctionGetContextBody> (operationName=0x7f50d5cb7d1e "getContext", callFrame=..., lexicalGlobalObject=...)
    at ../Source/WebCore/bindings/js/JSDOMOperation.h:53
#8  0x00007f50d439e35d in WebCore::jsHTMLCanvasElementPrototypeFunctionGetContext(JSC::JSGlobalObject*, JSC::CallFrame*) (lexicalGlobalObject=0x7f50c4cea068, callFrame=<optimized out>)
    at DerivedSources/WebCore/JSHTMLCanvasElement.cpp:303
#9  0x00007f507ffff178 in  ()
#10 0x00007ffc8f222500 in  ()
#11 0x00007f50d164428f in llint_op_call () at /usr/lib/x86_64-linux-gnu/libjavascriptcoregtk-4.0.so.18
#12 0x0000000000000000 in  ()
Comment 41 Michael Catanzaro 2020-06-03 07:39:04 PDT
OK it's been over half a year, I'm out of ideas and am considering switching GNOME back to WaylandCompositor to avoid these crashes. Please, if there's any debugging we can add to the code to help with this, let's add it.
Comment 42 Michael Catanzaro 2020-06-16 07:27:35 PDT
(In reply to Michael Catanzaro from comment #41)
> OK it's been over half a year, I'm out of ideas and am considering switching
> GNOME back to WaylandCompositor to avoid these crashes. Please, if there's
> any debugging we can add to the code to help with this, let's add it.

Since we are not making any progress on this issue, I'm going to disable WPE renderer again, for both GNOME and Fedora. For Fedora, we'll keep the WPE dependencies around indefinitely, but I won't update them anymore since WebKit will no longer use them. For GNOME, I will remove the deps from the SDK until WebKit is ready to use them again.
Comment 43 Carlos Garcia Campos 2020-06-16 08:12:48 PDT
Are you really getting crash reports about this in fedora? We don't have more reports upstream.
Comment 44 Michael Catanzaro 2020-06-16 08:30:24 PDT
(In reply to Carlos Garcia Campos from comment #43)
> Are you really getting crash reports about this in fedora? We don't have
> more reports upstream.

We're not, but I think that's just because our crash reporting infrastructure is broken. We've hardly received any crash reports from WebKit in the past year or two. It's possible that we've fixed all the bugs and WebKit has become nearly perfect... but I don't think so. ;)
Comment 45 Michael Catanzaro 2020-06-16 08:32:18 PDT
BTW, I'm OK with keeping WPE renderer as long as we add some sort of debugging to make it possible to solve this bug when crashes occur. My attempt in https://github.com/Igalia/WPEBackend-fdo/pull/89 was not successful.
Comment 46 Carlos Garcia Campos 2020-06-17 00:21:55 PDT
(In reply to Michael Catanzaro from comment #45)
> BTW, I'm OK with keeping WPE renderer as long as we add some sort of
> debugging to make it possible to solve this bug when crashes occur. My
> attempt in https://github.com/Igalia/WPEBackend-fdo/pull/89 was not
> successful.

Let's add it then, Adrian?
Comment 47 Michael Catanzaro 2020-07-17 08:11:51 PDT
Help? :)
Comment 48 Adrian Perez 2020-07-22 07:26:35 PDT
Created attachment 404920 [details]
Patch
Comment 49 Michael Catanzaro 2020-07-22 07:34:20 PDT
Comment on attachment 404920 [details]
Patch

Thanks!
Comment 50 Carlos Garcia Campos 2020-07-23 00:46:07 PDT
Comment on attachment 404920 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=404920&action=review

> Source/WebCore/platform/graphics/egl/GLContextEGL.cpp:195
> +    default:
> +        RELEASE_ASSERT_NOT_REACHED();

I think it's better not to add default here, since we are handling all possible cases (supported for the given build).

> Source/WebCore/platform/graphics/egl/GLContextEGL.cpp:248
> +        WTFLogAlways("Cannot create surfaceless EGL context: required extensions missing.");

I prefer not to add messages for things that are not errors. We did in the past and people thought there were errors, reporting them as possible cause of other bugs.

> Source/WebCore/platform/graphics/egl/GLContextEGL.cpp:306
> +        default:
> +            RELEASE_ASSERT_NOT_REACHED();
> +        }

Same here about the default, let the compiler complain.

> Source/WebCore/platform/graphics/egl/GLContextEGL.cpp:353
> +        default:
> +            RELEASE_ASSERT_NOT_REACHED();

Ditto.
Comment 51 Adrian Perez 2020-07-28 07:59:14 PDT
Created attachment 405357 [details]
Patch v2
Comment 52 EWS 2020-07-28 08:31:52 PDT
Committed r264986: <https://trac.webkit.org/changeset/264986>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 405357 [details].
Comment 53 Adrian Perez 2020-07-28 08:56:59 PDT
The landed patch was to add additional logging, let's keep the
bug open until we find out the root cause and fix it :)
Comment 54 Fujii Hironori 2020-07-28 23:48:00 PDT
Committed r265031: <https://trac.webkit.org/changeset/265031>
Comment 55 Fujii Hironori 2020-07-28 23:50:21 PDT
oops, reopened.
Comment 56 Michael Catanzaro 2020-08-05 08:55:56 PDT
I hit this again today, but discovered that the RELEASE_LOGS kinda failed since they are not activated by default. Can we change this to WTFLogAlways?

Here's what I see:

Aug 05 10:50:42 chargestone-cave geary[14241]: Cannot create EGL context: invalid display (last error: EGL_SUCCESS)
Aug 05 10:50:42 chargestone-cave kernel: WebKitWebProces[14241]: segfault at 0 ip 00007f6ca3dee168 sp 00007ffd6b29e798 error 4 in libwebkit2gtk-4.0.so.37.49.0[7f6ca17db000+32cb000]
Aug 05 10:50:42 chargestone-cave kernel: Code: c4 08 48 01 d8 5b 5d c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f3 0f 1e fa 48 8b 7f 10 <48> 8b 07 ff 60 10 66 90 f3 0f 1e fa 48 8b 7f 10 48 8b 07 ff 60 50
Aug 05 10:50:42 chargestone-cave audit[14241]: ANOM_ABEND auid=1000 uid=1000 gid=1000 ses=2 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=14241 comm="WebKitWebProces" exe="/usr/libexec/webkit2gtk-4.0/WebKitWebProcess" sig=11 res=1
Aug 05 10:50:42 chargestone-cave audit: BPF prog-id=72 op=LOAD
Aug 05 10:50:42 chargestone-cave audit: BPF prog-id=73 op=LOAD
Aug 05 10:50:42 chargestone-cave audit: BPF prog-id=74 op=LOAD
Aug 05 10:50:42 chargestone-cave audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-coredump@5-16310-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Aug 05 10:50:42 chargestone-cave systemd[1]: Started Process Core Dump (PID 16310/UID 0).
Aug 05 10:50:42 chargestone-cave epiphany[5820]: Web process crashed
Aug 05 10:50:43 chargestone-cave systemd-coredump[16311]: Process 14241 (WebKitWebProces) of user 1000 dumped core.
                                                          
                                                          Stack trace of thread 2305:
                                                          #0  0x00007f6ca3dee168 n/a (/usr/lib/x86_64-linux-gnu/libwebkit2gtk-4.0.so.37.49.0 + 0x2613168)
Aug 05 10:50:43 chargestone-cave systemd[1]: systemd-coredump@5-16310-0.service: Succeeded.
Aug 05 10:50:43 chargestone-cave audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-coredump@5-16310-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Aug 05 10:50:43 chargestone-cave geary[16318]: Cannot get default EGL display: EGL_BAD_PARAMETER
Aug 05 10:50:43 chargestone-cave geary[16318]: PlatformDisplayLibWPE: could not create the EGL display: EGL_SUCCESS.

Notice that geary seems to be failing in the same way at exactly the same time that Epiphany crashed, so whatever has gone wrong has happened in multiple WebKit  UI processes at the same time. I've never noticed that before. But unlike Epiphany, Geary never crashes.