Bug 155536

Summary: Wrong use of EGL_DEPTH_SIZE
Product: WebKit Reporter: Erwin Rol <erwin>
Component: WebCore Misc.Assignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Major CC: cgarcia, clopez, mario, mcatanzaro, tpopela, yoon, zan
Priority: P2    
Version: WebKit Local Build   
Hardware: All   
OS: All   
Attachments:
Description Flags
Patch mcatanzaro: review+

Description Erwin Rol 2016-03-16 02:14:30 PDT
In Source/WebCore/platform/graphics/egl/GLContextEGL.cpp function std::unique_ptr<GLContextEGL> GLContextEGL::createPixmapContext(EGLContext sharingContext) uses;

eglGetConfigAttrib(display, config, EGL_DEPTH_SIZE, &depth)

to (mistakenly) get the depth buffer size instead of the color depth. That depth (0 because no depth buffer was requested) is later passed to;

XCreatePixmap(x11Display, DefaultRootWindow(x11Display), 1, 1, depth) 

to request a pixmap with the same color depth as the screen. And passing 0 causes that call to fail with a;

(WebKitWebProcess:669): Gdk-ERROR **: The program 'WebKitWebProcess' received an X Window System error.
This probably reflects a bug in the program.
The error was 'BadValue'.
  (Details: serial 148 error_code 2 request_code 53 (core protocol) minor_code 0)
  (Note to programmers: normally, X errors are reported asynchronously;
   that is, you will receive the error a while after causing it.
   To debug your program, run it with the GDK_SYNCHRONIZE environment
   variable to change this behavior. You can then get a meaningful
   backtrace from your debugger if you break on the gdk_x_error() function.)


On my hardware setup (Tegra3 Linux) setting depth hard to 32 makes things work, this is ofcourse not a real fix because other setups might not use 32bit screens.
Comment 1 Carlos Alberto Lopez Perez 2016-03-16 11:03:28 PDT
So the issue is that we are requesting an EGL config without specifying the depth in eglChooseConfig, or the issue is that the value of EGL_DEPTH_SIZE is not what we should be passing to XCreatePixmap ?
Comment 2 Erwin Rol 2016-03-16 15:50:52 PDT
The intention of the code seems to be to get the number of bits per pixel (used by the screen for color) and to create a pixmap with that same number of bits per pixel. 

The EGL_DEPTH_SIZE has nothing to do with color, it is the depth buffer size (the number of bits for the Z axis).
https://www.khronos.org/registry/egl/sdk/docs/man/html/eglGetConfigAttrib.xhtml

A better way to get the color depth of the DefaultRootWindow is maybe to use the XGetWindowAttributes() call and use the XWindowAttributes.depth member. 

https://tronche.com/gui/x/xlib/window-information/XGetWindowAttributes.html
Comment 3 Carlos Alberto Lopez Perez 2016-03-17 04:17:21 PDT
(In reply to comment #2)
> The intention of the code seems to be to get the number of bits per pixel
> (used by the screen for color) and to create a pixmap with that same number
> of bits per pixel. 
> 
> The EGL_DEPTH_SIZE has nothing to do with color, it is the depth buffer size
> (the number of bits for the Z axis).
> https://www.khronos.org/registry/egl/sdk/docs/man/html/eglGetConfigAttrib.
> xhtml
> 

So instead of EGL_DEPTH_SIZE we should request EGL_BUFFER_SIZE ?

 - EGL_BUFFER_SIZE: Returns the depth of the color buffer. It is the sum of EGL_RED_SIZE, EGL_GREEN_SIZE, EGL_BLUE_SIZE, and EGL_ALPHA_SIZE. 

If you want, feel free to upload a patch here for review. Don't forget to include a Changelog in your patch as explained here: https://webkit.org/contributing-code/
Comment 4 Erwin Rol 2016-03-17 04:36:55 PDT
I can make a patch, but it will take a few days.
Comment 5 Michael Catanzaro 2016-03-21 10:23:55 PDT
(In reply to comment #4)
> I can make a patch, but it will take a few days.

Much appreciated, thanks.
Comment 6 Carlos Garcia Campos 2016-10-19 06:03:14 PDT
I've run into this issue too, both EGL_DEPTH_SIZE and EGL_BUFFER_SIZE are indeed confusing. What happens here is that the driver doesn't implement EGL_DEPTH_SIZE and the default value, which is 0, is returned. Then XCreatePixmap fails because 0 is not a valid depth. The thing is that even if EGL_DEPTH_SIZE or EGL_BUFFER_SIZE retuned a valid depth, it still might not be supported by the default screen and XCreatePixmap can fail. What we need to ensure is that the depth we pass is compatible with the X display, not only with the EGL config, to avoid failres when creating the pixmap. So, I think we can use EGL_NATIVE_VISUAL_ID instead, and then ask X for the visual info for that id. If not found then we just return before creating the pixmap. If the visual is found then we can be sure that the depth of the visual will not make the pixmap creation fail. However, with the driver I'm using it doesn't matter how we create the pixmap that eglCreatePixmapSurface always fails, again with X errors that are fatal by default. Since the driver is not free, I assume it doesn't support eglCreatePixmapSurface or it's just buggy, so the only option we have here is trap the x errors and ignore them. It turns out that the x errors are not fatal in this case, because eglCreatePixmapSurface ends up returning a surface, and since these are offscreen contexts, it doesn't really matter if they contains an invalid pixmap, because we never do swap buffer on them, so just ignoring the X errors fixes the crashes and makes everythig work.
Comment 7 Carlos Garcia Campos 2016-10-19 06:23:49 PDT
Created attachment 292062 [details]
Patch
Comment 8 Michael Catanzaro 2016-10-19 09:04:26 PDT
Comment on attachment 292062 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=292062&action=review

> Source/WebCore/ChangeLog:10
> +        EGL_BUFFER_SIZE retuned a valid depth, it still might not be supported by the default screen and XCreatePixmap

retuned -> returned

> Source/WebCore/ChangeLog:12
> +        EGL config, to avoid failures when creating the pixmap. So, We can use EGL_NATIVE_VISUAL_ID instead, and

We -> we

> Source/WebCore/ChangeLog:28
> +        the depthj to be passed to XCreatePixmap. Also use the XErrorTrapper class to ignore all BadDrawable errors

depthj -> depth

> Source/WebCore/platform/graphics/egl/GLContextEGL.cpp:210
> +    // Some drivers fail to create the surface producing BadDrawable X error and the default XError handler normally aborts.
> +    // However, if the X error is ignored, eglCreatePixmapSurface() ends up returning a surface and we can continue creating
> +    // the context. Since this is an offscreen context, it doesn't matter if the pixmap used is not valid because we never do
> +    // swap buffers. So, we use a custom XError handler here that ignores BadDrawable errors and only warns about any other
> +    // errors without aborting in any case.
> +    XErrorTrapper trapper(x11Display, XErrorTrapper::Policy::Warn, { BadDrawable });

This seems like a workaround to a broken crap proprietary graphics driver. Are you sure we really want to upstream this workaround? IMO this should be carried downstream in the project for the client that made the mistake of using a proprietary graphics stack.
Comment 9 Carlos Garcia Campos 2016-10-19 09:17:02 PDT
(In reply to comment #8)
> Comment on attachment 292062 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=292062&action=review
> 
> > Source/WebCore/ChangeLog:10
> > +        EGL_BUFFER_SIZE retuned a valid depth, it still might not be supported by the default screen and XCreatePixmap
> 
> retuned -> returned
> 
> > Source/WebCore/ChangeLog:12
> > +        EGL config, to avoid failures when creating the pixmap. So, We can use EGL_NATIVE_VISUAL_ID instead, and
> 
> We -> we
> 
> > Source/WebCore/ChangeLog:28
> > +        the depthj to be passed to XCreatePixmap. Also use the XErrorTrapper class to ignore all BadDrawable errors
> 
> depthj -> depth
> 
> > Source/WebCore/platform/graphics/egl/GLContextEGL.cpp:210
> > +    // Some drivers fail to create the surface producing BadDrawable X error and the default XError handler normally aborts.
> > +    // However, if the X error is ignored, eglCreatePixmapSurface() ends up returning a surface and we can continue creating
> > +    // the context. Since this is an offscreen context, it doesn't matter if the pixmap used is not valid because we never do
> > +    // swap buffers. So, we use a custom XError handler here that ignores BadDrawable errors and only warns about any other
> > +    // errors without aborting in any case.
> > +    XErrorTrapper trapper(x11Display, XErrorTrapper::Policy::Warn, { BadDrawable });
> 
> This seems like a workaround to a broken crap proprietary graphics driver.
> Are you sure we really want to upstream this workaround? IMO this should be
> carried downstream in the project for the client that made the mistake of
> using a proprietary graphics stack.

It's a workaround for a buggy driver, but also a better way to handle X errors in general. You can see other libs using X like GTK+ itself, that are full of blocks trapping X errors. I tried to be conservative here, because I don't want an error trapper like this to hide real bugs, but the plan is to use it every time we detect an X error crashing the web process for no reason and we know it's not a our fault. So, yes I want to upstream this.
Comment 10 Carlos Garcia Campos 2016-10-20 01:36:42 PDT
Committed r207590: <http://trac.webkit.org/changeset/207590>