Bug 164303 - [GTK] BadDamage X Window System error in WebKit::AcceleratedBackingStoreX11::update when called from WebPageProxy::exitAcceleratedCompositingMode
Summary: [GTK] BadDamage X Window System error in WebKit::AcceleratedBackingStoreX11::...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKitGTK (show other bugs)
Version: Other
Hardware: PC Linux
: P2 Normal
Assignee: Nobody
URL:
Keywords:
: 164221 (view as bug list)
Depends on:
Blocks:
 
Reported: 2016-11-01 17:11 PDT by Michael Catanzaro
Modified: 2016-12-16 14:04 PST (History)
8 users (show)

See Also:


Attachments
Backtrace with breakpoint set on gdk_x_error() (17.71 KB, text/plain)
2016-11-01 17:11 PDT, Michael Catanzaro
no flags Details
Patch (2.72 KB, patch)
2016-11-02 07:08 PDT, Carlos Garcia Campos
mcatanzaro: review+
Details | Formatted Diff | Diff
Patch (1.58 KB, patch)
2016-11-24 05:46 PST, Carlos Garcia Campos
mcatanzaro: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Catanzaro 2016-11-01 17:11:23 PDT
Created attachment 293621 [details]
Backtrace with breakpoint set on gdk_x_error()

With Epiphany master, if you close the browser with Ctrl+C, about half the time you will receive this:

(epiphany:28727): Gdk-ERROR **: The program 'epiphany' received an X Window System error.
This probably reflects a bug in the program.
The error was 'BadDamage (invalid Damage parameter)'.
  (Details: serial 83341 error_code 149 request_code 142 (DAMAGE) minor_code 2)
  (Note to programmers: normally, X errors are reported asynchronously;
   that is, you will receive the error a while after causing it.
   To debug your program, run it with the GDK_SYNCHRONIZE environment
   variable to change this behavior. You can then get a meaningful
   backtrace from your debugger if you break on the gdk_x_error() function.)

I think you can reproduce it with a stable version of Epiphany by using Ctrl+Q or the Quit app menu item.

Detailed backtrace attached.
Comment 1 Carlos Garcia Campos 2016-11-02 06:43:51 PDT
What WebKit version is this?
Comment 2 Carlos Garcia Campos 2016-11-02 06:59:00 PDT
This can happen if the web process exits before we have cleaned up the accelerated surface. I think we should just trap BadDrawable and BadDamage and ignore them.
Comment 3 Carlos Garcia Campos 2016-11-02 07:08:28 PDT
Created attachment 293657 [details]
Patch

I can't reproduce it, but this patch should fix it.
Comment 4 Tomas Popela 2016-11-02 07:48:47 PDT
There is a possible duplicate (from Evolution) - bug 164221
Comment 5 Carlos Garcia Campos 2016-11-02 07:56:36 PDT
*** Bug 164221 has been marked as a duplicate of this bug. ***
Comment 6 Michael Catanzaro 2016-11-02 08:34:45 PDT
(In reply to comment #1)
> What WebKit version is this?

trunk
Comment 7 Carlos Garcia Campos 2016-11-02 09:38:15 PDT
Committed r208285: <http://trac.webkit.org/changeset/208285>
Comment 8 Michael Catanzaro 2016-11-20 13:48:54 PST
Unfortunately it's still crashing in the same function AcceleratedBackingStoreX11::update with the same X Window System error in trunk. It's quite easy for me to reproduce this, it happens roughly half the time when closing Epiphany. I see you trapped the X11 API use at the top of the function, but not at the bottom of the function, so I guess it's coming from the bottom now.
Comment 9 Michael Catanzaro 2016-11-21 09:39:52 PST
Yes, the crash occurs during the XSync on the very last line of this function. Is it right to trap the errors here too?
Comment 10 Carlos Garcia Campos 2016-11-24 05:42:52 PST
When closing AcceleratedBackingStoreX11::update() is called with a null LayerTreeContext, so the function should always return from the !pixmap early return. That's why I didn't add the trapper at the bottom. I don't understand why it happens, but I guess it's safe to trap xerrors in that case too.
Comment 11 Carlos Garcia Campos 2016-11-24 05:46:19 PST
Created attachment 295406 [details]
Patch
Comment 12 Michael Catanzaro 2016-11-24 06:38:56 PST
Comment on attachment 295406 [details]
Patch

I'd like Zan to look at this.

Questions:

 (a) Isn't this equivalent to trapping the entire function, now? Both calls to X11 APIs are within the scope of one trapper or another, we could just put the trapper at the top so we only need one instead of two.
 (b) It occurs to me that it's probably a bad idea to trap errors unconditionally here. We really want to ignore errors only if shutting down, right? Is there any way we can do that, or are we just out of luck and have to trap always?
Comment 13 Carlos Garcia Campos 2016-11-24 08:27:31 PST
(In reply to comment #12)
> Comment on attachment 295406 [details]
> Patch
> 
> I'd like Zan to look at this.
> 
> Questions:
> 
>  (a) Isn't this equivalent to trapping the entire function, now? Both calls
> to X11 APIs are within the scope of one trapper or another, we could just
> put the trapper at the top so we only need one instead of two.

Not exactly the same, first one only happens if we already have a surface and second one if there's a pixmap and drawing area. We could just move it, though, but I prefer to make more expliit what we are protecting.

>  (b) It occurs to me that it's probably a bad idea to trap errors
> unconditionally here. We really want to ignore errors only if shutting down,
> right? Is there any way we can do that, or are we just out of luck and have
> to trap always?

Not really, there are a lot of situation in which this can happen, for example if the web process crashes in the middle of update, the pixmap is gone but we have a pixmap id in the Ui process. Also because of the async way X11 works, sometimes errors happen later, on a sync, so we could be failing due to previous errors. If you take a look at other projects using X11 directly like GTK+, cairo or clutter you will see they are full of xerror trappers.
Comment 14 Carlos Garcia Campos 2016-11-28 23:27:36 PST
Committed r209063: <http://trac.webkit.org/changeset/209063>
Comment 15 Carlos Alberto Lopez Perez 2016-12-16 09:42:38 PST
*** Bug 165656 has been marked as a duplicate of this bug. ***
Comment 16 Carlos Alberto Lopez Perez 2016-12-16 09:56:56 PST
(In reply to comment #13)
> (In reply to comment #12)
> > Comment on attachment 295406 [details]
> > Patch
> > 
> > I'd like Zan to look at this.
> > 
> > Questions:
> > 
> >  (a) Isn't this equivalent to trapping the entire function, now? Both calls
> > to X11 APIs are within the scope of one trapper or another, we could just
> > put the trapper at the top so we only need one instead of two.
> 
> Not exactly the same, first one only happens if we already have a surface
> and second one if there's a pixmap and drawing area. We could just move it,
> though, but I prefer to make more expliit what we are protecting.
> 

Look at bug 164303

I think we should set the trapper for the whole function.

I would say more: I *really* would like that we trap all X error for the whole UIProcess and we just print them on stdout instead of crashing.


I really hate when things crash. I rather prefer to be on the wrong side by protecting things more than needed than be on the wrong side by crashing on corner cases that were not detected when this was implemented.
Comment 17 Michael Catanzaro 2016-12-16 13:57:35 PST
I would rather have a crash report whenever there is a bug in our code....
Comment 18 Andres Gomez Garcia 2016-12-16 14:04:04 PST
(In reply to comment #16)
...
> I would say more: I *really* would like that we trap all X error for the
> whole UIProcess and we just print them on stdout instead of crashing.
> 
> 
> I really hate when things crash. I rather prefer to be on the wrong side by
> protecting things more than needed than be on the wrong side by crashing on
> corner cases that were not detected when this was implemented.

Like with asserts, crash should be left for debug builds. For a release build, this should be avoided, IMHO.