Bug 161862

Summary: [GTK] WebProcess from WebKitGtk+ 2.13.91 hangs my Intel GPU
Product: WebKit Reporter: Andres Gomez Garcia <agomez>
Component: WebKitGTKAssignee: Nobody <webkit-unassigned>
Status: RESOLVED CONFIGURATION CHANGED    
Severity: Normal CC: bugs-noreply, cgarcia, magomez
Priority: P2    
Version: WebKit Nightly Build   
Hardware: PC   
OS: Linux   
See Also: https://bugs.freedesktop.org/show_bug.cgi?id=97785
Attachments:
Description Flags
GPU dump error at /sys/class/drm/card0/error none

Description Andres Gomez Garcia 2016-09-12 08:16:09 PDT
I'm using WebKitGtk+ with my own JHBuild setting:
https://github.com/tanty/jhbuild-epiphany/tree/master

Epiphany 3.20.3 and WebKit 2.13.91.

I'm running Epiphany with the dconf key:

"process-model" = "shared-secondary-process"

The compilation was done with CMake args:

'-DPORT=GTK -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_FLAGS_RELEASE="-O0 -g1 -DNDEBUG -DG_DISABLE_CAST_CHECKS" -DCMAKE_CXX_FLAGS_RELEASE="-O0 -g1 -DNDEBUG -DG_DISABLE_CAST_CHECKS"'

Just by opening the last closed session, the desktop gets stall, even when there is, apparently, no high CPU usage.

I can see this in the logs:

$ less /var/log/kern.log

[snip]

Sep 12 18:02:33 pomeron kernel: [  142.207298] [drm:i915_hangcheck_elapsed [i915]] *ERROR* Hangcheck timer elapsed... render ring idle
Sep 12 18:03:33 pomeron kernel: [  202.140067] [drm] no progress on render ring
Sep 12 18:03:33 pomeron kernel: [  202.141208] [drm] GPU HANG: ecode 6:-1:0x00000000, reason: Ring hung, action: reset
Sep 12 18:03:33 pomeron kernel: [  202.141212] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Sep 12 18:03:33 pomeron kernel: [  202.141214] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Sep 12 18:03:33 pomeron kernel: [  202.141217] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Sep 12 18:03:33 pomeron kernel: [  202.141219] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Sep 12 18:03:33 pomeron kernel: [  202.141222] [drm] GPU crash dump saved to /sys/class/drm/card0/error
Sep 12 18:03:33 pomeron kernel: [  202.141409] [drm:i915_switch_context [i915]] *ERROR* ring init context: -11
Sep 12 18:03:33 pomeron kernel: [  202.143473] drm/i915: Resetting chip after gpu hang
Sep 12 18:03:33 pomeron kernel: [  202.396871] [drm:ironlake_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
Sep 12 18:03:33 pomeron kernel: [  202.399002] [drm:intel_set_pch_fifo_underrun_reporting [i915]] *ERROR* uncleared pch fifo underrun on pch transcoder A
Sep 12 18:03:33 pomeron kernel: [  202.399026] [drm:cpt_irq_handler [i915]] *ERROR* PCH transcoder A FIFO underrun
Sep 12 18:03:33 pomeron kernel: [  202.408353] [drm:ironlake_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun
Sep 12 18:03:33 pomeron kernel: [  202.432732] [drm:intel_check_pch_fifo_underruns [i915]] *ERROR* pch fifo underrun on pch transcoder B
Sep 12 18:03:41 pomeron kernel: [  210.163587] [drm] stuck on render ring
Sep 12 18:03:41 pomeron kernel: [  210.163990] [drm] GPU HANG: ecode 6:0:0x0009090b, in WebKitWebProces [4143], reason: Ring hung, action: reset
Sep 12 18:03:41 pomeron kernel: [  210.166054] drm/i915: Resetting chip after gpu hang


$ lspci -v

[snip]

00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09) (prog-if 00 [VGA controller])
        Subsystem: Lenovo 2nd Generation Core Processor Family Integrated Graphics Controller
        Flags: bus master, fast devsel, latency 0, IRQ 34
        Memory at f0000000 (64-bit, non-prefetchable) [size=4M]
        Memory at e0000000 (64-bit, prefetchable) [size=256M]
        I/O ports at 5000 [size=64]
        [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
        Capabilities: [d0] Power Management version 2
        Capabilities: [a4] PCI Advanced Features
        Kernel driver in use: i915
        Kernel modules: i915

$ glxinfo | grep string
server glx vendor string: SGI
server glx version string: 1.4
client glx vendor string: Mesa Project and SGI
client glx version string: 1.4
OpenGL vendor string: Intel Open Source Technology Center
OpenGL renderer string: Mesa DRI Intel(R) Sandybridge Mobile 
OpenGL core profile version string: 3.3 (Core Profile) Mesa 11.2.2
OpenGL core profile shading language version string: 3.30
OpenGL version string: 3.0 Mesa 11.2.2
OpenGL shading language version string: 1.30
OpenGL ES profile version string: OpenGL ES 3.0 Mesa 11.2.2
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.00

$ uname -a
Linux pomeron 4.6.0-1-amd64 #1 SMP Debian 4.6.1-1 (2016-06-06) x86_64 GNU/Linux
Comment 1 Andres Gomez Garcia 2016-09-12 08:18:39 PDT
Created attachment 288571 [details]
GPU dump error at /sys/class/drm/card0/error
Comment 2 Carlos Garcia Campos 2016-09-12 08:46:07 PDT
Have you tried filing a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel? GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Comment 3 Andres Gomez Garcia 2016-09-12 09:15:56 PDT
(In reply to comment #2)
> Have you tried filing a _new_ bug report on bugs.freedesktop.org against DRI
> -> DRM/Intel? GPU hangs can indicate a bug anywhere in the entire gfx stack,
> including userspace.

I can do that, of course but, since this is happening only from 2.12.5 to 2.13.91, there seems that WKGTK is also doing something differently ...
Comment 4 Carlos Garcia Campos 2016-09-12 09:48:56 PDT
(In reply to comment #3)
> (In reply to comment #2)
> > Have you tried filing a _new_ bug report on bugs.freedesktop.org against DRI
> > -> DRM/Intel? GPU hangs can indicate a bug anywhere in the entire gfx stack,
> > including userspace.
> 
> I can do that, of course but, since this is happening only from 2.12.5 to
> 2.13.91, there seems that WKGTK is also doing something differently ...

Yes, but we are doing too many things differently from 2.12 to 2.14. We have switched to use the threaded compositor which also implies we use now coordinated graphics, so too many things are different. On top of that there are 6 months of WebKit development :-)
Comment 5 Andres Gomez Garcia 2016-09-13 03:52:45 PDT
(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #2)
> > > Have you tried filing a _new_ bug report on bugs.freedesktop.org against DRI
> > > -> DRM/Intel? GPU hangs can indicate a bug anywhere in the entire gfx stack,
> > > including userspace.
> > 
> > I can do that, of course but, since this is happening only from 2.12.5 to
> > 2.13.91, there seems that WKGTK is also doing something differently ...
> 
> Yes, but we are doing too many things differently from 2.12 to 2.14. We have
> switched to use the threaded compositor which also implies we use now
> coordinated graphics, so too many things are different. On top of that there
> are 6 months of WebKit development :-)

Maybe I explained myself wrong.

This is not just happening from 2.12 to 2.14, it is happening from a former 2.13.x to the latest 2.13.x

I will look for the version in which this started happening but I think it was from 2.13.4 to 2.13.90 ...

I'll keep you posted :)
Comment 6 Andres Gomez Garcia 2016-09-13 04:07:13 PDT
(In reply to comment #5)

> I will look for the version in which this started happening but I think it
> was from 2.13.4 to 2.13.90 ...

Yes, I think this was the case.

In bug 160389 I commented that, using the patch proposed, my whole desktop was getting frozen. I didn't research much further but I suspect that the GPU hang was introduced by that patch.

Then, the patch was committed and I moved back to the 2.12.x series in my daily use because 2.13.x was not usable in my box any more ...
Comment 7 Andres Gomez Garcia 2016-09-13 04:09:00 PDT
(In reply to comment #6)
> (In reply to comment #5)
> 
> > I will look for the version in which this started happening but I think it
> > was from 2.13.4 to 2.13.90 ...
> 
> Yes, I think this was the case.
> 
> In bug 160389 I commented that, using the patch proposed, my whole desktop
> was getting frozen. I didn't research much further but I suspect that the
> GPU hang was introduced by that patch.
> 
> Then, the patch was committed and I moved back to the 2.12.x series in my
> daily use because 2.13.x was not usable in my box any more ...

FTR, I'm now testing 2.13.4 from Debian experimental and I've already gotten this:

$ cat /var/log/kern.log

[snip]

Sep 13 13:58:15 pomeron kernel: [11660.199863] [drm:ironlake_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
Sep 13 13:58:15 pomeron kernel: [11660.200645] [drm:intel_set_pch_fifo_underrun_reporting [i915]] *ERROR* uncleared pch fifo underrun on pch transcoder A
Sep 13 13:58:15 pomeron kernel: [11660.200671] [drm:cpt_irq_handler [i915]] *ERROR* PCH transcoder A FIFO underrun
Sep 13 13:58:15 pomeron kernel: [11660.202018] [drm:ironlake_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun
Sep 13 13:58:16 pomeron kernel: [11660.223535] [drm:intel_set_pch_fifo_underrun_reporting [i915]] *ERROR* uncleared pch fifo underrun on pch transcoder B
Sep 13 13:58:16 pomeron kernel: [11660.689752] [drm:ironlake_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun
Sep 13 13:58:16 pomeron kernel: [11660.722393] [drm:intel_check_pch_fifo_underruns [i915]] *ERROR* pch fifo underrun on pch transcoder B


However, after this small hiccup, I could keep using the desktop/browser without problems ...
Comment 8 Andres Gomez Garcia 2016-09-13 05:17:03 PDT
(In reply to comment #7)

> $ cat /var/log/kern.log
> 
> [snip]
> 
> Sep 13 13:58:15 pomeron kernel: [11660.199863] [drm:ironlake_irq_handler
> [i915]] *ERROR* CPU pipe A FIFO underrun
> Sep 13 13:58:15 pomeron kernel: [11660.200645]
> [drm:intel_set_pch_fifo_underrun_reporting [i915]] *ERROR* uncleared pch
> fifo underrun on pch transcoder A
> Sep 13 13:58:15 pomeron kernel: [11660.200671] [drm:cpt_irq_handler [i915]]
> *ERROR* PCH transcoder A FIFO underrun
> Sep 13 13:58:15 pomeron kernel: [11660.202018] [drm:ironlake_irq_handler
> [i915]] *ERROR* CPU pipe B FIFO underrun
> Sep 13 13:58:16 pomeron kernel: [11660.223535]
> [drm:intel_set_pch_fifo_underrun_reporting [i915]] *ERROR* uncleared pch
> fifo underrun on pch transcoder B
> Sep 13 13:58:16 pomeron kernel: [11660.689752] [drm:ironlake_irq_handler
> [i915]] *ERROR* CPU pipe B FIFO underrun
> Sep 13 13:58:16 pomeron kernel: [11660.722393]
> [drm:intel_check_pch_fifo_underruns [i915]] *ERROR* pch fifo underrun on pch
> transcoder B

This doesn't seem related. It happens every time I change from the X session to a terminal ...
Comment 9 Miguel Gomez 2017-04-18 07:35:09 PDT
Is this still an issue? Or can we close it? Many things changed since this was reported so I guess it's obsolete.
Comment 10 Andres Gomez Garcia 2017-04-18 08:21:41 PDT
(In reply to Miguel Gomez from comment #9)
> Is this still an issue? Or can we close it? Many things changed since this
> was reported so I guess it's obsolete.

This is already closed in b.f.o so I suppose there is not much we can do any more.