Bug 161908 - [GTK] WebKitGTK+ stalls my whole desktop
Summary: [GTK] WebKitGTK+ stalls my whole desktop
Status: RESOLVED WORKSFORME
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKitGTK (show other bugs)
Version: WebKit Nightly Build
Hardware: PC Linux
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-09-13 05:14 PDT by Andres Gomez Garcia
Modified: 2017-05-04 08:54 PDT (History)
5 users (show)

See Also:


Attachments
syslog showing a whole epiphany session from launch till it is killed (148.23 KB, text/plain)
2016-09-16 07:56 PDT, Andres Gomez Garcia
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andres Gomez Garcia 2016-09-13 05:14:58 PDT
Just installed 2.13.91 from Debian experimental.

I'm running Epiphany with the dconf key:

"process-model" = "shared-secondary-process"

Upon starting ephy, recovering a session with lots of windows and tabs, my desktop gets frozen.

Checking CPU, swap and disk, none of them seems to suffering much, the desktop just froze. Every now and then, the mouse is slowly responsive and a new window pops up. Also, I can switch to a vt but that doesn't work any more at some point and the only solution is just forcing a reboot.

I can see this repeated again and again:

$ cat  /var/log/kern.log

[snip]

Sep 13 14:55:36 pomeron kernel: [15010.344186] swapper/3: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC)
Sep 13 14:55:36 pomeron kernel: [15010.344197] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G           OE   4.6.0-1-amd64 #1 Debian 4.6.1-1
Sep 13 14:55:36 pomeron kernel: [15010.344201] Hardware name: LENOVO 42914BG/42914BG, BIOS 8DET70WW (1.40 ) 05/14/2015
Sep 13 14:55:36 pomeron kernel: [15010.344204]  0000000000000086 1364666ac54b1e78 ffffffff81311425 0000000000000000
Sep 13 14:55:36 pomeron kernel: [15010.344211]  ffff88021e2c3bc8 ffffffff81177bb1 020800209c4987c0 ffff88009c4987c0
Sep 13 14:55:36 pomeron kernel: [15010.344215]  ffff88021e5f6330 ffff880139d65b00 0000000000000000 ffff88009c498848
Sep 13 14:55:36 pomeron kernel: [15010.344220] Call Trace:
Sep 13 14:55:36 pomeron kernel: [15010.344223]  <IRQ>  [<ffffffff81311425>] ? dump_stack+0x5c/0x77
Sep 13 14:55:36 pomeron kernel: [15010.344248]  [<ffffffff81177bb1>] ? warn_alloc_failed+0x101/0x160
Sep 13 14:55:36 pomeron kernel: [15010.344255]  [<ffffffff8117b474>] ? __alloc_pages_nodemask+0x524/0xc90
Sep 13 14:55:36 pomeron kernel: [15010.344263]  [<ffffffff8117bdfc>] ? __alloc_page_frag+0x15c/0x180
Sep 13 14:55:36 pomeron kernel: [15010.344270]  [<ffffffff814b8a98>] ? __netdev_alloc_skb+0x98/0x100
Sep 13 14:55:36 pomeron kernel: [15010.344295]  [<ffffffffc004d6ae>] ? e1000_alloc_rx_buffers+0x25e/0x2c0 [e1000e]
Sep 13 14:55:36 pomeron kernel: [15010.344310]  [<ffffffffc004a4b3>] ? e1000_clean_rx_irq+0x2f3/0x430 [e1000e]
Sep 13 14:55:36 pomeron kernel: [15010.344315]  [<ffffffff810b4a4c>] ? load_balance+0x71c/0x8f0
Sep 13 14:55:36 pomeron kernel: [15010.344329]  [<ffffffffc005180c>] ? e1000e_poll+0x7c/0x2d0 [e1000e]
Sep 13 14:55:36 pomeron kernel: [15010.344334]  [<ffffffff814c7b13>] ? net_rx_action+0x233/0x370
Sep 13 14:55:36 pomeron kernel: [15010.344342]  [<ffffffff815c90b8>] ? __do_softirq+0xf8/0x28e
Sep 13 14:55:36 pomeron kernel: [15010.344347]  [<ffffffff8107fe3b>] ? irq_exit+0x9b/0xa0
Sep 13 14:55:36 pomeron kernel: [15010.344352]  [<ffffffff815c8e0f>] ? do_IRQ+0x4f/0xd0
Sep 13 14:55:36 pomeron kernel: [15010.344357]  [<ffffffff815c6f42>] ? common_interrupt+0x82/0x82
Sep 13 14:55:36 pomeron kernel: [15010.344359]  <EOI>  [<ffffffff8148f6f5>] ? cpuidle_enter_state+0x115/0x2c0
Sep 13 14:55:36 pomeron kernel: [15010.344368]  [<ffffffff8148f6e5>] ? cpuidle_enter_state+0x105/0x2c0
Sep 13 14:55:36 pomeron kernel: [15010.344374]  [<ffffffff810bb090>] ? cpu_startup_entry+0x290/0x330
Sep 13 14:55:36 pomeron kernel: [15010.344380]  [<ffffffff8104e37a>] ? start_secondary+0x15a/0x190
Sep 13 14:55:36 pomeron kernel: [15010.344383] Mem-Info:
Sep 13 14:55:36 pomeron kernel: [15010.344391] active_anon:1011948 inactive_anon:796705 isolated_anon:0
Sep 13 14:55:36 pomeron kernel: [15010.344391]  active_file:12988 inactive_file:12939 isolated_file:0
Sep 13 14:55:36 pomeron kernel: [15010.344391]  unevictable:60 dirty:1 writeback:308511 unstable:0
Sep 13 14:55:36 pomeron kernel: [15010.344391]  slab_reclaimable:10937 slab_unreclaimable:128252
Sep 13 14:55:36 pomeron kernel: [15010.344391]  mapped:150111 shmem:619183 pagetables:14991 bounce:0
Sep 13 14:55:36 pomeron kernel: [15010.344391]  free:7075 free_pcp:8 free_cma:0
Sep 13 14:55:36 pomeron kernel: [15010.344398] Node 0 DMA free:4kB min:128kB low:160kB high:192kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(f
ile):0kB present:15984kB managed:15360kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:15356kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:
0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Sep 13 14:55:36 pomeron kernel: [15010.344411] lowmem_reserve[]: 0 3382 7834 7834 7834
Sep 13 14:55:36 pomeron kernel: [15010.344417] Node 0 DMA32 free:25492kB min:29120kB low:36400kB high:43680kB active_anon:1765444kB inactive_anon:1345168kB active_file:17580kB inactive_file:17496kB unevictable:9
6kB isolated(anon):0kB isolated(file):0kB present:3561088kB managed:3484920kB mlocked:96kB dirty:0kB writeback:552032kB mapped:247436kB shmem:1009408kB slab_reclaimable:16756kB slab_unreclaimable:247948kB kernel
_stack:4976kB pagetables:24860kB unstable:0kB bounce:0kB free_pcp:32kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:357536 all_unreclaimable? no
Sep 13 14:55:36 pomeron kernel: [15010.344429] lowmem_reserve[]: 0 0 4452 4452 4452
Sep 13 14:55:36 pomeron kernel: [15010.344435] Node 0 Normal free:2804kB min:38332kB low:47912kB high:57492kB active_anon:2282348kB inactive_anon:1841652kB active_file:34372kB inactive_file:34260kB unevictable:1
44kB isolated(anon):0kB isolated(file):0kB present:4691968kB managed:4559088kB mlocked:144kB dirty:4kB writeback:682012kB mapped:353008kB shmem:1467324kB slab_reclaimable:26992kB slab_unreclaimable:249704kB kern
el_stack:6688kB pagetables:35104kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:229628 all_unreclaimable? no
Sep 13 14:55:36 pomeron kernel: [15010.344444] lowmem_reserve[]: 0 0 0 0 0
Sep 13 14:55:36 pomeron kernel: [15010.344450] Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4kB
Sep 13 14:55:36 pomeron kernel: [15010.344464] Node 0 DMA32: 6223*4kB (UME) 3*8kB (ME) 4*16kB (UME) 2*32kB (UE) 1*64kB (M) 1*128kB (E) 1*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 25492kB
Sep 13 14:55:36 pomeron kernel: [15010.344481] Node 0 Normal: 189*4kB (UME) 66*8kB (UME) 47*16kB (UM) 20*32kB (UM) 2*64kB (M) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2804kB
Sep 13 14:55:36 pomeron kernel: [15010.344498] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Sep 13 14:55:36 pomeron kernel: [15010.344501] 953796 total pagecache pages
Sep 13 14:55:36 pomeron kernel: [15010.344503] 308686 pages in swap cache
Sep 13 14:55:36 pomeron kernel: [15010.344506] Swap cache stats: add 2459899, delete 2151213, find 1010596/1077900
Sep 13 14:55:36 pomeron kernel: [15010.344509] Free swap  = 4809604kB
Sep 13 14:55:36 pomeron kernel: [15010.344511] Total swap = 7811068kB
Sep 13 14:55:36 pomeron kernel: [15010.344513] 2067260 pages RAM
Sep 13 14:55:36 pomeron kernel: [15010.344515] 0 pages HighMem/MovableOnly
Sep 13 14:55:36 pomeron kernel: [15010.344517] 52418 pages reserved
Sep 13 14:55:36 pomeron kernel: [15010.344519] 0 pages hwpoisoned

[snip]
Comment 1 Andres Gomez Garcia 2016-09-13 05:16:05 PDT
With this problem and bug 161862 (which may be interrelated), I just cannot use webkit since 2.13.90. I think this started with the fix for bug 160389.
Comment 2 Andres Gomez Garcia 2016-09-16 07:56:39 PDT
Created attachment 289062 [details]
syslog showing a whole epiphany session from launch till it is killed

* I've used WebKitGtk+ with my own JHBuild setting:
  https://github.com/tanty/jhbuild-epiphany/tree/master

  Epiphany 3.20.3 and WebKit 2.13.92.

  The compilation was done with CMake args:

  '-DUSE_LD_GOLD=OFF -DPORT=GTK -DCMAKE_BUILD_TYPE=Release -DENABLE_MINIBROWSER=ON -DCMAKE_C_FLAGS_RELEASE="-O0 -g1 -DNDEBUG -DG_DEBUG=fatal-criticals -DG_DISABLE_CAST_CHECKS" -DCMAKE_CXX_FLAGS_RELEASE="-O0 -g1 -DNDEBUG -DG_DEBUG=fatal-criticals -DG_DISABLE_CAST_CHECKS"'


* I also could reproduce exactly the same behaviour from a just installed 2.13.92 from Debian experimental, courtesy from Alberto García.

* I'm running Epiphany with the dconf key:

  "process-model" = "shared-secondary-process"

---

I think the problem was also the same with 2.13.90

Kept checking what was going on with the RAM as I was seeing again and again "page allocation failure" when my computer was getting frozen.

My computer has 8Gb RAM and ~8Gb swap space. Thanks to Adrián Pérez, I tweak my vm.swappiness value to "5" from the "60" default value.

Now, running the desktop and Epiphany was smoother but, eventually, the desktop still kept getting frozen. However, now it was recovering from it.


Basically, it seems that with the new threaded compositor and with the dconf key:

"process-model" = "shared-secondary-process"

The WebProcess is growing hugely and it has to be killed.
Comment 3 Andres Gomez Garcia 2016-10-06 04:13:10 PDT
I keep experiencing this, even with the WKGTK+ from Debian Testing.

Versions:
WebKit 2.12.5, Ephy 3.20.3

The only difference from a standard installation is that I have set the dconf key:

"process-model" = "shared-secondary-process"
Comment 4 Michael Catanzaro 2016-10-06 06:51:48 PDT
(In reply to comment #3)
> I keep experiencing this, even with the WKGTK+ from Debian Testing.
> 
> Versions:
> WebKit 2.12.5, Ephy 3.20.3

Well 2.12.5 was affected by bug #126122, so it's almost expected. Are you sure this is really the same issue? This started out as a bug report against 2.13.9x, which has a completely redone graphics architecture.
Comment 5 Andres Gomez Garcia 2016-10-06 07:33:53 PDT
(In reply to comment #4)
> (In reply to comment #3)
> > I keep experiencing this, even with the WKGTK+ from Debian Testing.
> > 
> > Versions:
> > WebKit 2.12.5, Ephy 3.20.3
> 
> Well 2.12.5 was affected by bug #126122, so it's almost expected. Are you
> sure this is really the same issue? This started out as a bug report against
> 2.13.9x, which has a completely redone graphics architecture.

I'm completely sure my desktop freezes and I can see the same kernel messages.
Comment 6 Zan Dobersek 2016-10-07 06:42:43 PDT
What environment are you running Ephy under (in terms of window management)?

I see both gdm-x-session and gdm-wayland-session running at the same time. Is this normal? Xwayland and Xorg I can understand since those would be running under gdm-wayland-session to operate with the X11 plugins.

Backtrace in Ephy shows being stuck on the main thread, waiting for an X protocol reply while shutting down a plugin.

... which is odd, because AFAIU WebKit::WebProcessConnection::didClose() call-chain that's on Ephy's (i.e. UIProcess') main thread should be executed in the PluginProcess.
Comment 7 Andres Gomez Garcia 2016-10-07 07:32:42 PDT
(In reply to comment #6)
> What environment are you running Ephy under (in terms of window management)?
> 
> I see both gdm-x-session and gdm-wayland-session running at the same time.
> Is this normal? Xwayland and Xorg I can understand since those would be
> running under gdm-wayland-session to operate with the X11 plugins.

I may be wrong but this is how it goes:

Since GNOME 3.20 GDM defaults to Wayland. So, when my system starts up, GDM is launched under Wayland by the Debian-+ user:

Debian-+  2962  0.0  0.0 191964  5260 tty1     Ssl+ 14:54   0:00 /usr/lib/gdm3/gdm-wayland-session gnome-session --autostart /usr/share/gdm/greeter/autostart

However, when I log-in into my GNOME-Session, I bring up a Xorg session, run by my user:

tanty     3118  0.0  0.0 202048  5564 tty2     Ssl+ 14:54   0:00 /usr/lib/gdm3/gdm-x-session --run-script default

Of course, I can change GDM's behavior to always use the Xorg backend, even at startup, if you think that could be causing some problems ...

$ cat /etc/gdm3/daemon.conf 
...
[daemon]
# Uncoment the line below to force the login screen to use Xorg
#WaylandEnable=false
...
Comment 8 Zan Dobersek 2016-10-07 07:53:12 PDT
(In reply to comment #7)
> 
> Of course, I can change GDM's behavior to always use the Xorg backend, even
> at startup, if you think that could be causing some problems ...
> 

Maybe. Can you also try and disable plugins?
Comment 9 Michael Catanzaro 2016-10-07 09:25:22 PDT
(In reply to comment #7) 
> Since GNOME 3.20 GDM defaults to Wayland.

Actually it's been that way since at least GNOME 3.16, I think maybe even GNOME 3.14.

> So, when my system starts up, GDM
> is launched under Wayland by the Debian-+ user:
> 
> However, when I log-in into my GNOME-Session, I bring up a Xorg session, run
> by my user:

Yes
Comment 10 Carlos Alberto Lopez Perez 2017-03-06 18:52:54 PST
I have some questions, if you can find some time to test them that may help us to isolate the problem a bit more.

1) Does disabling AC, by setting the environment variable WEBKIT_DISABLE_COMPOSITING_MODE=1 before starting the browser makes any difference?


2) Does disabling bmalloc, by setting the environment variable Malloc=1 before starting the browser makes any difference?


3) Does starting the browser with the default "process-model" = "one-secondary-process-per-web-view" dconf value makes any difference?
Comment 11 Miguel Gomez 2017-04-18 09:06:10 PDT
From the logs, what I see is that the WebProcess starts devouring memory (like 6.5GB of VM), the Memory Pressure Handler is running constantly, and eventually the WebProcess gets killed by the oom killer. During this, the kernel fails several times to allocate memory for other tasks, like buffers for network packages and such.

This huge memory consumption is probably due to the threaded compositor, which was enabled in 2.13.4, and the fact that the accelerated compositing was enabled by default (with the extra memory consumption in mesa due to the usage of the software rasterizer).

This memory issue was greatly improved by returning to the on demand acc compositing and by using a higher opengl version (no more software rasterizer on mesa), together with other tweaks in garbage collection. The problem should not be reproducible using the latest 2.15 and 2.16 versions.

Can you verify this Andres? Should we close this?
Comment 12 Michael Catanzaro 2017-05-04 08:54:28 PDT
Since we don't have anything actionable here, and we do not have a NEEDINFO state on WebKit Bugzilla, I'm going to close this bug for now.

Andres, please reopen if you can reply to Miguel's questions.
Comment 13 Michael Catanzaro 2017-05-04 08:54:44 PDT
And Carlos Lopez's questions too.