Bug 248267 - JSC init crashes WebKit with overcommit limit enabled
Summary: JSC init crashes WebKit with overcommit limit enabled
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: bmalloc (show other bugs)
Version: Other
Hardware: PC Linux
: P2 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2022-11-23 01:27 PST by Paul van Tilburg
Modified: 2022-12-06 11:05 PST (History)
7 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Paul van Tilburg 2022-11-23 01:27:08 PST
With the update of WebKitGTK in Ubuntu (both 20.04LTS and 22.04LTS) of 2.36.8 to 2.38.2 on a system with a VM overcommit limit enabled, it now crashes the process on WebKit initialization via `webkit_web_context_new_ephemeral()` (or `webkit_web_context_new()`) without any error message.

I use the following overcommit configuration on 2 GiB and 4 GiB RAM systems:

  vm.overcommit_memory = 2
  vm.overcommit_ratio = 80

I get the following backtrace (unfortunately incomplete because of unavailable debug symbols):

  #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
  [Current thread is 1 (Thread 0x7f2bf1762ac0 (LWP 11568))]

  Thread 4 (Thread 0x7f2be8822700 (LWP 11574)):
  #0  futex_abstimed_wait_cancelable (private=<optimized out>, abstime=0x7f2be8821b00, clockid=<optimized out>, expected=0, futex_word=0x7f2bd400e690) at ../sysdeps/nptl/futex-internal.h:320
  #1  __pthread_cond_wait_common (abstime=0x7f2be8821b00, clockid=<optimized out>, mutex=0x7f2bd400e640, cond=0x7f2bd400e668) at pthread_cond_wait.c:520
  #2  __pthread_cond_timedwait (cond=0x7f2bd400e668, mutex=0x7f2bd400e640, abstime=0x7f2be8821b00) at pthread_cond_wait.c:665
  #3  0x00007f2bf74c64ac in  () at /lib/x86_64-linux-gnu/libjavascriptcoregtk-4.0.so.18
  #4  0x00007f2bf74c6746 in  () at /lib/x86_64-linux-gnu/libjavascriptcoregtk-4.0.so.18
  #5  0x00007f2bf5b65609 in start_thread (arg=<optimized out>) at pthread_create.c:477
  #6  0x00007f2bf5a8a133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

  Thread 3 (Thread 0x7f2be9879700 (LWP 11570)):
  #0  futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x55fe89c24e28) at ../sysdeps/nptl/futex-internal.h:183
  #1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x55fe89c24dd8, cond=0x55fe89c24e00) at pthread_cond_wait.c:508
  #2  __pthread_cond_wait (cond=0x55fe89c24e00, mutex=0x55fe89c24dd8) at pthread_cond_wait.c:647
  #3  0x00007f2bef52a5eb in  () at /usr/lib/x86_64-linux-gnu/dri/iris_dri.so
  #4  0x00007f2bef52a1eb in  () at /usr/lib/x86_64-linux-gnu/dri/iris_dri.so
  #5  0x00007f2bf5b65609 in start_thread (arg=<optimized out>) at pthread_create.c:477
  #6  0x00007f2bf5a8a133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

  Thread 2 (Thread 0x7f2be9038700 (LWP 11572)):
  #0  futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x55fe89ca0f60) at ../sysdeps/nptl/futex-internal.h:183
  #1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x55fe89ca0f10, cond=0x55fe89ca0f38) at pthread_cond_wait.c:508
  #2  __pthread_cond_wait (cond=0x55fe89ca0f38, mutex=0x55fe89ca0f10) at pthread_cond_wait.c:647
  #3  0x00007f2bef52a5eb in  () at /usr/lib/x86_64-linux-gnu/dri/iris_dri.so
  #4  0x00007f2bef52a1eb in  () at /usr/lib/x86_64-linux-gnu/dri/iris_dri.so
  #5  0x00007f2bf5b65609 in start_thread (arg=<optimized out>) at pthread_create.c:477
  #6  0x00007f2bf5a8a133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

  Thread 1 (Thread 0x7f2bf1762ac0 (LWP 11568)):
  #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
  #1  0x00007f2bf598d859 in __GI_abort () at abort.c:79
  #2  0x00007f2bf61aee1d in  () at /lib/x86_64-linux-gnu/libjavascriptcoregtk-4.0.so.18
  #3  0x00007f2bf6eb2847 in  () at /lib/x86_64-linux-gnu/libjavascriptcoregtk-4.0.so.18
  #4  0x00007f2bf5b6e4df in __pthread_once_slow (once_control=0x7f2bf78dfee8, init_routine=0x7f2bf5dc1c20 <__once_proxy>) at pthread_once.c:116
  #5  0x00007f2bf6eb86e1 in JSC::initialize() () at /lib/x86_64-linux-gnu/libjavascriptcoregtk-4.0.so.18
  #6  0x00007f2bf8e31e81 in  () at /lib/x86_64-linux-gnu/libwebkit2gtk-4.0.so.37
  #7  0x00007f2bf8f93845 in  () at /lib/x86_64-linux-gnu/libwebkit2gtk-4.0.so.37
  #8  0x00007f2bf5b6e4df in __pthread_once_slow (once_control=0x7f2bfc2d8d90, init_routine=0x7f2bf5dc1c20 <__once_proxy>) at pthread_once.c:116
  #9  0x00007f2bf8f93c11 in  () at /lib/x86_64-linux-gnu/libwebkit2gtk-4.0.so.37
  #10 0x00007f2bf8fcb93c in  () at /lib/x86_64-linux-gnu/libwebkit2gtk-4.0.so.37
  #11 0x00007f2bf60301d1 in g_type_class_ref () at /lib/x86_64-linux-gnu/libgobject-2.0.so.0
  #12 0x00007f2bf60135e1 in g_object_new_valist () at /lib/x86_64-linux-gnu/libgobject-2.0.so.0
  #13 0x00007f2bf60136cd in g_object_new () at /lib/x86_64-linux-gnu/libgobject-2.0.so.0
  #14 0x00007f2bf8fb13e0 in webkit_web_context_new_ephemeral () at /lib/x86_64-linux-gnu/libwebkit2gtk-4.0.so.37
  #15 0x000055fe88b8b4c6 in main(int, char**) (argc=<optimized out>, argv=<optimized out>) at main.cpp:1342


and strace output:

[pid  9244] mmap(NULL, 1073750016, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7f952bffe000
[pid  9244] madvise(0x7f952bffe000, 1073750016, MADV_DONTNEED) = 0
[pid  9244] futex(0x7f9591a750d8, FUTEX_WAKE_PRIVATE, 2147483647) = 0
[pid  9244] mmap(NULL, 8589934592, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
[pid  9244] munmap(0x100000000, 4294967296) = 0
[pid  9244] mmap(NULL, 6442450944, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
[pid  9244] munmap(0x80000000, 4294967296) = 0
[pid  9244] mmap(NULL, 5368709120, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
[pid  9244] munmap(0x40000000, 4294967296) = 0
[pid  9244] mmap(NULL, 4831838208, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
[pid  9244] munmap(0x20000000, 4294967296) = 0
[pid  9244] mmap(NULL, 4563402752, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
[pid  9244] munmap(0x10000000, 4294967296) = 0
[pid  9244] mmap(NULL, 4429185024, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
[pid  9244] munmap(0x8000000, 4294967296) = 0
[pid  9244] mmap(NULL, 4362076160, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
[pid  9244] munmap(0x4000000, 4294967296) = 0
[pid  9244] mmap(NULL, 4328521728, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
[pid  9244] munmap(0x2000000, 4294967296) = 0
[pid  9244] rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
[pid  9244] rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
[pid  9244] getpid()                    = 9244
[pid  9244] gettid()                    = 9244
[pid  9244] tgkill(9244, 9244, SIGABRT) = 0
[pid  9244] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid  9244] --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=9244, si_uid=1000} ---
Comment 1 Paul van Tilburg 2022-11-23 01:28:38 PST
Related history: https://bugs.webkit.org/show_bug.cgi?id=183329
Comment 2 Michael Catanzaro 2022-11-23 13:07:49 PST
Why are debug symbols unavailable?
Comment 3 Paul van Tilburg 2022-11-24 00:50:23 PST
For some reason, they where not uploaded to the ddebs (debug symbols) repository and I could not find them.

  $ apt policy libwebkit2gtk-4.0-37 libwebkit2gtk-4.0-37-dbgsym
  libwebkit2gtk-4.0-37:
    Installed: 2.38.2-0ubuntu0.20.04.1
    Candidate: 2.38.2-0ubuntu0.20.04.1
    Version table:
   *** 2.38.2-0ubuntu0.20.04.1 500
          500 http://nl.archive.ubuntu.com/ubuntu focal-updates/main amd64   Packages
          500 http://security.ubuntu.com/ubuntu focal-security/main amd64   Packages
          100 /var/lib/dpkg/status
       2.28.1-1 500
          500 http://nl.archive.ubuntu.com/ubuntu focal/main amd64 Packages

  libwebkit2gtk-4.0-37-dbgsym:
    Installed: (none)
    Candidate: 2.36.8-0ubuntu0.20.04.1
    Version table:
       2.36.8-0ubuntu0.20.04.1 500
          500 http://ddebs.ubuntu.com focal-updates/main amd64 Packages
       2.28.1-1 500
          500 http://ddebs.ubuntu.com focal/main amd64 Packages
Comment 4 Michael Catanzaro 2022-11-24 06:52:17 PST
Weird. :/

Fortunately, the package at least exists: https://launchpad.net/ubuntu/+archive/primary/+files/libjavascriptcoregtk-4.0-18_2.38.2-0ubuntu0.20.04.1_amd64.deb

That should fill in the top frames of the backtrace and show us where the crash occurs.
Comment 5 Michael Catanzaro 2022-11-24 06:53:10 PST
Um sorry, that's the wrong link. Correct link: https://launchpad.net/ubuntu/+archive/primary/+files/libjavascriptcoregtk-4.0-18-dbgsym_2.38.2-0ubuntu0.20.04.1_amd64.ddeb
Comment 6 Paul van Tilburg 2022-11-24 07:32:34 PST
Indeed, great!

Here it is:

Thread 1 (Thread 0x7ff906b3cac0 (LWP 90112)):
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ff90ad67859 in __GI_abort () at abort.c:79
#2  0x00007ff90b588e1d in WTFCrashWithInfo(int, char const*, char const*, int) () at WTF/Headers/wtf/Assertions.h:778
#3  JSC::StructureMemoryManager::StructureMemoryManager() () at ../Source/JavaScriptCore/heap/StructureAlignedMemoryAllocator.cpp:90
#4  WTF::LazyNeverDestroyed<JSC::StructureMemoryManager, WTF::AnyThreadsAccessTraits>::constructWithoutAccessCheck<>() () at WTF/Headers/wtf/NeverDestroyed.h:130
#5  WTF::LazyNeverDestroyed<JSC::StructureMemoryManager, WTF::AnyThreadsAccessTraits>::construct<>() () at WTF/Headers/wtf/NeverDestroyed.h:120
#6  JSC::StructureAlignedMemoryAllocator::initializeStructureAddressSpace() () at ../Source/JavaScriptCore/heap/StructureAlignedMemoryAllocator.cpp:155
#7  0x00007ff90c28c847 in operator() () at ../Source/JavaScriptCore/runtime/InitializeThreading.cpp:91
#8  __invoke_impl<void, JSC::initialize()::<lambda()> > () at /usr/include/c++/9/bits/invoke.h:60
#9  __invoke<JSC::initialize()::<lambda()> > () at /usr/include/c++/9/bits/invoke.h:95
#10 operator() () at /usr/include/c++/9/mutex:671
#11 operator() () at /usr/include/c++/9/mutex:676
#12 _FUN() () at /usr/include/c++/9/mutex:676
#13 0x00007ff90af484df in __pthread_once_slow (once_control=0x7ff90ccb9ee8 <JSC::initialize()::onceFlag>, init_routine=0x7ff90b19bc20 <__once_proxy>) at pthread_once.c:116
#14 0x00007ff90c2926e1 in __gthread_once () at /usr/include/x86_64-linux-gnu/c++/9/bits/gthr-default.h:700
#15 call_once<JSC::initialize()::<lambda()> > () at /usr/include/c++/9/mutex:683
#16 JSC::initialize() () at ../Source/JavaScriptCore/runtime/InitializeThreading.cpp:69
#17 0x00007ff90e20be81 in  () at /lib/x86_64-linux-gnu/libwebkit2gtk-4.0.so.37
#18 0x00007ff90e36d845 in  () at /lib/x86_64-linux-gnu/libwebkit2gtk-4.0.so.37
#19 0x00007ff90af484df in __pthread_once_slow (once_control=0x7ff9116b2d90, init_routine=0x7ff90b19bc20 <__once_proxy>) at pthread_once.c:116
#20 0x00007ff90e36dc11 in  () at /lib/x86_64-linux-gnu/libwebkit2gtk-4.0.so.37
#21 0x00007ff90e3a593c in  () at /lib/x86_64-linux-gnu/libwebkit2gtk-4.0.so.37
#22 0x00007ff90b40a1d1 in g_type_class_ref () at /lib/x86_64-linux-gnu/libgobject-2.0.so.0
#23 0x00007ff90b3ed5e1 in g_object_new_valist () at /lib/x86_64-linux-gnu/libgobject-2.0.so.0
#24 0x00007ff90b3ed6cd in g_object_new () at /lib/x86_64-linux-gnu/libgobject-2.0.so.0
#25 0x00007ff90e38b3e0 in webkit_web_context_new_ephemeral () at /lib/x86_64-linux-gnu/libwebkit2gtk-4.0.so.37
#26 0x00005646fd77e4c6 in main(int, char**) (argc=<optimized out>, argv=<optimized out>) at LCMain.cpp:1342
Comment 7 Michael Catanzaro 2022-11-24 10:12:33 PST
OK, good backtrace. I'm not sure what to do about it, but at least we know where the allocation is failing now.
Comment 8 Paul van Tilburg 2022-11-24 12:51:43 PST
I tried to get a trace to find out what happens in 2.36.8 at that point and whether it also tries to map such large regions now.

I failed unfortunately; the output was too massive to correlate.
Comment 9 Carlos Alberto Lopez Perez 2022-11-24 17:42:49 PST
(In reply to Paul van Tilburg from comment #6)
> Indeed, great!
> 
> Here it is:
> 
> Thread 1 (Thread 0x7ff906b3cac0 (LWP 90112)):
> #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1  0x00007ff90ad67859 in __GI_abort () at abort.c:79
> #2  0x00007ff90b588e1d in WTFCrashWithInfo(int, char const*, char const*,
> int) () at WTF/Headers/wtf/Assertions.h:778
> #3  JSC::StructureMemoryManager::StructureMemoryManager() () at
> ../Source/JavaScriptCore/heap/StructureAlignedMemoryAllocator.cpp:90

It crashes there on a RELEASE_ASSERTION:

[...]
#if CPU(ADDRESS64) && !ENABLE(STRUCTURE_ID_WITH_SHIFT)

class StructureMemoryManager {
public:
    StructureMemoryManager()
    {
        // Don't use the first page because zero is used as the empty StructureID and the first allocation will conflict.
        m_usedBlocks.set(0);

        uintptr_t mappedHeapSize = structureHeapAddressSize;
        for (unsigned i = 0; i < 8; ++i) {
            g_jscConfig.startOfStructureHeap = reinterpret_cast<uintptr_t>(OSAllocator::tryReserveUncommittedAligned(mappedHeapSize, structureHeapAddressSize, OSAllocator::FastMallocPages));
            if (g_jscConfig.startOfStructureHeap)
                break;
            mappedHeapSize /= 2;
        }
        g_jscConfig.sizeOfStructureHeap = mappedHeapSize;
        RELEASE_ASSERT(g_jscConfig.startOfStructureHeap && ((g_jscConfig.startOfStructureHeap & ~StructureID::structureIDMask) == g_jscConfig.startOfStructureHeap));
[...]

This assertion was added on 250199@main (bug 239957)
Comment 10 Carlos Alberto Lopez Perez 2022-11-24 17:57:12 PST
(In reply to Paul van Tilburg from comment #0)
> With the update of WebKitGTK in Ubuntu (both 20.04LTS and 22.04LTS) of
> 2.36.8 to 2.38.2 on a system with a VM overcommit limit enabled, it now
> crashes the process on WebKit initialization via
> `webkit_web_context_new_ephemeral()` (or `webkit_web_context_new()`) without
> any error message.
> 
> I use the following overcommit configuration on 2 GiB and 4 GiB RAM systems:
> 
>   vm.overcommit_memory = 2
>   vm.overcommit_ratio = 80
> 

I'm a bit confused in respect to what those settings achieve.

After reading the kernel documentation <https://www.kernel.org/doc/Documentation/vm/overcommit-accounting> it looks to me that you are disabling overcommit, but then I also read this <https://unix.stackexchange.com/questions/348415/overcommit-memory-and-overcommit-ratio> and there is suggested another interpretation on those values (overcommit up to 80%).

Which one is the correct? What those parameters actually do?

Also.. what happens if you set vm.overcommit_memory = 0 ?
Comment 11 Paul van Tilburg 2022-11-24 23:04:19 PST
Yes, if `overcommit_memory` is set to 0, then it works fine. It is also our current workaround.

The limit is there (possibly historically) to limit the leeway of user space because of issues where the kernel got still stuck. I am dealing with unattended devices that, if stuck, are lost to us.

See also https://bugs.webkit.org/show_bug.cgi?id=183329
Comment 12 Radar WebKit Bug Importer 2022-11-30 01:28:17 PST
<rdar://problem/102803563>
Comment 13 Mark Lam 2022-12-06 11:05:26 PST
The RELEASE_ASSERT there:

    ELEASE_ASSERT(g_jscConfig.startOfStructureHeap && ((g_jscConfig.startOfStructureHeap & ~StructureID::structureIDMask) == g_jscConfig.startOfStructureHeap));

... is enforcing that the start of the StructureHeap is always aligned with structureHeapAddressSize.

#elif CPU(ADDRESS64)
    static constexpr CPURegister structureIDMask = structureHeapAddressSize - 1;
#endif

#if !ENABLE(STRUCTURE_ID_WITH_SHIFT)
#if defined(STRUCTURE_HEAP_ADDRESS_SIZE_IN_MB) && STRUCTURE_HEAP_ADDRESS_SIZE_IN_MB > 0
constexpr uintptr_t structureHeapAddressSize = STRUCTURE_HEAP_ADDRESS_SIZE_IN_MB * MB;
#elif PLATFORM(IOS_FAMILY) && CPU(ARM64) && !CPU(ARM64E)
constexpr uintptr_t structureHeapAddressSize = 512 * MB;
#else
constexpr uintptr_t structureHeapAddressSize = 4 * GB;
#endif
#endif // !ENABLE(STRUCTURE_ID_WITH_SHIFT)

I don't know what `vm.overcommit_memory = 2` does, but my guess is that it affected the allocation of the StructureHeap such that the invariant is now broken.