Bug 209236 - REGRESSION(r249808): [GTK] Crash in JSC Config::permanentlyFreeze() on architecture ppc64el
Summary: REGRESSION(r249808): [GTK] Crash in JSC Config::permanentlyFreeze() on archit...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Michael Catanzaro
URL:
Keywords: InRadar
: 204017 209171 (view as bug list)
Depends on: 209670
Blocks:
  Show dependency treegraph
 
Reported: 2020-03-18 08:55 PDT by seb128
Modified: 2020-04-29 09:04 PDT (History)
24 users (show)

See Also:


Attachments
small script testcase (395 bytes, text/x-python)
2020-03-18 08:55 PDT, seb128
no flags Details
gdb (17.66 KB, text/plain)
2020-03-18 13:42 PDT, seb128
no flags Details
Patch (7.33 KB, patch)
2020-03-19 18:28 PDT, Michael Catanzaro
no flags Details | Formatted Diff | Diff
Patch (7.54 KB, patch)
2020-03-20 08:48 PDT, Michael Catanzaro
no flags Details | Formatted Diff | Diff
Patch (7.74 KB, patch)
2020-03-20 09:00 PDT, Michael Catanzaro
no flags Details | Formatted Diff | Diff
Patch (7.74 KB, patch)
2020-03-20 12:35 PDT, Michael Catanzaro
no flags Details | Formatted Diff | Diff
Patch (7.61 KB, patch)
2020-03-23 09:51 PDT, Michael Catanzaro
no flags Details | Formatted Diff | Diff
Patch (7.56 KB, patch)
2020-03-23 09:52 PDT, Michael Catanzaro
no flags Details | Formatted Diff | Diff
JSCOnly aarch64 strace log (19.66 KB, text/plain)
2020-03-27 11:36 PDT, Zan Dobersek
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description seb128 2020-03-18 08:55:11 PDT
Created attachment 393861 [details]
small script testcase

Unsure what useful information to provide but the webkit2gtk 2.26 -> 2.28 upgrade is blocked in Ubuntu because some reverse depends tests fail on s390x/ppc64el

Trying a simple python webkitgtk script, using 2.26 the webpage loads fine but 2.28 gives an empty view...

Any hint on how to debug would be useful since that blocks the new serie to make it to the current Ubuntu serie
Comment 1 seb128 2020-03-18 09:31:09 PDT
The MiniBrowser provided by webkitgtk has the same issue
Comment 2 Carlos Alberto Lopez Perez 2020-03-18 09:46:24 PDT
(In reply to seb128 from comment #1)
> The MiniBrowser provided by webkitgtk has the same issue

Can you try running the script after exporting the environment variable WEBKIT_DISABLE_COMPOSITING_MODE=1 ? and with the environment variable JavaScriptCoreUseJIT=0 ? does any of it help?
Comment 3 seb128 2020-03-18 11:25:25 PDT
WEBKIT_DISABLE_COMPOSITING_MODE=1 doesn't seem to make a difference
Comment 4 Michael Catanzaro 2020-03-18 11:33:32 PDT
BTW we know JavaScriptCore tests on both of these architectures started failing sometime between 2.26 and 2.28, so wouldn't surprise me if they're totally busted. I have pending to bisect the issues and then report bugs, but no promises as to when.

 * ppc64le is special because it uses large page sizes (and different sizes on different distros!)
 * s390x is special because it's big endian

Ideally, we would set up JSCOnly EWS for these architectures, but due to the nature of the architectures, that's not likely going to be easy, especially for s390x.
Comment 5 Michael Catanzaro 2020-03-18 11:37:51 PDT
(In reply to Michael Catanzaro from comment #4)
> BTW we know JavaScriptCore tests on both of these architectures started
> failing sometime between 2.26 and 2.28, so wouldn't surprise me if they're
> totally busted.

(Separate issues, btw, because of course it would be too easy for there to be just one problem for both arches. ;)
Comment 6 Michael Catanzaro 2020-03-18 11:48:34 PDT
(In reply to Carlos Alberto Lopez Perez from comment #2)
> and with the environment variable JavaScriptCoreUseJIT=0 ? does any of it help?

Hopefully that should not make any difference, because these arches should both be using cloop.
Comment 7 seb128 2020-03-18 13:18:44 PDT
One of the test failing is ruby-gnome's webkit test unit
https://salsa.debian.org/ruby-team/ruby-gnome/-/blob/master/webkit2-gtk/test/run-test.rb

The Ubuntu package calls it under xvfb.  It makes a bit easier to test for the issue/regression from a command line only environment.

I confirmed the issues on a graphical environment by using ssh -X to a porter box
Comment 8 seb128 2020-03-18 13:21:12 PDT
Sorry, I overlooked earlier than you asked also about JavaScriptCoreUseJIT=0 ... that doesn't make a difference either
Comment 9 Carlos Alberto Lopez Perez 2020-03-18 13:25:32 PDT
If your test case, can you see a a process named WebKitWebProcess still alive after several seconds (let's say: 1 minute after loading the test) or does this process dies?
Comment 10 seb128 2020-03-18 13:30:35 PDT
> If your test case, can you see a a process named WebKitWebProcess still alive after several seconds 

No, that process goes away a few seconds after the view is opened
Comment 11 seb128 2020-03-18 13:31:52 PDT
attaching with gdb, it gives 

Thread 1 "WebKitWebProces" received signal SIGABRT, Aborted.
0x000078724d300468 in __libc_signal_restore_set (set=0x7ffffda70cb8)
    at ../sysdeps/unix/sysv/linux/internal-signals.h:86
86	../sysdeps/unix/sysv/linux/internal-signals.h: No such file or directory.
(gdb) bt
#0  0x000078724d300468 in __libc_signal_restore_set (set=0x7ffffda70cb8)
    at ../sysdeps/unix/sysv/linux/internal-signals.h:86
#1  __GI_raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:48
#2  0x000078724d2d7cd0 in __GI_abort () at abort.c:79
#3  0x000078724b27eed4 in JSC::Config::permanentlyFreeze() ()
    at /lib/powerpc64le-linux-gnu/libjavascriptcoregtk-4.0.so.18
#4  0x000078724b4b0fd0 in JSC::VM::VM(JSC::VM::VMType, JSC::HeapType) ()
    at /lib/powerpc64le-linux-gnu/libjavascriptcoregtk-4.0.so.18
#5  0x000078724b4b2ad4 in JSC::VM::create(JSC::HeapType) ()


I will try to install debug symbols
Comment 12 Carlos Alberto Lopez Perez 2020-03-18 13:36:26 PDT
(In reply to seb128 from comment #10)
> > If your test case, can you see a a process named WebKitWebProcess still alive after several seconds 
> 
> No, that process goes away a few seconds after the view is opened

great.. we have to know why it dies.

Don't attach gdb to it, just enable core generation by running "ulimit -c unlimited", then run your test case and wait until it dies and finishes writting the core file (can take a while).

Then generate a backtrace from the core file with this command.

gdb --batch -ex "thread apply all bt full" /path/to/WebKitWebProcess core &> core_backtrace.txt

and upload here the txt file.

btw: install the dbg-sym packages before genearating the backtrace
Comment 13 seb128 2020-03-18 13:42:25 PDT
Created attachment 393897 [details]
gdb

Using gdb on the process should be the same no? 

the ulimit/core doesn't work, probably because of apport

Anyway, I got one with gdb on the process, attaching the file now, let me know if I should better try your suggested way for some reason
Comment 14 seb128 2020-03-18 13:45:13 PDT
in fact apport collected the dump, using gdb on it gives the same stacktrace that it does on the process
Comment 15 Carlos Alberto Lopez Perez 2020-03-18 13:57:04 PDT
So its crashing here:

#3  0x0000705dcc8beed4 in CRASH_WITH_INFO(...) () at DerivedSources/ForwardingHeaders/wtf/Assertions.h:660
#4  JSC::Config::permanentlyFreeze() () at ../Source/JavaScriptCore/runtime/JSCConfig.cpp:78
#5  0x0000705dccaf0fd0 in JSC::VM::VM(JSC::VM::VMType, JSC::HeapType) () at ../Source/JavaScriptCore/runtime/VM.cpp:586
#6  0x0000705dccaf2ad4 in JSC::VM::create(JSC::HeapType) () at ../Source/JavaScriptCore/runtime/VM.cpp:703
#7  0x0000705dd06d0b48 in WebCore::commonVMSlow() () at ../Source/WebCore/bindings/js/CommonVM.cpp:55
#8  0x0000705dd0e9fe74 in WebCore::commonVM() () at ../Source/WebCore/bindings/js/CommonVM.h:52
#9  WebCore::PageScriptDebugServer::PageScriptDebugServer(WebCore::Page&) () at ../Source/WebCore/inspector/PageScriptDebugServer.cpp:58
#10 0x0000705dd0e87548 in WebCore::InspectorController::InspectorController(WebCore::Page&, WebCore::InspectorClient*) () at ../Source/WebCore/inspector/InspectorController.cpp:105
#11 0x0000705dd116e254 in std::make_unique<WebCore::InspectorController, WebCore::Page&, WebCore::InspectorClient*&>(WebCore::Page&, WebCore::InspectorClient*&) () at /usr/include/c++/9/bits/unique_ptr.h:857
#12 WTF::makeUnique<WebCore::InspectorController, WebCore::Page&, WebCore::InspectorClient*&>(WebCore::Page&, WebCore::InspectorClient*&) () at DerivedSources/ForwardingHeaders/wtf/StdLibExtras.h:483
#13 WebCore::Page::Page(WebCore::PageConfiguration&&) () at ../Source/WebCore/page/Page.cpp:279
#14 0x0000705dcfbdf4f8 in std::make_unique<WebCore::Page, WebCore::PageConfiguration>(WebCore::PageConfiguration&&) () at /usr/include/c++/9/bits/unique_ptr.h:857
#15 WTF::makeUnique<WebCore::Page, WebCore::PageConfiguration>(WebCore::PageConfiguration&&) () at DerivedSources/ForwardingHeaders/wtf/StdLibExtras.h:483
#16 WebKit::WebPage::WebPage(WTF::ObjectIdentifier<WebCore::PageIdentifierType>, WebKit::WebPageCreationParameters&&) () at ../Source/WebKit/WebProcess/WebPage/WebPage.cpp:536
#17 0x0000705dcfbe0254 in WebKit::WebPage::create(WTF::ObjectIdentifier<WebCore::PageIdentifierType>, WebKit::WebPageCreationParameters&&) () at ../Source/WebKit/WebProcess/WebPage/WebPage.cpp:379
#18 0x0000705dcf994ad8 in WebKit::WebProcess::createWebPage(WTF::ObjectIdentifier<WebCore::PageIdentifierType>, WebKit::WebPageCreationParameters&&) () at ../Source/WebKit/WebProcess/WebProcess.cpp:685
#19 0x0000705dcf434e08 in IPC::callMemberFunctionImpl<WebKit::WebProcess, void (WebKit::WebProcess::*)(WTF::ObjectIdentifier<WebCore::PageIdentifierType>, WebKit::WebPageCreationParameters&&), std::tuple<WTF::ObjectIdentifier<WebCore::PageIdentifierType>, WebKit::WebPageCreationParameters>, 0ul, 1ul>(WebKit::WebProcess*, void (WebKit::WebProcess::*)(WTF::ObjectIdentifier<WebCore::PageIdentifierType>, WebKit::WebPageCreationParameters&&), std::tuple<WTF::ObjectIdentifier<WebCore::PageIdentifierType>, WebKit::WebPageCreationParameters>&&, std::integer_sequence<unsigned long, 0ul, 1ul>) () at ../Source/WebKit/Platform/IPC/HandleMessage.h:41
#20 IPC::callMemberFunction<WebKit::WebProcess, void (WebKit::WebProcess::*)(WTF::ObjectIdentifier<WebCore::PageIdentifierType>, WebKit::WebPageCreationParameters&&), std::tuple<WTF::ObjectIdentifier<WebCore::PageIdentifierType>, WebKit::WebPageCreationParameters>, std::integer_sequence<unsigned long, 0ul, 1ul> >(std::tuple<WTF::ObjectIdentifier<WebCore::PageIdentifierType>, WebKit::WebPageCreationParameters>&&, WebKit::WebProcess*, void (WebKit::WebProcess::*)(WTF::ObjectIdentifier<WebCore::PageIdentifierType>, WebKit::WebPageCreationParameters&&)) () at ../Source/WebKit/Platform/IPC/HandleMessage.h:47
#21 IPC::handleMessage<Messages::WebProcess::CreateWebPage, WebKit::WebProcess, void (WebKit::WebProcess::*)(WTF::ObjectIdentifier<WebCore::PageIdentifierType>, WebKit::WebPageCreationParameters&&)>(IPC::Decoder&, WebKit::WebProcess*, void (WebKit::WebProcess::*)(WTF::ObjectIdentifier<WebCore::PageIdentifierType>, WebKit::WebPageCreationParameters&&)) () at ../Source/WebKit/Platform/IPC/HandleMessage.h:120
#22 0x0000705dcf42ab14 in WebKit::WebProcess::didReceiveWebProcessMessage(IPC::Connection&, IPC::Decoder&) () at DerivedSources/WebKit/WebProcessMessageReceiver.cpp:291
#23 0x0000705dcf99ce1c in WebKit::WebProcess::didReceiveMessage(IPC::Connection&, IPC::Decoder&) () at ../Source/WebKit/WebProcess/WebProcess.cpp:750
#24 WebKit::WebProcess::didReceiveMessage(IPC::Connection&, IPC::Decoder&) () at ../Source/WebKit/WebProcess/WebProcess.cpp:744
#25 0x0000705dcf5f1298 in IPC::Connection::dispatchMessage(IPC::Decoder&) () at ../Source/WebKit/Platform/IPC/Connection.cpp:1008
#26 0x0000705dcf5f3014 in IPC::Connection::dispatchMessage(std::unique_ptr<IPC::Decoder, std::default_delete<IPC::Decoder> >) () at ../Source/WebKit/Platform/IPC/Connection.cpp:1077
#27 0x0000705dcf5f39e4 in IPC::Connection::dispatchOneIncomingMessage() () at ../Source/WebKit/Platform/IPC/Connection.cpp:1146
#28 0x0000705dcf5f3f34 in operator() () at ../Source/WebKit/Platform/IPC/Connection.cpp:985
#29 call() () at DerivedSources/ForwardingHeaders/wtf/Function.h:52
#30 0x0000705dccbc26f8 in WTF::Function<void ()>::operator()() const () at ../Source/WTF/wtf/Function.h:84
#31 WTF::RunLoop::performWork() () at ../Source/WTF/wtf/RunLoop.cpp:124
#32 0x0000705dccc3e5e8 in operator() () at ../Source/WTF/wtf/glib/RunLoopGLib.cpp:68
#33 _FUN() () at ../Source/WTF/wtf/glib/RunLoopGLib.cpp:70
#34 0x0000705dccc3e670 in operator() () at ../Source/WTF/wtf/glib/RunLoopGLib.cpp:45
#35 _FUN() () at ../Source/WTF/wtf/glib/RunLoopGLib.cpp:46
#36 0x0000705dcd4c5d14 in g_main_context_dispatch () at /lib/powerpc64le-linux-gnu/libglib-2.0.so.0
#37 0x0000705dcd4c6258 in  () at /lib/powerpc64le-linux-gnu/libglib-2.0.so.0
#38 0x0000705dcd4c67bc in g_main_loop_run () at /lib/powerpc64le-linux-gnu/libglib-2.0.so.0
#39 0x0000705dccc3f8a4 in WTF::RunLoop::run() () at ../Source/WTF/wtf/glib/RunLoopGLib.cpp:96
#40 0x0000705dcfc17d24 in WebKit::AuxiliaryProcessMain<WebKit::WebProcess, WebKit::WebProcessMainGtk>(int, char**) () at ../Source/WebKit/Shared/AuxiliaryProcessMain.h:68
#41 WebKit::WebProcessMain(int, char**) () at ../Source/WebKit/WebProcess/gtk/WebProcessMainGtk.cpp:68
#42 0x000003e4f4b607c0 in main() () at ../Source/WebKit/WebProcess/EntryPoint/unix/WebProcessMain.cpp:45


Which translates to crashing in the RELEASE_ASSERT of line 78:


ebkitgtk-2.28.0 $ cat -n Source/JavaScriptCore/runtime/JSCConfig.cpp|tail -30
    53	    
    54	void Config::permanentlyFreeze()
    55	{
    56	#if PLATFORM(COCOA)
    57	    RELEASE_ASSERT(roundUpToMultipleOf(vmPageSize(), ConfigSizeToProtect) == ConfigSizeToProtect);
    58	#endif
    59	
    60	    if (!g_jscConfig.isPermanentlyFrozen)
    61	        g_jscConfig.isPermanentlyFrozen = true;
    62	
    63	    int result = 0;
    64	#if OS(DARWIN)
    65	    enum {
    66	        AllowPermissionChangesAfterThis = false,
    67	        DisallowPermissionChangesAfterThis = true
    68	    };
    69	
    70	    // There's no going back now!
    71	    result = vm_protect(mach_task_self(), reinterpret_cast<vm_address_t>(&g_jscConfig), ConfigSizeToProtect, DisallowPermissionChangesAfterThis, VM_PROT_READ);
    72	#elif OS(LINUX)
    73	    result = mprotect(&g_jscConfig, ConfigSizeToProtect, PROT_READ);
    74	#elif OS(WINDOWS)
    75	    // FIXME: Implement equivalent, maybe with VirtualProtect.
    76	    // Also need to fix WebKitTestRunner.
    77	#endif
    78	    RELEASE_ASSERT(!result); // <--- HERE IT CRASHES
    79	    RELEASE_ASSERT(g_jscConfig.isPermanentlyFrozen);
    80	}
    81	
    82	} // namespace JSC
Comment 16 Carlos Alberto Lopez Perez 2020-03-18 14:01:49 PDT
That code crashing was added on r249808
Comment 17 Mark Lam 2020-03-18 14:10:48 PDT
(In reply to Carlos Alberto Lopez Perez from comment #16)
> That code crashing was added on r249808

This crash means that ppc64el/s390x is lacking this support.  If you don't want the feature, you can add an #elif for that platform and make this a no-op for it.  Alternatively, you can implement the support.
Comment 18 Carlos Alberto Lopez Perez 2020-03-18 14:38:45 PDT
(In reply to Mark Lam from comment #17)
> (In reply to Carlos Alberto Lopez Perez from comment #16)
> > That code crashing was added on r249808
> 
> This crash means that ppc64el/s390x is lacking this support.  

Do you mean it lacks support for mprotect() with PROT_READ ?

>If you don't want the feature, you can add an #elif for that platform and make this a no-op for it.  Alternatively, you can implement the support.

Making it a no-op for this architectures seems fine to me.


@seb128: can you test if this patch fixes the issue?

diff --git a/Source/JavaScriptCore/runtime/JSCConfig.cpp b/Source/JavaScriptCore/runtime/JSCConfig.cpp
index 01e0e63..9c57da8 100644
--- a/Source/JavaScriptCore/runtime/JSCConfig.cpp
+++ b/Source/JavaScriptCore/runtime/JSCConfig.cpp
@@ -70,6 +70,7 @@ void Config::permanentlyFreeze()
     // There's no going back now!
     result = vm_protect(mach_task_self(), reinterpret_cast<vm_address_t>(&g_jscConfig), ConfigSizeToProtect, DisallowPermissionChangesAfterThis, VM_PROT_READ);
 #elif OS(LINUX)
+    return;
     result = mprotect(&g_jscConfig, ConfigSizeToProtect, PROT_READ);
 #elif OS(WINDOWS)
     // FIXME: Implement equivalent, maybe with VirtualProtect.


And if it does can you let me know which one its the value of CMAKE_SYSTEM_PROCESSOR for s390x? you can find this by grepping on the build directory for this string.

And other question: does this only affect ppc64el or also ppc64 and ppc?
Comment 19 Carlos Alberto Lopez Perez 2020-03-18 14:42:27 PDT
(In reply to Carlos Alberto Lopez Perez from comment #18)
> And if it does can you let me know which one its the value of
> CMAKE_SYSTEM_PROCESSOR for s390x? you can find this by grepping on the build
> directory for this string.
> 

Also upload here a txt file with the output of the command "echo | gcc -E -dM -" on s390x
Comment 20 Michael Catanzaro 2020-03-18 16:37:36 PDT
Sebastien, you told me there was no crash. :( A crash makes everything way easier!

(In reply to Carlos Alberto Lopez Perez from comment #18)
> Do you mean it lacks support for mprotect() with PROT_READ ?

errno is set on failure, so WebKit should check errno and print a proper error using strerror(errno) to give us an idea of what's going wrong. But I'm pretty sure it's going to be EINVAL. Look:

EINVAL addr is not a valid pointer, or not a  multiple  of  the  system
              page size.

And I see we have in JSCConfig.h:

#if !OS(WINDOWS)
constexpr size_t ConfigSizeToProtect = 16 * KB;
#else
constexpr size_t ConfigSizeToProtect = 4 * KB;
#endif

which is definitely incorrect. This is reminiscent of bug #182923, fixed in r232909. Note that in that revision, the task was to find a block size that was *at least as big* as the actual page size, so it was not a precise fix: we just guessed 64 KB was big enough for strange architectures without attempting to be precise about it. Hopefully the same approach would work here?

I'm not sure where to find correct pages sizes to assume. Looking at an internal email that doesn't cite any sources, it looks like s390x uses 4 KB pages, ppc64le uses 64 KB pages on RHEL but 4 KB on various other distros (would be too easy otherwise!), aarch64 uses 64 KB pages... though aarch64 is getting 16 KB block sizes in MarkedBlock.h without crashing, because we have CPU(ARM64) so it shouldn't be CPU(UNKNOWN). And afaik all of the above can be changed at kernel compile time, so we just hope by convention that it's the same on all distros, except ppc64le where we know it isn't. So conclusion: ?????????

The best solution is to get page size at runtime using sysconf(_SC_PAGESIZE), but it looks like the code really wants a compile-time solution. So maybe just hardcode 64 KB for these CPUs and for CPU(UNKNOWN)? Ideally we would share the value with MarkedBlock.h? clopez, what do you think?

Anyway, Sebastien, try changing ConfigSizeToProtect to 64 * KB and see if that works. My guess is that will fix ppc64le, but something somewhere else will be broken for s390x.

P.S. You know if Debian and Red Hat can't agree on whether to spell ppc64le or ppc64el, we definitely won't be able to agree on page size. :P It looks like Debian distros are saying ppc64el while Red Hat and SUSE say ppc64le. Whatever....
Comment 21 Michael Catanzaro 2020-03-18 17:08:31 PDT
(In reply to Carlos Alberto Lopez Perez from comment #18)
> And other question: does this only affect ppc64el or also ppc64 and ppc?

(Looks like neither Debian nor Red Hat supports ppc64 or ppc anymore, so it probably doesn't matter much.)
Comment 22 Mark Lam 2020-03-18 17:45:37 PDT
(In reply to Michael Catanzaro from comment #20)
> Sebastien, you told me there was no crash. :( A crash makes everything way
> easier!
> 
> (In reply to Carlos Alberto Lopez Perez from comment #18)
> > Do you mean it lacks support for mprotect() with PROT_READ ?
> 
> errno is set on failure, so WebKit should check errno and print a proper
> error using strerror(errno) to give us an idea of what's going wrong. But
> I'm pretty sure it's going to be EINVAL. Look:
> 
> EINVAL addr is not a valid pointer, or not a  multiple  of  the  system
>               page size.
> 
> And I see we have in JSCConfig.h:
> 
> #if !OS(WINDOWS)
> constexpr size_t ConfigSizeToProtect = 16 * KB;
> #else
> constexpr size_t ConfigSizeToProtect = 4 * KB;
> #endif
> 
> which is definitely incorrect. This is reminiscent of bug #182923, fixed in
> r232909. Note that in that revision, the task was to find a block size that
> was *at least as big* as the actual page size, so it was not a precise fix:
> we just guessed 64 KB was big enough for strange architectures without
> attempting to be precise about it. Hopefully the same approach would work
> here?

I was surprised that this could be the issue because there's an assert in Config::permanentlyFreeze() that ensures that ConfigSizeToProtect is an integral multiple of page size.  However, checking the code, I see:

#if PLATFORM(COCOA)
    RELEASE_ASSERT(roundUpToMultipleOf(vmPageSize(), ConfigSizeToProtect) == ConfigSizeToProtect);
#endif

It's a pity that vmPageSize() is not available on non-cocoa ports, or this issue would be more clearly identified.
Comment 23 Michael Catanzaro 2020-03-18 18:26:47 PDT
(In reply to Mark Lam from comment #22)
> It's a pity that vmPageSize() is not available on non-cocoa ports, or this
> issue would be more clearly identified.

Looks like everything in ResourceUsage.h is guarded by #if PLATFORM(COCOA).

At least vmPageSize() would be very easy to implement. The Cocoa impl is:

size_t vmPageSize()
{
#if PLATFORM(IOS_FAMILY)
    return vm_kernel_page_size;
#else
    static size_t cached = sysconf(_SC_PAGESIZE);
    return cached;
#endif
}

The Linux impl would be 100% identical to the non-IOS_FAMILY path there. Just needs to return sysconf(_SC_PAGESIZE). There's also _SC_PAGE_SIZE, which is a synonym. Both of these are specified by POSIX.1, so they should work on every Unix, not just Linux.
Comment 24 Tomas Popela 2020-03-19 00:23:54 PDT
(In reply to Michael Catanzaro from comment #20)
> I'm not sure where to find correct pages sizes to assume. Looking at an
> internal email that doesn't cite any sources, it looks like s390x uses 4 KB
> pages, ppc64le uses 64 KB pages on RHEL but 4 KB on various other distros
> (would be too easy otherwise!), aarch64 uses 64 KB pages...

Some correction:

s390x, ppc64(le) in RHEL and Fedora use 64 KB page size
aarch64 on RHEL uses 64 KB and on Fedora 4 KB page size
Comment 25 seb128 2020-03-19 01:02:58 PDT
> @seb128: can you test if this patch fixes the issue?

@Carlos, your return patch does fix the issue indeed!


> And if it does can you let me know which one its the value of 
> CMAKE_SYSTEM_PROCESSOR for s390x? you can find this by grepping 
> on the build directory for this string.

CMakeFiles/3.16.3/CMakeSystem.cmake:set(CMAKE_SYSTEM_PROCESSOR "s390x")


> And other question: does this only affect ppc64el or also ppc64 and ppc?

Ubuntu also provides the first variant and Debian doesn't run its tests on those archs so I can't easily say


> Sebastien, you told me there was no crash. :( A crash makes everything way easier!

@Michael, sorry about that, the small script doesn't print anything on stdout/stderr and neither does MiniBrowser. Since the UI was still open and no eror was printer I assumed there was no crash, I didn't know there was a background rendering process that could crash like this (I should have checked the apport directory but since it's in a remote server I didn't get notified about that either)
Comment 26 seb128 2020-03-19 01:07:43 PDT
I just tested on s390x and the rendering in fact works there, we have some autopkgtest failures but they seem different from the issue described on this bug. 
I've updated the title now to mention only ppc64el now
Sorry if that created confusion or extra work
Comment 27 seb128 2020-03-19 02:18:18 PDT
> Anyway, Sebastien, try changing ConfigSizeToProtect to 64 * KB and see if
> that works. My guess is that will fix ppc64le, but something somewhere
> else will be broken for s390x.

The change seems to resolve the issue on ppc indeed, and as stated in the previous comment we can ignore s390x now
Comment 28 Michael Catanzaro 2020-03-19 09:06:08 PDT
(In reply to Tomas Popela from comment #24)
> Some correction:
> 
> s390x, ppc64(le) in RHEL and Fedora use 64 KB page size
> aarch64 on RHEL uses 64 KB and on Fedora 4 KB page size

That makes a *lot* more sense than the mail I was looking at.

In that case, aarch64 must be broken on RHEL for a while now, because MarkedBlock.h uses 16 KB block size on that architecture.
Comment 29 Michael Catanzaro 2020-03-19 09:25:49 PDT
(In reply to Michael Catanzaro from comment #28)
> In that case, aarch64 must be broken on RHEL for a while now, because
> MarkedBlock.h uses 16 KB block size on that architecture.

Tomas pointed me to a downstream patch:

diff --git a/Source/JavaScriptCore/heap/MarkedBlock.h b/Source/JavaScriptCore/heap/MarkedBlock.h
index e240f0ae..6bf88692 100644
--- a/Source/JavaScriptCore/heap/MarkedBlock.h
+++ b/Source/JavaScriptCore/heap/MarkedBlock.h
@@ -68,7 +68,7 @@ public:
     static constexpr size_t atomSize = 16; // bytes
 
     // Block size must be at least as large as the system page size.
-#if CPU(PPC64) || CPU(PPC64LE) || CPU(PPC) || CPU(UNKNOWN)
+#if CPU(PPC64) || CPU(PPC64LE) || CPU(PPC) || CPU(ARM64) || CPU(UNKNOWN)
     static constexpr size_t blockSize = 64 * KB;
 #else
     static constexpr size_t blockSize = 16 * KB;
Comment 30 Michael Catanzaro 2020-03-19 10:07:33 PDT
Hey Mark, how was ConfigSizeToProtect chosen?

#if !OS(WINDOWS)
constexpr size_t ConfigSizeToProtect = 16 * KB;
#else
constexpr size_t ConfigSizeToProtect = 4 * KB;
#endif

Is it supposed to match page size (in which case most Linux architectures should be using 4 KB rather than 16 KB)? Or is it desired to be exactly 16 KB everywhere regardless of page size unless page size is bigger than 16 KB? Why is Windows using 4 KB while everything else uses 16 KB?

Same question applies to the blockSize in MarkedBlock.h. In that case, the code is a bit more clear, and I guess the desired block size is min(16 KB, page size)?
Comment 31 Michael Catanzaro 2020-03-19 10:09:42 PDT
Seb, here's your workaround patch for Ubuntu, which should be safe until we figure out a proper upstream fix. It maintains the current behavior except on architectures that are already likely broken:

diff --git a/Source/JavaScriptCore/runtime/JSCConfig.h b/Source/JavaScriptCore/runtime/JSCConfig.h
index 1ae53a56431a..ec67610057b8 100644
--- a/Source/JavaScriptCore/runtime/JSCConfig.h
+++ b/Source/JavaScriptCore/runtime/JSCConfig.h
@@ -34,10 +34,12 @@ class ExecutableAllocator;
 class FixedVMPoolExecutableAllocator;
 class VM;
 
-#if !OS(WINDOWS)
-constexpr size_t ConfigSizeToProtect = 16 * KB;
-#else
+#if OS(WINDOWS)
 constexpr size_t ConfigSizeToProtect = 4 * KB;
+#elif CPU(PPC64) || CPU(PPC64LE) || CPU(PPC) || CPU(UNKNOWN)
+constexpr size_t ConfigSizeToProtect = 64 * KB;
+#else
+constexpr size_t ConfigSizeToProtect = 16 * KB;
 #endif
 
 #if ENABLE(SEPARATED_WX_HEAP)
Comment 32 Mark Lam 2020-03-19 10:14:58 PDT
(In reply to Michael Catanzaro from comment #30)
> Hey Mark, how was ConfigSizeToProtect chosen?
> 
> #if !OS(WINDOWS)
> constexpr size_t ConfigSizeToProtect = 16 * KB;
> #else
> constexpr size_t ConfigSizeToProtect = 4 * KB;
> #endif
> 
> Is it supposed to match page size (in which case most Linux architectures
> should be using 4 KB rather than 16 KB)? Or is it desired to be exactly 16
> KB everywhere regardless of page size unless page size is bigger than 16 KB?
> Why is Windows using 4 KB while everything else uses 16 KB?
> 
> Same question applies to the blockSize in MarkedBlock.h. In that case, the
> code is a bit more clear, and I guess the desired block size is min(16 KB,
> page size)?

Look at JSC::Config::ensureSize.  It is padded so that sizeof(JSC::Config) is a multiple of OS page size.  ConfigSizeToProtect is shone to be that rounded up multiple of OS page size.
Comment 33 Mark Lam 2020-03-19 10:15:56 PDT
(In reply to Mark Lam from comment #32)
> Look at JSC::Config::ensureSize.  It is padded so that sizeof(JSC::Config)
> is a multiple of OS page size.  ConfigSizeToProtect is shone to be that
> rounded up multiple of OS page size.

/shone/set/
Comment 34 Carlos Alberto Lopez Perez 2020-03-19 11:15:31 PDT
(In reply to Michael Catanzaro from comment #28)
> The best solution is to get page size at runtime using
> sysconf(_SC_PAGESIZE), but it looks like the code really wants a
> compile-time solution. So maybe just hardcode 64 KB for these CPUs and for
> CPU(UNKNOWN)? Ideally we would share the value with MarkedBlock.h? clopez,
> what do you think?
> 


(In reply to Tomas Popela from comment #24)
> 
> s390x, ppc64(le) in RHEL and Fedora use 64 KB page size
> aarch64 on RHEL uses 64 KB and on Fedora 4 KB page size


I wonder about doing something like the patch below to avoid future problems in cases where we can't predict the page size.


diff --git a/Source/JavaScriptCore/runtime/JSCConfig.cpp b/Source/JavaScriptCore/runtime/JSCConfig.cpp
index 79cc2b67ba9..b85393e1def 100644
--- a/Source/JavaScriptCore/runtime/JSCConfig.cpp
+++ b/Source/JavaScriptCore/runtime/JSCConfig.cpp
@@ -33,6 +33,7 @@
 #include <mach/mach.h>
 #elif OS(LINUX)
 #include <sys/mman.h>
+#include <unistd.h>
 #endif
 
 namespace JSC {
@@ -70,7 +71,10 @@ void Config::permanentlyFreeze()
     // There's no going back now!
     result = vm_protect(mach_task_self(), reinterpret_cast<vm_address_t>(&g_jscConfig), ConfigSizeToProtect, DisallowPermissionChangesAfterThis, VM_PROT_READ);
 #elif OS(LINUX)
-    result = mprotect(&g_jscConfig, ConfigSizeToProtect, PROT_READ);
+    // Some architectures on Linux may have non-default page size.
+    // In that cases, avoid the crash in the RELEASE_ASSERT below due to using mprotect with a wrong page size.
+    if (sysconf(_SC_PAGESIZE) == ConfigSizeToProtect)
+        result = mprotect(&g_jscConfig, ConfigSizeToProtect, PROT_READ);
 #elif OS(WINDOWS)
     // FIXME: Implement equivalent, maybe with VirtualProtect.
     // Also need to fix WebKitTestRunner



Does this looks like a good idea?
Comment 35 Mark Lam 2020-03-19 11:21:47 PDT
(In reply to Carlos Alberto Lopez Perez from comment #34)
> (In reply to Michael Catanzaro from comment #28)
> > The best solution is to get page size at runtime using
> > sysconf(_SC_PAGESIZE), but it looks like the code really wants a
> > compile-time solution. So maybe just hardcode 64 KB for these CPUs and for
> > CPU(UNKNOWN)? Ideally we would share the value with MarkedBlock.h? clopez,
> > what do you think?
> > 

Hardcoding to 64K is a good approach in addition to the check below.

> (In reply to Tomas Popela from comment #24)
> > 
> > s390x, ppc64(le) in RHEL and Fedora use 64 KB page size
> > aarch64 on RHEL uses 64 KB and on Fedora 4 KB page size
> 
> 
> I wonder about doing something like the patch below to avoid future problems
> in cases where we can't predict the page size.
> 
> 
> diff --git a/Source/JavaScriptCore/runtime/JSCConfig.cpp
> b/Source/JavaScriptCore/runtime/JSCConfig.cpp
> index 79cc2b67ba9..b85393e1def 100644
> --- a/Source/JavaScriptCore/runtime/JSCConfig.cpp
> +++ b/Source/JavaScriptCore/runtime/JSCConfig.cpp
> @@ -33,6 +33,7 @@
>  #include <mach/mach.h>
>  #elif OS(LINUX)
>  #include <sys/mman.h>
> +#include <unistd.h>
>  #endif
>  
>  namespace JSC {
> @@ -70,7 +71,10 @@ void Config::permanentlyFreeze()
>      // There's no going back now!
>      result = vm_protect(mach_task_self(),
> reinterpret_cast<vm_address_t>(&g_jscConfig), ConfigSizeToProtect,
> DisallowPermissionChangesAfterThis, VM_PROT_READ);
>  #elif OS(LINUX)
> -    result = mprotect(&g_jscConfig, ConfigSizeToProtect, PROT_READ);
> +    // Some architectures on Linux may have non-default page size.
> +    // In that cases, avoid the crash in the RELEASE_ASSERT below due to
> using mprotect with a wrong page size.
> +    if (sysconf(_SC_PAGESIZE) == ConfigSizeToProtect)
> +        result = mprotect(&g_jscConfig, ConfigSizeToProtect, PROT_READ);

This check is not correct.  ConfigSizeToProtect does not need to be strictly equal to sysconf(_SC_PAGESIZE).  It needs to be a multiple of sysconf(_SC_PAGESIZE), where the multiple can be 1 or higher.  You can use WTF::roundUpToMultipleOf() to compute that.


> 
> 
> 
> Does this looks like a good idea?
Comment 36 Mark Lam 2020-03-19 11:26:45 PDT
(In reply to Mark Lam from comment #35)
> (In reply to Carlos Alberto Lopez Perez from comment #34)
> > (In reply to Michael Catanzaro from comment #28)
> > > The best solution is to get page size at runtime using
> > > sysconf(_SC_PAGESIZE), but it looks like the code really wants a
> > > compile-time solution. So maybe just hardcode 64 KB for these CPUs and for
> > > CPU(UNKNOWN)? Ideally we would share the value with MarkedBlock.h? clopez,
> > > what do you think?
> > > 
> 
> Hardcoding to 64K is a good approach in addition to the check below.

I should qualify this statement: hardcoding to 64K is a better workaround than disabling this feature outright.

The feature is a security mitigation.  If preventing the crash is a higher priority, the proposed check is good at the price of disabling this mitigation.
Comment 37 Carlos Alberto Lopez Perez 2020-03-19 11:34:43 PDT Comment hidden (obsolete)
Comment 38 Michael Catanzaro 2020-03-19 12:06:56 PDT
I think RELEASE_ASSERT() is the right thing to do if we somehow get the page size wrong. But let's RELEASE_ASSERT() on the sysconf(_SC_PAGESIZE) check instead of later when mprotect() fails. I'll cook a patch.
Comment 39 Michael Catanzaro 2020-03-19 18:26:36 PDT
Here's an initial patch.

Mark, any idea what's up with WTF::pageSize() vs. WTF::vmPageSize()? They do the same thing on all platforms except iOS family, where pageSize() uses the userspace page size (16 KiB, vm_page_size) whereas vmPageSize() uses the kernel page size (4 KiB,  vm_kernel_page_size). That's a strange difference. At first I had implemented vmPageSize() for OS(UNIX), but eventually I realized I could just get rid of that because it's really only needed in ResourceUsageCocoa.cpp, and changed JSCConfig to use pageSize() rather than vmPageSize(). Should be fine for iOS family because roundUpToMultipleOf(4 KiB, 16 KiB) == roundUpToMultipleOf(4 KiB, 16 KiB) == 16 KiB. I wound up not touching vmPageSize() at all, but it seems really suspicious to me. There's also bmalloc::vmPageSize(), which is the same as WTF::pageSize(), i.e. on iOS bmalloc::vmPageSize() returns a different value than WTF::vmPageSize() because bmalloc uses the userspace value while WTF uses the kernel value.)

Behavior change: I reduced the size of JSCConfig from 16 KiB down to 4 KiB on macOS and most Linux architectures. I assume that makes sense because there's no reason for it to be any bigger than a single page, yes? I can be the hero who saved 12 KiB per web process?

Then I made no behavior change in MarkedBlock.h. I used std::max(16 * KiB, CeilingOnPageSize) there, which I assume that makes sense because any changes there would affect object fragmentation, which could have performance impacts. Didn't want to mess with that.

Oh, and I changed pageSize() to use sysconf(_SC_PAGESIZE) instead of getpagesize() just because that's deprecated. And added RELEASE_ASSERTs in pageSize().

Let's see if EWS likes it....
Comment 40 Michael Catanzaro 2020-03-19 18:28:39 PDT
Created attachment 394053 [details]
Patch
Comment 41 Michael Catanzaro 2020-03-19 18:29:46 PDT
(In reply to Michael Catanzaro from comment #39)
> Here's an initial patch.
> 
> Mark, any idea what's up with WTF::pageSize() vs. WTF::vmPageSize()? They do
> the same thing on all platforms except iOS family,

(Um, well it's only implemented on Cocoa, so they do the same thing on Cocoa platforms, except iOS family.)
Comment 42 Michael Catanzaro 2020-03-19 18:31:03 PDT
(In reply to Michael Catanzaro from comment #39)
> Should be fine for iOS family because roundUpToMultipleOf(4
> KiB, 16 KiB) == roundUpToMultipleOf(4 KiB, 16 KiB) == 16 KiB.

I meant to write: roundUpToMultipleOf(4 KiB, 16 KiB) == roundUpToMultipleOf(16 KiB, 16 KiB) == 16 KiB.
Comment 43 Tomas Popela 2020-03-19 23:53:44 PDT
Comment on attachment 394053 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=394053&action=review

> Source/WTF/wtf/PageBlock.h:47
> +#if CPU(UNKNOWN) || CPU(PPC) || (CPU(ARM64) && !PLATFORM(IOS_FAMILY))

I think that PPC here means the old 32-bit PowerPC architecture no? Also CPU(ARM64) here will be wrong for Fedora and others, but I fail to see, where the 4 KB will be chosen over these 64 KB.
Comment 44 Mark Lam 2020-03-20 00:13:04 PDT
Comment on attachment 394053 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=394053&action=review

> Source/JavaScriptCore/runtime/JSCConfig.h:38
> +constexpr size_t ConfigSizeToProtect = CeilingOnPageSize;

We want Mac to match iOS.  Please only change this for non-Darwin/Cocoa platforms.
Comment 45 Michael Catanzaro 2020-03-20 06:07:24 PDT
(In reply to Tomas Popela from comment #43)
> I think that PPC here means the old 32-bit PowerPC architecture no? 

No, it means any variety of PPC. The old condition was redundant.

> Also
> CPU(ARM64) here will be wrong for Fedora and others, but I fail to see,
> where the 4 KB will be chosen over these 64 KB.

That's OK because it's just a ceiling. We only need to make sure the value is large enough for everyone.

(In reply to Mark Lam from comment #44)
> We want Mac to match iOS.  Please only change this for non-Darwin/Cocoa
> platforms.

Sure, but did you read comment #39? The iOS value seems to be wrong. I don't know much about iOS, but surely using the kernel page size for userspace pages isn't right? Seems like it can probably be reduced to 4 KB on iOS as well? (I bet EWS will prove me wrong if I try it... but would be interesting to see.)
Comment 46 Saam Barati 2020-03-20 08:12:26 PDT
Comment on attachment 394053 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=394053&action=review

> Source/WTF/wtf/PageBlock.h:50
> +constexpr size_t CeilingOnPageSize = 16 * KB;

This number should be 16k for all Darwin.
Comment 47 Saam Barati 2020-03-20 08:17:21 PDT
You probably also need the same thing inside bmalloc unless you’re not using it.
Comment 48 Michael Catanzaro 2020-03-20 08:39:48 PDT
I'll upload a new patch that always uses 16 KB for all Cocoa builds, so you can have the same value everywhere.

(In reply to Saam Barati from comment #47)
> You probably also need the same thing inside bmalloc unless you’re not using
> it.

bmalloc determines page size at runtime, so it always does the right thing.

(In reply to Michael Catanzaro from comment #45)
> Sure, but did you read comment #39? The iOS value seems to be wrong. I don't
> know much about iOS, but surely using the kernel page size for userspace
> pages isn't right? Seems like it can probably be reduced to 4 KB on iOS as
> well? (I bet EWS will prove me wrong if I try it... but would be interesting
> to see.)

Sorry, I got this completely backwards. iOS kernel page size is 4 KB, userspace page size is 16 KB. So I think ResourceUsageCocoa.cpp is returning bad results on iOS, by a factor of four. Someone from Apple could confirm and report a separate bug for this?
Comment 49 Michael Catanzaro 2020-03-20 08:45:04 PDT
(In reply to Michael Catanzaro from comment #48)
> bmalloc determines page size at runtime, so it always does the right thing.

Actually, we disable bmalloc on architectures that don't use 4 KB pages because it doesn't work. I looked for a place within bmalloc where it assumes page size at compile time and found bmalloc/Sizes.h. I guess if we update this, bmalloc might work on more architectures?

We've had bmalloc disabled on these architectures for years. I never realized it might be so simple as changing page size. Anyway, we can address this in a separate bug.
Comment 50 Michael Catanzaro 2020-03-20 08:48:20 PDT
Created attachment 394085 [details]
Patch
Comment 51 Michael Catanzaro 2020-03-20 08:50:32 PDT
Sebastien, please check that this new version of the patch also works for Ubuntu. Thanks!
Comment 52 Michael Catanzaro 2020-03-20 08:51:52 PDT
(In reply to Michael Catanzaro from comment #45)
> (In reply to Tomas Popela from comment #43)
> > I think that PPC here means the old 32-bit PowerPC architecture no? 
> 
> No, it means any variety of PPC. The old condition was redundant.

Ugh sorry, you're right, it's broken. New version incoming.
Comment 53 Michael Catanzaro 2020-03-20 08:58:25 PDT
(In reply to Michael Catanzaro from comment #52)
> Ugh sorry, you're right, it's broken. New version incoming.

CPU(PPC) means 32-bit PPC only.

CPU(PPC64) means ppc64 OR ppc64le, so having both in the condition is redundant.
Comment 54 Michael Catanzaro 2020-03-20 09:00:35 PDT
Created attachment 394087 [details]
Patch
Comment 55 Tomas Popela 2020-03-20 12:02:25 PDT
(In reply to Michael Catanzaro from comment #53)
> (In reply to Michael Catanzaro from comment #52)
> > Ugh sorry, you're right, it's broken. New version incoming.
> 
> CPU(PPC64) means ppc64 OR ppc64le, so having both in the condition is
> redundant.

Is it?

/* CPU(PPC64) - PowerPC 64-bit Big Endian */
#if (  defined(__ppc64__)      \
    || defined(__PPC64__))     \
    && defined(__BYTE_ORDER__) \
    && (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__)
#define WTF_CPU_PPC64 1
#define WTF_CPU_KNOWN 1
#endif

/* CPU(PPC64LE) - PowerPC 64-bit Little Endian */
#if (   defined(__ppc64__)     \
    || defined(__PPC64__)      \
    || defined(__ppc64le__)    \
    || defined(__PPC64LE__))   \
    && defined(__BYTE_ORDER__) \
    && (__BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__)
#define WTF_CPU_PPC64LE 1
#define WTF_CPU_KNOWN 1
#endif


from wtf/PlatformCPU.h
Comment 56 Michael Catanzaro 2020-03-20 12:34:38 PDT
(In reply to Tomas Popela from comment #55)
>     && defined(__BYTE_ORDER__) \
>     && (__BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__)
> #define WTF_CPU_PPC64LE 1
> #define WTF_CPU_KNOWN 1
> #endif

Apparently I am awful at reading code.
Comment 57 Michael Catanzaro 2020-03-20 12:35:58 PDT
Created attachment 394110 [details]
Patch
Comment 58 seb128 2020-03-20 15:44:55 PDT
@Michael,

I will give a try to the patch on monday, the JSCConfig.h code in webkitgtk 3.28 is different so the patch doesn't apply, I tried to adapt it but probably screwed something since it failed to build but it's a bit late to rework that today now
Comment 59 Carlos Alberto Lopez Perez 2020-03-23 06:50:01 PDT
*** Bug 209171 has been marked as a duplicate of this bug. ***
Comment 60 Daniel Kahn Gillmor 2020-03-23 08:08:18 PDT
I believe the same crash is happening on mipsel as well, so if these fixes can be tested on that platform, that would be great.

Here's a build of geary on the debian mipsel buildd that shows this failure during the test suite:

 https://buildd.debian.org/status/fetch.php?pkg=geary&arch=mipsel&ver=3.36.0-2&stamp=1584411277&raw=0
Comment 61 Carlos Alberto Lopez Perez 2020-03-23 09:04:04 PDT
(In reply to Daniel Kahn Gillmor from comment #60)
> I believe the same crash is happening on mipsel as well, so if these fixes
> can be tested on that platform, that would be great.
> 
> Here's a build of geary on the debian mipsel buildd that shows this failure
> during the test suite:
> 
>  https://buildd.debian.org/status/fetch.php?pkg=geary&arch=mipsel&ver=3.36.0-
> 2&stamp=1584411277&raw=0

We do extensive testing of JSC on mipsel (32-bits) and we have not noticed a problem with this (AFAIK)

The MIPS board we use for testing (ci20 with buidroot) have a pagesize of 4096 (same than Linux/x86_64)

Can you confirm which page size does Debian uses for the mipsel port? You can get that by running the command "getconf PAGESIZE" on that mipsel machine.
Comment 62 Michael Catanzaro 2020-03-23 09:49:08 PDT
Comment on attachment 394110 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=394110&action=review

> Source/WTF/wtf/PageBlock.h:50
> +// Use 64 KiB for any unknown CPUs to be conservative. This covers s390x, which doesn't currently
> +// have its own CPU() macro and which also uses 64 KiB pages.

This is wrong. s390x uses 4 KiB.
Comment 63 Michael Catanzaro 2020-03-23 09:51:31 PDT
Created attachment 394268 [details]
Patch
Comment 64 Michael Catanzaro 2020-03-23 09:52:07 PDT
Created attachment 394269 [details]
Patch
Comment 65 Tomas Popela 2020-03-23 09:58:20 PDT
(In reply to Michael Catanzaro from comment #62)
> Comment on attachment 394110 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=394110&action=review
> 
> > Source/WTF/wtf/PageBlock.h:50
> > +// Use 64 KiB for any unknown CPUs to be conservative. This covers s390x, which doesn't currently
> > +// have its own CPU() macro and which also uses 64 KiB pages.
> 
> This is wrong. s390x uses 4 KiB.

Sorry about that.. I was wrong/mistaken for the last 7 years..
Comment 66 Mark Lam 2020-03-23 10:00:38 PDT
Comment on attachment 394269 [details]
Patch

r=me if EWS bots are happy.
Comment 67 Michael Catanzaro 2020-03-23 10:19:30 PDT
Thanks Mark!

We've confirmed this fixes ppc64le on Red Hat's CI, so I can be pretty confident it will fix Ubuntu's problem too. I trust Sebastien will let us know when he's tested it.

(In reply to Daniel Kahn Gillmor from comment #60)
> I believe the same crash is happening on mipsel as well, so if these fixes
> can be tested on that platform, that would be great.
> 
> Here's a build of geary on the debian mipsel buildd that shows this failure
> during the test suite:
> 
>  https://buildd.debian.org/status/fetch.php?pkg=geary&arch=mipsel&ver=3.36.0-
> 2&stamp=1584411277&raw=0

How do you know it's this particular crash?

This patch is definitely not going to affect anything on MIPS, because it assumes 4 kiB page size on MIPS. We can change that if you have access to an affected MIPS system and report the 'getconf PAGESIZE' you see on that machine. (If it returns 4096, then your bug is something else.) I don't have access to any MIPS systems, so I can't test this for you.
Comment 68 EWS 2020-03-23 10:34:48 PDT
Committed r258857: <https://trac.webkit.org/changeset/258857>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 394269 [details].
Comment 69 Radar WebKit Bug Importer 2020-03-23 10:35:19 PDT
<rdar://problem/60780778>
Comment 70 Saam Barati 2020-03-24 11:26:17 PDT
(In reply to Michael Catanzaro from comment #49)
> (In reply to Michael Catanzaro from comment #48)
> > bmalloc determines page size at runtime, so it always does the right thing.
> 
> Actually, we disable bmalloc on architectures that don't use 4 KB pages
> because it doesn't work. I looked for a place within bmalloc where it
> assumes page size at compile time and found bmalloc/Sizes.h. I guess if we
> update this, bmalloc might work on more architectures?
> 
> We've had bmalloc disabled on these architectures for years. I never
> realized it might be so simple as changing page size. Anyway, we can address
> this in a separate bug.

I meant the "config size to protect" thing inside Gigacage code.
Comment 71 Saam Barati 2020-03-24 11:27:29 PDT
(In reply to Saam Barati from comment #70)
> (In reply to Michael Catanzaro from comment #49)
> > (In reply to Michael Catanzaro from comment #48)
> > > bmalloc determines page size at runtime, so it always does the right thing.
> > 
> > Actually, we disable bmalloc on architectures that don't use 4 KB pages
> > because it doesn't work. I looked for a place within bmalloc where it
> > assumes page size at compile time and found bmalloc/Sizes.h. I guess if we
> > update this, bmalloc might work on more architectures?
> > 
> > We've had bmalloc disabled on these architectures for years. I never
> > realized it might be so simple as changing page size. Anyway, we can address
> > this in a separate bug.
> 
> I meant the "config size to protect" thing inside Gigacage code.

I believe the "smallPageSize" stuff inside bmalloc's Sizes.h isn't required to be what the OS says it is. But I need to read a bit closer on how it's used.
Comment 72 Carlos Alberto Lopez Perez 2020-03-27 11:29:15 PDT
(In reply to EWS from comment #68)
> Committed r258857: <https://trac.webkit.org/changeset/258857>
> 
> All reviewed patches have been landed. Closing bug and clearing flags on
> attachment 394269 [details].

This has caused a regression on our ARM64 JSCOnly testers (which use 4KB of page size).
The tester its crashing all the time.
https://build.webkit.org/builders/JSCOnly%20Linux%20AArch64%20Release

Its also making the JSCOnly ARM64 EWS queue worthless (since it fails all the time) https://build.webkit.org/buildslaves/jsconly-linux-igalia-bot-2 


So I'm reverting this now, and I'll see if we can add here more info about why its crashing in order to rework the patch.
Comment 73 Carlos Alberto Lopez Perez 2020-03-27 11:32:39 PDT
(In reply to Carlos Alberto Lopez Perez from comment #72)
> Its also making the JSCOnly ARM64 EWS queue worthless (since it fails all
> the time) https://build.webkit.org/buildslaves/jsconly-linux-igalia-bot-2 
> 

Agh! forgot about that, we don't have an EWS for JSCOnly ARM64. 
That link was just the same than the one for the buildbot, just different view.
Comment 74 Zan Dobersek 2020-03-27 11:36:27 PDT
Created attachment 394736 [details]
JSCOnly aarch64 strace log

Here's output of strace for the SIGSEGV that occurs immediately after launch.
Comment 75 Daniel Kahn Gillmor 2020-04-08 06:52:38 PDT
should this remain RESOLVED FIXED if the patches have been reverted?  I'm sorry i haven't followed the changes closely enough upstream.  Is there a version where these fixes have landed?
Comment 76 Michael Catanzaro 2020-04-08 07:19:13 PDT
We wound up not reverting the patches.

aarch64 is still broken (bug #209670), but ppc64le is now fine.

Solving bug #209670 is basically impossible. We are being expected to provide a value at compile time that cannot be known until runtime. Very frustrating.

Oh, and I discovered people are actually using huge pages, so WebKit is guaranteed to be broken for them no matter what. *shrug!*
Comment 77 Mark Lam 2020-04-08 07:37:57 PDT
(In reply to Michael Catanzaro from comment #76)
> We wound up not reverting the patches.
> 
> aarch64 is still broken (bug #209670), but ppc64le is now fine.
> 
> Solving bug #209670 is basically impossible. We are being expected to
> provide a value at compile time that cannot be known until runtime. Very
> frustrating.
> 
> Oh, and I discovered people are actually using huge pages, so WebKit is
> guaranteed to be broken for them no matter what. *shrug!*

One way around this is to disable this mitigation for those platforms that cannot be statically configured: basically, on those platforms, at runtime, check if page size is within acceptable limits (and whatever other criteria you need), and if not, disable the mitigation and don’t do the page freeze.

On platforms that can guarantee their page sizes, this option to disable the mitigation should be #if’d out.
Comment 78 Michael Catanzaro 2020-04-08 07:57:55 PDT
Yeah you're right, that's exactly what we should do. We can check page size at runtime only on Linux platforms and disable config freezing, JIT, and bmalloc when pages are large. Will handle in bug #209670.
Comment 79 Mark Lam 2020-04-18 08:25:40 PDT
Comment on attachment 394269 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=394269&action=review

> Source/WTF/ChangeLog:11
> +        will be even easier to detect if we change RELEASE_ASSERT_WITH_MESSAGE() to actually print

We should not change RELEASE_ALERT_WITH_MESSAGE() to unconditionally print its message.  That message is only meant to be used in debug builds.  But if you really want a message in a release build, you can always check the condition first and print your message before doing the RELEASE_ASSERT.  I do not recommend doing this in the general case (for performance reasons), but in your implementation here, it’ll only be executed once.  So checking and printing that error message won’t hurt much.
Comment 80 Michael Catanzaro 2020-04-18 08:31:07 PDT
(In reply to Mark Lam from comment #79)
> We should not change RELEASE_ALERT_WITH_MESSAGE() to unconditionally print
> its message.

You probably want to look at bug #204399 ;)
Comment 81 Michael Catanzaro 2020-04-29 09:04:00 PDT
*** Bug 204017 has been marked as a duplicate of this bug. ***