Bug 217989 - Add new build option USE(64KB_PAGE_BLOCK)
Summary: Add new build option USE(64KB_PAGE_BLOCK)
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKitGTK (show other bugs)
Version: WebKit Nightly Build
Hardware: PC Linux
: P2 Normal
Assignee: Michael Catanzaro
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-10-20 14:41 PDT by Michael Catanzaro
Modified: 2022-08-17 08:59 PDT (History)
17 users (show)

See Also:


Attachments
Patch (6.34 KB, patch)
2020-10-20 14:46 PDT, Michael Catanzaro
no flags Details | Formatted Diff | Diff
Patch (6.32 KB, patch)
2020-10-20 14:54 PDT, Michael Catanzaro
no flags Details | Formatted Diff | Diff
Patch (6.30 KB, patch)
2020-10-20 16:30 PDT, Michael Catanzaro
no flags Details | Formatted Diff | Diff
Patch (6.30 KB, patch)
2020-11-04 12:59 PST, Michael Catanzaro
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Catanzaro 2020-10-20 14:41:41 PDT
Add new build option USE(64KB_PAGE_BLOCK)

Why do we need this option? Because JSC and bmalloc both want to know the userspace page size at compile time, which is impossible on Linux because it's a runtime setting. We cannot test the system page size at build time in hopes that it will be the same on the target system, because (a) cross compiling wouldn't work, and (b) the build system could use a different page size than the target system (which will be true for Fedora aarch64, because Fedora is built using RHEL), so the best we can do is guess based on based on the target CPU architecture. In practice, guessing works for all architectures except aarch64 (unless unusual page sizes are used), but it fails for aarch64 because distros are split between using 4 KB and 64 KB pages there. Most distros (including Fedora) use 4 KB, but RHEL uses 16 KB. SUSE actually supports both. Since there is no way to guess correctly, the best we can do is provide an option for it. You should probably only to use this if building for aarch64. Otherwise, known CPUs except PowerPC will use 4 KB, while PowerPC and unknown CPUs will use 64 KB (see wtf/PageBlock.h). aarch64 will continue to default to 4 KB because this is a better default on systems where it doesn't crash.

Having one flag will help avoid mistakes. E.g. both RHEL and SUSE were manually passing -DENABLE_JIT=OFF and -DUSE_SYSTEM_MALLOC=ON, but we missed -DENABLE_C_LOOP=ON and -DENABLE_SAMPLING_PROFILER=OFF, so wound up running with both JIT and cloop disabled, a configuration not otherwise used on Linux (and not supported by GTK or WPE ports). It will be easier to not mess up if we only have to pass one special build option.
Comment 1 Michael Catanzaro 2020-10-20 14:46:45 PDT
Created attachment 411919 [details]
Patch
Comment 2 Michael Catanzaro 2020-10-20 14:47:11 PDT
(This will also allow us to stop patching PageBlock.h downstream, because I don't like downstream patches that we have to keep forever.)
Comment 3 Michael Catanzaro 2020-10-20 14:54:20 PDT
Created attachment 411920 [details]
Patch
Comment 4 Carlos Alberto Lopez Perez 2020-10-20 16:13:03 PDT
(In reply to Michael Catanzaro from comment #0)
> Having one flag will help avoid mistakes. E.g. both RHEL and SUSE were
> manually passing -DENABLE_JIT=OFF and -DUSE_SYSTEM_MALLOC=ON, but we missed
> -DENABLE_C_LOOP=ON and -DENABLE_SAMPLING_PROFILER=OFF, so wound up running
> with both JIT and cloop disabled, a configuration not otherwise used on
> Linux (and not supported by GTK or WPE ports). It will be easier to not mess
> up if we only have to pass one special build option.

Any idea why the JSC JIT or bmalloc don't work on Aarch64 with a 64kb pagesize?
Comment 5 Carlos Alberto Lopez Perez 2020-10-20 16:15:31 PDT
Comment on attachment 411920 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=411920&action=review

> Source/WTF/wtf/PageBlock.h:53
> +#elif USE(64KB_PAGE_BLOCK) || CPU(PPC) || CPU(PPC64) || CPU(PPC64LE) || CPU(UNKNOWN)
> +constexpr size_t CeilingOnPageSize = 64 * KB;

On Source/bmalloc/bmalloc/mbmalloc.cpp I see the variables ConfigAlignment and ConfigSizeToProtect that likely should match this. There is a FIXME for that in such file.

> Source/cmake/WebKitFeatures.cmake:71
> +# there. Most distros (including Fedora) use 4 KB, but RHEL uses 16 KB. SUSE actually supports both.

I guess you mean here 64KB and not 16KB
Comment 6 Michael Catanzaro 2020-10-20 16:28:09 PDT
(In reply to Carlos Alberto Lopez Perez from comment #5)
> > Source/WTF/wtf/PageBlock.h:53
> > +#elif USE(64KB_PAGE_BLOCK) || CPU(PPC) || CPU(PPC64) || CPU(PPC64LE) || CPU(UNKNOWN)
> > +constexpr size_t CeilingOnPageSize = 64 * KB;
> 
> On Source/bmalloc/bmalloc/mbmalloc.cpp I see the variables ConfigAlignment
> and ConfigSizeToProtect that likely should match this. There is a FIXME for
> that in such file.

Well there are more places that need tweaked too (see bug #209360), and I don't know how to do it. Even if I did know what was needed, I'd have to figure out how to access a machine to test my changes on (that would be an... adventure).

> > Source/cmake/WebKitFeatures.cmake:71
> > +# there. Most distros (including Fedora) use 4 KB, but RHEL uses 16 KB. SUSE actually supports both.
> 
> I guess you mean here 64KB and not 16KB

Yup, I'll fix that.
Comment 7 Michael Catanzaro 2020-10-20 16:29:18 PDT
(In reply to Michael Catanzaro from comment #6)
> Well there are more places that need tweaked too (see bug #209360), and I
> don't know how to do it. Even if I did know what was needed, I'd have to
> figure out how to access a machine to test my changes on (that would be
> an... adventure).

Oh that's just for bmalloc. For JSC JIT, I don't know what all would need to be changed. (Fixing ConfigSizeToProtect was needed for cloop to work. I really don't know about JIT.)
Comment 8 Michael Catanzaro 2020-10-20 16:30:03 PDT
Created attachment 411938 [details]
Patch
Comment 9 Carlos Alberto Lopez Perez 2020-10-20 17:03:33 PDT
(In reply to Michael Catanzaro from comment #6)
> (In reply to Carlos Alberto Lopez Perez from comment #5)
> > > Source/WTF/wtf/PageBlock.h:53
> > > +#elif USE(64KB_PAGE_BLOCK) || CPU(PPC) || CPU(PPC64) || CPU(PPC64LE) || CPU(UNKNOWN)
> > > +constexpr size_t CeilingOnPageSize = 64 * KB;
> > 
> > On Source/bmalloc/bmalloc/mbmalloc.cpp I see the variables ConfigAlignment
> > and ConfigSizeToProtect that likely should match this. There is a FIXME for
> > that in such file.
> 
> Well there are more places that need tweaked too (see bug #209360), and I
> don't know how to do it. Even if I did know what was needed, I'd have to
> figure out how to access a machine to test my changes on (that would be
> an... adventure).
> 

I see.. having access to a test machine with this non-standard configurations is usually the biggest problem

I wonder if you tested to define the pre-processor defines ENABLE_UNIFIED_AND_FREEZABLE_CONFIG_RECORD BENABLE_UNIFIED_AND_FREEZABLE_CONFIG_RECORD? they seem to be guarding this code and seem to be disabled for OS(WINDOWS)

Maybe is a better idea to disable this feature of UNIFIED_AND_FREEZABLE_CONFIG_RECORD than to disable bmalloc or the JSC JIT =?
Comment 10 Michael Catanzaro 2020-10-21 08:20:14 PDT
(In reply to Carlos Alberto Lopez Perez from comment #9)
> Maybe is a better idea to disable this feature of
> UNIFIED_AND_FREEZABLE_CONFIG_RECORD than to disable bmalloc or the JSC JIT =?

Well... if we can make it work without crashing, then yes, of course it's better to not disable bmalloc. But again, there will be more work needed in at least Sizes.h and Gigacage.h that is not guarded by BENABLE_UNIFIED_AND_FREEZABLE_CONFIG_RECORD.

If we do make it work, we would still need USE(64KB_PAGE_BLOCK), it just wouldn't need to disable bmalloc or JIT anymore. And we could probably then enable bmalloc for CPU(UNKNOWN). (But JIT would still need to stay disabled for CPU(UNKNOWN).)
Comment 11 Michael Catanzaro 2020-10-26 09:45:33 PDT
(In reply to Michael Catanzaro from comment #10)
> If we do make it work, we would still need USE(64KB_PAGE_BLOCK), it just
> wouldn't need to disable bmalloc or JIT anymore. And we could probably then
> enable bmalloc for CPU(UNKNOWN). (But JIT would still need to stay disabled
> for CPU(UNKNOWN).)

So... I think we should add USE(64KB_PAGE_BLOCK) regardless. OK? :)
Comment 12 Carlos Alberto Lopez Perez 2020-10-26 12:05:40 PDT
Comment on attachment 411938 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=411938&action=review

> ChangeLog:26
> +        Having one flag will help avoid mistakes. E.g. both RHEL and SUSE were manually passing
> +        -DENABLE_JIT=OFF and -DUSE_SYSTEM_MALLOC=ON, but we missed -DENABLE_C_LOOP=ON and
> +        -DENABLE_SAMPLING_PROFILER=OFF, so wound up running with both JIT and cloop disabled, a
> +        configuration not otherwise used on Linux (and not supported by GTK or WPE ports). It will

I think we should add a check in cmake to ensure that one (and only one) of JIT or C_LOOP is enabled, so that we don't allow building with both options disabled (or with both enabled)
Comment 13 Michael Catanzaro 2020-10-26 13:14:26 PDT
OK, bug #218201.
Comment 14 Michael Catanzaro 2020-10-27 07:34:31 PDT
Comment on attachment 411938 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=411938&action=review

> Source/cmake/WebKitFeatures.cmake:73
> +# Since there is no way to guess correctly, the best we can do is provide an option for it. You
> +# should probably only to use this if building for aarch64. Otherwise, known CPUs except PowerPC

Typo to fix before landing: "You should probably only to use this"

> ChangeLog:18
> +        can do is provide an option for it. You should probably only to use this if building for

Typo to fix before landing: "You should probably only to use this"

>> ChangeLog:26
>> +        configuration not otherwise used on Linux (and not supported by GTK or WPE ports). It will
> 
> I think we should add a check in cmake to ensure that one (and only one) of JIT or C_LOOP is enabled, so that we don't allow building with both options disabled (or with both enabled)

Proposed in bug #218201, but we decided not to do this.
Comment 15 Michael Catanzaro 2020-11-04 08:05:57 PST
Ping reviewers
Comment 16 Yusuke Suzuki 2020-11-04 12:10:48 PST
Comment on attachment 411938 [details]
Patch

r=me
Comment 17 Michael Catanzaro 2020-11-04 12:14:34 PST
Thanks.
Comment 18 Michael Catanzaro 2020-11-04 12:59:24 PST
Created attachment 413198 [details]
Patch
Comment 19 EWS 2020-11-04 15:07:39 PST
Committed r269396: <https://trac.webkit.org/changeset/269396>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 413198 [details].
Comment 20 Carlos Alberto Lopez Perez 2020-11-04 15:48:17 PST
Comment on attachment 413198 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=413198&action=review

> Source/cmake/WebKitFeatures.cmake:86
> -    if (WTF_CPU_ARM64 OR WTF_CPU_X86_64)
> +    if (USE_64KB_PAGE_BLOCK)
> +        set(ENABLE_JIT_DEFAULT OFF)
> +        set(ENABLE_FTL_DEFAULT OFF)
> +        set(USE_SYSTEM_MALLOC_DEFAULT ON)
> +        set(ENABLE_C_LOOP_DEFAULT ON)

I wonder if it would be better to disable C_LOOP for the case of "USE_64KB_PAGE_BLOCK && WTF_CPU_ARM64" ? 
According to https://bugs.webkit.org/show_bug.cgi?id=218201#c6 asm LLInt+no JIT is faster than C_LOOP and it seems well supported on ARM64
Comment 21 Michael Catanzaro 2020-11-04 16:12:35 PST
(In reply to Carlos Alberto Lopez Perez from comment #20)
> I wonder if it would be better to disable C_LOOP for the case of
> "USE_64KB_PAGE_BLOCK && WTF_CPU_ARM64" ? 
> According to https://bugs.webkit.org/show_bug.cgi?id=218201#c6 asm LLInt+no
> JIT is faster than C_LOOP and it seems well supported on ARM64

I'm OK with changing this to whatever Igalia prefers. My inclination would be to lean towards using cloop, since I know you're not testing 64 KB aarch64, and I think you're not testing llint without JIT on any architecture (right?), and cloop ought to work well on every architecture. So I figure the chances of RHEL/SUSE-specific bugs should be lower using cloop than with llint without JIT, right? But if Guij and Xan think it's safe, then OK. Whatever you prefer.

I really do not care one at all about performance. I'd just rather not wind up with crashes or correctness problems that nobody else can reproduce.
Comment 22 Xan Lopez 2020-11-05 01:45:16 PST
(In reply to Michael Catanzaro from comment #21)
> (In reply to Carlos Alberto Lopez Perez from comment #20)
> > I wonder if it would be better to disable C_LOOP for the case of
> > "USE_64KB_PAGE_BLOCK && WTF_CPU_ARM64" ? 
> > According to https://bugs.webkit.org/show_bug.cgi?id=218201#c6 asm LLInt+no
> > JIT is faster than C_LOOP and it seems well supported on ARM64
> 
> I'm OK with changing this to whatever Igalia prefers. My inclination would
> be to lean towards using cloop, since I know you're not testing 64 KB
> aarch64, and I think you're not testing llint without JIT on any
> architecture (right?), and cloop ought to work well on every architecture.
> So I figure the chances of RHEL/SUSE-specific bugs should be lower using
> cloop than with llint without JIT, right? But if Guij and Xan think it's
> safe, then OK. Whatever you prefer.
> 
> I really do not care one at all about performance. I'd just rather not wind
> up with crashes or correctness problems that nobody else can reproduce.

LLInt on aarch64 is really well tested, I think having or not having JIT should not change things much in that regard (sure, it's a moving part and weird things could happen). In the long term it's also probably better if we test that instead of CLOOP. It is true, though, that going for CLOOP might be slightly safer.

That being said the difference in the end is going to be pretty small, for users in that architecture the deal breaker is not having any working tier above LLInt tbh. So if you are planning to figure that out at some point it might be worth just going for asm LLInt to laid the ground for future improvements.
Comment 23 Michael Catanzaro 2020-11-05 06:15:04 PST
OK, bug #218613.