Bug 214142 - [WTF] Fix PackedAlignedPtr for X86_64 canonical addresses
Summary: [WTF] Fix PackedAlignedPtr for X86_64 canonical addresses
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Web Template Framework (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2020-07-09 08:58 PDT by Jim Mason
Modified: 2020-07-15 09:33 PDT (History)
11 users (show)

See Also:


Attachments
Patch (3.49 KB, patch)
2020-07-09 09:06 PDT, Jim Mason
no flags Details | Formatted Diff | Diff
Patch (4.66 KB, patch)
2020-07-09 11:56 PDT, Jim Mason
no flags Details | Formatted Diff | Diff
Patch (5.41 KB, patch)
2020-07-10 02:54 PDT, Jim Mason
no flags Details | Formatted Diff | Diff
Patch (5.50 KB, patch)
2020-07-10 05:22 PDT, Jim Mason
no flags Details | Formatted Diff | Diff
Patch (6.35 KB, patch)
2020-07-13 02:14 PDT, Jim Mason
no flags Details | Formatted Diff | Diff
Patch (6.37 KB, patch)
2020-07-13 02:50 PDT, Jim Mason
no flags Details | Formatted Diff | Diff
Patch (6.41 KB, patch)
2020-07-13 03:12 PDT, Jim Mason
no flags Details | Formatted Diff | Diff
Patch (6.53 KB, patch)
2020-07-13 05:55 PDT, Jim Mason
no flags Details | Formatted Diff | Diff
Patch (5.52 KB, patch)
2020-07-14 02:42 PDT, Jim Mason
no flags Details | Formatted Diff | Diff
Patch (6.13 KB, patch)
2020-07-14 07:22 PDT, Jim Mason
no flags Details | Formatted Diff | Diff
Patch (6.13 KB, patch)
2020-07-14 07:42 PDT, Jim Mason
no flags Details | Formatted Diff | Diff
Patch (6.30 KB, patch)
2020-07-15 04:53 PDT, Jim Mason
no flags Details | Formatted Diff | Diff
Patch (6.33 KB, patch)
2020-07-15 08:34 PDT, Jim Mason
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jim Mason 2020-07-09 08:58:43 PDT
PackedAlignedPtr assumes the effective address width is <= 48 bits in 64-bit architectures.  This facilitates NaN boxing with the remaining bits.  (ref: wtf/PlatformOS.h)

This is fine for x86_64, as it has a 48-bit address space.  However, PackedAlignedPtr incorrectly assumes the 'hole' in the address space is at the high-end (i.e., that the most significant 16 bits are always zero).  For x86_64, the high order bits may be zero or one: "The AMD specification requires that the most significant 16 bits of any virtual address, bits 48 through 63, must be copies of bit 47 (in a manner akin to sign extension)." Reference: https://en.wikipedia.org/wiki/X86-64#Virtual_address_space_details

The patch to wtf/Packed.h fixes the getter of PackedAlignedPtr to reconstitute the canonical address by 'sign-extending' bit 47 (which we already store) across the most significant 16-bits:

#if CPU(X86_64)                                                               
    if(value & 0x1ULL << 47)                                              
        value |= 0xffffULL << 48;                                         
#endif

In this way, both the getter and setter of PackedAlignedPtr will work correctly for all X86_64 canonical addresses: The setter continues to discard the upper 16 bits and store only the lower 48 bits, while the getter now reconstitutes the upper 16 bits at retrieval time.


The patch to wasm/js/WebAssemblyFunction.cpp is a related fix for an address pointer test:  Currently, the test uses a signed value (intptr_t) for the comparison.  If the high order bit is set, this number is negative, which makes the test fail.  The fix changes intptr_t to uintptr_t so the test will work as expected for all addresses.
Comment 1 Jim Mason 2020-07-09 09:06:16 PDT
Created attachment 403873 [details]
Patch
Comment 2 Mark Lam 2020-07-09 09:15:09 PDT
Jim, what platform actually allocates user space memory in the range where bit 47 of the address is set?  Can you share the detail?
Comment 3 Jim Mason 2020-07-09 09:25:11 PDT
(In reply to Mark Lam from comment #2)
> Jim, what platform actually allocates user space memory in the range where
> bit 47 of the address is set?  Can you share the detail?

Solaris/illumos.  See https://www.oracle.com/solaris/technologies/memory.html

I first thought this was a Solaris quirk, and for a long time, just self-applied the attached patch to make it work in my environment.  When I learned that X86_64 actually specifies the canonical address space, such that the 'hole' of unusable addresses is in the middle of the virtual space, I reckoned it would be worthwhile to share the fix, so that webkit always does the right thing for this processor architecture.
Comment 4 Mark Lam 2020-07-09 09:28:52 PDT
(In reply to Jim Mason from comment #3)
> (In reply to Mark Lam from comment #2)
> > Jim, what platform actually allocates user space memory in the range where
> > bit 47 of the address is set?  Can you share the detail?
> 
> Solaris/illumos.  See https://www.oracle.com/solaris/technologies/memory.html
> 
> I first thought this was a Solaris quirk, and for a long time, just
> self-applied the attached patch to make it work in my environment.  When I
> learned that X86_64 actually specifies the canonical address space, such
> that the 'hole' of unusable addresses is in the middle of the virtual space,
> I reckoned it would be worthwhile to share the fix, so that webkit always
> does the right thing for this processor architecture.

Thanks for the patch and the justification for it.  Will look into the issue.
Comment 5 Mark Lam 2020-07-09 09:43:29 PDT
(In reply to Jim Mason from comment #3)
> (In reply to Mark Lam from comment #2)
> > Jim, what platform actually allocates user space memory in the range where
> > bit 47 of the address is set?  Can you share the detail?
> 
> Solaris/illumos.  See https://www.oracle.com/solaris/technologies/memory.html
> 
> I first thought this was a Solaris quirk, and for a long time, just
> self-applied the attached patch to make it work in my environment.  When I
> learned that X86_64 actually specifies the canonical address space, such
> that the 'hole' of unusable addresses is in the middle of the virtual space,
> I reckoned it would be worthwhile to share the fix, so that webkit always
> does the right thing for this processor architecture.

According to https://www.oracle.com/solaris/technologies/memory.html, addresses with bit 47 set are all reserved for Kernel space.  In fact, the upper bound in user space memory is 0x00008000.00000000 i.e. you should never see an address with bit 47 set in WebKit.  Am I missing something?
Comment 6 Jim Mason 2020-07-09 09:57:37 PDT
(In reply to Mark Lam from comment #5)
> According to https://www.oracle.com/solaris/technologies/memory.html,
> addresses with bit 47 set are all reserved for Kernel space.  In fact, the
> upper bound in user space memory is 0x00008000.00000000 i.e. you should
> never see an address with bit 47 set in WebKit.  Am I missing something?

I think you may be looking at SPARC.  X86_64 has the user stack and shared objects up there.  It's about half way down:

      64-bit Kernel's Virtual Memory Layout. (assuming 64 bit app)

 0xFFFFFD80.00000000  |-----------------------|- KERNELBASE (lower if > 1TB)
                      |     User stack        |- User space memory
                      |                       |
                      | shared objects, etc   |       (grows downwards)
                      :                       :
                      |                       |
 0xFFFF8000.00000000  |-----------------------|
                      |                       |
                      | VA Hole / unused      |
                      |                       |
 0x00008000.00000000  |-----------------------|
                      |                       |
                      |                       |
                      :                       :
                      |       user heap       |       (grows upwards)
                      |                       |
                      |       user data       |
                      |-----------------------|
                      |       user text       |
 0x00000000.04000000  |-----------------------|
                      |       invalid         |
 0x00000000.00000000  +-----------------------+


I can confirm on Solaris that I see addresses with bit 47 both set and not set in PackedAlignedPtr, and if I don't sign-extend the ones with bit 47 set, it will SIGSEGV.
Comment 7 Mark Lam 2020-07-09 10:01:45 PDT
(In reply to Jim Mason from comment #6)
> (In reply to Mark Lam from comment #5)
> > According to https://www.oracle.com/solaris/technologies/memory.html,
> > addresses with bit 47 set are all reserved for Kernel space.  In fact, the
> > upper bound in user space memory is 0x00008000.00000000 i.e. you should
> > never see an address with bit 47 set in WebKit.  Am I missing something?
> 
> I think you may be looking at SPARC.  X86_64 has the user stack and shared
> objects up there.  It's about half way down:
> 
>       64-bit Kernel's Virtual Memory Layout. (assuming 64 bit app)
> 
>  0xFFFFFD80.00000000  |-----------------------|- KERNELBASE (lower if > 1TB)
>                       |     User stack        |- User space memory
>                       |                       |
>                       | shared objects, etc   |       (grows downwards)
>                       :                       :
>                       |                       |
>  0xFFFF8000.00000000  |-----------------------|
>                       |                       |
>                       | VA Hole / unused      |
>                       |                       |
>  0x00008000.00000000  |-----------------------|
>                       |                       |
>                       |                       |
>                       :                       :
>                       |       user heap       |       (grows upwards)
>                       |                       |
>                       |       user data       |
>                       |-----------------------|
>                       |       user text       |
>  0x00000000.04000000  |-----------------------|
>                       |       invalid         |
>  0x00000000.00000000  +-----------------------+
> 
> 
> I can confirm on Solaris that I see addresses with bit 47 both set and not
> set in PackedAlignedPtr, and if I don't sign-extend the ones with bit 47
> set, it will SIGSEGV.

This was exactly what I was looking at.  Oh, I missed the "User stack and shared objects, etc" section.
Comment 8 Mark Lam 2020-07-09 10:05:21 PDT
Comment on attachment 403873 [details]
Patch

r=me
Comment 9 Mark Lam 2020-07-09 10:08:42 PDT
Comment on attachment 403873 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=403873&action=review

> Source/WTF/wtf/Packed.h:155
> +#if CPU(X86_64)
> +        // The AMD specification requires that the most significant 16
> +        // bits of any virtual address, bits 48 through 63, must be
> +        // copies of bit 47 (in a manner akin to sign extension).
> +        //
> +        // Reference: https://en.wikipedia.org/wiki/X86-64#Virtual_address_space_details
> +        if (value & 0x1ULL << 47)
> +            value |= 0xffffULL << 48;
> +#endif

Please add an ASSERT in set() to verify that if bit 47 is set that the upper 16 bits are also set.
Comment 10 Mark Lam 2020-07-09 10:11:12 PDT
Comment on attachment 403873 [details]
Patch

Taking back the r+.  Looks like you also need to update TEST(WTF_Packed, AssignAndGet) and the api tests' Packed.cpp.
Comment 11 Jim Mason 2020-07-09 11:56:45 PDT
Created attachment 403897 [details]
Patch
Comment 12 Mark Lam 2020-07-09 12:18:57 PDT
Comment on attachment 403897 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=403897&action=review

> Source/WTF/wtf/Packed.h:166
> +        // Shift bit 47 to high order bit, then arithmetic shift back.
> +        // If value is different, address is not canonical.

Just remove these 2 lines.  It's frowned upon in WebKit code to have comments that say exactly the operations the code is doing as opposed to why it's doing it.

> Source/WTF/wtf/Packed.h:167
> +        ASSERT(bitwise_cast<intptr_t>(value) == bitwise_cast<intptr_t>(value) << 16 >> 16);

Please put parens around bitwise_cast<intptr_t>(value) << 16:
    ASSERT(bitwise_cast<intptr_t>(value) == (bitwise_cast<intptr_t>(value) << 16) >> 16);

> Tools/TestWebKitAPI/Tests/WTF/Packed.cpp:72
> +#if CPU(X86_64)
> +        // validate X86_64 "higher half" address
> +        key = max = bitwise_cast<uint8_t*>(static_cast<uintptr_t>(-1));
> +        EXPECT_EQ(key.get(), max);
> +#endif

This isn't want I was getting at.  Your earlier patch was failing the API tests (see https://ews-build.webkit.org/#/builders/3/builds/28085) with the following details:

Failed
    TestWTF.WTF_Packed.AssignAndGet
        
        /Volumes/Data/worker/macOS-Mojave-Release-Build-EWS/build/Tools/TestWebKitAPI/Tests/WTF/Packed.cpp:67
        Expected equality of these values:
          key.get()
            Which is: 0xffffffffffffffff
          max
            Which is: 0xffffffffffff

The reason it failed is because the test fabricated an address in "max" of 0xffffffffffff, which is an invalid address according to the spec.  Naturally, the test would fail.

You should fix the test to handle the following cases:
    0xffff ffff ffff ffff
    0xffff ffff ffff if not X86_64
    0x7fff ffff ffff
    0x0001 2345 6789
    0x0000 0000 0000

Does that make sense?

Also, since the only difference here between each test case is the input pointer value, I would put these values in an array, and just run the test in a loop.
Comment 13 Mark Lam 2020-07-09 12:19:30 PDT
Comment on attachment 403897 [details]
Patch

r- because the patch needs more work.
Comment 14 Yusuke Suzuki 2020-07-09 12:37:06 PDT
Comment on attachment 403897 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=403897&action=review

> Source/WTF/wtf/Packed.h:147
> +#if CPU(X86_64)

When doing that, please do it only in Solaris.
Comment 15 Jim Mason 2020-07-09 14:59:36 PDT
(In reply to Yusuke Suzuki from comment #14)
> Comment on attachment 403897 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=403897&action=review
> 
> > Source/WTF/wtf/Packed.h:147
> > +#if CPU(X86_64)
> 
> When doing that, please do it only in Solaris.

I was planning to make the changes requested by Mark tomorrow, but the whole point of this patch is to provide support for X86_64 canonical addressing in webkit.  It is not just a Solaris thing.

If webkit does not want to do the right thing for X86_64, let me know and I will  just close the bug.
Comment 16 Mark Lam 2020-07-09 15:16:51 PDT
(In reply to Jim Mason from comment #15)
> (In reply to Yusuke Suzuki from comment #14)
> > Comment on attachment 403897 [details]
> > Patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=403897&action=review
> > 
> > > Source/WTF/wtf/Packed.h:147
> > > +#if CPU(X86_64)
> > 
> > When doing that, please do it only in Solaris.
> 
> I was planning to make the changes requested by Mark tomorrow, but the whole
> point of this patch is to provide support for X86_64 canonical addressing in
> webkit.  It is not just a Solaris thing.
> 
> If webkit does not want to do the right thing for X86_64, let me know and I
> will  just close the bug.

Just to clarify.  I think Yusuke's point is not that this is strictly incorrect, but that it is unnecessary for any other OS than Solaris.  In WebKit, we do a lot of work to make sure the code is optimal.  While this extra work does not hurt in terms of correctness, it is guaranteed to be an effective no-op on any OS that is not Solaris because for other OSes, user space never has an address with a set bit 47.  So, it is not incorrect to omit this extra code for any OS that is not Solaris.  Yusuke's concern here is with regards to performance.

That said. I don't see WTF_OS_SOLARIS defined in PlatformOS.h.  So, in order to add the #if OS(SOLARIS) check, one would have to figure out how to detect a build targeting Solaris and add the WTF_OS_SOLARIS #define in PlatformOS.h.
Comment 17 Jim Mason 2020-07-09 15:50:17 PDT
(In reply to Mark Lam from comment #16)
> (In reply to Jim Mason from comment #15)
> > (In reply to Yusuke Suzuki from comment #14)
> > > Comment on attachment 403897 [details]
> > > Patch
> > > 
> > > View in context:
> > > https://bugs.webkit.org/attachment.cgi?id=403897&action=review
> > > 
> > > > Source/WTF/wtf/Packed.h:147
> > > > +#if CPU(X86_64)
> > > 
> > > When doing that, please do it only in Solaris.
> > 
> > I was planning to make the changes requested by Mark tomorrow, but the whole
> > point of this patch is to provide support for X86_64 canonical addressing in
> > webkit.  It is not just a Solaris thing.
> > 
> > If webkit does not want to do the right thing for X86_64, let me know and I
> > will  just close the bug.
> 
> Just to clarify.  I think Yusuke's point is not that this is strictly
> incorrect, but that it is unnecessary for any other OS than Solaris.  In
> WebKit, we do a lot of work to make sure the code is optimal.  While this
> extra work does not hurt in terms of correctness, it is guaranteed to be an
> effective no-op on any OS that is not Solaris because for other OSes, user
> space never has an address with a set bit 47.  So, it is not incorrect to
> omit this extra code for any OS that is not Solaris.  Yusuke's concern here
> is with regards to performance.

With -O2 or -O3 in gcc, the test becomes two instructions, a bit test and a jump.  Probably similar in other compilers.  In the PackedAlignedPtr getter, there is already a test -- if (isAlignmentShiftProfitable) -- that is probably a no op on most every platform.  This test is similarly trivial.

The reason I submit this patch is that an operating system can map memory however it likes.  If webkit does the right thing, it ensures every OS will work correctly, no matter what.  As it stands now, the operating system has to make 'the right choices' based on the current assumptions, or webkit will fail, which means someone has to chase bugs down.

As you pointed out, the current TEST(WTF_Packed, AssignAndGet) is testing an address that is *invalid* for the X86_64 processor architecture.  And in the web assembly code, a signed value is being used for a pointer test.  Cleaning up these sorts of things helps make webkit robust.
Comment 18 Jim Mason 2020-07-10 02:54:26 PDT
Created attachment 403954 [details]
Patch
Comment 19 Jim Mason 2020-07-10 05:22:01 PDT
Created attachment 403960 [details]
Patch
Comment 20 Jim Mason 2020-07-10 05:29:44 PDT
(In reply to Mark Lam from comment #12)
> You should fix the test to handle the following cases:
>     0xffff ffff ffff ffff
>     0xffff ffff ffff if not X86_64
>     0x7fff ffff ffff
>     0x0001 2345 6789
>     0x0000 0000 0000

Done.

As for, 0x123456789, I am running that test case for all 64-bit architectures (not just X86_64), while testing 0x12345678 on 32-bit.  If you want to do something else, let me know.
Comment 21 Mark Lam 2020-07-10 07:57:31 PDT
(In reply to Jim Mason from comment #17)
> (In reply to Mark Lam from comment #16)
> > (In reply to Jim Mason from comment #15)
> > > (In reply to Yusuke Suzuki from comment #14)
> > > > Comment on attachment 403897 [details]
> > > > Patch
> > > > 
> > > > View in context:
> > > > https://bugs.webkit.org/attachment.cgi?id=403897&action=review
> > > > 
> > > > > Source/WTF/wtf/Packed.h:147
> > > > > +#if CPU(X86_64)
> > > > 
> > > > When doing that, please do it only in Solaris.
> > > 
> > > I was planning to make the changes requested by Mark tomorrow, but the whole
> > > point of this patch is to provide support for X86_64 canonical addressing in
> > > webkit.  It is not just a Solaris thing.
> > > 
> > > If webkit does not want to do the right thing for X86_64, let me know and I
> > > will  just close the bug.
> > 
> > Just to clarify.  I think Yusuke's point is not that this is strictly
> > incorrect, but that it is unnecessary for any other OS than Solaris.  In
> > WebKit, we do a lot of work to make sure the code is optimal.  While this
> > extra work does not hurt in terms of correctness, it is guaranteed to be an
> > effective no-op on any OS that is not Solaris because for other OSes, user
> > space never has an address with a set bit 47.  So, it is not incorrect to
> > omit this extra code for any OS that is not Solaris.  Yusuke's concern here
> > is with regards to performance.
> 
> With -O2 or -O3 in gcc, the test becomes two instructions, a bit test and a
> jump.  Probably similar in other compilers.  In the PackedAlignedPtr getter,
> there is already a test -- if (isAlignmentShiftProfitable) -- that is
> probably a no op on most every platform.  This test is similarly trivial.

isAlignmentShiftProfitable is a constexpr.  This check will be constant folded.  The shift will either be emitted unconditionally, or elided away.  Hence, this is not a justification for adding a conditional branch into this code, especially one that we know are not needed for OS(DARWIN), OS(LINUX) and OS(WIN).

Also, this get() accessor is used all over the place.  Yusuke is the engineer who applied PackedAlignedPtr pointers throughout the code, and he did a lot of benchmarking back then to ensure that there is no performance regression when it was applied.  I trust he has good reason to ask to have this #if'd out.  This performance issue is also why I asked what OS you were seeing this on in the first place, and I tried to take the time to think about whether we really need this.  I would take Yusuke's word on this over mine.

> The reason I submit this patch is that an operating system can map memory
> however it likes.  If webkit does the right thing, it ensures every OS will
> work correctly, no matter what.  As it stands now, the operating system has
> to make 'the right choices' based on the current assumptions, or webkit will
> fail, which means someone has to chase bugs down.

Your claim here is not correct.  We're not assuming that the OS will make this choice.  We know for a fact that OS(DARWIN), OS(LINUX)  (see https://www.kernel.org/doc/html/latest/x86/x86_64/mm.html?highlight=virtual%20memory), and OS(WIN) (see https://docs.microsoft.com/en-us/windows/win32/memory/memory-limits-for-windows-releases#memory-and-address-space-limits) do not need this.  It's in their ABI.  And the software is written based on that ABI.  Solaris is the weird duck here.
 
> As you pointed out, the current TEST(WTF_Packed, AssignAndGet) is testing an
> address that is *invalid* for the X86_64 processor architecture.  And in the
> web assembly code, a signed value is being used for a pointer test. 
> Cleaning up these sorts of things helps make webkit robust.

Yes, there is value in this patch.  We're not rejecting it.  We're just trying to refine it.
Comment 22 Mark Lam 2020-07-10 08:46:28 PDT
Comment on attachment 403960 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=403960&action=review

Almost there.

> Source/WTF/wtf/Packed.h:147
> +#if CPU(X86_64)

Please do one of the following:
1. In PlatformOS.h, add a #define for WTF_OS_SOLARIS based on what gcc would define for Solaris.
    Change this line to `#if CPU(X86_64) && OS(SOLARIS)`

Or ...

2. Change this line to `#if CPU(X86_64) && !(OS(DARWIN) || OS(LINUX) || OS(WINDOWS))`

I prefer option (1) but if needed, I can accept option (2).

> Source/WTF/wtf/Packed.h:154
> +        // The AMD specification requires that the most significant 16
> +        // bits of any virtual address, bits 48 through 63, must be
> +        // copies of bit 47 (in a manner akin to sign extension).
> +        //
> +        // Reference: https://en.wikipedia.org/wiki/X86-64#Virtual_address_space_details
> +        if (value & 0x1ULL << 47)
> +            value |= 0xffffULL << 48;

After thinking about this some more, I realized that this quirk only applies to X86_64 implementations that have 48 bit addresses.  The spec actually allows for implementations with > 48 bit addresses.  However, I'm not aware that those implementations actually exist today.  Also, lots of WebKit code relies on addresses not exceeding 48 bits.  

Please add the following just above this if statement:
        static_assert(OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH) == 48);

This will prevent this quirk from silently causing bad behavior on x86_64 implementations that have > 48 bits addressing (not that a fix will be easy).  It also documents our expectation.

> Source/WTF/wtf/Packed.h:166
> +#if CPU(X86_64)
> +        // Ensure bits 48-63 track the value of bit 47, per the AMD spec.
> +        ASSERT(bitwise_cast<intptr_t>(value) == (bitwise_cast<intptr_t>(value) << 16) >> 16);
> +#endif

The reason I asked for this before is because it will catch the issue if someone ends up building for another OS with an unusual virtual memory map like Solaris.  Having this assertion here allows us to use Option 1 above so that all ports will get the optimized behavior by default.  And if it doesn't work for any specific port, it won't fail silently anymore.

> Tools/TestWebKitAPI/Tests/WTF/Packed.cpp:67
> +        uint8_t* candidates[] =
> +            {
> +                0,

WebKit style requires that this be written as:
        uint8_t* candidates[] = {
            0,
            ...
        };

> Tools/TestWebKitAPI/Tests/WTF/Packed.cpp:83
> +#if CPU(ADDRESS64)
> +                bitwise_cast<uint8_t*>(0x123456789ULL),
> +#else
> +                bitwise_cast<uint8_t*>(0x12345678UL),
> +#endif
> +#if CPU(X86_64)
> +                bitwise_cast<uint8_t*>((1ULL << 47) - 1), // max lower half
> +                bitwise_cast<uint8_t*>(~((1ULL << 47) - 1)), // min higher half
> +                bitwise_cast<uint8_t*>(static_cast<uintptr_t>(-1)), // max higher half
> +#else
> +                // Caution: This test should be processor-specific.
> +                //
> +                // In absence of a specific test, assume that all
> +                // bits of the EFFECTIVE_ADDRESS_WIDTH can be used.
> +                bitwise_cast<uint8_t*>(static_cast<uintptr_t>((1ULL << OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH)) - 1)),
> +#endif

I would write these as:

        bitwise_cast<uint8_t*>(static_cast<uintptr_t>((1ULL << (OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH) / 2)) - 1)),
        bitwise_cast<uint8_t*>(static_cast<uintptr_t>((1ULL << (OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH) - 1)) - 1)),
#if CPU(X86_64) && OS(SOLARIS)
        bitwise_cast<uint8_t*>(~((1ULL << 47) - 1)), // min higher half
        bitwise_cast<uint8_t*>(std::numeric_limits<uintptr_t>::max()),
#else
        bitwise_cast<uint8_t*>(static_cast<uintptr_t>((1ULL << OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH)) - 1)),
#endif
Comment 23 Mark Lam 2020-07-10 08:48:05 PDT
Comment on attachment 403960 [details]
Patch

r- because the patch needs a bit more work.
Comment 24 Jim Mason 2020-07-10 10:38:32 PDT
You have given this a lot of thought, Mark.  Thank you!  This is becoming quite good.

> 1. In PlatformOS.h, add a #define for WTF_OS_SOLARIS based on what gcc would
> define for Solaris.
>     Change this line to `#if CPU(X86_64) && OS(SOLARIS)`
> 
> Or ...
> 
> 2. Change this line to `#if CPU(X86_64) && !(OS(DARWIN) || OS(LINUX) ||
> OS(WINDOWS))`
> 
> I prefer option (1) but if needed, I can accept option (2).

How about a touch of option (1) and option (2)?  Instead of bringing OS-specifics into the test, we could define a new symbol in one of Platform* files, something like USE_CANONICAL_ADDRESSING.  There, it could be setup as an 'opt-in' option, like (1) above:

    #if CPU(X86_64) && (OS(SOLARIS) || ...)
    #define USE_CANONICAL_ADDRESSING
    #endif

or opt-out, like (2) above:

    #if CPU(X86_64) && !(OS(DARWIN) || ...)
    #define USE_CANONICAL_ADDRESSING
    #endif

and the code in Packed.h would read something like:
    #if USE(CANONICAL_ADDRESSING)
        if (value & 0x1ULL << 47)
            value |= 0xffffULL << 48;
    #endif

It avoids having to go touch the source file if we want to change from opt-in to opt-out, or add or remove an OS from the list.

> After thinking about this some more, I realized that this quirk only applies
> to X86_64 implementations that have 48 bit addresses.  The spec actually
> allows for implementations with > 48 bit addresses.  However, I'm not aware
> that those implementations actually exist today.  Also, lots of WebKit code
> relies on addresses not exceeding 48 bits.  
> 
> Please add the following just above this if statement:
>         static_assert(OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH) == 48);
> 
> This will prevent this quirk from silently causing bad behavior on x86_64
> implementations that have > 48 bits addressing (not that a fix will be
> easy).  It also documents our expectation.


What if we use EFFECTIVE_ADDRESS_WIDTH to determine the bit and mask?  e.g,

    #if USE(CANONICAL_ADDRESSING)
        if (value & 0x1ULL << (OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH) - 1))
            value |= ~((1ULL << OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH)) - 1);
    #endif

The ASSERT in the setter could be updated similarly.

In this way, if a 56-bit CPU emerges down the road, no adjustment to this code would be needed.  (NaN boxing and other webkit code that rely on 48 bits is another matter... but at least this code would continue to work as expected...)
Comment 25 Jim Mason 2020-07-13 02:14:05 PDT
Created attachment 404134 [details]
Patch
Comment 26 Jim Mason 2020-07-13 02:50:33 PDT
Created attachment 404137 [details]
Patch
Comment 27 Jim Mason 2020-07-13 03:12:19 PDT
Created attachment 404139 [details]
Patch
Comment 28 Jim Mason 2020-07-13 05:55:48 PDT
Created attachment 404143 [details]
Patch
Comment 29 Mark Lam 2020-07-13 15:14:23 PDT
Comment on attachment 404143 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=404143&action=review

> Source/WTF/wtf/Packed.h:147
> +#if HAVE(CANONICAL_ADDRESSING)

This is wrong because the name does not match what the code is trying to do.  x86_64 on OS(DARWIN), OS(LINUX), and OS(WINDOWS) all have canonical addresses, but none of them want this.  Really, this is truly an OS(SOLARIS) thing.  There's currently no evidence that any other OS would lay out memory this way.  Declaring this HAVE(CANONICAL_ADDRESSING) condition is just opening a door for confusion later because it doesn't really guard what it says.

Please just use #if OS(SOLARIS) here.

> Source/WTF/wtf/Packed.h:151
> +        //

You can add a comment here saying: "No other OS needs to do this because they do not map user space data into any address range with bit 47 set."

> Source/WTF/wtf/Packed.h:172
> +#if HAVE(CANONICAL_ADDRESSING)
> +        // Ensure the address is in canonical form (see note above).
> +        ASSERT(bitwise_cast<intptr_t>(value) == (bitwise_cast<intptr_t>(value)
> +            << (64 - OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH)))
> +            >> (64 - OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH)));
> +#elif CPU(X86_64)
> +        // Ensure the address is in the lower half.
> +        // If this assert fails, you may need to enable CANONICAL_ADDRESSING.
> +        ASSERT(value < 1ULL << (OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH) - 1));
> +#endif

Let's remove this.  There's a much easier way to do this assert.

After the memcpy below, add:
    ASSERT(bitwise_cast<uintptr_t>(get()) == value);

This is also an improvement because it tests how we will actually encode the address bits in the packed pointer, not how we "expect" we will encode the address bits.  And there's no need for any conditionals.  The assertion should always be true.

> Source/WTF/wtf/PlatformHave.h:678
> +#if CPU(X86_64) && !(OS(DARWIN) || OS(LINUX) || OS(WINDOWS))
> +#define HAVE_CANONICAL_ADDRESSING 1
> +#endif

Please remove this.  Instead, please add the following to PlatformOS.h:

#if defined(__sun)
#define WTF_OS_SOLARIS 1
#endif

> Tools/TestWebKitAPI/Tests/WTF/Packed.cpp:69
> +#if HAVE(CANONICAL_ADDRESSING)

Let's use `#if OS(SOLARIS)` instead.

> Tools/TestWebKitAPI/Tests/WTF/Packed.cpp:72
> +#elif !CPU(X86_64)

Use `#else`

> Tools/TestWebKitAPI/Tests/WTF/Packed.cpp:76
> +            // Caution: This test should be processor-specific.
> +            //
> +            // In absence of a specific test, assume that all
> +            // bits of the EFFECTIVE_ADDRESS_WIDTH can be used.

Just say "Other OSes will never allocate user space addresses with bit 47 (i.e. OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH) - 1) set."
Comment 30 Jim Mason 2020-07-14 02:42:53 PDT
Created attachment 404218 [details]
Patch
Comment 31 Mark Lam 2020-07-14 06:22:46 PDT
Comment on attachment 404218 [details]
Patch

I just noticed that you’re missing Tools/ChangeLog.
Comment 32 Mark Lam 2020-07-14 06:29:41 PDT
Comment on attachment 404218 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=404218&action=review

r- because of the missing ChangeLog.

> Tools/TestWebKitAPI/Tests/WTF/Packed.cpp:69
> +#if CPU(X86_64) && !(OS(DARWIN) || OS(LINUX) || OS(WINDOWS))

nit: while you’re adding the ChangeLog, can you also flip these 2 cases so that we can test for CPU(X86_64) && (OS(DARWIN) || OS(LINUX) || OS(WINDOWS)) instead?  It’s better to test for a positive condition than a negative one.

> Tools/TestWebKitAPI/Tests/WTF/Packed.cpp:73
> +            // Other OSes will never allocate user space addresses with

Since we’re no testing for OS(SOLARIS), can you change this comment to say “These OSes” instead of “Other OSes”?
Comment 33 Jim Mason 2020-07-14 07:22:17 PDT
Created attachment 404233 [details]
Patch
Comment 34 Jim Mason 2020-07-14 07:28:51 PDT
(In reply to Mark Lam from comment #32)
> nit: while you’re adding the ChangeLog, can you also flip these 2 cases so
> that we can test for CPU(X86_64) && (OS(DARWIN) || OS(LINUX) || OS(WINDOWS))
> instead?  It’s better to test for a positive condition than a negative one.

I am having a doubt now.  The #else will now run for all CPUs and OSes not named in the #if, not just X86_64.  I don't think that is right.  Anything not X86_64 ought to be running the same test as in the #if.
Comment 35 Jim Mason 2020-07-14 07:32:11 PDT
(In reply to Jim Mason from comment #34)
> (In reply to Mark Lam from comment #32)
> > nit: while you’re adding the ChangeLog, can you also flip these 2 cases so
> > that we can test for CPU(X86_64) && (OS(DARWIN) || OS(LINUX) || OS(WINDOWS))
> > instead?  It’s better to test for a positive condition than a negative one.
> 
> I am having a doubt now.  The #else will now run for all CPUs and OSes not
> named in the #if, not just X86_64.  I don't think that is right.  Anything
> not X86_64 ought to be running the same test as in the #if.

I think the correct test with the sense reversed would be:

#if !CPU(X86_64) || OS(DARWIN) || OS(LINUX) || OS(WINDOWS)
   ... run the legacy test ...
#else
   ... run the new min/max upper test ...
#endif
Comment 36 Mark Lam 2020-07-14 07:39:05 PDT
(In reply to Jim Mason from comment #35)
> (In reply to Jim Mason from comment #34)
> > (In reply to Mark Lam from comment #32)
> > > nit: while you’re adding the ChangeLog, can you also flip these 2 cases so
> > > that we can test for CPU(X86_64) && (OS(DARWIN) || OS(LINUX) || OS(WINDOWS))
> > > instead?  It’s better to test for a positive condition than a negative one.
> > 
> > I am having a doubt now.  The #else will now run for all CPUs and OSes not
> > named in the #if, not just X86_64.  I don't think that is right.  Anything
> > not X86_64 ought to be running the same test as in the #if.
> 
> I think the correct test with the sense reversed would be:
> 
> #if !CPU(X86_64) || OS(DARWIN) || OS(LINUX) || OS(WINDOWS)
>    ... run the legacy test ...
> #else
>    ... run the new min/max upper test ...
> #endif

You are correct.  Sorry about that.  Can you fix the patch?
Comment 37 Jim Mason 2020-07-14 07:42:41 PDT
Created attachment 404235 [details]
Patch
Comment 38 Mark Lam 2020-07-14 08:09:11 PDT
Comment on attachment 404235 [details]
Patch

r=me.  Jim, if you’re not a committer, the way to get the patch committed is to set the cq flag to ?.  Thereafter, any WebKit committer or reviewer can cq+ it to automatically commit it.  That said, I’d wait for the api test EWS bots to be green before committing.
Comment 39 Mark Lam 2020-07-14 16:38:17 PDT
Comment on attachment 404235 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=404235&action=review

The patch looks ready to land.  If you set the cq? flag, we can cq+ it to land it.

> Source/WTF/wtf/Packed.h:154
> +        if (value & 0x1ULL << (OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH) - 1))
> +            value |= ~((1ULL << OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH)) - 1);

FWIW, after thinking about this some more, I think you can implement a fast version of this:
    unsigned shiftBits = countOfBits<uintptr_t> - OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH);
    value = (bitwise_cast<intptr_t>(value) << shiftBits) >> shiftBits;

The CPU should be able to compute the shifts much faster than the masking + conditional branch + masking.  Plus the bytes needed to encode the shift instructions is probably less too.

Jim, if you want to apply this, you'll have to build and test it out on your Solaris machine yourself to make sure it works first.  The EWS bots will only ensure that the patch does not introduce bugs to other ports.  They won't be able to test its correctness since this code is not needed for other ports.
Comment 40 Jim Mason 2020-07-15 01:34:11 PDT
(In reply to Mark Lam from comment #39)
> FWIW, after thinking about this some more, I think you can implement a fast
> version of this:
>     unsigned shiftBits = countOfBits<uintptr_t> -
> OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH);
>     value = (bitwise_cast<intptr_t>(value) << shiftBits) >> shiftBits;
> 
> The CPU should be able to compute the shifts much faster than the masking +
> conditional branch + masking.  Plus the bytes needed to encode the shift
> instructions is probably less too.

Thanks for the suggestion, Mark.  I think you're right.  The old way generates something like this:

	btq	$47, %rax
	jnc	.L2
	movabsq	$-281474976710656, %rdx
	orq	%rdx, %rax
    .L2:

while your revision generates this:

        salq    $16, %rax
        sarq    $16, %rax

I am going to apply the revision, but will first validate in my environment before committing.
Comment 41 Jim Mason 2020-07-15 04:53:01 PDT
Created attachment 404330 [details]
Patch
Comment 42 Mark Lam 2020-07-15 08:06:09 PDT
Comment on attachment 404330 [details]
Patch

r=me
Comment 43 EWS 2020-07-15 08:13:29 PDT
/Volumes/Data/worker/Commit-Queue/build/Source/JavaScriptCore/ChangeLog neither lists a valid reviewer nor contains the string "Unreviewed" or "Rubber stamp" (case insensitive).
Comment 44 Jim Mason 2020-07-15 08:19:49 PDT
(In reply to EWS from comment #43)
> /Volumes/Data/worker/Commit-Queue/build/Source/JavaScriptCore/ChangeLog
> neither lists a valid reviewer nor contains the string "Unreviewed" or
> "Rubber stamp" (case insensitive).

Is this something that I need to fix?
Comment 45 Jim Mason 2020-07-15 08:34:36 PDT
Created attachment 404344 [details]
Patch
Comment 46 Jim Mason 2020-07-15 08:37:17 PDT
(In reply to Jim Mason from comment #44)
> (In reply to EWS from comment #43)
> > /Volumes/Data/worker/Commit-Queue/build/Source/JavaScriptCore/ChangeLog
> > neither lists a valid reviewer nor contains the string "Unreviewed" or
> > "Rubber stamp" (case insensitive).
> 
> Is this something that I need to fix?

Looks like it was missing Reviewed by NOBODY (OOPS!) in the ChangeLog.

Just updated the ChangeLog entries.  Let's try it again.  Sorry about that.
Comment 47 EWS 2020-07-15 09:32:06 PDT
Committed r264400: <https://trac.webkit.org/changeset/264400>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 404344 [details].
Comment 48 Radar WebKit Bug Importer 2020-07-15 09:33:15 PDT
<rdar://problem/65609790>