WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
214142
[WTF] Fix PackedAlignedPtr for X86_64 canonical addresses
https://bugs.webkit.org/show_bug.cgi?id=214142
Summary
[WTF] Fix PackedAlignedPtr for X86_64 canonical addresses
Jim Mason
Reported
2020-07-09 08:58:43 PDT
PackedAlignedPtr assumes the effective address width is <= 48 bits in 64-bit architectures. This facilitates NaN boxing with the remaining bits. (ref: wtf/PlatformOS.h) This is fine for x86_64, as it has a 48-bit address space. However, PackedAlignedPtr incorrectly assumes the 'hole' in the address space is at the high-end (i.e., that the most significant 16 bits are always zero). For x86_64, the high order bits may be zero or one: "The AMD specification requires that the most significant 16 bits of any virtual address, bits 48 through 63, must be copies of bit 47 (in a manner akin to sign extension)." Reference:
https://en.wikipedia.org/wiki/X86-64#Virtual_address_space_details
The patch to wtf/Packed.h fixes the getter of PackedAlignedPtr to reconstitute the canonical address by 'sign-extending' bit 47 (which we already store) across the most significant 16-bits: #if CPU(X86_64) if(value & 0x1ULL << 47) value |= 0xffffULL << 48; #endif In this way, both the getter and setter of PackedAlignedPtr will work correctly for all X86_64 canonical addresses: The setter continues to discard the upper 16 bits and store only the lower 48 bits, while the getter now reconstitutes the upper 16 bits at retrieval time. The patch to wasm/js/WebAssemblyFunction.cpp is a related fix for an address pointer test: Currently, the test uses a signed value (intptr_t) for the comparison. If the high order bit is set, this number is negative, which makes the test fail. The fix changes intptr_t to uintptr_t so the test will work as expected for all addresses.
Attachments
Patch
(3.49 KB, patch)
2020-07-09 09:06 PDT
,
Jim Mason
no flags
Details
Formatted Diff
Diff
Patch
(4.66 KB, patch)
2020-07-09 11:56 PDT
,
Jim Mason
no flags
Details
Formatted Diff
Diff
Patch
(5.41 KB, patch)
2020-07-10 02:54 PDT
,
Jim Mason
no flags
Details
Formatted Diff
Diff
Patch
(5.50 KB, patch)
2020-07-10 05:22 PDT
,
Jim Mason
no flags
Details
Formatted Diff
Diff
Patch
(6.35 KB, patch)
2020-07-13 02:14 PDT
,
Jim Mason
no flags
Details
Formatted Diff
Diff
Patch
(6.37 KB, patch)
2020-07-13 02:50 PDT
,
Jim Mason
no flags
Details
Formatted Diff
Diff
Patch
(6.41 KB, patch)
2020-07-13 03:12 PDT
,
Jim Mason
no flags
Details
Formatted Diff
Diff
Patch
(6.53 KB, patch)
2020-07-13 05:55 PDT
,
Jim Mason
no flags
Details
Formatted Diff
Diff
Patch
(5.52 KB, patch)
2020-07-14 02:42 PDT
,
Jim Mason
no flags
Details
Formatted Diff
Diff
Patch
(6.13 KB, patch)
2020-07-14 07:22 PDT
,
Jim Mason
no flags
Details
Formatted Diff
Diff
Patch
(6.13 KB, patch)
2020-07-14 07:42 PDT
,
Jim Mason
no flags
Details
Formatted Diff
Diff
Patch
(6.30 KB, patch)
2020-07-15 04:53 PDT
,
Jim Mason
no flags
Details
Formatted Diff
Diff
Patch
(6.33 KB, patch)
2020-07-15 08:34 PDT
,
Jim Mason
no flags
Details
Formatted Diff
Diff
Show Obsolete
(12)
View All
Add attachment
proposed patch, testcase, etc.
Jim Mason
Comment 1
2020-07-09 09:06:16 PDT
Created
attachment 403873
[details]
Patch
Mark Lam
Comment 2
2020-07-09 09:15:09 PDT
Jim, what platform actually allocates user space memory in the range where bit 47 of the address is set? Can you share the detail?
Jim Mason
Comment 3
2020-07-09 09:25:11 PDT
(In reply to Mark Lam from
comment #2
)
> Jim, what platform actually allocates user space memory in the range where > bit 47 of the address is set? Can you share the detail?
Solaris/illumos. See
https://www.oracle.com/solaris/technologies/memory.html
I first thought this was a Solaris quirk, and for a long time, just self-applied the attached patch to make it work in my environment. When I learned that X86_64 actually specifies the canonical address space, such that the 'hole' of unusable addresses is in the middle of the virtual space, I reckoned it would be worthwhile to share the fix, so that webkit always does the right thing for this processor architecture.
Mark Lam
Comment 4
2020-07-09 09:28:52 PDT
(In reply to Jim Mason from
comment #3
)
> (In reply to Mark Lam from
comment #2
) > > Jim, what platform actually allocates user space memory in the range where > > bit 47 of the address is set? Can you share the detail? > > Solaris/illumos. See
https://www.oracle.com/solaris/technologies/memory.html
> > I first thought this was a Solaris quirk, and for a long time, just > self-applied the attached patch to make it work in my environment. When I > learned that X86_64 actually specifies the canonical address space, such > that the 'hole' of unusable addresses is in the middle of the virtual space, > I reckoned it would be worthwhile to share the fix, so that webkit always > does the right thing for this processor architecture.
Thanks for the patch and the justification for it. Will look into the issue.
Mark Lam
Comment 5
2020-07-09 09:43:29 PDT
(In reply to Jim Mason from
comment #3
)
> (In reply to Mark Lam from
comment #2
) > > Jim, what platform actually allocates user space memory in the range where > > bit 47 of the address is set? Can you share the detail? > > Solaris/illumos. See
https://www.oracle.com/solaris/technologies/memory.html
> > I first thought this was a Solaris quirk, and for a long time, just > self-applied the attached patch to make it work in my environment. When I > learned that X86_64 actually specifies the canonical address space, such > that the 'hole' of unusable addresses is in the middle of the virtual space, > I reckoned it would be worthwhile to share the fix, so that webkit always > does the right thing for this processor architecture.
According to
https://www.oracle.com/solaris/technologies/memory.html
, addresses with bit 47 set are all reserved for Kernel space. In fact, the upper bound in user space memory is 0x00008000.00000000 i.e. you should never see an address with bit 47 set in WebKit. Am I missing something?
Jim Mason
Comment 6
2020-07-09 09:57:37 PDT
(In reply to Mark Lam from
comment #5
)
> According to
https://www.oracle.com/solaris/technologies/memory.html
, > addresses with bit 47 set are all reserved for Kernel space. In fact, the > upper bound in user space memory is 0x00008000.00000000 i.e. you should > never see an address with bit 47 set in WebKit. Am I missing something?
I think you may be looking at SPARC. X86_64 has the user stack and shared objects up there. It's about half way down: 64-bit Kernel's Virtual Memory Layout. (assuming 64 bit app) 0xFFFFFD80.00000000 |-----------------------|- KERNELBASE (lower if > 1TB) | User stack |- User space memory | | | shared objects, etc | (grows downwards) : : | | 0xFFFF8000.00000000 |-----------------------| | | | VA Hole / unused | | | 0x00008000.00000000 |-----------------------| | | | | : : | user heap | (grows upwards) | | | user data | |-----------------------| | user text | 0x00000000.04000000 |-----------------------| | invalid | 0x00000000.00000000 +-----------------------+ I can confirm on Solaris that I see addresses with bit 47 both set and not set in PackedAlignedPtr, and if I don't sign-extend the ones with bit 47 set, it will SIGSEGV.
Mark Lam
Comment 7
2020-07-09 10:01:45 PDT
(In reply to Jim Mason from
comment #6
)
> (In reply to Mark Lam from
comment #5
) > > According to
https://www.oracle.com/solaris/technologies/memory.html
, > > addresses with bit 47 set are all reserved for Kernel space. In fact, the > > upper bound in user space memory is 0x00008000.00000000 i.e. you should > > never see an address with bit 47 set in WebKit. Am I missing something? > > I think you may be looking at SPARC. X86_64 has the user stack and shared > objects up there. It's about half way down: > > 64-bit Kernel's Virtual Memory Layout. (assuming 64 bit app) > > 0xFFFFFD80.00000000 |-----------------------|- KERNELBASE (lower if > 1TB) > | User stack |- User space memory > | | > | shared objects, etc | (grows downwards) > : : > | | > 0xFFFF8000.00000000 |-----------------------| > | | > | VA Hole / unused | > | | > 0x00008000.00000000 |-----------------------| > | | > | | > : : > | user heap | (grows upwards) > | | > | user data | > |-----------------------| > | user text | > 0x00000000.04000000 |-----------------------| > | invalid | > 0x00000000.00000000 +-----------------------+ > > > I can confirm on Solaris that I see addresses with bit 47 both set and not > set in PackedAlignedPtr, and if I don't sign-extend the ones with bit 47 > set, it will SIGSEGV.
This was exactly what I was looking at. Oh, I missed the "User stack and shared objects, etc" section.
Mark Lam
Comment 8
2020-07-09 10:05:21 PDT
Comment on
attachment 403873
[details]
Patch r=me
Mark Lam
Comment 9
2020-07-09 10:08:42 PDT
Comment on
attachment 403873
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=403873&action=review
> Source/WTF/wtf/Packed.h:155 > +#if CPU(X86_64) > + // The AMD specification requires that the most significant 16 > + // bits of any virtual address, bits 48 through 63, must be > + // copies of bit 47 (in a manner akin to sign extension). > + // > + // Reference:
https://en.wikipedia.org/wiki/X86-64#Virtual_address_space_details
> + if (value & 0x1ULL << 47) > + value |= 0xffffULL << 48; > +#endif
Please add an ASSERT in set() to verify that if bit 47 is set that the upper 16 bits are also set.
Mark Lam
Comment 10
2020-07-09 10:11:12 PDT
Comment on
attachment 403873
[details]
Patch Taking back the r+. Looks like you also need to update TEST(WTF_Packed, AssignAndGet) and the api tests' Packed.cpp.
Jim Mason
Comment 11
2020-07-09 11:56:45 PDT
Created
attachment 403897
[details]
Patch
Mark Lam
Comment 12
2020-07-09 12:18:57 PDT
Comment on
attachment 403897
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=403897&action=review
> Source/WTF/wtf/Packed.h:166 > + // Shift bit 47 to high order bit, then arithmetic shift back. > + // If value is different, address is not canonical.
Just remove these 2 lines. It's frowned upon in WebKit code to have comments that say exactly the operations the code is doing as opposed to why it's doing it.
> Source/WTF/wtf/Packed.h:167 > + ASSERT(bitwise_cast<intptr_t>(value) == bitwise_cast<intptr_t>(value) << 16 >> 16);
Please put parens around bitwise_cast<intptr_t>(value) << 16: ASSERT(bitwise_cast<intptr_t>(value) == (bitwise_cast<intptr_t>(value) << 16) >> 16);
> Tools/TestWebKitAPI/Tests/WTF/Packed.cpp:72 > +#if CPU(X86_64) > + // validate X86_64 "higher half" address > + key = max = bitwise_cast<uint8_t*>(static_cast<uintptr_t>(-1)); > + EXPECT_EQ(key.get(), max); > +#endif
This isn't want I was getting at. Your earlier patch was failing the API tests (see
https://ews-build.webkit.org/#/builders/3/builds/28085
) with the following details: Failed TestWTF.WTF_Packed.AssignAndGet /Volumes/Data/worker/macOS-Mojave-Release-Build-EWS/build/Tools/TestWebKitAPI/Tests/WTF/Packed.cpp:67 Expected equality of these values: key.get() Which is: 0xffffffffffffffff max Which is: 0xffffffffffff The reason it failed is because the test fabricated an address in "max" of 0xffffffffffff, which is an invalid address according to the spec. Naturally, the test would fail. You should fix the test to handle the following cases: 0xffff ffff ffff ffff 0xffff ffff ffff if not X86_64 0x7fff ffff ffff 0x0001 2345 6789 0x0000 0000 0000 Does that make sense? Also, since the only difference here between each test case is the input pointer value, I would put these values in an array, and just run the test in a loop.
Mark Lam
Comment 13
2020-07-09 12:19:30 PDT
Comment on
attachment 403897
[details]
Patch r- because the patch needs more work.
Yusuke Suzuki
Comment 14
2020-07-09 12:37:06 PDT
Comment on
attachment 403897
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=403897&action=review
> Source/WTF/wtf/Packed.h:147 > +#if CPU(X86_64)
When doing that, please do it only in Solaris.
Jim Mason
Comment 15
2020-07-09 14:59:36 PDT
(In reply to Yusuke Suzuki from
comment #14
)
> Comment on
attachment 403897
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=403897&action=review
> > > Source/WTF/wtf/Packed.h:147 > > +#if CPU(X86_64) > > When doing that, please do it only in Solaris.
I was planning to make the changes requested by Mark tomorrow, but the whole point of this patch is to provide support for X86_64 canonical addressing in webkit. It is not just a Solaris thing. If webkit does not want to do the right thing for X86_64, let me know and I will just close the bug.
Mark Lam
Comment 16
2020-07-09 15:16:51 PDT
(In reply to Jim Mason from
comment #15
)
> (In reply to Yusuke Suzuki from
comment #14
) > > Comment on
attachment 403897
[details]
> > Patch > > > > View in context: > >
https://bugs.webkit.org/attachment.cgi?id=403897&action=review
> > > > > Source/WTF/wtf/Packed.h:147 > > > +#if CPU(X86_64) > > > > When doing that, please do it only in Solaris. > > I was planning to make the changes requested by Mark tomorrow, but the whole > point of this patch is to provide support for X86_64 canonical addressing in > webkit. It is not just a Solaris thing. > > If webkit does not want to do the right thing for X86_64, let me know and I > will just close the bug.
Just to clarify. I think Yusuke's point is not that this is strictly incorrect, but that it is unnecessary for any other OS than Solaris. In WebKit, we do a lot of work to make sure the code is optimal. While this extra work does not hurt in terms of correctness, it is guaranteed to be an effective no-op on any OS that is not Solaris because for other OSes, user space never has an address with a set bit 47. So, it is not incorrect to omit this extra code for any OS that is not Solaris. Yusuke's concern here is with regards to performance. That said. I don't see WTF_OS_SOLARIS defined in PlatformOS.h. So, in order to add the #if OS(SOLARIS) check, one would have to figure out how to detect a build targeting Solaris and add the WTF_OS_SOLARIS #define in PlatformOS.h.
Jim Mason
Comment 17
2020-07-09 15:50:17 PDT
(In reply to Mark Lam from
comment #16
)
> (In reply to Jim Mason from
comment #15
) > > (In reply to Yusuke Suzuki from
comment #14
) > > > Comment on
attachment 403897
[details]
> > > Patch > > > > > > View in context: > > >
https://bugs.webkit.org/attachment.cgi?id=403897&action=review
> > > > > > > Source/WTF/wtf/Packed.h:147 > > > > +#if CPU(X86_64) > > > > > > When doing that, please do it only in Solaris. > > > > I was planning to make the changes requested by Mark tomorrow, but the whole > > point of this patch is to provide support for X86_64 canonical addressing in > > webkit. It is not just a Solaris thing. > > > > If webkit does not want to do the right thing for X86_64, let me know and I > > will just close the bug. > > Just to clarify. I think Yusuke's point is not that this is strictly > incorrect, but that it is unnecessary for any other OS than Solaris. In > WebKit, we do a lot of work to make sure the code is optimal. While this > extra work does not hurt in terms of correctness, it is guaranteed to be an > effective no-op on any OS that is not Solaris because for other OSes, user > space never has an address with a set bit 47. So, it is not incorrect to > omit this extra code for any OS that is not Solaris. Yusuke's concern here > is with regards to performance.
With -O2 or -O3 in gcc, the test becomes two instructions, a bit test and a jump. Probably similar in other compilers. In the PackedAlignedPtr getter, there is already a test -- if (isAlignmentShiftProfitable) -- that is probably a no op on most every platform. This test is similarly trivial. The reason I submit this patch is that an operating system can map memory however it likes. If webkit does the right thing, it ensures every OS will work correctly, no matter what. As it stands now, the operating system has to make 'the right choices' based on the current assumptions, or webkit will fail, which means someone has to chase bugs down. As you pointed out, the current TEST(WTF_Packed, AssignAndGet) is testing an address that is *invalid* for the X86_64 processor architecture. And in the web assembly code, a signed value is being used for a pointer test. Cleaning up these sorts of things helps make webkit robust.
Jim Mason
Comment 18
2020-07-10 02:54:26 PDT
Created
attachment 403954
[details]
Patch
Jim Mason
Comment 19
2020-07-10 05:22:01 PDT
Created
attachment 403960
[details]
Patch
Jim Mason
Comment 20
2020-07-10 05:29:44 PDT
(In reply to Mark Lam from
comment #12
)
> You should fix the test to handle the following cases: > 0xffff ffff ffff ffff > 0xffff ffff ffff if not X86_64 > 0x7fff ffff ffff > 0x0001 2345 6789 > 0x0000 0000 0000
Done. As for, 0x123456789, I am running that test case for all 64-bit architectures (not just X86_64), while testing 0x12345678 on 32-bit. If you want to do something else, let me know.
Mark Lam
Comment 21
2020-07-10 07:57:31 PDT
(In reply to Jim Mason from
comment #17
)
> (In reply to Mark Lam from
comment #16
) > > (In reply to Jim Mason from
comment #15
) > > > (In reply to Yusuke Suzuki from
comment #14
) > > > > Comment on
attachment 403897
[details]
> > > > Patch > > > > > > > > View in context: > > > >
https://bugs.webkit.org/attachment.cgi?id=403897&action=review
> > > > > > > > > Source/WTF/wtf/Packed.h:147 > > > > > +#if CPU(X86_64) > > > > > > > > When doing that, please do it only in Solaris. > > > > > > I was planning to make the changes requested by Mark tomorrow, but the whole > > > point of this patch is to provide support for X86_64 canonical addressing in > > > webkit. It is not just a Solaris thing. > > > > > > If webkit does not want to do the right thing for X86_64, let me know and I > > > will just close the bug. > > > > Just to clarify. I think Yusuke's point is not that this is strictly > > incorrect, but that it is unnecessary for any other OS than Solaris. In > > WebKit, we do a lot of work to make sure the code is optimal. While this > > extra work does not hurt in terms of correctness, it is guaranteed to be an > > effective no-op on any OS that is not Solaris because for other OSes, user > > space never has an address with a set bit 47. So, it is not incorrect to > > omit this extra code for any OS that is not Solaris. Yusuke's concern here > > is with regards to performance. > > With -O2 or -O3 in gcc, the test becomes two instructions, a bit test and a > jump. Probably similar in other compilers. In the PackedAlignedPtr getter, > there is already a test -- if (isAlignmentShiftProfitable) -- that is > probably a no op on most every platform. This test is similarly trivial.
isAlignmentShiftProfitable is a constexpr. This check will be constant folded. The shift will either be emitted unconditionally, or elided away. Hence, this is not a justification for adding a conditional branch into this code, especially one that we know are not needed for OS(DARWIN), OS(LINUX) and OS(WIN). Also, this get() accessor is used all over the place. Yusuke is the engineer who applied PackedAlignedPtr pointers throughout the code, and he did a lot of benchmarking back then to ensure that there is no performance regression when it was applied. I trust he has good reason to ask to have this #if'd out. This performance issue is also why I asked what OS you were seeing this on in the first place, and I tried to take the time to think about whether we really need this. I would take Yusuke's word on this over mine.
> The reason I submit this patch is that an operating system can map memory > however it likes. If webkit does the right thing, it ensures every OS will > work correctly, no matter what. As it stands now, the operating system has > to make 'the right choices' based on the current assumptions, or webkit will > fail, which means someone has to chase bugs down.
Your claim here is not correct. We're not assuming that the OS will make this choice. We know for a fact that OS(DARWIN), OS(LINUX) (see
https://www.kernel.org/doc/html/latest/x86/x86_64/mm.html?highlight=virtual%20memory
), and OS(WIN) (see
https://docs.microsoft.com/en-us/windows/win32/memory/memory-limits-for-windows-releases#memory-and-address-space-limits
) do not need this. It's in their ABI. And the software is written based on that ABI. Solaris is the weird duck here.
> As you pointed out, the current TEST(WTF_Packed, AssignAndGet) is testing an > address that is *invalid* for the X86_64 processor architecture. And in the > web assembly code, a signed value is being used for a pointer test. > Cleaning up these sorts of things helps make webkit robust.
Yes, there is value in this patch. We're not rejecting it. We're just trying to refine it.
Mark Lam
Comment 22
2020-07-10 08:46:28 PDT
Comment on
attachment 403960
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=403960&action=review
Almost there.
> Source/WTF/wtf/Packed.h:147 > +#if CPU(X86_64)
Please do one of the following: 1. In PlatformOS.h, add a #define for WTF_OS_SOLARIS based on what gcc would define for Solaris. Change this line to `#if CPU(X86_64) && OS(SOLARIS)` Or ... 2. Change this line to `#if CPU(X86_64) && !(OS(DARWIN) || OS(LINUX) || OS(WINDOWS))` I prefer option (1) but if needed, I can accept option (2).
> Source/WTF/wtf/Packed.h:154 > + // The AMD specification requires that the most significant 16 > + // bits of any virtual address, bits 48 through 63, must be > + // copies of bit 47 (in a manner akin to sign extension). > + // > + // Reference:
https://en.wikipedia.org/wiki/X86-64#Virtual_address_space_details
> + if (value & 0x1ULL << 47) > + value |= 0xffffULL << 48;
After thinking about this some more, I realized that this quirk only applies to X86_64 implementations that have 48 bit addresses. The spec actually allows for implementations with > 48 bit addresses. However, I'm not aware that those implementations actually exist today. Also, lots of WebKit code relies on addresses not exceeding 48 bits. Please add the following just above this if statement: static_assert(OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH) == 48); This will prevent this quirk from silently causing bad behavior on x86_64 implementations that have > 48 bits addressing (not that a fix will be easy). It also documents our expectation.
> Source/WTF/wtf/Packed.h:166 > +#if CPU(X86_64) > + // Ensure bits 48-63 track the value of bit 47, per the AMD spec. > + ASSERT(bitwise_cast<intptr_t>(value) == (bitwise_cast<intptr_t>(value) << 16) >> 16); > +#endif
The reason I asked for this before is because it will catch the issue if someone ends up building for another OS with an unusual virtual memory map like Solaris. Having this assertion here allows us to use Option 1 above so that all ports will get the optimized behavior by default. And if it doesn't work for any specific port, it won't fail silently anymore.
> Tools/TestWebKitAPI/Tests/WTF/Packed.cpp:67 > + uint8_t* candidates[] = > + { > + 0,
WebKit style requires that this be written as: uint8_t* candidates[] = { 0, ... };
> Tools/TestWebKitAPI/Tests/WTF/Packed.cpp:83 > +#if CPU(ADDRESS64) > + bitwise_cast<uint8_t*>(0x123456789ULL), > +#else > + bitwise_cast<uint8_t*>(0x12345678UL), > +#endif > +#if CPU(X86_64) > + bitwise_cast<uint8_t*>((1ULL << 47) - 1), // max lower half > + bitwise_cast<uint8_t*>(~((1ULL << 47) - 1)), // min higher half > + bitwise_cast<uint8_t*>(static_cast<uintptr_t>(-1)), // max higher half > +#else > + // Caution: This test should be processor-specific. > + // > + // In absence of a specific test, assume that all > + // bits of the EFFECTIVE_ADDRESS_WIDTH can be used. > + bitwise_cast<uint8_t*>(static_cast<uintptr_t>((1ULL << OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH)) - 1)), > +#endif
I would write these as: bitwise_cast<uint8_t*>(static_cast<uintptr_t>((1ULL << (OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH) / 2)) - 1)), bitwise_cast<uint8_t*>(static_cast<uintptr_t>((1ULL << (OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH) - 1)) - 1)), #if CPU(X86_64) && OS(SOLARIS) bitwise_cast<uint8_t*>(~((1ULL << 47) - 1)), // min higher half bitwise_cast<uint8_t*>(std::numeric_limits<uintptr_t>::max()), #else bitwise_cast<uint8_t*>(static_cast<uintptr_t>((1ULL << OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH)) - 1)), #endif
Mark Lam
Comment 23
2020-07-10 08:48:05 PDT
Comment on
attachment 403960
[details]
Patch r- because the patch needs a bit more work.
Jim Mason
Comment 24
2020-07-10 10:38:32 PDT
You have given this a lot of thought, Mark. Thank you! This is becoming quite good.
> 1. In PlatformOS.h, add a #define for WTF_OS_SOLARIS based on what gcc would > define for Solaris. > Change this line to `#if CPU(X86_64) && OS(SOLARIS)` > > Or ... > > 2. Change this line to `#if CPU(X86_64) && !(OS(DARWIN) || OS(LINUX) || > OS(WINDOWS))` > > I prefer option (1) but if needed, I can accept option (2).
How about a touch of option (1) and option (2)? Instead of bringing OS-specifics into the test, we could define a new symbol in one of Platform* files, something like USE_CANONICAL_ADDRESSING. There, it could be setup as an 'opt-in' option, like (1) above: #if CPU(X86_64) && (OS(SOLARIS) || ...) #define USE_CANONICAL_ADDRESSING #endif or opt-out, like (2) above: #if CPU(X86_64) && !(OS(DARWIN) || ...) #define USE_CANONICAL_ADDRESSING #endif and the code in Packed.h would read something like: #if USE(CANONICAL_ADDRESSING) if (value & 0x1ULL << 47) value |= 0xffffULL << 48; #endif It avoids having to go touch the source file if we want to change from opt-in to opt-out, or add or remove an OS from the list.
> After thinking about this some more, I realized that this quirk only applies > to X86_64 implementations that have 48 bit addresses. The spec actually > allows for implementations with > 48 bit addresses. However, I'm not aware > that those implementations actually exist today. Also, lots of WebKit code > relies on addresses not exceeding 48 bits. > > Please add the following just above this if statement: > static_assert(OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH) == 48); > > This will prevent this quirk from silently causing bad behavior on x86_64 > implementations that have > 48 bits addressing (not that a fix will be > easy). It also documents our expectation.
What if we use EFFECTIVE_ADDRESS_WIDTH to determine the bit and mask? e.g, #if USE(CANONICAL_ADDRESSING) if (value & 0x1ULL << (OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH) - 1)) value |= ~((1ULL << OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH)) - 1); #endif The ASSERT in the setter could be updated similarly. In this way, if a 56-bit CPU emerges down the road, no adjustment to this code would be needed. (NaN boxing and other webkit code that rely on 48 bits is another matter... but at least this code would continue to work as expected...)
Jim Mason
Comment 25
2020-07-13 02:14:05 PDT
Created
attachment 404134
[details]
Patch
Jim Mason
Comment 26
2020-07-13 02:50:33 PDT
Created
attachment 404137
[details]
Patch
Jim Mason
Comment 27
2020-07-13 03:12:19 PDT
Created
attachment 404139
[details]
Patch
Jim Mason
Comment 28
2020-07-13 05:55:48 PDT
Created
attachment 404143
[details]
Patch
Mark Lam
Comment 29
2020-07-13 15:14:23 PDT
Comment on
attachment 404143
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=404143&action=review
> Source/WTF/wtf/Packed.h:147 > +#if HAVE(CANONICAL_ADDRESSING)
This is wrong because the name does not match what the code is trying to do. x86_64 on OS(DARWIN), OS(LINUX), and OS(WINDOWS) all have canonical addresses, but none of them want this. Really, this is truly an OS(SOLARIS) thing. There's currently no evidence that any other OS would lay out memory this way. Declaring this HAVE(CANONICAL_ADDRESSING) condition is just opening a door for confusion later because it doesn't really guard what it says. Please just use #if OS(SOLARIS) here.
> Source/WTF/wtf/Packed.h:151 > + //
You can add a comment here saying: "No other OS needs to do this because they do not map user space data into any address range with bit 47 set."
> Source/WTF/wtf/Packed.h:172 > +#if HAVE(CANONICAL_ADDRESSING) > + // Ensure the address is in canonical form (see note above). > + ASSERT(bitwise_cast<intptr_t>(value) == (bitwise_cast<intptr_t>(value) > + << (64 - OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH))) > + >> (64 - OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH))); > +#elif CPU(X86_64) > + // Ensure the address is in the lower half. > + // If this assert fails, you may need to enable CANONICAL_ADDRESSING. > + ASSERT(value < 1ULL << (OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH) - 1)); > +#endif
Let's remove this. There's a much easier way to do this assert. After the memcpy below, add: ASSERT(bitwise_cast<uintptr_t>(get()) == value); This is also an improvement because it tests how we will actually encode the address bits in the packed pointer, not how we "expect" we will encode the address bits. And there's no need for any conditionals. The assertion should always be true.
> Source/WTF/wtf/PlatformHave.h:678 > +#if CPU(X86_64) && !(OS(DARWIN) || OS(LINUX) || OS(WINDOWS)) > +#define HAVE_CANONICAL_ADDRESSING 1 > +#endif
Please remove this. Instead, please add the following to PlatformOS.h: #if defined(__sun) #define WTF_OS_SOLARIS 1 #endif
> Tools/TestWebKitAPI/Tests/WTF/Packed.cpp:69 > +#if HAVE(CANONICAL_ADDRESSING)
Let's use `#if OS(SOLARIS)` instead.
> Tools/TestWebKitAPI/Tests/WTF/Packed.cpp:72 > +#elif !CPU(X86_64)
Use `#else`
> Tools/TestWebKitAPI/Tests/WTF/Packed.cpp:76 > + // Caution: This test should be processor-specific. > + // > + // In absence of a specific test, assume that all > + // bits of the EFFECTIVE_ADDRESS_WIDTH can be used.
Just say "Other OSes will never allocate user space addresses with bit 47 (i.e. OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH) - 1) set."
Jim Mason
Comment 30
2020-07-14 02:42:53 PDT
Created
attachment 404218
[details]
Patch
Mark Lam
Comment 31
2020-07-14 06:22:46 PDT
Comment on
attachment 404218
[details]
Patch I just noticed that you’re missing Tools/ChangeLog.
Mark Lam
Comment 32
2020-07-14 06:29:41 PDT
Comment on
attachment 404218
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=404218&action=review
r- because of the missing ChangeLog.
> Tools/TestWebKitAPI/Tests/WTF/Packed.cpp:69 > +#if CPU(X86_64) && !(OS(DARWIN) || OS(LINUX) || OS(WINDOWS))
nit: while you’re adding the ChangeLog, can you also flip these 2 cases so that we can test for CPU(X86_64) && (OS(DARWIN) || OS(LINUX) || OS(WINDOWS)) instead? It’s better to test for a positive condition than a negative one.
> Tools/TestWebKitAPI/Tests/WTF/Packed.cpp:73 > + // Other OSes will never allocate user space addresses with
Since we’re no testing for OS(SOLARIS), can you change this comment to say “These OSes” instead of “Other OSes”?
Jim Mason
Comment 33
2020-07-14 07:22:17 PDT
Created
attachment 404233
[details]
Patch
Jim Mason
Comment 34
2020-07-14 07:28:51 PDT
(In reply to Mark Lam from
comment #32
)
> nit: while you’re adding the ChangeLog, can you also flip these 2 cases so > that we can test for CPU(X86_64) && (OS(DARWIN) || OS(LINUX) || OS(WINDOWS)) > instead? It’s better to test for a positive condition than a negative one.
I am having a doubt now. The #else will now run for all CPUs and OSes not named in the #if, not just X86_64. I don't think that is right. Anything not X86_64 ought to be running the same test as in the #if.
Jim Mason
Comment 35
2020-07-14 07:32:11 PDT
(In reply to Jim Mason from
comment #34
)
> (In reply to Mark Lam from
comment #32
) > > nit: while you’re adding the ChangeLog, can you also flip these 2 cases so > > that we can test for CPU(X86_64) && (OS(DARWIN) || OS(LINUX) || OS(WINDOWS)) > > instead? It’s better to test for a positive condition than a negative one. > > I am having a doubt now. The #else will now run for all CPUs and OSes not > named in the #if, not just X86_64. I don't think that is right. Anything > not X86_64 ought to be running the same test as in the #if.
I think the correct test with the sense reversed would be: #if !CPU(X86_64) || OS(DARWIN) || OS(LINUX) || OS(WINDOWS) ... run the legacy test ... #else ... run the new min/max upper test ... #endif
Mark Lam
Comment 36
2020-07-14 07:39:05 PDT
(In reply to Jim Mason from
comment #35
)
> (In reply to Jim Mason from
comment #34
) > > (In reply to Mark Lam from
comment #32
) > > > nit: while you’re adding the ChangeLog, can you also flip these 2 cases so > > > that we can test for CPU(X86_64) && (OS(DARWIN) || OS(LINUX) || OS(WINDOWS)) > > > instead? It’s better to test for a positive condition than a negative one. > > > > I am having a doubt now. The #else will now run for all CPUs and OSes not > > named in the #if, not just X86_64. I don't think that is right. Anything > > not X86_64 ought to be running the same test as in the #if. > > I think the correct test with the sense reversed would be: > > #if !CPU(X86_64) || OS(DARWIN) || OS(LINUX) || OS(WINDOWS) > ... run the legacy test ... > #else > ... run the new min/max upper test ... > #endif
You are correct. Sorry about that. Can you fix the patch?
Jim Mason
Comment 37
2020-07-14 07:42:41 PDT
Created
attachment 404235
[details]
Patch
Mark Lam
Comment 38
2020-07-14 08:09:11 PDT
Comment on
attachment 404235
[details]
Patch r=me. Jim, if you’re not a committer, the way to get the patch committed is to set the cq flag to ?. Thereafter, any WebKit committer or reviewer can cq+ it to automatically commit it. That said, I’d wait for the api test EWS bots to be green before committing.
Mark Lam
Comment 39
2020-07-14 16:38:17 PDT
Comment on
attachment 404235
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=404235&action=review
The patch looks ready to land. If you set the cq? flag, we can cq+ it to land it.
> Source/WTF/wtf/Packed.h:154 > + if (value & 0x1ULL << (OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH) - 1)) > + value |= ~((1ULL << OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH)) - 1);
FWIW, after thinking about this some more, I think you can implement a fast version of this: unsigned shiftBits = countOfBits<uintptr_t> - OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH); value = (bitwise_cast<intptr_t>(value) << shiftBits) >> shiftBits; The CPU should be able to compute the shifts much faster than the masking + conditional branch + masking. Plus the bytes needed to encode the shift instructions is probably less too. Jim, if you want to apply this, you'll have to build and test it out on your Solaris machine yourself to make sure it works first. The EWS bots will only ensure that the patch does not introduce bugs to other ports. They won't be able to test its correctness since this code is not needed for other ports.
Jim Mason
Comment 40
2020-07-15 01:34:11 PDT
(In reply to Mark Lam from
comment #39
)
> FWIW, after thinking about this some more, I think you can implement a fast > version of this: > unsigned shiftBits = countOfBits<uintptr_t> - > OS_CONSTANT(EFFECTIVE_ADDRESS_WIDTH); > value = (bitwise_cast<intptr_t>(value) << shiftBits) >> shiftBits; > > The CPU should be able to compute the shifts much faster than the masking + > conditional branch + masking. Plus the bytes needed to encode the shift > instructions is probably less too.
Thanks for the suggestion, Mark. I think you're right. The old way generates something like this: btq $47, %rax jnc .L2 movabsq $-281474976710656, %rdx orq %rdx, %rax .L2: while your revision generates this: salq $16, %rax sarq $16, %rax I am going to apply the revision, but will first validate in my environment before committing.
Jim Mason
Comment 41
2020-07-15 04:53:01 PDT
Created
attachment 404330
[details]
Patch
Mark Lam
Comment 42
2020-07-15 08:06:09 PDT
Comment on
attachment 404330
[details]
Patch r=me
EWS
Comment 43
2020-07-15 08:13:29 PDT
/Volumes/Data/worker/Commit-Queue/build/Source/JavaScriptCore/ChangeLog neither lists a valid reviewer nor contains the string "Unreviewed" or "Rubber stamp" (case insensitive).
Jim Mason
Comment 44
2020-07-15 08:19:49 PDT
(In reply to EWS from
comment #43
)
> /Volumes/Data/worker/Commit-Queue/build/Source/JavaScriptCore/ChangeLog > neither lists a valid reviewer nor contains the string "Unreviewed" or > "Rubber stamp" (case insensitive).
Is this something that I need to fix?
Jim Mason
Comment 45
2020-07-15 08:34:36 PDT
Created
attachment 404344
[details]
Patch
Jim Mason
Comment 46
2020-07-15 08:37:17 PDT
(In reply to Jim Mason from
comment #44
)
> (In reply to EWS from
comment #43
) > > /Volumes/Data/worker/Commit-Queue/build/Source/JavaScriptCore/ChangeLog > > neither lists a valid reviewer nor contains the string "Unreviewed" or > > "Rubber stamp" (case insensitive). > > Is this something that I need to fix?
Looks like it was missing Reviewed by NOBODY (OOPS!) in the ChangeLog. Just updated the ChangeLog entries. Let's try it again. Sorry about that.
EWS
Comment 47
2020-07-15 09:32:06 PDT
Committed
r264400
: <
https://trac.webkit.org/changeset/264400
> All reviewed patches have been landed. Closing bug and clearing flags on
attachment 404344
[details]
.
Radar WebKit Bug Importer
Comment 48
2020-07-15 09:33:15 PDT
<
rdar://problem/65609790
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug