Summary: | [JSC] Butterfly allocation from LargeAllocation should try "realloc" behavior if collector thread is not active | ||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Yusuke Suzuki <ysuzuki> | ||||||||||||||||||||||||||||||||||||||||||||||
Component: | JavaScriptCore | Assignee: | Yusuke Suzuki <ysuzuki> | ||||||||||||||||||||||||||||||||||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||||||||||||||||||||||||||||||||||
Severity: | Normal | CC: | benjamin, cdumez, cmarcelo, dbates, ews-watchlist, ggaren, keith_miller, mark.lam, msaboff, rniwa, saam, webkit-bug-importer | ||||||||||||||||||||||||||||||||||||||||||||||
Priority: | P2 | Keywords: | InRadar | ||||||||||||||||||||||||||||||||||||||||||||||
Version: | WebKit Nightly Build | ||||||||||||||||||||||||||||||||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||||||||||||||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||||||||||||||||||||||||||||||
Attachments: |
|
Description
Yusuke Suzuki
2019-03-22 14:45:16 PDT
I don't think this hurts performance even if we make it by default. This is simply because allocating LargeAllocation should be already super slow. And this cost is amortized by the exponential size increase policy. But anyway, we can measure it. And if we hit some performance problems, we could do, 1. enable it only if sweepSynchronously() is true 2. find some way for mitigation I think we do not need to do this and believe it is performance-neutral. Created attachment 365781 [details]
Patch
Start working
Created attachment 365786 [details]
Patch
more
Comment on attachment 365786 [details] Patch Attachment 365786 [details] did not pass mac-wk2-ews (mac-wk2): Output: https://webkit-queues.webkit.org/results/11619055 Number of test failures exceeded the failure limit. Created attachment 365789 [details]
Archive of layout-test-results from ews107 for mac-highsierra-wk2
The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews107 Port: mac-highsierra-wk2 Platform: Mac OS X 10.13.6
Comment on attachment 365786 [details] Patch Attachment 365786 [details] did not pass mac-ews (mac): Output: https://webkit-queues.webkit.org/results/11619108 Number of test failures exceeded the failure limit. Created attachment 365790 [details]
Archive of layout-test-results from ews103 for mac-highsierra
The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews103 Port: mac-highsierra Platform: Mac OS X 10.13.6
Comment on attachment 365786 [details] Patch Attachment 365786 [details] did not pass mac-debug-ews (mac): Output: https://webkit-queues.webkit.org/results/11619062 Number of test failures exceeded the failure limit. Created attachment 365791 [details]
Archive of layout-test-results from ews112 for mac-highsierra
The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews112 Port: mac-highsierra Platform: Mac OS X 10.13.6
Comment on attachment 365786 [details] Patch Attachment 365786 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: https://webkit-queues.webkit.org/results/11619219 Number of test failures exceeded the failure limit. Created attachment 365795 [details]
Archive of layout-test-results from ews125 for ios-simulator-wk2
The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews125 Port: ios-simulator-wk2 Platform: Mac OS X 10.13.6
It turns out that "realloc" behavior is very important. While bmalloc uses "malloc", "memcpy" and "free" operations for "realloc" request, system malloc attempts to use in-place expansion. Created attachment 365796 [details]
Patch
Created attachment 365797 [details]
Patch
Created attachment 365800 [details]
Patch
Created attachment 365801 [details]
Patch
(In reply to Yusuke Suzuki from comment #12) > It turns out that "realloc" behavior is very important. While bmalloc uses > "malloc", "memcpy" and "free" operations for "realloc" request, system > malloc attempts to use in-place expansion. Darwin libmalloc explicitly states that "posix_memalign + realloc is valid" in its man page. In glibc, there is no documentation, but it works. In bmalloc, it is valid. In windows, using realloc with _aligned_malloc is not valid. But it has _aligned_realloc. So if we want, we can use it. Comment on attachment 365801 [details] Patch Attachment 365801 [details] did not pass win-ews (win): Output: https://webkit-queues.webkit.org/results/11620968 New failing tests: js/sort-large-array.html js/dom/put-direct-index-beyond-vector-length-resize.html js/array-iterate-backwards.html js/regress-155776.html fast/events/bogus-event-listener-invalidation.html js/promises-tests/promises-in-workers.html js/dom/array-with-double-assign.html fast/xpath/xpath-result-eventlistener-crash.html js/arrowfunction-lexical-bind-arguments-strict.html js/reentrant-caching.html js/dom/array-with-double-push.html editing/selection/move-left-right.html fast/workers/worker-gc2.html js/function-apply-many-args.html js/arrowfunction-lexical-bind-arguments-non-strict.html fast/dom/SelectorAPI/resig-SelectorsAPI-test.xhtml editing/selection/move-by-word-visually-crash-test-5.html fast/css/css-selector-deeply-nested.html fast/css/rule-selector-overflow.html js/dom/document-all-between-frames.html js/slow-stress/array-prototype-filter.html Created attachment 365803 [details]
Archive of layout-test-results from ews205 for win-future
The attached test failures were seen while running run-webkit-tests on the win-ews.
Bot: ews205 Port: win-future Platform: CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit
Created attachment 365805 [details]
Patch
Created attachment 365888 [details]
Patch
Created attachment 365893 [details]
Patch
Comment on attachment 365893 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365893&action=review > Source/JavaScriptCore/heap/LargeAllocation.cpp:81 > + if (isOnList()) > + remove(); > + unsigned originalCellSize = m_cellSize; > + > + void* space = subspace->alignedMemoryAllocator()->tryReallocate(this, sizeIncludingDistancingAndAlignmentAdjustment); > + if (!space) > + return nullptr; I do not want to call LargeAllocation's destructor here since we still use oldAllocation if tryReallocate fails. Comment on attachment 365893 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365893&action=review > Source/WTF/wtf/FastMalloc.cpp:271 > +bool canUseReallocOnToAlignedMallocedPointer() > +{ > +#if OS(DARWIN) > + // `man posix_memalign` explicitly states that this is allowed. > + return true; > +#else > + return false; > +#endif > +} Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? Comment on attachment 365893 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365893&action=review >> Source/WTF/wtf/FastMalloc.cpp:271 >> +} > > Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. So, in Darwin, we can just return `true`, and the code is doing this. 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. And yes. As you said, we can make 1) fastMalloc to call _aligned_malloc with 8 or 16 2) fastRealloc to call _aligned_realloc with 8 or 16 3) fastFree to call _aligned_free But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. So, due to (3), anyway, we need to have this concept. Comment on attachment 365893 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365893&action=review >>> Source/WTF/wtf/FastMalloc.cpp:271 >>> +} >> >> Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? > > 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. > So, in Darwin, we can just return `true`, and the code is doing this. > > 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. > And yes. As you said, we can make > 1) fastMalloc to call _aligned_malloc with 8 or 16 > 2) fastRealloc to call _aligned_realloc with 8 or 16 > 3) fastFree to call _aligned_free > But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. > > 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. > > So, due to (3), anyway, we need to have this concept. I see. Thanks for the explanation! I wonder if it would be better to have a Windows path in FastMalloc (to call _aligned_realloc) and libc path in the non-Darwin DebugHeap (to call posix_memalign and memcpy). Would that work? (It's nice for client code when FastMalloc can paper over these platform differences.) Comment on attachment 365893 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365893&action=review >>>> Source/WTF/wtf/FastMalloc.cpp:271 >>>> +} >>> >>> Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? >> >> 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. >> So, in Darwin, we can just return `true`, and the code is doing this. >> >> 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. >> And yes. As you said, we can make >> 1) fastMalloc to call _aligned_malloc with 8 or 16 >> 2) fastRealloc to call _aligned_realloc with 8 or 16 >> 3) fastFree to call _aligned_free >> But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. >> >> 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. >> >> So, due to (3), anyway, we need to have this concept. > > I see. Thanks for the explanation! > > I wonder if it would be better to have a Windows path in FastMalloc (to call _aligned_realloc) and libc path in the non-Darwin DebugHeap (to call posix_memalign and memcpy). Would that work? (It's nice for client code when FastMalloc can paper over these platform differences.) Sounds like a good idea! If we change Windows to always using _aligned_malloc / _aligned_realloc / _aligned_free, we can completely hide this concept behind the FastMalloc. The problem was Windows' fastAlignedMalloc and fastAlignedFree. We cannot call realloc and free on fastAlignedMalloc memory, and this problem leaded to this complexity. But if we change Windows' fastMalloc / fastFree implementations to always using _aligned_malloc etc. we can meet the following features. 1. fastMalloced memory can be freed by fastFree 2. fastAlignedMalloced memory can be freed by fastFree (this was not possible, and it prevented us from introducing fallback implementation to fastRealloc (we need to perform "free" onto the given memory, but we do not know whether the memory is allocated from fastMalloc or fastAlignedMalloc)). This allows us to implement fastRealloc for the non Darwin & non Windows environments as "malloc, memcpy and free". Comment on attachment 365893 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365893&action=review >>>>> Source/WTF/wtf/FastMalloc.cpp:271 >>>>> +} >>>> >>>> Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? >>> >>> 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. >>> So, in Darwin, we can just return `true`, and the code is doing this. >>> >>> 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. >>> And yes. As you said, we can make >>> 1) fastMalloc to call _aligned_malloc with 8 or 16 >>> 2) fastRealloc to call _aligned_realloc with 8 or 16 >>> 3) fastFree to call _aligned_free >>> But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. >>> >>> 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. >>> >>> So, due to (3), anyway, we need to have this concept. >> >> I see. Thanks for the explanation! >> >> I wonder if it would be better to have a Windows path in FastMalloc (to call _aligned_realloc) and libc path in the non-Darwin DebugHeap (to call posix_memalign and memcpy). Would that work? (It's nice for client code when FastMalloc can paper over these platform differences.) > > Sounds like a good idea! If we change Windows to always using _aligned_malloc / _aligned_realloc / _aligned_free, we can completely hide this concept behind the FastMalloc. > The problem was Windows' fastAlignedMalloc and fastAlignedFree. We cannot call realloc and free on fastAlignedMalloc memory, and this problem leaded to this complexity. But if we change Windows' fastMalloc / fastFree implementations to always using _aligned_malloc etc. we can meet the following features. > > 1. fastMalloced memory can be freed by fastFree > 2. fastAlignedMalloced memory can be freed by fastFree (this was not possible, and it prevented us from introducing fallback implementation to fastRealloc (we need to perform "free" onto the given memory, but we do not know whether the memory is allocated from fastMalloc or fastAlignedMalloc)). > > This allows us to implement fastRealloc for the non Darwin & non Windows environments as "malloc, memcpy and free". Ah! No, unfortunately :( "realloc" only takes (1) a pointer to the previously allocated memory and (2) size for newly allocated memory. But we need to know the size of (1)'s memory region, since, realloc(pointer, largerSize) => void* memory = malloc(largerSize); memcpy(memory, pointer, min(largerSize, sizeOfPointer)); // !!! free(pointer); But we do not know "sizeOfPointer" here. We can do that if we pass the allocated memory size to "realloc" function, but it adds additional complexity, and I think it is more complex and error prone than checking "canUseReallocOnToAlignedMallocedPointer". Comment on attachment 365893 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365893&action=review >>>>>> Source/WTF/wtf/FastMalloc.cpp:271 >>>>>> +} >>>>> >>>>> Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? >>>> >>>> 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. >>>> So, in Darwin, we can just return `true`, and the code is doing this. >>>> >>>> 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. >>>> And yes. As you said, we can make >>>> 1) fastMalloc to call _aligned_malloc with 8 or 16 >>>> 2) fastRealloc to call _aligned_realloc with 8 or 16 >>>> 3) fastFree to call _aligned_free >>>> But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. >>>> >>>> 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. >>>> >>>> So, due to (3), anyway, we need to have this concept. >>> >>> I see. Thanks for the explanation! >>> >>> I wonder if it would be better to have a Windows path in FastMalloc (to call _aligned_realloc) and libc path in the non-Darwin DebugHeap (to call posix_memalign and memcpy). Would that work? (It's nice for client code when FastMalloc can paper over these platform differences.) >> >> Sounds like a good idea! If we change Windows to always using _aligned_malloc / _aligned_realloc / _aligned_free, we can completely hide this concept behind the FastMalloc. >> The problem was Windows' fastAlignedMalloc and fastAlignedFree. We cannot call realloc and free on fastAlignedMalloc memory, and this problem leaded to this complexity. But if we change Windows' fastMalloc / fastFree implementations to always using _aligned_malloc etc. we can meet the following features. >> >> 1. fastMalloced memory can be freed by fastFree >> 2. fastAlignedMalloced memory can be freed by fastFree (this was not possible, and it prevented us from introducing fallback implementation to fastRealloc (we need to perform "free" onto the given memory, but we do not know whether the memory is allocated from fastMalloc or fastAlignedMalloc)). >> >> This allows us to implement fastRealloc for the non Darwin & non Windows environments as "malloc, memcpy and free". > > Ah! No, unfortunately :( > "realloc" only takes (1) a pointer to the previously allocated memory and (2) size for newly allocated memory. But we need to know the size of (1)'s memory region, since, > > realloc(pointer, largerSize) > => > > void* memory = malloc(largerSize); > memcpy(memory, pointer, min(largerSize, sizeOfPointer)); // !!! > free(pointer); > > But we do not know "sizeOfPointer" here. We can do that if we pass the allocated memory size to "realloc" function, but it adds additional complexity, and I think it is more complex and error prone than checking "canUseReallocOnToAlignedMallocedPointer". Another way is stopping using "posix_memalign" for allocating LargeAllocation in JSC GC. Use "malloc" and adjust the pointer to align it to 16B. This way, we can always use "realloc" onto LargeAllocation since we know that this is allocated by "malloc". I think it would be cleaner since, 1. anyway, we already have a code handling misalignment for realloc-ed memory. 2. we do not need to rely on this "realloc"'s platform behavior. I'll try to do this. Comment on attachment 365893 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365893&action=review >>>>>>> Source/WTF/wtf/FastMalloc.cpp:271 >>>>>>> +} >>>>>> >>>>>> Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? >>>>> >>>>> 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. >>>>> So, in Darwin, we can just return `true`, and the code is doing this. >>>>> >>>>> 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. >>>>> And yes. As you said, we can make >>>>> 1) fastMalloc to call _aligned_malloc with 8 or 16 >>>>> 2) fastRealloc to call _aligned_realloc with 8 or 16 >>>>> 3) fastFree to call _aligned_free >>>>> But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. >>>>> >>>>> 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. >>>>> >>>>> So, due to (3), anyway, we need to have this concept. >>>> >>>> I see. Thanks for the explanation! >>>> >>>> I wonder if it would be better to have a Windows path in FastMalloc (to call _aligned_realloc) and libc path in the non-Darwin DebugHeap (to call posix_memalign and memcpy). Would that work? (It's nice for client code when FastMalloc can paper over these platform differences.) >>> >>> Sounds like a good idea! If we change Windows to always using _aligned_malloc / _aligned_realloc / _aligned_free, we can completely hide this concept behind the FastMalloc. >>> The problem was Windows' fastAlignedMalloc and fastAlignedFree. We cannot call realloc and free on fastAlignedMalloc memory, and this problem leaded to this complexity. But if we change Windows' fastMalloc / fastFree implementations to always using _aligned_malloc etc. we can meet the following features. >>> >>> 1. fastMalloced memory can be freed by fastFree >>> 2. fastAlignedMalloced memory can be freed by fastFree (this was not possible, and it prevented us from introducing fallback implementation to fastRealloc (we need to perform "free" onto the given memory, but we do not know whether the memory is allocated from fastMalloc or fastAlignedMalloc)). >>> >>> This allows us to implement fastRealloc for the non Darwin & non Windows environments as "malloc, memcpy and free". >> >> Ah! No, unfortunately :( >> "realloc" only takes (1) a pointer to the previously allocated memory and (2) size for newly allocated memory. But we need to know the size of (1)'s memory region, since, >> >> realloc(pointer, largerSize) >> => >> >> void* memory = malloc(largerSize); >> memcpy(memory, pointer, min(largerSize, sizeOfPointer)); // !!! >> free(pointer); >> >> But we do not know "sizeOfPointer" here. We can do that if we pass the allocated memory size to "realloc" function, but it adds additional complexity, and I think it is more complex and error prone than checking "canUseReallocOnToAlignedMallocedPointer". > > Another way is stopping using "posix_memalign" for allocating LargeAllocation in JSC GC. Use "malloc" and adjust the pointer to align it to 16B. > This way, we can always use "realloc" onto LargeAllocation since we know that this is allocated by "malloc". I think it would be cleaner since, > > 1. anyway, we already have a code handling misalignment for realloc-ed memory. > 2. we do not need to rely on this "realloc"'s platform behavior. > > I'll try to do this. I've tried, but it seems that this is also not so good than I thought. We need to add non-aligned version of allocation-related member functions to JSC::AlignedMemoryAllocator, Comment on attachment 365893 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365893&action=review >>>>>>>> Source/WTF/wtf/FastMalloc.cpp:271 >>>>>>>> +} >>>>>>> >>>>>>> Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? >>>>>> >>>>>> 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. >>>>>> So, in Darwin, we can just return `true`, and the code is doing this. >>>>>> >>>>>> 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. >>>>>> And yes. As you said, we can make >>>>>> 1) fastMalloc to call _aligned_malloc with 8 or 16 >>>>>> 2) fastRealloc to call _aligned_realloc with 8 or 16 >>>>>> 3) fastFree to call _aligned_free >>>>>> But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. >>>>>> >>>>>> 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. >>>>>> >>>>>> So, due to (3), anyway, we need to have this concept. >>>>> >>>>> I see. Thanks for the explanation! >>>>> >>>>> I wonder if it would be better to have a Windows path in FastMalloc (to call _aligned_realloc) and libc path in the non-Darwin DebugHeap (to call posix_memalign and memcpy). Would that work? (It's nice for client code when FastMalloc can paper over these platform differences.) >>>> >>>> Sounds like a good idea! If we change Windows to always using _aligned_malloc / _aligned_realloc / _aligned_free, we can completely hide this concept behind the FastMalloc. >>>> The problem was Windows' fastAlignedMalloc and fastAlignedFree. We cannot call realloc and free on fastAlignedMalloc memory, and this problem leaded to this complexity. But if we change Windows' fastMalloc / fastFree implementations to always using _aligned_malloc etc. we can meet the following features. >>>> >>>> 1. fastMalloced memory can be freed by fastFree >>>> 2. fastAlignedMalloced memory can be freed by fastFree (this was not possible, and it prevented us from introducing fallback implementation to fastRealloc (we need to perform "free" onto the given memory, but we do not know whether the memory is allocated from fastMalloc or fastAlignedMalloc)). >>>> >>>> This allows us to implement fastRealloc for the non Darwin & non Windows environments as "malloc, memcpy and free". >>> >>> Ah! No, unfortunately :( >>> "realloc" only takes (1) a pointer to the previously allocated memory and (2) size for newly allocated memory. But we need to know the size of (1)'s memory region, since, >>> >>> realloc(pointer, largerSize) >>> => >>> >>> void* memory = malloc(largerSize); >>> memcpy(memory, pointer, min(largerSize, sizeOfPointer)); // !!! >>> free(pointer); >>> >>> But we do not know "sizeOfPointer" here. We can do that if we pass the allocated memory size to "realloc" function, but it adds additional complexity, and I think it is more complex and error prone than checking "canUseReallocOnToAlignedMallocedPointer". >> >> Another way is stopping using "posix_memalign" for allocating LargeAllocation in JSC GC. Use "malloc" and adjust the pointer to align it to 16B. >> This way, we can always use "realloc" onto LargeAllocation since we know that this is allocated by "malloc". I think it would be cleaner since, >> >> 1. anyway, we already have a code handling misalignment for realloc-ed memory. >> 2. we do not need to rely on this "realloc"'s platform behavior. >> >> I'll try to do this. > > I've tried, but it seems that this is also not so good than I thought. We need to add non-aligned version of allocation-related member functions to JSC::AlignedMemoryAllocator, But it would be better. I'll continue trying. Comment on attachment 365893 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365893&action=review >>>>>>>>> Source/WTF/wtf/FastMalloc.cpp:271 >>>>>>>>> +} >>>>>>>> >>>>>>>> Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? >>>>>>> >>>>>>> 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. >>>>>>> So, in Darwin, we can just return `true`, and the code is doing this. >>>>>>> >>>>>>> 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. >>>>>>> And yes. As you said, we can make >>>>>>> 1) fastMalloc to call _aligned_malloc with 8 or 16 >>>>>>> 2) fastRealloc to call _aligned_realloc with 8 or 16 >>>>>>> 3) fastFree to call _aligned_free >>>>>>> But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. >>>>>>> >>>>>>> 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. >>>>>>> >>>>>>> So, due to (3), anyway, we need to have this concept. >>>>>> >>>>>> I see. Thanks for the explanation! >>>>>> >>>>>> I wonder if it would be better to have a Windows path in FastMalloc (to call _aligned_realloc) and libc path in the non-Darwin DebugHeap (to call posix_memalign and memcpy). Would that work? (It's nice for client code when FastMalloc can paper over these platform differences.) >>>>> >>>>> Sounds like a good idea! If we change Windows to always using _aligned_malloc / _aligned_realloc / _aligned_free, we can completely hide this concept behind the FastMalloc. >>>>> The problem was Windows' fastAlignedMalloc and fastAlignedFree. We cannot call realloc and free on fastAlignedMalloc memory, and this problem leaded to this complexity. But if we change Windows' fastMalloc / fastFree implementations to always using _aligned_malloc etc. we can meet the following features. >>>>> >>>>> 1. fastMalloced memory can be freed by fastFree >>>>> 2. fastAlignedMalloced memory can be freed by fastFree (this was not possible, and it prevented us from introducing fallback implementation to fastRealloc (we need to perform "free" onto the given memory, but we do not know whether the memory is allocated from fastMalloc or fastAlignedMalloc)). >>>>> >>>>> This allows us to implement fastRealloc for the non Darwin & non Windows environments as "malloc, memcpy and free". >>>> >>>> Ah! No, unfortunately :( >>>> "realloc" only takes (1) a pointer to the previously allocated memory and (2) size for newly allocated memory. But we need to know the size of (1)'s memory region, since, >>>> >>>> realloc(pointer, largerSize) >>>> => >>>> >>>> void* memory = malloc(largerSize); >>>> memcpy(memory, pointer, min(largerSize, sizeOfPointer)); // !!! >>>> free(pointer); >>>> >>>> But we do not know "sizeOfPointer" here. We can do that if we pass the allocated memory size to "realloc" function, but it adds additional complexity, and I think it is more complex and error prone than checking "canUseReallocOnToAlignedMallocedPointer". >>> >>> Another way is stopping using "posix_memalign" for allocating LargeAllocation in JSC GC. Use "malloc" and adjust the pointer to align it to 16B. >>> This way, we can always use "realloc" onto LargeAllocation since we know that this is allocated by "malloc". I think it would be cleaner since, >>> >>> 1. anyway, we already have a code handling misalignment for realloc-ed memory. >>> 2. we do not need to rely on this "realloc"'s platform behavior. >>> >>> I'll try to do this. >> >> I've tried, but it seems that this is also not so good than I thought. We need to add non-aligned version of allocation-related member functions to JSC::AlignedMemoryAllocator, > > But it would be better. I'll continue trying. I've uploaded the patch. In this patch, I just stop using this platform-specific feature. LargeAllocation is always managed by malloc & free. Stop using posix_memalign. Then, we can use realloc safely in all the platforms. Created attachment 366030 [details]
Patch
Created attachment 366033 [details]
Patch
Created attachment 366034 [details]
Patch
Created attachment 366035 [details]
Patch
Created attachment 366039 [details]
Patch
Comment on attachment 366039 [details] Patch Attachment 366039 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: https://webkit-queues.webkit.org/results/11678869 New failing tests: fast/visual-viewport/ios/min-scale-greater-than-one.html Created attachment 366044 [details]
Archive of layout-test-results from ews123 for ios-simulator-wk2
The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews123 Port: ios-simulator-wk2 Platform: Mac OS X 10.13.6
Created attachment 366054 [details]
Patch
Comment on attachment 366054 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366054&action=review > Source/JavaScriptCore/heap/CompleteSubspace.cpp:167 > + ASSERT(oldAllocation->weakSet().isTriviallyDestructible()); How is this guaranteed? Why is this needed? > Source/JavaScriptCore/heap/CompleteSubspace.cpp:189 > + m_largeAllocations.append(oldAllocation); Why do we need to do this if we failed to allocate? > Source/JavaScriptCore/heap/LargeAllocation.cpp:77 > + // This includes padding at the end of the allocation to maintain the distancing constraint. We should remove this. > Source/JavaScriptCore/heap/LargeAllocation.cpp:105 > + // We have 4 patterns. This is super unfortunate if we have to do this memmove. Doesn’t Bmalloc have an aligned malloc? Can’t we just use that? Comment on attachment 366054 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366054&action=review >> Source/JavaScriptCore/heap/CompleteSubspace.cpp:167 >> + ASSERT(oldAllocation->weakSet().isTriviallyDestructible()); > > How is this guaranteed? Why is this needed? Weak<> only accept JSCell*, so Auxiliary CompleteSubspace never uses it. Since we perform "realloc" without calling destructors, this needs to be ensured. If we call a destructor in "realloc", recovering the oldAllocation becomes super tricky instead when "realloc" fails. >> Source/JavaScriptCore/heap/CompleteSubspace.cpp:189 >> + m_largeAllocations.append(oldAllocation); > > Why do we need to do this if we failed to allocate? Because we removed this from Subspace in LargeAllocation::tryReallocate. If we do not remove oldAllocation from the linked list in "tryReallocate", we miss the chance to remove oldAllocation from Subspace's doubly linked list when "realloc" succeeds since address may be changed by "realloc". >> Source/JavaScriptCore/heap/LargeAllocation.cpp:77 >> + // This includes padding at the end of the allocation to maintain the distancing constraint. > > We should remove this. I'll remove this in "distancing removing patch" side :) >> Source/JavaScriptCore/heap/LargeAllocation.cpp:105 >> + // We have 4 patterns. > > This is super unfortunate if we have to do this memmove. > > Doesn’t Bmalloc have an aligned malloc? Can’t we just use that? bmalloc has aligned malloc. But it is not sufficient... What we need is aligned realloc. and nobody has it! (except for Windows). Even if we use aligned malloc in the first malloc (LargeAllocation::tryCreate), we do not have any guarantee that realloc-ed malloc has the same alignment (this is explicitly described in `man posix_memalign`). So, anyway, we need this complicated code. Let's consider the following example, 1. 1st LargeAllocation is allocated from aligned-malloc (posix_memalign) 2. After that, this LargeAllocation is realloc-ed (This is only valid on Darwin platforms. In Windows, it crashes. In Linux, it works but no guarantee). 3. Then, when we again want to expand this LargeAllocation, we need this adjustment anyway since result in (2)'s realloc may not have an alignment specified in (1). If we have an API like, posix_memaligned_realloc or aligned_realloc thing, we can consistently use that. But it does not exist in Darwin unfortunately :( We can add it to bmalloc. But since system malloc does not have that, we cannot use it anyway here. Good news: In Darwin libmalloc, returned memory from malloc is always 16B aligned (undocumented, but this is correct in the current implementation IIRC). So basically, memmove does not happen if we use this code with system malloc in Darwin. Comment on attachment 366054 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366054&action=review >>> Source/JavaScriptCore/heap/CompleteSubspace.cpp:189 >>> + m_largeAllocations.append(oldAllocation); >> >> Why do we need to do this if we failed to allocate? > > Because we removed this from Subspace in LargeAllocation::tryReallocate. If we do not remove oldAllocation from the linked list in "tryReallocate", we miss the chance to remove oldAllocation from Subspace's doubly linked list when "realloc" succeeds since address may be changed by "realloc". I changed this code a bit: moving if (isOnList()) remove(); operations out of tryReallocate to make code cleaner. >>> Source/JavaScriptCore/heap/LargeAllocation.cpp:77 >>> + // This includes padding at the end of the allocation to maintain the distancing constraint. >> >> We should remove this. > > I'll remove this in "distancing removing patch" side :) Removed. So I'll update the patch. Created attachment 366331 [details]
Patch
It is ready. Comment on attachment 366331 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366331&action=review r=me > Source/JavaScriptCore/heap/AlignedMemoryAllocator.h:53 > + // Only some of carefully-designed allocator has this. “Only some carefully-designed allocators have this.” That said, not sure this comment adds much. It’s better to explain why > Source/JavaScriptCore/heap/LargeAllocation.cpp:70 > + size_t allocationSizeAndAlignmentAdjustment = headerSize() + size + halfAlignment; Nit: i think a better name is “alignmentAdjustedAllocationSize” > Source/JavaScriptCore/heap/LargeAllocation.cpp:76 > + bool oldAlignmentAdjusted = m_alignmentAdjusted; Nit: another name suggestion, I think this makes more sense as “adjusted alignment” instead of “alignment adjusted”, e.g “m_adjustedAlignment” Comment on attachment 366331 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366331&action=review Thank you! >> Source/JavaScriptCore/heap/AlignedMemoryAllocator.h:53 >> + // Only some of carefully-designed allocator has this. > > “Only some carefully-designed allocators have this.” > > That said, not sure this comment adds much. It’s better to explain why We do not add realloc features to IsoAlignedMemoryAllocator, this is simply because it is not necessary: IsoAlignedMemoryAllocator is used for IsoSubspace, so allocation size is always the same. It means that we never have "realloc" use case. Changed the comment to explain this. >> Source/JavaScriptCore/heap/LargeAllocation.cpp:70 >> + size_t allocationSizeAndAlignmentAdjustment = headerSize() + size + halfAlignment; > > Nit: i think a better name is “alignmentAdjustedAllocationSize” Sounds good. Changed. >> Source/JavaScriptCore/heap/LargeAllocation.cpp:76 >> + bool oldAlignmentAdjusted = m_alignmentAdjusted; > > Nit: another name suggestion, I think this makes more sense as “adjusted alignment” instead of “alignment adjusted”, e.g “m_adjustedAlignment” Sounds nice. Fixed. Committed r243688: <https://trac.webkit.org/changeset/243688> |