RESOLVED FIXED 196160
[JSC] Butterfly allocation from LargeAllocation should try "realloc" behavior if collector thread is not active
https://bugs.webkit.org/show_bug.cgi?id=196160
Summary [JSC] Butterfly allocation from LargeAllocation should try "realloc" behavior...
Yusuke Suzuki
Reported 2019-03-22 14:45:16 PDT
...
Attachments
Patch (15.09 KB, patch)
2019-03-22 17:08 PDT, Yusuke Suzuki
no flags
Patch (15.12 KB, patch)
2019-03-22 17:24 PDT, Yusuke Suzuki
no flags
Archive of layout-test-results from ews107 for mac-highsierra-wk2 (1.90 MB, application/zip)
2019-03-22 18:32 PDT, EWS Watchlist
no flags
Archive of layout-test-results from ews103 for mac-highsierra (1.39 MB, application/zip)
2019-03-22 18:36 PDT, EWS Watchlist
no flags
Archive of layout-test-results from ews112 for mac-highsierra (2.15 MB, application/zip)
2019-03-22 18:42 PDT, EWS Watchlist
no flags
Archive of layout-test-results from ews125 for ios-simulator-wk2 (10.76 MB, application/zip)
2019-03-22 19:21 PDT, EWS Watchlist
no flags
Patch (24.83 KB, patch)
2019-03-22 19:40 PDT, Yusuke Suzuki
no flags
Patch (24.96 KB, patch)
2019-03-22 19:44 PDT, Yusuke Suzuki
no flags
Patch (26.21 KB, patch)
2019-03-22 20:32 PDT, Yusuke Suzuki
no flags
Patch (26.27 KB, patch)
2019-03-22 21:17 PDT, Yusuke Suzuki
no flags
Archive of layout-test-results from ews205 for win-future (12.86 MB, application/zip)
2019-03-22 23:58 PDT, EWS Watchlist
no flags
Patch (30.39 KB, patch)
2019-03-23 00:05 PDT, Yusuke Suzuki
no flags
Patch (33.37 KB, patch)
2019-03-25 13:07 PDT, Yusuke Suzuki
no flags
Patch (33.37 KB, patch)
2019-03-25 13:45 PDT, Yusuke Suzuki
no flags
Patch (32.96 KB, patch)
2019-03-26 18:03 PDT, Yusuke Suzuki
no flags
Patch (32.96 KB, patch)
2019-03-26 18:27 PDT, Yusuke Suzuki
no flags
Patch (29.55 KB, patch)
2019-03-26 18:30 PDT, Yusuke Suzuki
no flags
Patch (33.26 KB, patch)
2019-03-26 18:35 PDT, Yusuke Suzuki
no flags
Patch (32.84 KB, patch)
2019-03-26 18:40 PDT, Yusuke Suzuki
no flags
Archive of layout-test-results from ews123 for ios-simulator-wk2 (2.59 MB, application/zip)
2019-03-26 20:56 PDT, EWS Watchlist
no flags
Patch (32.85 KB, patch)
2019-03-26 23:19 PDT, Yusuke Suzuki
no flags
Patch (31.93 KB, patch)
2019-03-29 17:26 PDT, Yusuke Suzuki
saam: review+
Yusuke Suzuki
Comment 1 2019-03-22 14:47:18 PDT
I don't think this hurts performance even if we make it by default. This is simply because allocating LargeAllocation should be already super slow. And this cost is amortized by the exponential size increase policy. But anyway, we can measure it. And if we hit some performance problems, we could do, 1. enable it only if sweepSynchronously() is true 2. find some way for mitigation I think we do not need to do this and believe it is performance-neutral.
Yusuke Suzuki
Comment 2 2019-03-22 17:08:04 PDT
Created attachment 365781 [details] Patch Start working
Yusuke Suzuki
Comment 3 2019-03-22 17:24:05 PDT
Created attachment 365786 [details] Patch more
EWS Watchlist
Comment 4 2019-03-22 18:32:33 PDT
Comment on attachment 365786 [details] Patch Attachment 365786 [details] did not pass mac-wk2-ews (mac-wk2): Output: https://webkit-queues.webkit.org/results/11619055 Number of test failures exceeded the failure limit.
EWS Watchlist
Comment 5 2019-03-22 18:32:35 PDT
Created attachment 365789 [details] Archive of layout-test-results from ews107 for mac-highsierra-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews107 Port: mac-highsierra-wk2 Platform: Mac OS X 10.13.6
EWS Watchlist
Comment 6 2019-03-22 18:36:40 PDT
Comment on attachment 365786 [details] Patch Attachment 365786 [details] did not pass mac-ews (mac): Output: https://webkit-queues.webkit.org/results/11619108 Number of test failures exceeded the failure limit.
EWS Watchlist
Comment 7 2019-03-22 18:36:42 PDT
Created attachment 365790 [details] Archive of layout-test-results from ews103 for mac-highsierra The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews103 Port: mac-highsierra Platform: Mac OS X 10.13.6
EWS Watchlist
Comment 8 2019-03-22 18:42:36 PDT
Comment on attachment 365786 [details] Patch Attachment 365786 [details] did not pass mac-debug-ews (mac): Output: https://webkit-queues.webkit.org/results/11619062 Number of test failures exceeded the failure limit.
EWS Watchlist
Comment 9 2019-03-22 18:42:37 PDT
Created attachment 365791 [details] Archive of layout-test-results from ews112 for mac-highsierra The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews112 Port: mac-highsierra Platform: Mac OS X 10.13.6
EWS Watchlist
Comment 10 2019-03-22 19:21:48 PDT
Comment on attachment 365786 [details] Patch Attachment 365786 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: https://webkit-queues.webkit.org/results/11619219 Number of test failures exceeded the failure limit.
EWS Watchlist
Comment 11 2019-03-22 19:21:50 PDT
Created attachment 365795 [details] Archive of layout-test-results from ews125 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews125 Port: ios-simulator-wk2 Platform: Mac OS X 10.13.6
Yusuke Suzuki
Comment 12 2019-03-22 19:39:54 PDT
It turns out that "realloc" behavior is very important. While bmalloc uses "malloc", "memcpy" and "free" operations for "realloc" request, system malloc attempts to use in-place expansion.
Yusuke Suzuki
Comment 13 2019-03-22 19:40:11 PDT
Yusuke Suzuki
Comment 14 2019-03-22 19:44:52 PDT
Yusuke Suzuki
Comment 15 2019-03-22 20:32:26 PDT
Yusuke Suzuki
Comment 16 2019-03-22 21:17:47 PDT
Yusuke Suzuki
Comment 17 2019-03-22 23:43:58 PDT
(In reply to Yusuke Suzuki from comment #12) > It turns out that "realloc" behavior is very important. While bmalloc uses > "malloc", "memcpy" and "free" operations for "realloc" request, system > malloc attempts to use in-place expansion. Darwin libmalloc explicitly states that "posix_memalign + realloc is valid" in its man page. In glibc, there is no documentation, but it works. In bmalloc, it is valid. In windows, using realloc with _aligned_malloc is not valid. But it has _aligned_realloc. So if we want, we can use it.
EWS Watchlist
Comment 18 2019-03-22 23:58:29 PDT
Comment on attachment 365801 [details] Patch Attachment 365801 [details] did not pass win-ews (win): Output: https://webkit-queues.webkit.org/results/11620968 New failing tests: js/sort-large-array.html js/dom/put-direct-index-beyond-vector-length-resize.html js/array-iterate-backwards.html js/regress-155776.html fast/events/bogus-event-listener-invalidation.html js/promises-tests/promises-in-workers.html js/dom/array-with-double-assign.html fast/xpath/xpath-result-eventlistener-crash.html js/arrowfunction-lexical-bind-arguments-strict.html js/reentrant-caching.html js/dom/array-with-double-push.html editing/selection/move-left-right.html fast/workers/worker-gc2.html js/function-apply-many-args.html js/arrowfunction-lexical-bind-arguments-non-strict.html fast/dom/SelectorAPI/resig-SelectorsAPI-test.xhtml editing/selection/move-by-word-visually-crash-test-5.html fast/css/css-selector-deeply-nested.html fast/css/rule-selector-overflow.html js/dom/document-all-between-frames.html js/slow-stress/array-prototype-filter.html
EWS Watchlist
Comment 19 2019-03-22 23:58:41 PDT
Created attachment 365803 [details] Archive of layout-test-results from ews205 for win-future The attached test failures were seen while running run-webkit-tests on the win-ews. Bot: ews205 Port: win-future Platform: CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit
Yusuke Suzuki
Comment 20 2019-03-23 00:05:46 PDT
Yusuke Suzuki
Comment 21 2019-03-25 13:07:20 PDT
Yusuke Suzuki
Comment 22 2019-03-25 13:45:47 PDT
Yusuke Suzuki
Comment 23 2019-03-25 20:19:16 PDT
Comment on attachment 365893 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365893&action=review > Source/JavaScriptCore/heap/LargeAllocation.cpp:81 > + if (isOnList()) > + remove(); > + unsigned originalCellSize = m_cellSize; > + > + void* space = subspace->alignedMemoryAllocator()->tryReallocate(this, sizeIncludingDistancingAndAlignmentAdjustment); > + if (!space) > + return nullptr; I do not want to call LargeAllocation's destructor here since we still use oldAllocation if tryReallocate fails.
Geoffrey Garen
Comment 24 2019-03-26 14:16:34 PDT
Comment on attachment 365893 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365893&action=review > Source/WTF/wtf/FastMalloc.cpp:271 > +bool canUseReallocOnToAlignedMallocedPointer() > +{ > +#if OS(DARWIN) > + // `man posix_memalign` explicitly states that this is allowed. > + return true; > +#else > + return false; > +#endif > +} Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc?
Yusuke Suzuki
Comment 25 2019-03-26 14:29:40 PDT
Comment on attachment 365893 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365893&action=review >> Source/WTF/wtf/FastMalloc.cpp:271 >> +} > > Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. So, in Darwin, we can just return `true`, and the code is doing this. 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. And yes. As you said, we can make 1) fastMalloc to call _aligned_malloc with 8 or 16 2) fastRealloc to call _aligned_realloc with 8 or 16 3) fastFree to call _aligned_free But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. So, due to (3), anyway, we need to have this concept.
Geoffrey Garen
Comment 26 2019-03-26 16:09:39 PDT
Comment on attachment 365893 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365893&action=review >>> Source/WTF/wtf/FastMalloc.cpp:271 >>> +} >> >> Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? > > 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. > So, in Darwin, we can just return `true`, and the code is doing this. > > 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. > And yes. As you said, we can make > 1) fastMalloc to call _aligned_malloc with 8 or 16 > 2) fastRealloc to call _aligned_realloc with 8 or 16 > 3) fastFree to call _aligned_free > But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. > > 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. > > So, due to (3), anyway, we need to have this concept. I see. Thanks for the explanation! I wonder if it would be better to have a Windows path in FastMalloc (to call _aligned_realloc) and libc path in the non-Darwin DebugHeap (to call posix_memalign and memcpy). Would that work? (It's nice for client code when FastMalloc can paper over these platform differences.)
Yusuke Suzuki
Comment 27 2019-03-26 16:39:27 PDT
Comment on attachment 365893 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365893&action=review >>>> Source/WTF/wtf/FastMalloc.cpp:271 >>>> +} >>> >>> Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? >> >> 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. >> So, in Darwin, we can just return `true`, and the code is doing this. >> >> 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. >> And yes. As you said, we can make >> 1) fastMalloc to call _aligned_malloc with 8 or 16 >> 2) fastRealloc to call _aligned_realloc with 8 or 16 >> 3) fastFree to call _aligned_free >> But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. >> >> 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. >> >> So, due to (3), anyway, we need to have this concept. > > I see. Thanks for the explanation! > > I wonder if it would be better to have a Windows path in FastMalloc (to call _aligned_realloc) and libc path in the non-Darwin DebugHeap (to call posix_memalign and memcpy). Would that work? (It's nice for client code when FastMalloc can paper over these platform differences.) Sounds like a good idea! If we change Windows to always using _aligned_malloc / _aligned_realloc / _aligned_free, we can completely hide this concept behind the FastMalloc. The problem was Windows' fastAlignedMalloc and fastAlignedFree. We cannot call realloc and free on fastAlignedMalloc memory, and this problem leaded to this complexity. But if we change Windows' fastMalloc / fastFree implementations to always using _aligned_malloc etc. we can meet the following features. 1. fastMalloced memory can be freed by fastFree 2. fastAlignedMalloced memory can be freed by fastFree (this was not possible, and it prevented us from introducing fallback implementation to fastRealloc (we need to perform "free" onto the given memory, but we do not know whether the memory is allocated from fastMalloc or fastAlignedMalloc)). This allows us to implement fastRealloc for the non Darwin & non Windows environments as "malloc, memcpy and free".
Yusuke Suzuki
Comment 28 2019-03-26 16:50:56 PDT
Comment on attachment 365893 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365893&action=review >>>>> Source/WTF/wtf/FastMalloc.cpp:271 >>>>> +} >>>> >>>> Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? >>> >>> 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. >>> So, in Darwin, we can just return `true`, and the code is doing this. >>> >>> 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. >>> And yes. As you said, we can make >>> 1) fastMalloc to call _aligned_malloc with 8 or 16 >>> 2) fastRealloc to call _aligned_realloc with 8 or 16 >>> 3) fastFree to call _aligned_free >>> But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. >>> >>> 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. >>> >>> So, due to (3), anyway, we need to have this concept. >> >> I see. Thanks for the explanation! >> >> I wonder if it would be better to have a Windows path in FastMalloc (to call _aligned_realloc) and libc path in the non-Darwin DebugHeap (to call posix_memalign and memcpy). Would that work? (It's nice for client code when FastMalloc can paper over these platform differences.) > > Sounds like a good idea! If we change Windows to always using _aligned_malloc / _aligned_realloc / _aligned_free, we can completely hide this concept behind the FastMalloc. > The problem was Windows' fastAlignedMalloc and fastAlignedFree. We cannot call realloc and free on fastAlignedMalloc memory, and this problem leaded to this complexity. But if we change Windows' fastMalloc / fastFree implementations to always using _aligned_malloc etc. we can meet the following features. > > 1. fastMalloced memory can be freed by fastFree > 2. fastAlignedMalloced memory can be freed by fastFree (this was not possible, and it prevented us from introducing fallback implementation to fastRealloc (we need to perform "free" onto the given memory, but we do not know whether the memory is allocated from fastMalloc or fastAlignedMalloc)). > > This allows us to implement fastRealloc for the non Darwin & non Windows environments as "malloc, memcpy and free". Ah! No, unfortunately :( "realloc" only takes (1) a pointer to the previously allocated memory and (2) size for newly allocated memory. But we need to know the size of (1)'s memory region, since, realloc(pointer, largerSize) => void* memory = malloc(largerSize); memcpy(memory, pointer, min(largerSize, sizeOfPointer)); // !!! free(pointer); But we do not know "sizeOfPointer" here. We can do that if we pass the allocated memory size to "realloc" function, but it adds additional complexity, and I think it is more complex and error prone than checking "canUseReallocOnToAlignedMallocedPointer".
Yusuke Suzuki
Comment 29 2019-03-26 16:59:26 PDT
Comment on attachment 365893 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365893&action=review >>>>>> Source/WTF/wtf/FastMalloc.cpp:271 >>>>>> +} >>>>> >>>>> Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? >>>> >>>> 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. >>>> So, in Darwin, we can just return `true`, and the code is doing this. >>>> >>>> 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. >>>> And yes. As you said, we can make >>>> 1) fastMalloc to call _aligned_malloc with 8 or 16 >>>> 2) fastRealloc to call _aligned_realloc with 8 or 16 >>>> 3) fastFree to call _aligned_free >>>> But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. >>>> >>>> 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. >>>> >>>> So, due to (3), anyway, we need to have this concept. >>> >>> I see. Thanks for the explanation! >>> >>> I wonder if it would be better to have a Windows path in FastMalloc (to call _aligned_realloc) and libc path in the non-Darwin DebugHeap (to call posix_memalign and memcpy). Would that work? (It's nice for client code when FastMalloc can paper over these platform differences.) >> >> Sounds like a good idea! If we change Windows to always using _aligned_malloc / _aligned_realloc / _aligned_free, we can completely hide this concept behind the FastMalloc. >> The problem was Windows' fastAlignedMalloc and fastAlignedFree. We cannot call realloc and free on fastAlignedMalloc memory, and this problem leaded to this complexity. But if we change Windows' fastMalloc / fastFree implementations to always using _aligned_malloc etc. we can meet the following features. >> >> 1. fastMalloced memory can be freed by fastFree >> 2. fastAlignedMalloced memory can be freed by fastFree (this was not possible, and it prevented us from introducing fallback implementation to fastRealloc (we need to perform "free" onto the given memory, but we do not know whether the memory is allocated from fastMalloc or fastAlignedMalloc)). >> >> This allows us to implement fastRealloc for the non Darwin & non Windows environments as "malloc, memcpy and free". > > Ah! No, unfortunately :( > "realloc" only takes (1) a pointer to the previously allocated memory and (2) size for newly allocated memory. But we need to know the size of (1)'s memory region, since, > > realloc(pointer, largerSize) > => > > void* memory = malloc(largerSize); > memcpy(memory, pointer, min(largerSize, sizeOfPointer)); // !!! > free(pointer); > > But we do not know "sizeOfPointer" here. We can do that if we pass the allocated memory size to "realloc" function, but it adds additional complexity, and I think it is more complex and error prone than checking "canUseReallocOnToAlignedMallocedPointer". Another way is stopping using "posix_memalign" for allocating LargeAllocation in JSC GC. Use "malloc" and adjust the pointer to align it to 16B. This way, we can always use "realloc" onto LargeAllocation since we know that this is allocated by "malloc". I think it would be cleaner since, 1. anyway, we already have a code handling misalignment for realloc-ed memory. 2. we do not need to rely on this "realloc"'s platform behavior. I'll try to do this.
Yusuke Suzuki
Comment 30 2019-03-26 17:17:49 PDT
Comment on attachment 365893 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365893&action=review >>>>>>> Source/WTF/wtf/FastMalloc.cpp:271 >>>>>>> +} >>>>>> >>>>>> Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? >>>>> >>>>> 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. >>>>> So, in Darwin, we can just return `true`, and the code is doing this. >>>>> >>>>> 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. >>>>> And yes. As you said, we can make >>>>> 1) fastMalloc to call _aligned_malloc with 8 or 16 >>>>> 2) fastRealloc to call _aligned_realloc with 8 or 16 >>>>> 3) fastFree to call _aligned_free >>>>> But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. >>>>> >>>>> 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. >>>>> >>>>> So, due to (3), anyway, we need to have this concept. >>>> >>>> I see. Thanks for the explanation! >>>> >>>> I wonder if it would be better to have a Windows path in FastMalloc (to call _aligned_realloc) and libc path in the non-Darwin DebugHeap (to call posix_memalign and memcpy). Would that work? (It's nice for client code when FastMalloc can paper over these platform differences.) >>> >>> Sounds like a good idea! If we change Windows to always using _aligned_malloc / _aligned_realloc / _aligned_free, we can completely hide this concept behind the FastMalloc. >>> The problem was Windows' fastAlignedMalloc and fastAlignedFree. We cannot call realloc and free on fastAlignedMalloc memory, and this problem leaded to this complexity. But if we change Windows' fastMalloc / fastFree implementations to always using _aligned_malloc etc. we can meet the following features. >>> >>> 1. fastMalloced memory can be freed by fastFree >>> 2. fastAlignedMalloced memory can be freed by fastFree (this was not possible, and it prevented us from introducing fallback implementation to fastRealloc (we need to perform "free" onto the given memory, but we do not know whether the memory is allocated from fastMalloc or fastAlignedMalloc)). >>> >>> This allows us to implement fastRealloc for the non Darwin & non Windows environments as "malloc, memcpy and free". >> >> Ah! No, unfortunately :( >> "realloc" only takes (1) a pointer to the previously allocated memory and (2) size for newly allocated memory. But we need to know the size of (1)'s memory region, since, >> >> realloc(pointer, largerSize) >> => >> >> void* memory = malloc(largerSize); >> memcpy(memory, pointer, min(largerSize, sizeOfPointer)); // !!! >> free(pointer); >> >> But we do not know "sizeOfPointer" here. We can do that if we pass the allocated memory size to "realloc" function, but it adds additional complexity, and I think it is more complex and error prone than checking "canUseReallocOnToAlignedMallocedPointer". > > Another way is stopping using "posix_memalign" for allocating LargeAllocation in JSC GC. Use "malloc" and adjust the pointer to align it to 16B. > This way, we can always use "realloc" onto LargeAllocation since we know that this is allocated by "malloc". I think it would be cleaner since, > > 1. anyway, we already have a code handling misalignment for realloc-ed memory. > 2. we do not need to rely on this "realloc"'s platform behavior. > > I'll try to do this. I've tried, but it seems that this is also not so good than I thought. We need to add non-aligned version of allocation-related member functions to JSC::AlignedMemoryAllocator,
Yusuke Suzuki
Comment 31 2019-03-26 17:18:20 PDT
Comment on attachment 365893 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365893&action=review >>>>>>>> Source/WTF/wtf/FastMalloc.cpp:271 >>>>>>>> +} >>>>>>> >>>>>>> Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? >>>>>> >>>>>> 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. >>>>>> So, in Darwin, we can just return `true`, and the code is doing this. >>>>>> >>>>>> 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. >>>>>> And yes. As you said, we can make >>>>>> 1) fastMalloc to call _aligned_malloc with 8 or 16 >>>>>> 2) fastRealloc to call _aligned_realloc with 8 or 16 >>>>>> 3) fastFree to call _aligned_free >>>>>> But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. >>>>>> >>>>>> 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. >>>>>> >>>>>> So, due to (3), anyway, we need to have this concept. >>>>> >>>>> I see. Thanks for the explanation! >>>>> >>>>> I wonder if it would be better to have a Windows path in FastMalloc (to call _aligned_realloc) and libc path in the non-Darwin DebugHeap (to call posix_memalign and memcpy). Would that work? (It's nice for client code when FastMalloc can paper over these platform differences.) >>>> >>>> Sounds like a good idea! If we change Windows to always using _aligned_malloc / _aligned_realloc / _aligned_free, we can completely hide this concept behind the FastMalloc. >>>> The problem was Windows' fastAlignedMalloc and fastAlignedFree. We cannot call realloc and free on fastAlignedMalloc memory, and this problem leaded to this complexity. But if we change Windows' fastMalloc / fastFree implementations to always using _aligned_malloc etc. we can meet the following features. >>>> >>>> 1. fastMalloced memory can be freed by fastFree >>>> 2. fastAlignedMalloced memory can be freed by fastFree (this was not possible, and it prevented us from introducing fallback implementation to fastRealloc (we need to perform "free" onto the given memory, but we do not know whether the memory is allocated from fastMalloc or fastAlignedMalloc)). >>>> >>>> This allows us to implement fastRealloc for the non Darwin & non Windows environments as "malloc, memcpy and free". >>> >>> Ah! No, unfortunately :( >>> "realloc" only takes (1) a pointer to the previously allocated memory and (2) size for newly allocated memory. But we need to know the size of (1)'s memory region, since, >>> >>> realloc(pointer, largerSize) >>> => >>> >>> void* memory = malloc(largerSize); >>> memcpy(memory, pointer, min(largerSize, sizeOfPointer)); // !!! >>> free(pointer); >>> >>> But we do not know "sizeOfPointer" here. We can do that if we pass the allocated memory size to "realloc" function, but it adds additional complexity, and I think it is more complex and error prone than checking "canUseReallocOnToAlignedMallocedPointer". >> >> Another way is stopping using "posix_memalign" for allocating LargeAllocation in JSC GC. Use "malloc" and adjust the pointer to align it to 16B. >> This way, we can always use "realloc" onto LargeAllocation since we know that this is allocated by "malloc". I think it would be cleaner since, >> >> 1. anyway, we already have a code handling misalignment for realloc-ed memory. >> 2. we do not need to rely on this "realloc"'s platform behavior. >> >> I'll try to do this. > > I've tried, but it seems that this is also not so good than I thought. We need to add non-aligned version of allocation-related member functions to JSC::AlignedMemoryAllocator, But it would be better. I'll continue trying.
Yusuke Suzuki
Comment 32 2019-03-26 18:00:50 PDT
Comment on attachment 365893 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=365893&action=review >>>>>>>>> Source/WTF/wtf/FastMalloc.cpp:271 >>>>>>>>> +} >>>>>>>> >>>>>>>> Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? >>>>>>> >>>>>>> 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. >>>>>>> So, in Darwin, we can just return `true`, and the code is doing this. >>>>>>> >>>>>>> 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. >>>>>>> And yes. As you said, we can make >>>>>>> 1) fastMalloc to call _aligned_malloc with 8 or 16 >>>>>>> 2) fastRealloc to call _aligned_realloc with 8 or 16 >>>>>>> 3) fastFree to call _aligned_free >>>>>>> But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. >>>>>>> >>>>>>> 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. >>>>>>> >>>>>>> So, due to (3), anyway, we need to have this concept. >>>>>> >>>>>> I see. Thanks for the explanation! >>>>>> >>>>>> I wonder if it would be better to have a Windows path in FastMalloc (to call _aligned_realloc) and libc path in the non-Darwin DebugHeap (to call posix_memalign and memcpy). Would that work? (It's nice for client code when FastMalloc can paper over these platform differences.) >>>>> >>>>> Sounds like a good idea! If we change Windows to always using _aligned_malloc / _aligned_realloc / _aligned_free, we can completely hide this concept behind the FastMalloc. >>>>> The problem was Windows' fastAlignedMalloc and fastAlignedFree. We cannot call realloc and free on fastAlignedMalloc memory, and this problem leaded to this complexity. But if we change Windows' fastMalloc / fastFree implementations to always using _aligned_malloc etc. we can meet the following features. >>>>> >>>>> 1. fastMalloced memory can be freed by fastFree >>>>> 2. fastAlignedMalloced memory can be freed by fastFree (this was not possible, and it prevented us from introducing fallback implementation to fastRealloc (we need to perform "free" onto the given memory, but we do not know whether the memory is allocated from fastMalloc or fastAlignedMalloc)). >>>>> >>>>> This allows us to implement fastRealloc for the non Darwin & non Windows environments as "malloc, memcpy and free". >>>> >>>> Ah! No, unfortunately :( >>>> "realloc" only takes (1) a pointer to the previously allocated memory and (2) size for newly allocated memory. But we need to know the size of (1)'s memory region, since, >>>> >>>> realloc(pointer, largerSize) >>>> => >>>> >>>> void* memory = malloc(largerSize); >>>> memcpy(memory, pointer, min(largerSize, sizeOfPointer)); // !!! >>>> free(pointer); >>>> >>>> But we do not know "sizeOfPointer" here. We can do that if we pass the allocated memory size to "realloc" function, but it adds additional complexity, and I think it is more complex and error prone than checking "canUseReallocOnToAlignedMallocedPointer". >>> >>> Another way is stopping using "posix_memalign" for allocating LargeAllocation in JSC GC. Use "malloc" and adjust the pointer to align it to 16B. >>> This way, we can always use "realloc" onto LargeAllocation since we know that this is allocated by "malloc". I think it would be cleaner since, >>> >>> 1. anyway, we already have a code handling misalignment for realloc-ed memory. >>> 2. we do not need to rely on this "realloc"'s platform behavior. >>> >>> I'll try to do this. >> >> I've tried, but it seems that this is also not so good than I thought. We need to add non-aligned version of allocation-related member functions to JSC::AlignedMemoryAllocator, > > But it would be better. I'll continue trying. I've uploaded the patch. In this patch, I just stop using this platform-specific feature. LargeAllocation is always managed by malloc & free. Stop using posix_memalign. Then, we can use realloc safely in all the platforms.
Yusuke Suzuki
Comment 33 2019-03-26 18:03:05 PDT
Yusuke Suzuki
Comment 34 2019-03-26 18:27:45 PDT
Yusuke Suzuki
Comment 35 2019-03-26 18:30:25 PDT
Yusuke Suzuki
Comment 36 2019-03-26 18:35:21 PDT
Yusuke Suzuki
Comment 37 2019-03-26 18:40:28 PDT
EWS Watchlist
Comment 38 2019-03-26 20:56:51 PDT
Comment on attachment 366039 [details] Patch Attachment 366039 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: https://webkit-queues.webkit.org/results/11678869 New failing tests: fast/visual-viewport/ios/min-scale-greater-than-one.html
EWS Watchlist
Comment 39 2019-03-26 20:56:54 PDT
Created attachment 366044 [details] Archive of layout-test-results from ews123 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews123 Port: ios-simulator-wk2 Platform: Mac OS X 10.13.6
Yusuke Suzuki
Comment 40 2019-03-26 23:19:40 PDT
Saam Barati
Comment 41 2019-03-28 22:56:36 PDT
Comment on attachment 366054 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366054&action=review > Source/JavaScriptCore/heap/CompleteSubspace.cpp:167 > + ASSERT(oldAllocation->weakSet().isTriviallyDestructible()); How is this guaranteed? Why is this needed? > Source/JavaScriptCore/heap/CompleteSubspace.cpp:189 > + m_largeAllocations.append(oldAllocation); Why do we need to do this if we failed to allocate? > Source/JavaScriptCore/heap/LargeAllocation.cpp:77 > + // This includes padding at the end of the allocation to maintain the distancing constraint. We should remove this. > Source/JavaScriptCore/heap/LargeAllocation.cpp:105 > + // We have 4 patterns. This is super unfortunate if we have to do this memmove. Doesn’t Bmalloc have an aligned malloc? Can’t we just use that?
Yusuke Suzuki
Comment 42 2019-03-29 00:39:55 PDT
Comment on attachment 366054 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366054&action=review >> Source/JavaScriptCore/heap/CompleteSubspace.cpp:167 >> + ASSERT(oldAllocation->weakSet().isTriviallyDestructible()); > > How is this guaranteed? Why is this needed? Weak<> only accept JSCell*, so Auxiliary CompleteSubspace never uses it. Since we perform "realloc" without calling destructors, this needs to be ensured. If we call a destructor in "realloc", recovering the oldAllocation becomes super tricky instead when "realloc" fails. >> Source/JavaScriptCore/heap/CompleteSubspace.cpp:189 >> + m_largeAllocations.append(oldAllocation); > > Why do we need to do this if we failed to allocate? Because we removed this from Subspace in LargeAllocation::tryReallocate. If we do not remove oldAllocation from the linked list in "tryReallocate", we miss the chance to remove oldAllocation from Subspace's doubly linked list when "realloc" succeeds since address may be changed by "realloc". >> Source/JavaScriptCore/heap/LargeAllocation.cpp:77 >> + // This includes padding at the end of the allocation to maintain the distancing constraint. > > We should remove this. I'll remove this in "distancing removing patch" side :) >> Source/JavaScriptCore/heap/LargeAllocation.cpp:105 >> + // We have 4 patterns. > > This is super unfortunate if we have to do this memmove. > > Doesn’t Bmalloc have an aligned malloc? Can’t we just use that? bmalloc has aligned malloc. But it is not sufficient... What we need is aligned realloc. and nobody has it! (except for Windows). Even if we use aligned malloc in the first malloc (LargeAllocation::tryCreate), we do not have any guarantee that realloc-ed malloc has the same alignment (this is explicitly described in `man posix_memalign`). So, anyway, we need this complicated code. Let's consider the following example, 1. 1st LargeAllocation is allocated from aligned-malloc (posix_memalign) 2. After that, this LargeAllocation is realloc-ed (This is only valid on Darwin platforms. In Windows, it crashes. In Linux, it works but no guarantee). 3. Then, when we again want to expand this LargeAllocation, we need this adjustment anyway since result in (2)'s realloc may not have an alignment specified in (1). If we have an API like, posix_memaligned_realloc or aligned_realloc thing, we can consistently use that. But it does not exist in Darwin unfortunately :( We can add it to bmalloc. But since system malloc does not have that, we cannot use it anyway here. Good news: In Darwin libmalloc, returned memory from malloc is always 16B aligned (undocumented, but this is correct in the current implementation IIRC). So basically, memmove does not happen if we use this code with system malloc in Darwin.
Yusuke Suzuki
Comment 43 2019-03-29 17:13:39 PDT
Comment on attachment 366054 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366054&action=review >>> Source/JavaScriptCore/heap/CompleteSubspace.cpp:189 >>> + m_largeAllocations.append(oldAllocation); >> >> Why do we need to do this if we failed to allocate? > > Because we removed this from Subspace in LargeAllocation::tryReallocate. If we do not remove oldAllocation from the linked list in "tryReallocate", we miss the chance to remove oldAllocation from Subspace's doubly linked list when "realloc" succeeds since address may be changed by "realloc". I changed this code a bit: moving if (isOnList()) remove(); operations out of tryReallocate to make code cleaner. >>> Source/JavaScriptCore/heap/LargeAllocation.cpp:77 >>> + // This includes padding at the end of the allocation to maintain the distancing constraint. >> >> We should remove this. > > I'll remove this in "distancing removing patch" side :) Removed. So I'll update the patch.
Yusuke Suzuki
Comment 44 2019-03-29 17:26:18 PDT
Yusuke Suzuki
Comment 45 2019-03-29 18:18:31 PDT
It is ready.
Saam Barati
Comment 46 2019-03-31 16:44:36 PDT
Comment on attachment 366331 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366331&action=review r=me > Source/JavaScriptCore/heap/AlignedMemoryAllocator.h:53 > + // Only some of carefully-designed allocator has this. “Only some carefully-designed allocators have this.” That said, not sure this comment adds much. It’s better to explain why > Source/JavaScriptCore/heap/LargeAllocation.cpp:70 > + size_t allocationSizeAndAlignmentAdjustment = headerSize() + size + halfAlignment; Nit: i think a better name is “alignmentAdjustedAllocationSize” > Source/JavaScriptCore/heap/LargeAllocation.cpp:76 > + bool oldAlignmentAdjusted = m_alignmentAdjusted; Nit: another name suggestion, I think this makes more sense as “adjusted alignment” instead of “alignment adjusted”, e.g “m_adjustedAlignment”
Yusuke Suzuki
Comment 47 2019-03-31 23:40:36 PDT
Comment on attachment 366331 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=366331&action=review Thank you! >> Source/JavaScriptCore/heap/AlignedMemoryAllocator.h:53 >> + // Only some of carefully-designed allocator has this. > > “Only some carefully-designed allocators have this.” > > That said, not sure this comment adds much. It’s better to explain why We do not add realloc features to IsoAlignedMemoryAllocator, this is simply because it is not necessary: IsoAlignedMemoryAllocator is used for IsoSubspace, so allocation size is always the same. It means that we never have "realloc" use case. Changed the comment to explain this. >> Source/JavaScriptCore/heap/LargeAllocation.cpp:70 >> + size_t allocationSizeAndAlignmentAdjustment = headerSize() + size + halfAlignment; > > Nit: i think a better name is “alignmentAdjustedAllocationSize” Sounds good. Changed. >> Source/JavaScriptCore/heap/LargeAllocation.cpp:76 >> + bool oldAlignmentAdjusted = m_alignmentAdjusted; > > Nit: another name suggestion, I think this makes more sense as “adjusted alignment” instead of “alignment adjusted”, e.g “m_adjustedAlignment” Sounds nice. Fixed.
Yusuke Suzuki
Comment 48 2019-03-31 23:51:22 PDT
Radar WebKit Bug Importer
Comment 49 2019-03-31 23:54:45 PDT
Note You need to log in before you can comment on or make changes to this bug.