WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
196160
[JSC] Butterfly allocation from LargeAllocation should try "realloc" behavior if collector thread is not active
https://bugs.webkit.org/show_bug.cgi?id=196160
Summary
[JSC] Butterfly allocation from LargeAllocation should try "realloc" behavior...
Yusuke Suzuki
Reported
2019-03-22 14:45:16 PDT
...
Attachments
Patch
(15.09 KB, patch)
2019-03-22 17:08 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(15.12 KB, patch)
2019-03-22 17:24 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews107 for mac-highsierra-wk2
(1.90 MB, application/zip)
2019-03-22 18:32 PDT
,
EWS Watchlist
no flags
Details
Archive of layout-test-results from ews103 for mac-highsierra
(1.39 MB, application/zip)
2019-03-22 18:36 PDT
,
EWS Watchlist
no flags
Details
Archive of layout-test-results from ews112 for mac-highsierra
(2.15 MB, application/zip)
2019-03-22 18:42 PDT
,
EWS Watchlist
no flags
Details
Archive of layout-test-results from ews125 for ios-simulator-wk2
(10.76 MB, application/zip)
2019-03-22 19:21 PDT
,
EWS Watchlist
no flags
Details
Patch
(24.83 KB, patch)
2019-03-22 19:40 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(24.96 KB, patch)
2019-03-22 19:44 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(26.21 KB, patch)
2019-03-22 20:32 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(26.27 KB, patch)
2019-03-22 21:17 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews205 for win-future
(12.86 MB, application/zip)
2019-03-22 23:58 PDT
,
EWS Watchlist
no flags
Details
Patch
(30.39 KB, patch)
2019-03-23 00:05 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(33.37 KB, patch)
2019-03-25 13:07 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(33.37 KB, patch)
2019-03-25 13:45 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(32.96 KB, patch)
2019-03-26 18:03 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(32.96 KB, patch)
2019-03-26 18:27 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(29.55 KB, patch)
2019-03-26 18:30 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(33.26 KB, patch)
2019-03-26 18:35 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(32.84 KB, patch)
2019-03-26 18:40 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews123 for ios-simulator-wk2
(2.59 MB, application/zip)
2019-03-26 20:56 PDT
,
EWS Watchlist
no flags
Details
Patch
(32.85 KB, patch)
2019-03-26 23:19 PDT
,
Yusuke Suzuki
no flags
Details
Formatted Diff
Diff
Patch
(31.93 KB, patch)
2019-03-29 17:26 PDT
,
Yusuke Suzuki
saam
: review+
Details
Formatted Diff
Diff
Show Obsolete
(15)
View All
Add attachment
proposed patch, testcase, etc.
Yusuke Suzuki
Comment 1
2019-03-22 14:47:18 PDT
I don't think this hurts performance even if we make it by default. This is simply because allocating LargeAllocation should be already super slow. And this cost is amortized by the exponential size increase policy. But anyway, we can measure it. And if we hit some performance problems, we could do, 1. enable it only if sweepSynchronously() is true 2. find some way for mitigation I think we do not need to do this and believe it is performance-neutral.
Yusuke Suzuki
Comment 2
2019-03-22 17:08:04 PDT
Created
attachment 365781
[details]
Patch Start working
Yusuke Suzuki
Comment 3
2019-03-22 17:24:05 PDT
Created
attachment 365786
[details]
Patch more
EWS Watchlist
Comment 4
2019-03-22 18:32:33 PDT
Comment on
attachment 365786
[details]
Patch
Attachment 365786
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
https://webkit-queues.webkit.org/results/11619055
Number of test failures exceeded the failure limit.
EWS Watchlist
Comment 5
2019-03-22 18:32:35 PDT
Created
attachment 365789
[details]
Archive of layout-test-results from ews107 for mac-highsierra-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews107 Port: mac-highsierra-wk2 Platform: Mac OS X 10.13.6
EWS Watchlist
Comment 6
2019-03-22 18:36:40 PDT
Comment on
attachment 365786
[details]
Patch
Attachment 365786
[details]
did not pass mac-ews (mac): Output:
https://webkit-queues.webkit.org/results/11619108
Number of test failures exceeded the failure limit.
EWS Watchlist
Comment 7
2019-03-22 18:36:42 PDT
Created
attachment 365790
[details]
Archive of layout-test-results from ews103 for mac-highsierra The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews103 Port: mac-highsierra Platform: Mac OS X 10.13.6
EWS Watchlist
Comment 8
2019-03-22 18:42:36 PDT
Comment on
attachment 365786
[details]
Patch
Attachment 365786
[details]
did not pass mac-debug-ews (mac): Output:
https://webkit-queues.webkit.org/results/11619062
Number of test failures exceeded the failure limit.
EWS Watchlist
Comment 9
2019-03-22 18:42:37 PDT
Created
attachment 365791
[details]
Archive of layout-test-results from ews112 for mac-highsierra The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews112 Port: mac-highsierra Platform: Mac OS X 10.13.6
EWS Watchlist
Comment 10
2019-03-22 19:21:48 PDT
Comment on
attachment 365786
[details]
Patch
Attachment 365786
[details]
did not pass ios-sim-ews (ios-simulator-wk2): Output:
https://webkit-queues.webkit.org/results/11619219
Number of test failures exceeded the failure limit.
EWS Watchlist
Comment 11
2019-03-22 19:21:50 PDT
Created
attachment 365795
[details]
Archive of layout-test-results from ews125 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews125 Port: ios-simulator-wk2 Platform: Mac OS X 10.13.6
Yusuke Suzuki
Comment 12
2019-03-22 19:39:54 PDT
It turns out that "realloc" behavior is very important. While bmalloc uses "malloc", "memcpy" and "free" operations for "realloc" request, system malloc attempts to use in-place expansion.
Yusuke Suzuki
Comment 13
2019-03-22 19:40:11 PDT
Created
attachment 365796
[details]
Patch
Yusuke Suzuki
Comment 14
2019-03-22 19:44:52 PDT
Created
attachment 365797
[details]
Patch
Yusuke Suzuki
Comment 15
2019-03-22 20:32:26 PDT
Created
attachment 365800
[details]
Patch
Yusuke Suzuki
Comment 16
2019-03-22 21:17:47 PDT
Created
attachment 365801
[details]
Patch
Yusuke Suzuki
Comment 17
2019-03-22 23:43:58 PDT
(In reply to Yusuke Suzuki from
comment #12
)
> It turns out that "realloc" behavior is very important. While bmalloc uses > "malloc", "memcpy" and "free" operations for "realloc" request, system > malloc attempts to use in-place expansion.
Darwin libmalloc explicitly states that "posix_memalign + realloc is valid" in its man page. In glibc, there is no documentation, but it works. In bmalloc, it is valid. In windows, using realloc with _aligned_malloc is not valid. But it has _aligned_realloc. So if we want, we can use it.
EWS Watchlist
Comment 18
2019-03-22 23:58:29 PDT
Comment on
attachment 365801
[details]
Patch
Attachment 365801
[details]
did not pass win-ews (win): Output:
https://webkit-queues.webkit.org/results/11620968
New failing tests: js/sort-large-array.html js/dom/put-direct-index-beyond-vector-length-resize.html js/array-iterate-backwards.html js/regress-155776.html fast/events/bogus-event-listener-invalidation.html js/promises-tests/promises-in-workers.html js/dom/array-with-double-assign.html fast/xpath/xpath-result-eventlistener-crash.html js/arrowfunction-lexical-bind-arguments-strict.html js/reentrant-caching.html js/dom/array-with-double-push.html editing/selection/move-left-right.html fast/workers/worker-gc2.html js/function-apply-many-args.html js/arrowfunction-lexical-bind-arguments-non-strict.html fast/dom/SelectorAPI/resig-SelectorsAPI-test.xhtml editing/selection/move-by-word-visually-crash-test-5.html fast/css/css-selector-deeply-nested.html fast/css/rule-selector-overflow.html js/dom/document-all-between-frames.html js/slow-stress/array-prototype-filter.html
EWS Watchlist
Comment 19
2019-03-22 23:58:41 PDT
Created
attachment 365803
[details]
Archive of layout-test-results from ews205 for win-future The attached test failures were seen while running run-webkit-tests on the win-ews. Bot: ews205 Port: win-future Platform: CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit
Yusuke Suzuki
Comment 20
2019-03-23 00:05:46 PDT
Created
attachment 365805
[details]
Patch
Yusuke Suzuki
Comment 21
2019-03-25 13:07:20 PDT
Created
attachment 365888
[details]
Patch
Yusuke Suzuki
Comment 22
2019-03-25 13:45:47 PDT
Created
attachment 365893
[details]
Patch
Yusuke Suzuki
Comment 23
2019-03-25 20:19:16 PDT
Comment on
attachment 365893
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=365893&action=review
> Source/JavaScriptCore/heap/LargeAllocation.cpp:81 > + if (isOnList()) > + remove(); > + unsigned originalCellSize = m_cellSize; > + > + void* space = subspace->alignedMemoryAllocator()->tryReallocate(this, sizeIncludingDistancingAndAlignmentAdjustment); > + if (!space) > + return nullptr;
I do not want to call LargeAllocation's destructor here since we still use oldAllocation if tryReallocate fails.
Geoffrey Garen
Comment 24
2019-03-26 14:16:34 PDT
Comment on
attachment 365893
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=365893&action=review
> Source/WTF/wtf/FastMalloc.cpp:271 > +bool canUseReallocOnToAlignedMallocedPointer() > +{ > +#if OS(DARWIN) > + // `man posix_memalign` explicitly states that this is allowed. > + return true; > +#else > + return false; > +#endif > +}
Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc?
Yusuke Suzuki
Comment 25
2019-03-26 14:29:40 PDT
Comment on
attachment 365893
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=365893&action=review
>> Source/WTF/wtf/FastMalloc.cpp:271 >> +} > > Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc?
1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. So, in Darwin, we can just return `true`, and the code is doing this. 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. And yes. As you said, we can make 1) fastMalloc to call _aligned_malloc with 8 or 16 2) fastRealloc to call _aligned_realloc with 8 or 16 3) fastFree to call _aligned_free But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. So, due to (3), anyway, we need to have this concept.
Geoffrey Garen
Comment 26
2019-03-26 16:09:39 PDT
Comment on
attachment 365893
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=365893&action=review
>>> Source/WTF/wtf/FastMalloc.cpp:271 >>> +} >> >> Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? > > 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. > So, in Darwin, we can just return `true`, and the code is doing this. > > 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. > And yes. As you said, we can make > 1) fastMalloc to call _aligned_malloc with 8 or 16 > 2) fastRealloc to call _aligned_realloc with 8 or 16 > 3) fastFree to call _aligned_free > But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. > > 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. > > So, due to (3), anyway, we need to have this concept.
I see. Thanks for the explanation! I wonder if it would be better to have a Windows path in FastMalloc (to call _aligned_realloc) and libc path in the non-Darwin DebugHeap (to call posix_memalign and memcpy). Would that work? (It's nice for client code when FastMalloc can paper over these platform differences.)
Yusuke Suzuki
Comment 27
2019-03-26 16:39:27 PDT
Comment on
attachment 365893
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=365893&action=review
>>>> Source/WTF/wtf/FastMalloc.cpp:271 >>>> +} >>> >>> Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? >> >> 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. >> So, in Darwin, we can just return `true`, and the code is doing this. >> >> 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. >> And yes. As you said, we can make >> 1) fastMalloc to call _aligned_malloc with 8 or 16 >> 2) fastRealloc to call _aligned_realloc with 8 or 16 >> 3) fastFree to call _aligned_free >> But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. >> >> 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. >> >> So, due to (3), anyway, we need to have this concept. > > I see. Thanks for the explanation! > > I wonder if it would be better to have a Windows path in FastMalloc (to call _aligned_realloc) and libc path in the non-Darwin DebugHeap (to call posix_memalign and memcpy). Would that work? (It's nice for client code when FastMalloc can paper over these platform differences.)
Sounds like a good idea! If we change Windows to always using _aligned_malloc / _aligned_realloc / _aligned_free, we can completely hide this concept behind the FastMalloc. The problem was Windows' fastAlignedMalloc and fastAlignedFree. We cannot call realloc and free on fastAlignedMalloc memory, and this problem leaded to this complexity. But if we change Windows' fastMalloc / fastFree implementations to always using _aligned_malloc etc. we can meet the following features. 1. fastMalloced memory can be freed by fastFree 2. fastAlignedMalloced memory can be freed by fastFree (this was not possible, and it prevented us from introducing fallback implementation to fastRealloc (we need to perform "free" onto the given memory, but we do not know whether the memory is allocated from fastMalloc or fastAlignedMalloc)). This allows us to implement fastRealloc for the non Darwin & non Windows environments as "malloc, memcpy and free".
Yusuke Suzuki
Comment 28
2019-03-26 16:50:56 PDT
Comment on
attachment 365893
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=365893&action=review
>>>>> Source/WTF/wtf/FastMalloc.cpp:271 >>>>> +} >>>> >>>> Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? >>> >>> 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. >>> So, in Darwin, we can just return `true`, and the code is doing this. >>> >>> 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. >>> And yes. As you said, we can make >>> 1) fastMalloc to call _aligned_malloc with 8 or 16 >>> 2) fastRealloc to call _aligned_realloc with 8 or 16 >>> 3) fastFree to call _aligned_free >>> But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. >>> >>> 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. >>> >>> So, due to (3), anyway, we need to have this concept. >> >> I see. Thanks for the explanation! >> >> I wonder if it would be better to have a Windows path in FastMalloc (to call _aligned_realloc) and libc path in the non-Darwin DebugHeap (to call posix_memalign and memcpy). Would that work? (It's nice for client code when FastMalloc can paper over these platform differences.) > > Sounds like a good idea! If we change Windows to always using _aligned_malloc / _aligned_realloc / _aligned_free, we can completely hide this concept behind the FastMalloc. > The problem was Windows' fastAlignedMalloc and fastAlignedFree. We cannot call realloc and free on fastAlignedMalloc memory, and this problem leaded to this complexity. But if we change Windows' fastMalloc / fastFree implementations to always using _aligned_malloc etc. we can meet the following features. > > 1. fastMalloced memory can be freed by fastFree > 2. fastAlignedMalloced memory can be freed by fastFree (this was not possible, and it prevented us from introducing fallback implementation to fastRealloc (we need to perform "free" onto the given memory, but we do not know whether the memory is allocated from fastMalloc or fastAlignedMalloc)). > > This allows us to implement fastRealloc for the non Darwin & non Windows environments as "malloc, memcpy and free".
Ah! No, unfortunately :( "realloc" only takes (1) a pointer to the previously allocated memory and (2) size for newly allocated memory. But we need to know the size of (1)'s memory region, since, realloc(pointer, largerSize) => void* memory = malloc(largerSize); memcpy(memory, pointer, min(largerSize, sizeOfPointer)); // !!! free(pointer); But we do not know "sizeOfPointer" here. We can do that if we pass the allocated memory size to "realloc" function, but it adds additional complexity, and I think it is more complex and error prone than checking "canUseReallocOnToAlignedMallocedPointer".
Yusuke Suzuki
Comment 29
2019-03-26 16:59:26 PDT
Comment on
attachment 365893
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=365893&action=review
>>>>>> Source/WTF/wtf/FastMalloc.cpp:271 >>>>>> +} >>>>> >>>>> Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? >>>> >>>> 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. >>>> So, in Darwin, we can just return `true`, and the code is doing this. >>>> >>>> 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. >>>> And yes. As you said, we can make >>>> 1) fastMalloc to call _aligned_malloc with 8 or 16 >>>> 2) fastRealloc to call _aligned_realloc with 8 or 16 >>>> 3) fastFree to call _aligned_free >>>> But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. >>>> >>>> 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. >>>> >>>> So, due to (3), anyway, we need to have this concept. >>> >>> I see. Thanks for the explanation! >>> >>> I wonder if it would be better to have a Windows path in FastMalloc (to call _aligned_realloc) and libc path in the non-Darwin DebugHeap (to call posix_memalign and memcpy). Would that work? (It's nice for client code when FastMalloc can paper over these platform differences.) >> >> Sounds like a good idea! If we change Windows to always using _aligned_malloc / _aligned_realloc / _aligned_free, we can completely hide this concept behind the FastMalloc. >> The problem was Windows' fastAlignedMalloc and fastAlignedFree. We cannot call realloc and free on fastAlignedMalloc memory, and this problem leaded to this complexity. But if we change Windows' fastMalloc / fastFree implementations to always using _aligned_malloc etc. we can meet the following features. >> >> 1. fastMalloced memory can be freed by fastFree >> 2. fastAlignedMalloced memory can be freed by fastFree (this was not possible, and it prevented us from introducing fallback implementation to fastRealloc (we need to perform "free" onto the given memory, but we do not know whether the memory is allocated from fastMalloc or fastAlignedMalloc)). >> >> This allows us to implement fastRealloc for the non Darwin & non Windows environments as "malloc, memcpy and free". > > Ah! No, unfortunately :( > "realloc" only takes (1) a pointer to the previously allocated memory and (2) size for newly allocated memory. But we need to know the size of (1)'s memory region, since, > > realloc(pointer, largerSize) > => > > void* memory = malloc(largerSize); > memcpy(memory, pointer, min(largerSize, sizeOfPointer)); // !!! > free(pointer); > > But we do not know "sizeOfPointer" here. We can do that if we pass the allocated memory size to "realloc" function, but it adds additional complexity, and I think it is more complex and error prone than checking "canUseReallocOnToAlignedMallocedPointer".
Another way is stopping using "posix_memalign" for allocating LargeAllocation in JSC GC. Use "malloc" and adjust the pointer to align it to 16B. This way, we can always use "realloc" onto LargeAllocation since we know that this is allocated by "malloc". I think it would be cleaner since, 1. anyway, we already have a code handling misalignment for realloc-ed memory. 2. we do not need to rely on this "realloc"'s platform behavior. I'll try to do this.
Yusuke Suzuki
Comment 30
2019-03-26 17:17:49 PDT
Comment on
attachment 365893
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=365893&action=review
>>>>>>> Source/WTF/wtf/FastMalloc.cpp:271 >>>>>>> +} >>>>>> >>>>>> Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? >>>>> >>>>> 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. >>>>> So, in Darwin, we can just return `true`, and the code is doing this. >>>>> >>>>> 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. >>>>> And yes. As you said, we can make >>>>> 1) fastMalloc to call _aligned_malloc with 8 or 16 >>>>> 2) fastRealloc to call _aligned_realloc with 8 or 16 >>>>> 3) fastFree to call _aligned_free >>>>> But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. >>>>> >>>>> 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. >>>>> >>>>> So, due to (3), anyway, we need to have this concept. >>>> >>>> I see. Thanks for the explanation! >>>> >>>> I wonder if it would be better to have a Windows path in FastMalloc (to call _aligned_realloc) and libc path in the non-Darwin DebugHeap (to call posix_memalign and memcpy). Would that work? (It's nice for client code when FastMalloc can paper over these platform differences.) >>> >>> Sounds like a good idea! If we change Windows to always using _aligned_malloc / _aligned_realloc / _aligned_free, we can completely hide this concept behind the FastMalloc. >>> The problem was Windows' fastAlignedMalloc and fastAlignedFree. We cannot call realloc and free on fastAlignedMalloc memory, and this problem leaded to this complexity. But if we change Windows' fastMalloc / fastFree implementations to always using _aligned_malloc etc. we can meet the following features. >>> >>> 1. fastMalloced memory can be freed by fastFree >>> 2. fastAlignedMalloced memory can be freed by fastFree (this was not possible, and it prevented us from introducing fallback implementation to fastRealloc (we need to perform "free" onto the given memory, but we do not know whether the memory is allocated from fastMalloc or fastAlignedMalloc)). >>> >>> This allows us to implement fastRealloc for the non Darwin & non Windows environments as "malloc, memcpy and free". >> >> Ah! No, unfortunately :( >> "realloc" only takes (1) a pointer to the previously allocated memory and (2) size for newly allocated memory. But we need to know the size of (1)'s memory region, since, >> >> realloc(pointer, largerSize) >> => >> >> void* memory = malloc(largerSize); >> memcpy(memory, pointer, min(largerSize, sizeOfPointer)); // !!! >> free(pointer); >> >> But we do not know "sizeOfPointer" here. We can do that if we pass the allocated memory size to "realloc" function, but it adds additional complexity, and I think it is more complex and error prone than checking "canUseReallocOnToAlignedMallocedPointer". > > Another way is stopping using "posix_memalign" for allocating LargeAllocation in JSC GC. Use "malloc" and adjust the pointer to align it to 16B. > This way, we can always use "realloc" onto LargeAllocation since we know that this is allocated by "malloc". I think it would be cleaner since, > > 1. anyway, we already have a code handling misalignment for realloc-ed memory. > 2. we do not need to rely on this "realloc"'s platform behavior. > > I'll try to do this.
I've tried, but it seems that this is also not so good than I thought. We need to add non-aligned version of allocation-related member functions to JSC::AlignedMemoryAllocator,
Yusuke Suzuki
Comment 31
2019-03-26 17:18:20 PDT
Comment on
attachment 365893
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=365893&action=review
>>>>>>>> Source/WTF/wtf/FastMalloc.cpp:271 >>>>>>>> +} >>>>>>> >>>>>>> Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? >>>>>> >>>>>> 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. >>>>>> So, in Darwin, we can just return `true`, and the code is doing this. >>>>>> >>>>>> 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. >>>>>> And yes. As you said, we can make >>>>>> 1) fastMalloc to call _aligned_malloc with 8 or 16 >>>>>> 2) fastRealloc to call _aligned_realloc with 8 or 16 >>>>>> 3) fastFree to call _aligned_free >>>>>> But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. >>>>>> >>>>>> 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. >>>>>> >>>>>> So, due to (3), anyway, we need to have this concept. >>>>> >>>>> I see. Thanks for the explanation! >>>>> >>>>> I wonder if it would be better to have a Windows path in FastMalloc (to call _aligned_realloc) and libc path in the non-Darwin DebugHeap (to call posix_memalign and memcpy). Would that work? (It's nice for client code when FastMalloc can paper over these platform differences.) >>>> >>>> Sounds like a good idea! If we change Windows to always using _aligned_malloc / _aligned_realloc / _aligned_free, we can completely hide this concept behind the FastMalloc. >>>> The problem was Windows' fastAlignedMalloc and fastAlignedFree. We cannot call realloc and free on fastAlignedMalloc memory, and this problem leaded to this complexity. But if we change Windows' fastMalloc / fastFree implementations to always using _aligned_malloc etc. we can meet the following features. >>>> >>>> 1. fastMalloced memory can be freed by fastFree >>>> 2. fastAlignedMalloced memory can be freed by fastFree (this was not possible, and it prevented us from introducing fallback implementation to fastRealloc (we need to perform "free" onto the given memory, but we do not know whether the memory is allocated from fastMalloc or fastAlignedMalloc)). >>>> >>>> This allows us to implement fastRealloc for the non Darwin & non Windows environments as "malloc, memcpy and free". >>> >>> Ah! No, unfortunately :( >>> "realloc" only takes (1) a pointer to the previously allocated memory and (2) size for newly allocated memory. But we need to know the size of (1)'s memory region, since, >>> >>> realloc(pointer, largerSize) >>> => >>> >>> void* memory = malloc(largerSize); >>> memcpy(memory, pointer, min(largerSize, sizeOfPointer)); // !!! >>> free(pointer); >>> >>> But we do not know "sizeOfPointer" here. We can do that if we pass the allocated memory size to "realloc" function, but it adds additional complexity, and I think it is more complex and error prone than checking "canUseReallocOnToAlignedMallocedPointer". >> >> Another way is stopping using "posix_memalign" for allocating LargeAllocation in JSC GC. Use "malloc" and adjust the pointer to align it to 16B. >> This way, we can always use "realloc" onto LargeAllocation since we know that this is allocated by "malloc". I think it would be cleaner since, >> >> 1. anyway, we already have a code handling misalignment for realloc-ed memory. >> 2. we do not need to rely on this "realloc"'s platform behavior. >> >> I'll try to do this. > > I've tried, but it seems that this is also not so good than I thought. We need to add non-aligned version of allocation-related member functions to JSC::AlignedMemoryAllocator,
But it would be better. I'll continue trying.
Yusuke Suzuki
Comment 32
2019-03-26 18:00:50 PDT
Comment on
attachment 365893
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=365893&action=review
>>>>>>>>> Source/WTF/wtf/FastMalloc.cpp:271 >>>>>>>>> +} >>>>>>>> >>>>>>>> Do we really need this concept? Which OS will fail a realloc on an aligned malloc? Is it just Windows? If so, can the Windows FastMalloc path just add a call to _aligned_realloc? >>>>>>> >>>>>>> 1. In Darwin libmalloc, it succeeds and it is guaranteed, and documented in `man`. >>>>>>> So, in Darwin, we can just return `true`, and the code is doing this. >>>>>>> >>>>>>> 2. In Windows, _aligned_malloc & realloc fail. _aligned_malloc & _aligned_realloc succeed. >>>>>>> And yes. As you said, we can make >>>>>>> 1) fastMalloc to call _aligned_malloc with 8 or 16 >>>>>>> 2) fastRealloc to call _aligned_realloc with 8 or 16 >>>>>>> 3) fastFree to call _aligned_free >>>>>>> But not sure about the impact in terms of performance, but I think it does not have an impact as long as we use 8 or 16 for default alignment. >>>>>>> >>>>>>> 3. In Linux GLibC, it succeeds, but it is not guaranteed. It depends on the current implementation. In the other libc, it depends on the implementation of each Libc. >>>>>>> >>>>>>> So, due to (3), anyway, we need to have this concept. >>>>>> >>>>>> I see. Thanks for the explanation! >>>>>> >>>>>> I wonder if it would be better to have a Windows path in FastMalloc (to call _aligned_realloc) and libc path in the non-Darwin DebugHeap (to call posix_memalign and memcpy). Would that work? (It's nice for client code when FastMalloc can paper over these platform differences.) >>>>> >>>>> Sounds like a good idea! If we change Windows to always using _aligned_malloc / _aligned_realloc / _aligned_free, we can completely hide this concept behind the FastMalloc. >>>>> The problem was Windows' fastAlignedMalloc and fastAlignedFree. We cannot call realloc and free on fastAlignedMalloc memory, and this problem leaded to this complexity. But if we change Windows' fastMalloc / fastFree implementations to always using _aligned_malloc etc. we can meet the following features. >>>>> >>>>> 1. fastMalloced memory can be freed by fastFree >>>>> 2. fastAlignedMalloced memory can be freed by fastFree (this was not possible, and it prevented us from introducing fallback implementation to fastRealloc (we need to perform "free" onto the given memory, but we do not know whether the memory is allocated from fastMalloc or fastAlignedMalloc)). >>>>> >>>>> This allows us to implement fastRealloc for the non Darwin & non Windows environments as "malloc, memcpy and free". >>>> >>>> Ah! No, unfortunately :( >>>> "realloc" only takes (1) a pointer to the previously allocated memory and (2) size for newly allocated memory. But we need to know the size of (1)'s memory region, since, >>>> >>>> realloc(pointer, largerSize) >>>> => >>>> >>>> void* memory = malloc(largerSize); >>>> memcpy(memory, pointer, min(largerSize, sizeOfPointer)); // !!! >>>> free(pointer); >>>> >>>> But we do not know "sizeOfPointer" here. We can do that if we pass the allocated memory size to "realloc" function, but it adds additional complexity, and I think it is more complex and error prone than checking "canUseReallocOnToAlignedMallocedPointer". >>> >>> Another way is stopping using "posix_memalign" for allocating LargeAllocation in JSC GC. Use "malloc" and adjust the pointer to align it to 16B. >>> This way, we can always use "realloc" onto LargeAllocation since we know that this is allocated by "malloc". I think it would be cleaner since, >>> >>> 1. anyway, we already have a code handling misalignment for realloc-ed memory. >>> 2. we do not need to rely on this "realloc"'s platform behavior. >>> >>> I'll try to do this. >> >> I've tried, but it seems that this is also not so good than I thought. We need to add non-aligned version of allocation-related member functions to JSC::AlignedMemoryAllocator, > > But it would be better. I'll continue trying.
I've uploaded the patch. In this patch, I just stop using this platform-specific feature. LargeAllocation is always managed by malloc & free. Stop using posix_memalign. Then, we can use realloc safely in all the platforms.
Yusuke Suzuki
Comment 33
2019-03-26 18:03:05 PDT
Created
attachment 366030
[details]
Patch
Yusuke Suzuki
Comment 34
2019-03-26 18:27:45 PDT
Created
attachment 366033
[details]
Patch
Yusuke Suzuki
Comment 35
2019-03-26 18:30:25 PDT
Created
attachment 366034
[details]
Patch
Yusuke Suzuki
Comment 36
2019-03-26 18:35:21 PDT
Created
attachment 366035
[details]
Patch
Yusuke Suzuki
Comment 37
2019-03-26 18:40:28 PDT
Created
attachment 366039
[details]
Patch
EWS Watchlist
Comment 38
2019-03-26 20:56:51 PDT
Comment on
attachment 366039
[details]
Patch
Attachment 366039
[details]
did not pass ios-sim-ews (ios-simulator-wk2): Output:
https://webkit-queues.webkit.org/results/11678869
New failing tests: fast/visual-viewport/ios/min-scale-greater-than-one.html
EWS Watchlist
Comment 39
2019-03-26 20:56:54 PDT
Created
attachment 366044
[details]
Archive of layout-test-results from ews123 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews123 Port: ios-simulator-wk2 Platform: Mac OS X 10.13.6
Yusuke Suzuki
Comment 40
2019-03-26 23:19:40 PDT
Created
attachment 366054
[details]
Patch
Saam Barati
Comment 41
2019-03-28 22:56:36 PDT
Comment on
attachment 366054
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=366054&action=review
> Source/JavaScriptCore/heap/CompleteSubspace.cpp:167 > + ASSERT(oldAllocation->weakSet().isTriviallyDestructible());
How is this guaranteed? Why is this needed?
> Source/JavaScriptCore/heap/CompleteSubspace.cpp:189 > + m_largeAllocations.append(oldAllocation);
Why do we need to do this if we failed to allocate?
> Source/JavaScriptCore/heap/LargeAllocation.cpp:77 > + // This includes padding at the end of the allocation to maintain the distancing constraint.
We should remove this.
> Source/JavaScriptCore/heap/LargeAllocation.cpp:105 > + // We have 4 patterns.
This is super unfortunate if we have to do this memmove. Doesn’t Bmalloc have an aligned malloc? Can’t we just use that?
Yusuke Suzuki
Comment 42
2019-03-29 00:39:55 PDT
Comment on
attachment 366054
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=366054&action=review
>> Source/JavaScriptCore/heap/CompleteSubspace.cpp:167 >> + ASSERT(oldAllocation->weakSet().isTriviallyDestructible()); > > How is this guaranteed? Why is this needed?
Weak<> only accept JSCell*, so Auxiliary CompleteSubspace never uses it. Since we perform "realloc" without calling destructors, this needs to be ensured. If we call a destructor in "realloc", recovering the oldAllocation becomes super tricky instead when "realloc" fails.
>> Source/JavaScriptCore/heap/CompleteSubspace.cpp:189 >> + m_largeAllocations.append(oldAllocation); > > Why do we need to do this if we failed to allocate?
Because we removed this from Subspace in LargeAllocation::tryReallocate. If we do not remove oldAllocation from the linked list in "tryReallocate", we miss the chance to remove oldAllocation from Subspace's doubly linked list when "realloc" succeeds since address may be changed by "realloc".
>> Source/JavaScriptCore/heap/LargeAllocation.cpp:77 >> + // This includes padding at the end of the allocation to maintain the distancing constraint. > > We should remove this.
I'll remove this in "distancing removing patch" side :)
>> Source/JavaScriptCore/heap/LargeAllocation.cpp:105 >> + // We have 4 patterns. > > This is super unfortunate if we have to do this memmove. > > Doesn’t Bmalloc have an aligned malloc? Can’t we just use that?
bmalloc has aligned malloc. But it is not sufficient... What we need is aligned realloc. and nobody has it! (except for Windows). Even if we use aligned malloc in the first malloc (LargeAllocation::tryCreate), we do not have any guarantee that realloc-ed malloc has the same alignment (this is explicitly described in `man posix_memalign`). So, anyway, we need this complicated code. Let's consider the following example, 1. 1st LargeAllocation is allocated from aligned-malloc (posix_memalign) 2. After that, this LargeAllocation is realloc-ed (This is only valid on Darwin platforms. In Windows, it crashes. In Linux, it works but no guarantee). 3. Then, when we again want to expand this LargeAllocation, we need this adjustment anyway since result in (2)'s realloc may not have an alignment specified in (1). If we have an API like, posix_memaligned_realloc or aligned_realloc thing, we can consistently use that. But it does not exist in Darwin unfortunately :( We can add it to bmalloc. But since system malloc does not have that, we cannot use it anyway here. Good news: In Darwin libmalloc, returned memory from malloc is always 16B aligned (undocumented, but this is correct in the current implementation IIRC). So basically, memmove does not happen if we use this code with system malloc in Darwin.
Yusuke Suzuki
Comment 43
2019-03-29 17:13:39 PDT
Comment on
attachment 366054
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=366054&action=review
>>> Source/JavaScriptCore/heap/CompleteSubspace.cpp:189 >>> + m_largeAllocations.append(oldAllocation); >> >> Why do we need to do this if we failed to allocate? > > Because we removed this from Subspace in LargeAllocation::tryReallocate. If we do not remove oldAllocation from the linked list in "tryReallocate", we miss the chance to remove oldAllocation from Subspace's doubly linked list when "realloc" succeeds since address may be changed by "realloc".
I changed this code a bit: moving if (isOnList()) remove(); operations out of tryReallocate to make code cleaner.
>>> Source/JavaScriptCore/heap/LargeAllocation.cpp:77 >>> + // This includes padding at the end of the allocation to maintain the distancing constraint. >> >> We should remove this. > > I'll remove this in "distancing removing patch" side :)
Removed. So I'll update the patch.
Yusuke Suzuki
Comment 44
2019-03-29 17:26:18 PDT
Created
attachment 366331
[details]
Patch
Yusuke Suzuki
Comment 45
2019-03-29 18:18:31 PDT
It is ready.
Saam Barati
Comment 46
2019-03-31 16:44:36 PDT
Comment on
attachment 366331
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=366331&action=review
r=me
> Source/JavaScriptCore/heap/AlignedMemoryAllocator.h:53 > + // Only some of carefully-designed allocator has this.
“Only some carefully-designed allocators have this.” That said, not sure this comment adds much. It’s better to explain why
> Source/JavaScriptCore/heap/LargeAllocation.cpp:70 > + size_t allocationSizeAndAlignmentAdjustment = headerSize() + size + halfAlignment;
Nit: i think a better name is “alignmentAdjustedAllocationSize”
> Source/JavaScriptCore/heap/LargeAllocation.cpp:76 > + bool oldAlignmentAdjusted = m_alignmentAdjusted;
Nit: another name suggestion, I think this makes more sense as “adjusted alignment” instead of “alignment adjusted”, e.g “m_adjustedAlignment”
Yusuke Suzuki
Comment 47
2019-03-31 23:40:36 PDT
Comment on
attachment 366331
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=366331&action=review
Thank you!
>> Source/JavaScriptCore/heap/AlignedMemoryAllocator.h:53 >> + // Only some of carefully-designed allocator has this. > > “Only some carefully-designed allocators have this.” > > That said, not sure this comment adds much. It’s better to explain why
We do not add realloc features to IsoAlignedMemoryAllocator, this is simply because it is not necessary: IsoAlignedMemoryAllocator is used for IsoSubspace, so allocation size is always the same. It means that we never have "realloc" use case. Changed the comment to explain this.
>> Source/JavaScriptCore/heap/LargeAllocation.cpp:70 >> + size_t allocationSizeAndAlignmentAdjustment = headerSize() + size + halfAlignment; > > Nit: i think a better name is “alignmentAdjustedAllocationSize”
Sounds good. Changed.
>> Source/JavaScriptCore/heap/LargeAllocation.cpp:76 >> + bool oldAlignmentAdjusted = m_alignmentAdjusted; > > Nit: another name suggestion, I think this makes more sense as “adjusted alignment” instead of “alignment adjusted”, e.g “m_adjustedAlignment”
Sounds nice. Fixed.
Yusuke Suzuki
Comment 48
2019-03-31 23:51:22 PDT
Committed
r243688
: <
https://trac.webkit.org/changeset/243688
>
Radar WebKit Bug Importer
Comment 49
2019-03-31 23:54:45 PDT
<
rdar://problem/49469428
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug