WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
REOPENED
213071
Replace JSC::FreeList linked list with a Bitmap.
https://bugs.webkit.org/show_bug.cgi?id=213071
Summary
Replace JSC::FreeList linked list with a Bitmap.
Mark Lam
Reported
2020-06-11 01:44:58 PDT
This is an investigation to see if using a Bitmap is faster. We expect it to be.
Attachments
work in progress.
(33.30 KB, patch)
2020-06-11 02:20 PDT
,
Mark Lam
no flags
Details
Formatted Diff
Diff
test patch 1: does NOT use adjusted bitmap start.
(33.65 KB, patch)
2020-06-11 13:41 PDT
,
Mark Lam
no flags
Details
Formatted Diff
Diff
test patch 2: does use adjusted bitmap start.
(33.65 KB, patch)
2020-06-11 13:42 PDT
,
Mark Lam
no flags
Details
Formatted Diff
Diff
test patch 3: post-tuned
(39.06 KB, patch)
2020-06-12 23:05 PDT
,
Mark Lam
no flags
Details
Formatted Diff
Diff
proposed patch.
(45.22 KB, patch)
2020-06-17 15:58 PDT
,
Mark Lam
no flags
Details
Formatted Diff
Diff
proposed patch.
(45.17 KB, patch)
2020-06-17 16:12 PDT
,
Mark Lam
no flags
Details
Formatted Diff
Diff
Show Obsolete
(5)
View All
Add attachment
proposed patch, testcase, etc.
Mark Lam
Comment 1
2020-06-11 02:20:26 PDT
Created
attachment 401630
[details]
work in progress.
Mark Lam
Comment 2
2020-06-11 13:41:52 PDT
Created
attachment 401678
[details]
test patch 1: does NOT use adjusted bitmap start.
Mark Lam
Comment 3
2020-06-11 13:42:45 PDT
Created
attachment 401679
[details]
test patch 2: does use adjusted bitmap start.
Mark Lam
Comment 4
2020-06-12 23:05:24 PDT
Created
attachment 401826
[details]
test patch 3: post-tuned
Mark Lam
Comment 5
2020-06-17 15:58:42 PDT
Created
attachment 402163
[details]
proposed patch.
Mark Lam
Comment 6
2020-06-17 16:12:07 PDT
Created
attachment 402164
[details]
proposed patch.
Filip Pizlo
Comment 7
2020-06-17 18:44:34 PDT
Comment on
attachment 402164
[details]
proposed patch. View in context:
https://bugs.webkit.org/attachment.cgi?id=402164&action=review
> Source/JavaScriptCore/heap/MarkedBlockInlines.h:341 > + freeAtoms ^= cellLocations;
You could probably make this faster if it was one loop that did the whole thing. If I’m understanding right we will have a loop here for the ^= and loops above for other parts of the freeAtoms computation.
Mark Lam
Comment 8
2020-06-17 18:53:43 PDT
(In reply to Filip Pizlo from
comment #7
)
> Comment on
attachment 402164
[details]
> proposed patch. > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=402164&action=review
> > > Source/JavaScriptCore/heap/MarkedBlockInlines.h:341 > > + freeAtoms ^= cellLocations; > > You could probably make this faster if it was one loop that did the whole > thing. If I’m understanding right we will have a loop here for the ^= and > loops above for other parts of the freeAtoms computation.
Thanks. I'll look into doing this in a follow up patch as it might get a messy. I'll land the current patch first which has already been tested.
Mark Lam
Comment 9
2020-06-17 18:58:08 PDT
Landed in
r263195
: <
http://trac.webkit.org/r263195
>.
Radar WebKit Bug Importer
Comment 10
2020-06-17 18:59:16 PDT
<
rdar://problem/64472738
>
Robin Morisset
Comment 11
2020-06-17 19:15:52 PDT
Comment on
attachment 402164
[details]
proposed patch. View in context:
https://bugs.webkit.org/attachment.cgi?id=402164&action=review
This patch looks perfectly reasonable to me, but I am not familiar enough with this code to feel comfortable r+-ing it alone.
> Source/JavaScriptCore/ChangeLog:98 > + there is n the location of m_cellSize. It is now moved up next to m_remaining,
typo: "n" ?
> Source/JavaScriptCore/heap/FreeList.h:108 > + // if there atoms still available for allocation. See comment blob below
typo: "there atoms" => "there are atoms"
> Source/JavaScriptCore/heap/FreeList.h:119 > + static ptrdiff_t offsetOfBitmapRows() { return OBJECT_OFFSETOF(FreeList, m_bitmap) - sizeof(AtomsBitmap::Word); }
Maybe rename to offsetOfBitmapRowsMinusOne, or offsetOneBeforeBitmapRows, or something like this? Just in case anyone tries to use it without looking at where it is defined.
> Source/JavaScriptCore/heap/FreeListInlines.h:100 > + while (rowBitmap) {
It probably does not matter, but I can think of a way to make this loop a tad more efficient: ``` unsigned atomIndexInRow = 0; while (rowBitmap) { atomIndexInRow += ctz(rowBitmap); auto* cell = bitwise_cast<HeapCell*>(¤tMarkedBlockRowAddress[atomIndexInRow]); rowBitmap >>= (++atomIndexInRow); func(cell); } ```
> Source/JavaScriptCore/jit/AssemblyHelpers.cpp:561 > +#if CPU(ARM64)
I am a bit wary that this code will never be tested since BITMAP_FREELIST is currently only set on x86_64. I don't have a better solution though. It is also a bit weird to have both "#if CPU(ARM64)" and "if (isARM64())", but just "if (isx86_64())". What is the criterion for picking one of these ways of checking the CPU?
Mark Lam
Comment 12
2020-06-17 20:09:00 PDT
Comment on
attachment 402164
[details]
proposed patch. View in context:
https://bugs.webkit.org/attachment.cgi?id=402164&action=review
Thanks.
>> Source/JavaScriptCore/ChangeLog:98 >> + there is n the location of m_cellSize. It is now moved up next to m_remaining, > > typo: "n" ?
I think I meant to say "in".
>> Source/JavaScriptCore/heap/FreeList.h:108 >> + // if there atoms still available for allocation. See comment blob below > > typo: "there atoms" => "there are atoms"
Will fix.
>> Source/JavaScriptCore/heap/FreeList.h:119 >> + static ptrdiff_t offsetOfBitmapRows() { return OBJECT_OFFSETOF(FreeList, m_bitmap) - sizeof(AtomsBitmap::Word); } > > Maybe rename to offsetOfBitmapRowsMinusOne, or offsetOneBeforeBitmapRows, or something like this? Just in case anyone tries to use it without looking at where it is defined.
offsetOfBitmapRowsMinusOne sounds good. I'll try to apply this in my follow up patch to address Fil's idea for an optimization.
>> Source/JavaScriptCore/heap/FreeListInlines.h:100 >> + while (rowBitmap) { > > It probably does not matter, but I can think of a way to make this loop a tad more efficient: > ``` > unsigned atomIndexInRow = 0; > while (rowBitmap) { > atomIndexInRow += ctz(rowBitmap); > auto* cell = bitwise_cast<HeapCell*>(¤tMarkedBlockRowAddress[atomIndexInRow]); > rowBitmap >>= (++atomIndexInRow); > func(cell); > } > ```
I don't see how this can work. Let's go over an example: say we have bits at index 2, 3, and 5. rowBitmap is ...0000101100b. 1st iteration: atomIndexInRow is 0, and becomes 2. We compute the cell to be ¤tMarkedBlockRowAddress[2]. All is well. atomIndexInRow becomes 3. rowBitmap becomes ...0000101b. 2nd iteration: atomIndexInRow is 3, and stays 3. We compute the cell to be ¤tMarkedBlockRowAddress[3]. All is well. atomIndexInRow becomes 4. rowBitmap becomes ...000b. 3rd iteration should have computed cell at ¤tMarkedBlockRowAddress[5], but actually thinks there is no cells left because rowBitmap has become 0. The bug lies in the right shift on rowBitmap. I think it needs to be shifted by the result of the ctz + 1, not atomIndexInRow + 1. And yeah, I'm not sure this is worth it. The code will definitely be trickier and harder to understand, and therefore error prone.
>> Source/JavaScriptCore/jit/AssemblyHelpers.cpp:561 >> +#if CPU(ARM64) > > I am a bit wary that this code will never be tested since BITMAP_FREELIST is currently only set on x86_64. I don't have a better solution though. > It is also a bit weird to have both "#if CPU(ARM64)" and "if (isARM64())", but just "if (isx86_64())". What is the criterion for picking one of these ways of checking the CPU?
It's unfortunate but the #if CPU(ARM64) is needed because getCachedMemoryTempRegisterIDAndInvalidate() is only available for ARM64. We'll be testing this code when we try to turn this on for ARM64 later after we try to do more to improve perf.
Saam Barati
Comment 13
2020-06-17 22:30:14 PDT
Comment on
attachment 402164
[details]
proposed patch. View in context:
https://bugs.webkit.org/attachment.cgi?id=402164&action=review
LGTM too
> Source/JavaScriptCore/ChangeLog:87 > + The middle path checks m_currentRowIndex to see if we have more rows to allocate > + from. For each m_currentRowIndex, we check its corresponding AtomsBitmap::Word > + in m_bitmap. If the word is non-zero, we copy it to m_currentRowBitmap and > + jump to the fast path to do the allocation. The middle path will update > + m_currentRowIndex to point to the current row we're allocating from.
I think it'd be faster if m_currentRowIndex were instead a bitmap itself telling you the next row to allocate out of was. Might matter for blocks with larger cell sizes.
> Source/JavaScriptCore/heap/FreeList.cpp:137 > + return m_currentRowBitmap & (one << atomIndexInRow);
using the constant 1 is more readable imo
> Source/JavaScriptCore/heap/FreeListInlines.h:51 > + rowBitmap &= ~(one << atomIndexInRow);
nit: I think it's easier to read with the constant 1 than the name "one"
> Source/JavaScriptCore/heap/MarkedBlockInlines.h:322 > + auto handleDeadCell = [&] (size_t i) { > + HeapCell* cell = reinterpret_cast_ptr<HeapCell*>(atomAt(i)); > + > + if (destructionMode != BlockHasNoDestructors) > + destroy(cell); > + > + if (sweepMode == SweepToFreeList) { > + if (scribbleMode == Scribble) > + scribble(cell, cellSize); > + ++count; > + } > + };
It'd be a lot easier to read if this just were inline where you used it. Anyways, you check destructionMode twice
Saam Barati
Comment 14
2020-06-17 22:31:34 PDT
Comment on
attachment 402164
[details]
proposed patch. View in context:
https://bugs.webkit.org/attachment.cgi?id=402164&action=review
> Source/JavaScriptCore/ChangeLog:12 > + 2 loads - m_currentRowBitmap, m_currentMarkedBlockRowAddress
on arm64 we can should turn this into a single loadpair
Mark Lam
Comment 15
2020-06-17 23:50:33 PDT
Comment on
attachment 402164
[details]
proposed patch. View in context:
https://bugs.webkit.org/attachment.cgi?id=402164&action=review
>> Source/JavaScriptCore/ChangeLog:12 >> + 2 loads - m_currentRowBitmap, m_currentMarkedBlockRowAddress > > on arm64 we can should turn this into a single loadpair
I can experiment with this to see if it's worth it. It impacts the layout of the fields. I think our cache line should be large enough that it doesn't matter, but I'll need to test it to be sure.
>> Source/JavaScriptCore/ChangeLog:87 >> + m_currentRowIndex to point to the current row we're allocating from. > > I think it'd be faster if m_currentRowIndex were instead a bitmap itself telling you the next row to allocate out of was. Might matter for blocks with larger cell sizes.
This is an intriguing idea. I'll give it a try.
>> Source/JavaScriptCore/heap/FreeList.cpp:137 >> + return m_currentRowBitmap & (one << atomIndexInRow); > > using the constant 1 is more readable imo
It is necessary to use "one" here because the alternative would be to use static_cast<AtomsBitmap::Word>(1). It is incorrect to just use literal 1 because literal 1 is of type int, and this shift will never produce a 64-bit value if we use that (which has been a source of bugs). We don't use 1ull either because technically, AtomsBitmap::Word can be 32-bits eventually if 32-bit ports adopts BITMAP_FREELIST. Using "one" is the portable and always correct solution here.
>> Source/JavaScriptCore/heap/MarkedBlockInlines.h:322 >> + }; > > It'd be a lot easier to read if this just were inline where you used it. > > Anyways, you check destructionMode twice
This is a carry over from the linked list implementation that calls handleDeadCell in 2 places. For the bitmap implementation, we only need to call it in one place. I'll move it to the call site. As for the redundant destructionMode check, you found a bug! handleDeadCell should be called unconditionally. The bug here is that: 1. we won't scribble dead cells if the cell does not have a destructor. 2. we won't count the dead cell. This results in FreeList::originalSize() always returning 0, which leads to a 0 being wrongly reported to Heap::didAllocate().
WebKit Commit Bot
Comment 16
2020-08-09 02:07:00 PDT
Re-opened since this is blocked by
bug 215312
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug