Bug 121007 - Clearing MarkedBlock::m_newlyAllocated should be separate from MarkedBlock::clearMarks
Summary: Clearing MarkedBlock::m_newlyAllocated should be separate from MarkedBlock::c...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Mark Hahnenberg
URL:
Keywords:
Depends on:
Blocks: 121074
  Show dependency treegraph
 
Reported: 2013-09-08 09:25 PDT by Mark Hahnenberg
Modified: 2013-09-12 10:57 PDT (History)
1 user (show)

See Also:


Attachments
Patch (7.25 KB, patch)
2013-09-08 09:27 PDT, Mark Hahnenberg
no flags Details | Formatted Diff | Diff
Patch (7.56 KB, patch)
2013-09-08 14:47 PDT, Mark Hahnenberg
oliver: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Hahnenberg 2013-09-08 09:25:58 PDT
We call clearMarks on every MarkedBlock in the Heap, whereas we only need to clear m_newlyAllocated for the m_currentBlock at the time of the last canonicalizeCellLiveness() for each MarkedAllocator. We also need to call it on every block in the largeAllocators because each one of their blocks is canonicalized as it is used.
Comment 1 Mark Hahnenberg 2013-09-08 09:27:09 PDT
Created attachment 210981 [details]
Patch
Comment 2 Geoffrey Garen 2013-09-08 10:34:16 PDT
Comment on attachment 210981 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=210981&action=review

> Source/JavaScriptCore/heap/Heap.cpp:471
> +        m_objectSpace.clearNewlyAllocated();

I think it would be a little clearer to do this after the mark phase, in Heap::collect(). The newly allocated bits are the inverse of the mark phase: you need them in order to mark (for the conservative phase), and immediately after marking, they're null by definition. So, I think it's clearer to set them before marking (which we do), and then unset them right after marking.

> Source/JavaScriptCore/heap/MarkedAllocator.h:30
> +    MarkedBlock* getAndClearCanonicalizedBlock()

We usually call "getAndClear" "take": takeCanonicalizedBlock.

Also, let's name this accessor the same as its data member. So, either takeCanonicalizedBlock, and rename the data member to m_canonicalizedBlock, or takeCurrentBlockAtLastCanonicalize. My preference is to rename the data member. It's a mouthful.

> Source/JavaScriptCore/heap/MarkedSpace.cpp:253
> +    // We have to iterate all of the blocks in the large allocators because they are
> +    // canonicalized as they are used up (see MarkedAllocator::tryAllocateHelper)
> +    // which creates the m_newlyAllocated bitmap.

Why do we do that? I think it would be a bit clearer just to canonicalize all large blocks at the start of GC, with the rest of our blocks.
Comment 3 Mark Hahnenberg 2013-09-08 12:22:04 PDT
(In reply to comment #2)
> (From update of attachment 210981 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=210981&action=review
> 
> > Source/JavaScriptCore/heap/Heap.cpp:471
> > +        m_objectSpace.clearNewlyAllocated();
> 
> I think it would be a little clearer to do this after the mark phase, in Heap::collect(). The newly allocated bits are the inverse of the mark phase: you need them in order to mark (for the conservative phase), and immediately after marking, they're null by definition. So, I think it's clearer to set them before marking (which we do), and then unset them right after marking.
The main reason I do it before clearMarks() is so that clearMarks() can ASSERT for every block that m_newlyAllocated == null. I expect clearMarks() to be the only time where iterate all blocks in the Heap. I suppose I could write a debug-only Heap walk for this purpose.

> 
> > Source/JavaScriptCore/heap/MarkedAllocator.h:30
> > +    MarkedBlock* getAndClearCanonicalizedBlock()
> 
> We usually call "getAndClear" "take": takeCanonicalizedBlock.
> 
> Also, let's name this accessor the same as its data member. So, either takeCanonicalizedBlock, and rename the data member to m_canonicalizedBlock, or takeCurrentBlockAtLastCanonicalize. My preference is to rename the data member. It's a mouthful.
> 
Sounds good.

> > Source/JavaScriptCore/heap/MarkedSpace.cpp:253
> > +    // We have to iterate all of the blocks in the large allocators because they are
> > +    // canonicalized as they are used up (see MarkedAllocator::tryAllocateHelper)
> > +    // which creates the m_newlyAllocated bitmap.
> 
> Why do we do that? I think it would be a bit clearer just to canonicalize all large blocks at the start of GC, with the rest of our blocks.
I think the reason we eagerly canonicalize is because MarkedBlocks don't maintain their own free list after they're swept. Instead, the MarkedAllocator holds the reference to the FreeList for its m_currentBlock. Since we need the FreeList for a block while canonicalizing it, we're forced to do it eagerly when we advance to a new block in the largeAllocator. As it stands now, if we were to wait until collection to canonicalize all blocks in largeAllocators, we'd have to re-sweep each block to build the FreeList. 

Of course, that doesn't prevent us from storing the FreeList in MarkedBlocks, which is what we should do in the long run so incremental sweeping could also build free lists in addition to running destructors/finalizers. There's a bug from a while back assigned to me to support this: https://bugs.webkit.org/show_bug.cgi?id=87918.
Comment 4 Mark Hahnenberg 2013-09-08 14:31:33 PDT
(In reply to comment #3)
> (In reply to comment #2)
> > (From update of attachment 210981 [details] [details])
> > View in context: https://bugs.webkit.org/attachment.cgi?id=210981&action=review
> > 
> > > Source/JavaScriptCore/heap/Heap.cpp:471
> > > +        m_objectSpace.clearNewlyAllocated();
> > 
> > I think it would be a little clearer to do this after the mark phase, in Heap::collect(). The newly allocated bits are the inverse of the mark phase: you need them in order to mark (for the conservative phase), and immediately after marking, they're null by definition. So, I think it's clearer to set them before marking (which we do), and then unset them right after marking.
> The main reason I do it before clearMarks() is so that clearMarks() can ASSERT for every block that m_newlyAllocated == null. I expect clearMarks() to be the only time where iterate all blocks in the Heap. I suppose I could write a debug-only Heap walk for this purpose.
I ran into an issue with this. We use the newly allocated bits to indicate liveness, and we use liveness during the marking phase for determining whether weak references are live and need to be visited. Thus, either we need to invent a new form of "liveness" that only consults the mark bit and ignores the newly allocated bit, or we could just clear all liveness-related data at once prior to the beginning of the mark phase. I'm kind of leaning toward just clearing all liveness data (i.e. mark bits and newly allocated bits) together prior to marking. Inventing a subtly different definition of liveness just so we can clear some bits later on during collection seems a bit odd.
Comment 5 Mark Hahnenberg 2013-09-08 14:47:21 PDT
Created attachment 210995 [details]
Patch
Comment 6 Geoffrey Garen 2013-09-08 16:17:23 PDT
> We use the newly allocated bits to indicate liveness, and we use liveness during the marking phase for determining whether weak references are live and need to be visited.

I'm happy with the patch Ollie r+ed, but I wanted to get more clarity on this.

At the end of GC, when visiting weak references, why do we want to visit pointers that are newly allocated but not marked? Those pointers are garbage, so we shouldn't visit them -- otherwise, we're artificially keeping new garbage alive for one GC cycle, right?

My understanding is that we only need the newly allocated bits for the sake of the conservative scan, to tell it whether a pointer is valid memory or not. After the conservative scan, the mark bit is "the truth".
Comment 7 Mark Hahnenberg 2013-09-08 16:26:32 PDT
(In reply to comment #6)
> > We use the newly allocated bits to indicate liveness, and we use liveness during the marking phase for determining whether weak references are live and need to be visited.
> 
> I'm happy with the patch Ollie r+ed, but I wanted to get more clarity on this.
> 
> At the end of GC, when visiting weak references, why do we want to visit pointers that are newly allocated but not marked? Those pointers are garbage, so we shouldn't visit them -- otherwise, we're artificially keeping new garbage alive for one GC cycle, right?

We don't want to visit pointers that are newly allocated, which is why we want to clear the newly allocated bits before we visit weak references.

> 
> My understanding is that we only need the newly allocated bits for the sake of the conservative scan, to tell it whether a pointer is valid memory or not. After the conservative scan, the mark bit is "the truth".

This is correct, which is why we should clear the newly allocated bits after the conservative scan but before visiting weak references. An alternative would be to have a different version of MarkedBlock::isLive that ignored newly allocated bits.
Comment 8 Mark Hahnenberg 2013-09-08 16:35:42 PDT
Committed r155316: <http://trac.webkit.org/changeset/155316>