Bug 166878

Summary: Streamline the GC barrier slowpath
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: JavaScriptCoreAssignee: Filip Pizlo <fpizlo>
Status: RESOLVED FIXED    
Severity: Normal CC: ggaren, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: All   
OS: All   
See Also: https://bugs.webkit.org/show_bug.cgi?id=166847
Bug Depends on:    
Bug Blocks: 165909    
Attachments:
Description Flags
the patch saam: review+

Description Filip Pizlo 2017-01-09 20:58:38 PST
Remove the write barrier buffer. Teach the barrier slow path to re-white unmarked-black objects.
Comment 1 Filip Pizlo 2017-01-09 21:00:39 PST
Created attachment 298444 [details]
the patch
Comment 2 Radar WebKit Bug Importer 2017-01-09 21:01:48 PST
<rdar://problem/29942167>
Comment 3 Geoffrey Garen 2017-01-10 11:33:45 PST
Comment on attachment 298444 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=298444&action=review

r=me

> Source/JavaScriptCore/heap/CellState.h:45
> +// The CellState of a cell is a kind of hint about what the state of the cell is.
>  enum class CellState : uint8_t {
>      // The object is either currently being scanned, or it has finished being scanned, or this
>      // is a full collection and it's actually a white object (you'd know because its mark bit
>      // would be clear).
> -    PossiblyOldOrBlack = 0,
> +    PossiblyBlack = 0,
>      
>      // The object is in eden. During GC, this means that the object has not been marked yet.
> -    DefinitelyNewAndWhite = 1,
> +    DefinitelyWhite = 1,
>  
> -    // The object is grey - i.e. it will be scanned.
> -    DefinitelyGrey = 2,
> +    // This sorta means that the object is grey - i.e. it will be scanned. Or it could be white
> +    // during a full collection if its mark bit is clear. That would happen if it had been black,
> +    // got barriered, and we did a full collection.
> +    PossiblyGrey = 2

It would help to explain that these values indicate an upper bound on the visiting state of an object:

Black means the object has probably transitioned all the way to Black, but it might still be back at Grey or White.

Grey means the object has probably transitioned to Grey, but it might still be back at White.

White means White.

Otherwise, "PossiblyGrey" is ambiguous, and it's not clear why it's OK to skip Black object behaviors when PossiblyGrey.

You could even consider renaming to { White, GreyOrLower, BlackOrLower }, or renaming the enum to CellStateUpperBound.

> Source/JavaScriptCore/heap/Heap.cpp:957
> +            // During a full collection a store into an unmarked object that had surivived past
> +            // collections will manifest as a store to an unmarked black object. If the object gets
> +            // marked at some time after this then it will go down the normal marking path. We can
> +            // safely ignore these stores.

Can we ASSERT here that we're doing a full collection?

"store to an unmarked black object" => "store to an unmarked possibly black object"

Instead of "If the object....", I would say: In this situation, the object is actually white. By synchronizing with the GC below, we can prove that the object is white and update its state so it doesn't take the slow path again.

> Source/JavaScriptCore/heap/Heap.cpp:973
> +            if (cell->atomicCompareExchangeCellStateStrong(CellState::PossiblyBlack, CellState::DefinitelyWhite) == CellState::PossiblyBlack) {
> +                // Now we protect against this race:
> +                //
> +                //     1) Object starts out black + unmarked.
> +                //     --> We do isMarkedConcurrently here.
> +                //     2) Object is marked and greyed.
> +                //     3) Object is scanned and blacked.
> +                //     --> We do atomicCompareExchangeCellStateStrong here.
> +                //
> +                // In this case we would have made the object white again, even though it should
> +                // be black. This check lets us correct our mistake. This relies on the fact that
> +                // isMarkedConcurrently converges monotonically to true.
> +                if (!isMarkedConcurrently(cell)) {
> +                    // OK - the object really deserves to be white!
> +                    return;
> +                }

I wonder if it would be better (i.e., simpler and not more expensive) just to grey the object unconditionally.

I think the advantage of this check-and-white approach is that it allows an old object to die even if we store to it during GC. I wonder if this advantage is really all that important for old objects, since it's impossible for the mutator to create a death spiral of old objects, and old objects tend to survive anyway.
Comment 4 Filip Pizlo 2017-01-10 11:50:33 PST
(In reply to comment #3)
> Comment on attachment 298444 [details]
> the patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=298444&action=review
> 
> r=me
> 
> > Source/JavaScriptCore/heap/CellState.h:45
> > +// The CellState of a cell is a kind of hint about what the state of the cell is.
> >  enum class CellState : uint8_t {
> >      // The object is either currently being scanned, or it has finished being scanned, or this
> >      // is a full collection and it's actually a white object (you'd know because its mark bit
> >      // would be clear).
> > -    PossiblyOldOrBlack = 0,
> > +    PossiblyBlack = 0,
> >      
> >      // The object is in eden. During GC, this means that the object has not been marked yet.
> > -    DefinitelyNewAndWhite = 1,
> > +    DefinitelyWhite = 1,
> >  
> > -    // The object is grey - i.e. it will be scanned.
> > -    DefinitelyGrey = 2,
> > +    // This sorta means that the object is grey - i.e. it will be scanned. Or it could be white
> > +    // during a full collection if its mark bit is clear. That would happen if it had been black,
> > +    // got barriered, and we did a full collection.
> > +    PossiblyGrey = 2
> 
> It would help to explain that these values indicate an upper bound on the
> visiting state of an object:
> 
> Black means the object has probably transitioned all the way to Black, but
> it might still be back at Grey or White.
> 
> Grey means the object has probably transitioned to Grey, but it might still
> be back at White.
> 
> White means White.
> 
> Otherwise, "PossiblyGrey" is ambiguous, and it's not clear why it's OK to
> skip Black object behaviors when PossiblyGrey.
> 
> You could even consider renaming to { White, GreyOrLower, BlackOrLower }, or
> renaming the enum to CellStateUpperBound.

I get what "GreyOrLower" and "BlackOrLower" are supposed to mean, but I wouldn't have thought that without reading your explanation.

There's a ton of nuance to these states.  You're right that DefinitelyWhite is the only one that has guaranteed meaning.  To understand the meaning of the other states, you have to consider the mark bit and possibly other information.  That's why after many attempts at naming these states in a more descriptive way, I decided to go with the "Possibly" prefix for states whose meaning has caveats and the "Definitely" prefix for the state whose meaning has no caveats.

> 
> > Source/JavaScriptCore/heap/Heap.cpp:957
> > +            // During a full collection a store into an unmarked object that had surivived past
> > +            // collections will manifest as a store to an unmarked black object. If the object gets
> > +            // marked at some time after this then it will go down the normal marking path. We can
> > +            // safely ignore these stores.
> 
> Can we ASSERT here that we're doing a full collection?
> 

Yes.  I added it.

> "store to an unmarked black object" => "store to an unmarked possibly black
> object"
> 
> Instead of "If the object....", I would say: In this situation, the object
> is actually white. By synchronizing with the GC below, we can prove that the
> object is white and update its state so it doesn't take the slow path again.

That doesn't mean the same thing.  This is speaking of the span of time beginning immediately after the isMarkedConcurrently check, and ending at the end of GC.  During any moment in that span of time, if the object gets marked, it will go down the normal marking path.  This paragraph is justifying why we can ignore PossiblyBlack unmarked objects.  It's not meant as a justification for the optimization below, which re-whites the object.

I rewrote this paragraph:

            // During a full collection a store into an unmarked object that had surivived past
            // collections will manifest as a store to an unmarked PossiblyBlack object. If the
            // object gets marked at some time after this then it will go down the normal marking
            // path. So, we don't have to remember this object. We could return here. But we go
            // further and attempt to re-white the object.

> 
> > Source/JavaScriptCore/heap/Heap.cpp:973
> > +            if (cell->atomicCompareExchangeCellStateStrong(CellState::PossiblyBlack, CellState::DefinitelyWhite) == CellState::PossiblyBlack) {
> > +                // Now we protect against this race:
> > +                //
> > +                //     1) Object starts out black + unmarked.
> > +                //     --> We do isMarkedConcurrently here.
> > +                //     2) Object is marked and greyed.
> > +                //     3) Object is scanned and blacked.
> > +                //     --> We do atomicCompareExchangeCellStateStrong here.
> > +                //
> > +                // In this case we would have made the object white again, even though it should
> > +                // be black. This check lets us correct our mistake. This relies on the fact that
> > +                // isMarkedConcurrently converges monotonically to true.
> > +                if (!isMarkedConcurrently(cell)) {
> > +                    // OK - the object really deserves to be white!
> > +                    return;
> > +                }
> 
> I wonder if it would be better (i.e., simpler and not more expensive) just
> to grey the object unconditionally.

We can't do that because that would be incorrect. A grey object will get ignored by the barrier.  You cannot grey an object that might be black.  The GC is changing the object's color using a naked store - no CAS.  So, just as we change the object's color to grey the GC could be changing it to black.  In between the !isMarkedConcurrently check and when we run our CAS, the object could have been greyed, blackened, and scanned.  Now it still looks black.  If we just change it to grey we will introduce a bug.

> 
> I think the advantage of this check-and-white approach is that it allows an
> old object to die even if we store to it during GC. I wonder if this
> advantage is really all that important for old objects, since it's
> impossible for the mutator to create a death spiral of old objects, and old
> objects tend to survive anyway.

This isn't an optimization.  We can't unblacken an object without being extremely careful.  The GC may have blackened the object to signal that the barrier is needed.  We can't forget this information.  That's why we first whiten it, and then check the mark bit again.  We can prove that the bad interleaving that I speak of would get caught by this extra check.
Comment 5 Saam Barati 2017-01-10 11:56:58 PST
Comment on attachment 298444 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=298444&action=review

> Source/JavaScriptCore/heap/Heap.cpp:977
> +                // It's difficult to work out whether the object should be grey or black at
> +                // this point. We say black conservatively.
> +                cell->setCellState(CellState::PossiblyBlack);

What's the point of doing this if we fall through to the code below that does greying?
Maybe it should return here?
Comment 6 Saam Barati 2017-01-10 11:58:51 PST
Comment on attachment 298444 [details]
the patch

(I didn't mean to clear r+)
Comment 7 Filip Pizlo 2017-01-10 12:02:04 PST
(In reply to comment #5)
> Comment on attachment 298444 [details]
> the patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=298444&action=review
> 
> > Source/JavaScriptCore/heap/Heap.cpp:977
> > +                // It's difficult to work out whether the object should be grey or black at
> > +                // this point. We say black conservatively.
> > +                cell->setCellState(CellState::PossiblyBlack);
> 
> What's the point of doing this if we fall through to the code below that
> does greying?
> Maybe it should return here?

That's true - it *should* return here.

We already confirmed above that the object was not marked immediately after the store that this barrier is protecting.  This means that the object had since been marked normally, and has most likely been scanned the normal way.  There is no need to barrier the object now.

In fact, we can return early no matter what happens in the !isMarkedConcurrently path.  There is no need to remember the object if it was not marked at the time that the barrier slow path ran.
Comment 8 Filip Pizlo 2017-01-10 12:05:04 PST
(In reply to comment #4)
> (In reply to comment #3)
> > Comment on attachment 298444 [details]
> > the patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=298444&action=review
> > 
> > r=me
> > 
> > > Source/JavaScriptCore/heap/CellState.h:45
> > > +// The CellState of a cell is a kind of hint about what the state of the cell is.
> > >  enum class CellState : uint8_t {
> > >      // The object is either currently being scanned, or it has finished being scanned, or this
> > >      // is a full collection and it's actually a white object (you'd know because its mark bit
> > >      // would be clear).
> > > -    PossiblyOldOrBlack = 0,
> > > +    PossiblyBlack = 0,
> > >      
> > >      // The object is in eden. During GC, this means that the object has not been marked yet.
> > > -    DefinitelyNewAndWhite = 1,
> > > +    DefinitelyWhite = 1,
> > >  
> > > -    // The object is grey - i.e. it will be scanned.
> > > -    DefinitelyGrey = 2,
> > > +    // This sorta means that the object is grey - i.e. it will be scanned. Or it could be white
> > > +    // during a full collection if its mark bit is clear. That would happen if it had been black,
> > > +    // got barriered, and we did a full collection.
> > > +    PossiblyGrey = 2
> > 
> > It would help to explain that these values indicate an upper bound on the
> > visiting state of an object:
> > 
> > Black means the object has probably transitioned all the way to Black, but
> > it might still be back at Grey or White.
> > 
> > Grey means the object has probably transitioned to Grey, but it might still
> > be back at White.
> > 
> > White means White.
> > 
> > Otherwise, "PossiblyGrey" is ambiguous, and it's not clear why it's OK to
> > skip Black object behaviors when PossiblyGrey.
> > 
> > You could even consider renaming to { White, GreyOrLower, BlackOrLower }, or
> > renaming the enum to CellStateUpperBound.
> 
> I get what "GreyOrLower" and "BlackOrLower" are supposed to mean, but I
> wouldn't have thought that without reading your explanation.
> 
> There's a ton of nuance to these states.  You're right that DefinitelyWhite
> is the only one that has guaranteed meaning.  To understand the meaning of
> the other states, you have to consider the mark bit and possibly other
> information.  That's why after many attempts at naming these states in a
> more descriptive way, I decided to go with the "Possibly" prefix for states
> whose meaning has caveats and the "Definitely" prefix for the state whose
> meaning has no caveats.
> 
> > 
> > > Source/JavaScriptCore/heap/Heap.cpp:957
> > > +            // During a full collection a store into an unmarked object that had surivived past
> > > +            // collections will manifest as a store to an unmarked black object. If the object gets
> > > +            // marked at some time after this then it will go down the normal marking path. We can
> > > +            // safely ignore these stores.
> > 
> > Can we ASSERT here that we're doing a full collection?
> > 
> 
> Yes.  I added it.
> 
> > "store to an unmarked black object" => "store to an unmarked possibly black
> > object"
> > 
> > Instead of "If the object....", I would say: In this situation, the object
> > is actually white. By synchronizing with the GC below, we can prove that the
> > object is white and update its state so it doesn't take the slow path again.
> 
> That doesn't mean the same thing.  This is speaking of the span of time
> beginning immediately after the isMarkedConcurrently check, and ending at
> the end of GC.  During any moment in that span of time, if the object gets
> marked, it will go down the normal marking path.  This paragraph is
> justifying why we can ignore PossiblyBlack unmarked objects.  It's not meant
> as a justification for the optimization below, which re-whites the object.
> 
> I rewrote this paragraph:
> 
>             // During a full collection a store into an unmarked object that
> had surivived past
>             // collections will manifest as a store to an unmarked
> PossiblyBlack object. If the
>             // object gets marked at some time after this then it will go
> down the normal marking
>             // path. So, we don't have to remember this object. We could
> return here. But we go
>             // further and attempt to re-white the object.
> 
> > 
> > > Source/JavaScriptCore/heap/Heap.cpp:973
> > > +            if (cell->atomicCompareExchangeCellStateStrong(CellState::PossiblyBlack, CellState::DefinitelyWhite) == CellState::PossiblyBlack) {
> > > +                // Now we protect against this race:
> > > +                //
> > > +                //     1) Object starts out black + unmarked.
> > > +                //     --> We do isMarkedConcurrently here.
> > > +                //     2) Object is marked and greyed.
> > > +                //     3) Object is scanned and blacked.
> > > +                //     --> We do atomicCompareExchangeCellStateStrong here.
> > > +                //
> > > +                // In this case we would have made the object white again, even though it should
> > > +                // be black. This check lets us correct our mistake. This relies on the fact that
> > > +                // isMarkedConcurrently converges monotonically to true.
> > > +                if (!isMarkedConcurrently(cell)) {
> > > +                    // OK - the object really deserves to be white!
> > > +                    return;
> > > +                }
> > 
> > I wonder if it would be better (i.e., simpler and not more expensive) just
> > to grey the object unconditionally.
> 
> We can't do that because that would be incorrect. A grey object will get
> ignored by the barrier.  You cannot grey an object that might be black.  The
> GC is changing the object's color using a naked store - no CAS.  So, just as
> we change the object's color to grey the GC could be changing it to black. 
> In between the !isMarkedConcurrently check and when we run our CAS, the
> object could have been greyed, blackened, and scanned.  Now it still looks
> black.  If we just change it to grey we will introduce a bug.
> 
> > 
> > I think the advantage of this check-and-white approach is that it allows an
> > old object to die even if we store to it during GC. I wonder if this
> > advantage is really all that important for old objects, since it's
> > impossible for the mutator to create a death spiral of old objects, and old
> > objects tend to survive anyway.
> 
> This isn't an optimization.  We can't unblacken an object without being
> extremely careful.  The GC may have blackened the object to signal that the
> barrier is needed.  We can't forget this information.  That's why we first
> whiten it, and then check the mark bit again.  We can prove that the bad
> interleaving that I speak of would get caught by this extra check.

I think that the code did have a bug, see https://bugs.webkit.org/show_bug.cgi?id=166878#c7

You're right that we could just grey the object and fall through to the barrier.  Regardless of whether that approach is a good idea or not, it's clearly a major change in behavior - much bigger than this patch is right now.

Right now, the patch has no effect on floating garbage, wavefronts, etc. It's just trying to make the barrier remember the fact that the object doesn't need to hit the barrier slow path next time. In trunk, we alraedy return early for black-unmarked objects. But even though the slow path returns early for those objects, we still take the slow path every time we store to those objects. So - those objects are never seen by GC except in this slow path, which rejects them immediately.

If we greyed those objects, then this is a big change in behavior. Now, the GC will see - and visit - every dead object we store to during a full collection. I don't think we want that!
Comment 9 Filip Pizlo 2017-01-10 12:06:45 PST
Here's the new version of that re-whitening code, incorporating all of the feedback:

        if (!isMarkedConcurrently(cell)) {
            // During a full collection a store into an unmarked object that had surivived past
            // collections will manifest as a store to an unmarked PossiblyBlack object. If the
            // object gets marked at some time after this then it will go down the normal marking
            // path. So, we don't have to remember this object. We could return here. But we go
            // further and attempt to re-white the object.
            
            RELEASE_ASSERT(m_collectionScope == CollectionScope::Full);
            
            if (cell->atomicCompareExchangeCellStateStrong(CellState::PossiblyBlack, CellState::DefinitelyWhite) == CellState::PossiblyBlack) {
                // Now we protect against this race:
                //
                //     1) Object starts out black + unmarked.
                //     --> We do isMarkedConcurrently here.
                //     2) Object is marked and greyed.
                //     3) Object is scanned and blacked.
                //     --> We do atomicCompareExchangeCellStateStrong here.
                //
                // In this case we would have made the object white again, even though it should
                // be black. This check lets us correct our mistake. This relies on the fact that
                // isMarkedConcurrently converges monotonically to true.
                if (isMarkedConcurrently(cell)) {
                    // It's difficult to work out whether the object should be grey or black at
                    // this point. We say black conservatively.
                    cell->setCellState(CellState::PossiblyBlack);
                }
                
                // Either way, we can return. Most likely, the object was not marked, and so the
                // object is now labeled white. This means that future barrier executions will not
                // fire. In the unlikely even that the object had become marked, we can still
                // return anyway, since we proved that the object was not marked at the time that
                // we executed this slow path.
            }
            
            return;
        }
Comment 10 Filip Pizlo 2017-01-10 15:43:15 PST
Landed in https://trac.webkit.org/changeset/210565
Comment 11 Geoffrey Garen 2017-01-11 12:47:50 PST
> If we greyed those objects, then this is a big change in behavior. Now, the
> GC will see - and visit - every dead object we store to during a full
> collection. I don't think we want that!

Every dead *old* object. New objects won't take this path because we allocate White.
Comment 12 Filip Pizlo 2017-01-11 13:49:03 PST
(In reply to comment #11)
> > If we greyed those objects, then this is a big change in behavior. Now, the
> > GC will see - and visit - every dead object we store to during a full
> > collection. I don't think we want that!
> 
> Every dead *old* object. New objects won't take this path because we
> allocate White.

Right!  In splay a stray pointer from the past might keep a huge sub tree alive. We want to be precise about this.