Bug 165712 - The DOM should have an advancing wavefront opaque root barrier
Summary: The DOM should have an advancing wavefront opaque root barrier
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore JavaScript (show other bugs)
Version: WebKit Nightly Build
Hardware: All All
: P2 Normal
Assignee: Filip Pizlo
URL:
Keywords:
Depends on:
Blocks: 149432
  Show dependency treegraph
 
Reported: 2016-12-09 18:50 PST by Filip Pizlo
Modified: 2016-12-12 14:47 PST (History)
4 users (show)

See Also:


Attachments
work in progress (48.62 KB, patch)
2016-12-09 20:14 PST, Filip Pizlo
no flags Details | Formatted Diff | Diff
more (52.88 KB, patch)
2016-12-09 22:32 PST, Filip Pizlo
no flags Details | Formatted Diff | Diff
more (90.72 KB, patch)
2016-12-10 10:15 PST, Filip Pizlo
no flags Details | Formatted Diff | Diff
the patch (96.84 KB, patch)
2016-12-10 18:55 PST, Filip Pizlo
no flags Details | Formatted Diff | Diff
the patch (96.38 KB, patch)
2016-12-10 19:49 PST, Filip Pizlo
no flags Details | Formatted Diff | Diff
better patch (98.46 KB, patch)
2016-12-10 20:29 PST, Filip Pizlo
ysuzuki: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Filip Pizlo 2016-12-09 18:50:04 PST
I thought that I wouldn't need a barrier because of how the weak fixpoint works, but we still do need a barrier when root relationships change.

Fortunately, this isn't so difficult.

We could do a retreating wavefront barrier eventually, which would improve performance in programs that churn a lot of DOM things during an eden collection.  But I think that's a rare enough case - since eden collections are short - that I'm OK with not caring initially.

Thanks @rniwa for coming up with the algorithm on IRC.
Comment 1 Filip Pizlo 2016-12-09 19:03:54 PST
For reference:

[18:17:49]  <rniwa_>	ConatinerNode::notifyChildInserted might be a good place
[18:19:04]  <rniwa_>	hm.... maybe ContainerNode::appendChildCommon and ContainerNode::insertBeforeCommon
[18:19:35]  <rniwa_>	and probably ContainerNode::notifyChildRemoved
[18:19:59]  <rniwa_>	and there's a special code path for removing the entire document from GC
[18:20:21]  <rniwa_>	in ContainerNode::removeDetachedChildren()
[18:20:40]  <rniwa_>	but that doesn't call the functions I mentioned above so we should be fine
Comment 2 Filip Pizlo 2016-12-09 20:14:31 PST
Created attachment 296772 [details]
work in progress
Comment 3 Filip Pizlo 2016-12-09 22:32:50 PST
Created attachment 296781 [details]
more
Comment 4 Filip Pizlo 2016-12-10 10:15:10 PST
Created attachment 296805 [details]
more
Comment 5 Filip Pizlo 2016-12-10 18:55:58 PST
Created attachment 296838 [details]
the patch
Comment 6 WebKit Commit Bot 2016-12-10 18:58:28 PST
Attachment 296838 [details] did not pass style-queue:


ERROR: Source/WebCore/dom/ContainerNodeAlgorithms.cpp:106:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/WebCore/dom/ContainerNodeAlgorithms.cpp:158:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/WebCore/bindings/js/CommonVM.cpp:40:  g_commonVMOrNull is incorrectly named. Don't use underscores in your identifier names.  [readability/naming/underscores] [4]
ERROR: Source/WebCore/bindings/js/CommonVM.cpp:41:  g_opaqueRootWriteBarrierEnabled is incorrectly named. Don't use underscores in your identifier names.  [readability/naming/underscores] [4]
ERROR: Source/WebKit2/Shared/mac/WebMemorySampler.mac.mm:39:  Alphabetical sorting problem.  [build/include_order] [4]
ERROR: Source/WebCore/page/cocoa/ResourceUsageOverlayCocoa.mm:33:  Bad include order. Mixing system and custom headers.  [build/include_order] [4]
Total errors found: 6 in 58 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 7 Filip Pizlo 2016-12-10 19:49:11 PST
Created attachment 296843 [details]
the patch
Comment 8 WebKit Commit Bot 2016-12-10 19:51:05 PST
Attachment 296843 [details] did not pass style-queue:


ERROR: Source/WebCore/dom/ContainerNodeAlgorithms.cpp:106:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/WebCore/dom/ContainerNodeAlgorithms.cpp:158:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/WebCore/bindings/js/CommonVM.cpp:40:  g_commonVMOrNull is incorrectly named. Don't use underscores in your identifier names.  [readability/naming/underscores] [4]
ERROR: Source/WebCore/bindings/js/CommonVM.cpp:41:  g_opaqueRootWriteBarrierEnabled is incorrectly named. Don't use underscores in your identifier names.  [readability/naming/underscores] [4]
ERROR: Source/WebKit2/Shared/mac/WebMemorySampler.mac.mm:39:  Alphabetical sorting problem.  [build/include_order] [4]
ERROR: Source/WebCore/page/cocoa/ResourceUsageOverlayCocoa.mm:33:  Bad include order. Mixing system and custom headers.  [build/include_order] [4]
Total errors found: 6 in 57 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 9 Filip Pizlo 2016-12-10 20:29:19 PST
Created attachment 296845 [details]
better patch
Comment 10 WebKit Commit Bot 2016-12-10 20:31:41 PST
Attachment 296845 [details] did not pass style-queue:


ERROR: Source/WebCore/dom/ContainerNodeAlgorithms.cpp:106:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/WebCore/dom/ContainerNodeAlgorithms.cpp:158:  More than one command on the same line  [whitespace/newline] [4]
ERROR: Source/WebCore/bindings/js/CommonVM.cpp:40:  g_commonVMOrNull is incorrectly named. Don't use underscores in your identifier names.  [readability/naming/underscores] [4]
ERROR: Source/WebCore/bindings/js/CommonVM.cpp:41:  g_opaqueRootWriteBarrierEnabled is incorrectly named. Don't use underscores in your identifier names.  [readability/naming/underscores] [4]
ERROR: Source/WebKit2/Shared/mac/WebMemorySampler.mac.mm:39:  Alphabetical sorting problem.  [build/include_order] [4]
ERROR: Source/WebCore/page/cocoa/ResourceUsageOverlayCocoa.mm:33:  Bad include order. Mixing system and custom headers.  [build/include_order] [4]
Total errors found: 6 in 59 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 11 Yusuke Suzuki 2016-12-11 09:48:18 PST
Comment on attachment 296845 [details]
better patch

View in context: https://bugs.webkit.org/attachment.cgi?id=296845&action=review

r=me

> Source/WebCore/ChangeLog:13
> +           1) DOM at start: D->X->Y

OK, this is just C++ pointers. So these references are not observed by GC.
We just track liveness of these things by using the opaque root.

> Source/WebCore/ChangeLog:14
> +           2) Mark X, X->visitChildren, addOpaqueRoot(D)

The problem is, when removing X now, Y's opaque root becomes X. But it is not in the remember set.
But this should be live since X is marked.

> Source/WebCore/ChangeLog:19
> +           1) DOM at start: D, X->Y

Now, Y's opaque root is X.

> Source/WebCore/ChangeLog:22
> +           4) Y thinks it's not reachable (its opaque root, D, is not in the set).

Now, Y's opaque root becomes D. But D is not in the set.
But since Y is reachable from X, this should be live.

> Source/WebCore/ChangeLog:27
> +           Removal: add X (the removed child) to the opaque root set.
> +           Insertion: add D (the insertion point) to the opaque root set.

To keep the liveness of the descendants of the X.

> Source/WebCore/dom/ContainerNodeAlgorithms.cpp:106
> +    writeBarrierOpaqueRoot([&insertionPoint] () -> void* { return insertionPoint.opaqueRoot(); });

OK, when the node is inserted, we should perform write barrier for the insertion point's opaque root.

> Source/WebCore/dom/ContainerNodeAlgorithms.cpp:158
> +    writeBarrierOpaqueRoot([&child] () -> void* { return &child; });

And when removing, we should perform writeBarrierOpaqueRoot for the removed child node itself.
Without it, the descendants of the child's opaque root, `child` is not in the set.
Comment 12 Filip Pizlo 2016-12-11 10:20:17 PST
Landed in https://trac.webkit.org/changeset/209683
Comment 13 Geoffrey Garen 2016-12-11 21:26:29 PST
I don't think this strategy can be correct. The opaque roots algorithm covers lots of relationships in the DOM, not just ConatinerNode relationships.
Comment 14 Filip Pizlo 2016-12-11 21:45:13 PST
(In reply to comment #13)
> I don't think this strategy can be correct. The opaque roots algorithm
> covers lots of relationships in the DOM, not just ConatinerNode
> relationships.

I'm not sure I follow.

We only need a barrier when an object's visitChildren method would change its mind about what it adds as an opaque root.  So long as a wrapper's visitChildren continues to report the same opaque root over the course of its lifetime, the GC needs no barrier.  The GC still evaluates weak references with the world stop, and still re-evaluates all of them every time it does an iteration.

It all comes down to what our JSBlahCustom.cpp code calls "root()".  I spot-checked one, maybe two, other places at the time of landing the patch and found that their "root()" was OK.  I don't doubt that I will find more ways in which root() and other things are either racy or need a barrier, and I'll fix them as I find them!
Comment 15 Geoffrey Garen 2016-12-12 08:50:01 PST
> It all comes down to what our JSBlahCustom.cpp code calls "root()".  I
> spot-checked one, maybe two, other places at the time of landing the patch
> and found that their "root()" was OK.  I don't doubt that I will find more
> ways in which root() and other things are either racy or need a barrier, and
> I'll fix them as I find them!

Yes, it's the other places I'm talking about -- namely, any visitChildren function that is not in the JSNode hierarchy.
Comment 16 Filip Pizlo 2016-12-12 09:02:13 PST
(In reply to comment #15)
> > It all comes down to what our JSBlahCustom.cpp code calls "root()".  I
> > spot-checked one, maybe two, other places at the time of landing the patch
> > and found that their "root()" was OK.  I don't doubt that I will find more
> > ways in which root() and other things are either racy or need a barrier, and
> > I'll fix them as I find them!
> 
> Yes, it's the other places I'm talking about -- namely, any visitChildren
> function that is not in the JSNode hierarchy.

When you said that you didn't think that the strategy can be correct, were you merely saying that this patch does not insert all of the barriers that we will eventually insert?  I agree that it most likely does not.

Or do you believe that this strategy will never ever work for some of the other visitChildren() methods because of a fundamental issue?
Comment 17 Filip Pizlo 2016-12-12 09:20:15 PST
(In reply to comment #16)
> (In reply to comment #15)
> > > It all comes down to what our JSBlahCustom.cpp code calls "root()".  I
> > > spot-checked one, maybe two, other places at the time of landing the patch
> > > and found that their "root()" was OK.  I don't doubt that I will find more
> > > ways in which root() and other things are either racy or need a barrier, and
> > > I'll fix them as I find them!
> > 
> > Yes, it's the other places I'm talking about -- namely, any visitChildren
> > function that is not in the JSNode hierarchy.
> 
> When you said that you didn't think that the strategy can be correct, were
> you merely saying that this patch does not insert all of the barriers that
> we will eventually insert?  I agree that it most likely does not.
> 
> Or do you believe that this strategy will never ever work for some of the
> other visitChildren() methods because of a fundamental issue?

BTW - if such a thing did exist, then I'd just give those visitChildren methods this API:

SlotVisitor::revisitOnEachFixpoint()

I just don't think I'll need it.
Comment 18 Geoffrey Garen 2016-12-12 10:27:43 PST
> Or do you believe that this strategy will never ever work for some of the
> other visitChildren() methods because of a fundamental issue?

Sorry, I should have been clearer: I think this strategy could work in most cases; and that we've missed some cases; and that some cases like JSManagedValue may never work with this strategy because the thing that would do the barrier is outside of our code base.
Comment 19 Filip Pizlo 2016-12-12 10:35:36 PST
(In reply to comment #18)
> > Or do you believe that this strategy will never ever work for some of the
> > other visitChildren() methods because of a fundamental issue?
> 
> Sorry, I should have been clearer: I think this strategy could work in most
> cases; and that we've missed some cases; and that some cases like
> JSManagedValue may never work with this strategy because the thing that
> would do the barrier is outside of our code base.

OK - that's not for off from what I expected.

I've got a patch coming for AudioTrack, which needs a very simple barrier.

JSManagedValue can use rescanOnEachIteration or whatever.
Comment 20 Geoffrey Garen 2016-12-12 11:38:48 PST
Here's an alternative design, which might simplify and save us from having to dig up the remaining missing barriers:

Change the API from

    virtual bool isReachableFromOpaqueRoots(Handle<Unknown>, void* context, SlotVisitor&);

to

    virtual void* opaqueRoot(Handle<Unknown>, void* context, SlotVisitor&);
    virtual bool isReachableForWeirdReasons(Handle<Unknown>, void* context, SlotVisitor&);

Then the weak pointer scan says:

(1) Observe weird reasons.

if (isReachableForWeirdReasons())
    mark object // for efficiency, real implementation just puts object on the grey list

This covers things that have nothing to do with object graph relationships and instead are "I am loading" or "I am playing audio" and things like that.

(2) Mark things if necessary:

o = opaqueRoot();
if (object is marked)
    add o to set
else if (o is in set and object is not marked)
    mark object // for efficiency, real implementation just puts object on the grey list

This algorithm is immune to mutations because it observes all opaque pointer relationships during the weak pointer scan, which runs with the world stopped.

I think I prefer this because it solves an immediate problem and it frees future DOM programmers from having to invent new kinds of barriers.
Comment 21 Filip Pizlo 2016-12-12 11:54:03 PST
(In reply to comment #20)
> Here's an alternative design, which might simplify and save us from having
> to dig up the remaining missing barriers:
> 
> Change the API from
> 
>     virtual bool isReachableFromOpaqueRoots(Handle<Unknown>, void* context,
> SlotVisitor&);
> 
> to
> 
>     virtual void* opaqueRoot(Handle<Unknown>, void* context, SlotVisitor&);

What does this do?

>     virtual bool isReachableForWeirdReasons(Handle<Unknown>, void* context,
> SlotVisitor&);
> 
> Then the weak pointer scan says:
> 
> (1) Observe weird reasons.
> 
> if (isReachableForWeirdReasons())
>     mark object // for efficiency, real implementation just puts object on
> the grey list
> 
> This covers things that have nothing to do with object graph relationships
> and instead are "I am loading" or "I am playing audio" and things like that.

So, this is the weak scan?

> 
> (2) Mark things if necessary:
> 
> o = opaqueRoot();
> if (object is marked)
>     add o to set
> else if (o is in set and object is not marked)
>     mark object // for efficiency, real implementation just puts object on
> the grey list

When does this algorithm run?  Is this a barrier?

> 
> This algorithm is immune to mutations because it observes all opaque pointer
> relationships during the weak pointer scan, which runs with the world
> stopped.

That's what rescanOnEachIteration would give to any class that wanted to opt into it.

Another form of this API might be:
- addOpaqueRoot puts the caller into the rescan: his visitChildren will get called on each weak fixpoint iteration.
- addOpaqueRootWithoutRescan behaves like addOpaqueRoot does now, and requires you to use a barrier.

Using rescan means that the weak scan to has to both evaluate what to put into the opaque set (execute what now happens in visitChildren) and evaluate what to mark based on what is in the opaque set (execute what now happens in weak visiting).  In my simple approach, the cost would probably be 50/50 between the stuff you have to do to all visitChildren and the hashtable lookup itself.

> 
> I think I prefer this because it solves an immediate problem and it frees
> future DOM programmers from having to invent new kinds of barriers.

I think that I'm implementing a superset of what you are proposing, in the sense that one of the modes does not require any changes to how DOM code gets written.  I want both modes because I want to measure the difference between them by changing which visitChildren methods opt into the rescan.  I have a hunch that we'll end up with something like this:

- Most code uses the default addOpaqueRoot, which puts you into the rescan so you're safe.
- Node hierarchy uses addOpaqueRootWithoutRescan to get the barrier-based algorithm.
- Some other DOM things get the barrier-based algorithm because they either have no barriers or the barrier is trivial.

I don't really know how your proposed algorithm stacks up, because I don't quite grok what it does.
Comment 22 Geoffrey Garen 2016-12-12 13:24:49 PST
> >     virtual void* opaqueRoot(Handle<Unknown>, void* context, SlotVisitor&);
> 
> What does this do?

It returns the object's opaque root. Usually like so:

return root(impl())

> So, this is the weak scan?

This algorithm runs, with the world stopped, when we scan WeakBlocks.

> > (2) Mark things if necessary:
> > 
> > o = opaqueRoot();
> > if (object is marked)
> >     add o to set
> > else if (o is in set and object is not marked)
> >     mark object // for efficiency, real implementation just puts object on
> > the grey list
> 
> When does this algorithm run?  Is this a barrier?

This algorithm runs, with the world stopped, when we scan WeakBlocks.

This is not a barrier.

> That's what rescanOnEachIteration would give to any class that wanted to opt
> into it.

I agree that it is also possible to apply this algorithm to a fraction of objects, and apply barriers to the other fraction. I also agree that it's possible to implement a similar algorithm by just calling visitChildren again -- although that has the additional cost of visiting a bunch of stuff we don't need to visit.

> Another form of this API might be:
> - addOpaqueRoot puts the caller into the rescan: his visitChildren will get
> called on each weak fixpoint iteration.
> - addOpaqueRootWithoutRescan behaves like addOpaqueRoot does now, and
> requires you to use a barrier.

Seems reasonable. I think it's a clear improvement to make the no-barrier behavior the default.

> > I think I prefer this because it solves an immediate problem and it frees
> > future DOM programmers from having to invent new kinds of barriers.
> 
> I think that I'm implementing a superset of what you are proposing, in the
> sense that one of the modes does not require any changes to how DOM code
> gets written.  I want both modes because I want to measure the difference
> between them by changing which visitChildren methods opt into the rescan.

Seems reasonable to benchmark the alternatives.

> I don't really know how your proposed algorithm stacks up, because I don't
> quite grok what it does.

It's the equivalent of rescanOnEachIteration for all objects, with the cost savings of not doing other visitChildren things and the clarity advantage of providing an API where DOM programmers just need to write a function to identify a root, rather than a full visitChildren function.
Comment 23 Filip Pizlo 2016-12-12 14:45:39 PST
OK - I've investigating what the DOM does some more.

I think that I'm going to just make addOpaqueRoot do rescanning.  And I'll remove the opaque root barrier.  Then I'll measure everything.

I don't want to devote effort to optimizing this if it costs nothing anyway!
Comment 24 Filip Pizlo 2016-12-12 14:47:37 PST
Tracked here:
Comment 25 Filip Pizlo 2016-12-12 14:47:47 PST
(In reply to comment #24)
> Tracked here:

https://bugs.webkit.org/show_bug.cgi?id=165760