Bug 172548 - ObjectToStringAdaptiveInferredPropertyValueWatchpoint should not reinstall itself nor handleFire if it's dying shortly.
Summary: ObjectToStringAdaptiveInferredPropertyValueWatchpoint should not reinstall it...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Mark Lam
URL:
Keywords: InRadar
Depends on:
Blocks: 172549
  Show dependency treegraph
 
Reported: 2017-05-24 11:12 PDT by Mark Lam
Modified: 2017-05-25 10:58 PDT (History)
7 users (show)

See Also:


Attachments
proposed patch. (12.33 KB, patch)
2017-05-24 11:28 PDT, Mark Lam
mark.lam: review-
buildbot: commit-queue-
Details | Formatted Diff | Diff
proposed patch. (16.64 KB, patch)
2017-05-24 15:47 PDT, Mark Lam
fpizlo: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Lam 2017-05-24 11:12:30 PDT
Consider the following scenario:

1. A ObjectToStringAdaptiveInferredPropertyValueWatchpoint O1, watches for structure transitions, e.g. structure S2 transitioning to structure S3.  In this case, O1 would be installed in S2's watchpoint set.
2. When the structure transition happens, structure S2 will fire watchpoint O1.
3. O1's handler will normally re-install itself in the watchpoint set of the new "transitioned to" structure S3.
4. "Installation" here requires writing into the StructureRareData SD3 of the new structure S3.  If SD3 does not exist yet, the installation process will trigger the allocation of StructureRareData SD3.
5. It is possible that the Structure S1, and StructureRareData SD1 that owns the ObjectToStringAdaptiveInferredPropertyValueWatchpoint O1 is no longer reachable by the GC, and therefore will be collected soon.
6. The allocation of SD3 in (4) may trigger the sweeping of the StructureRareData SD1.  This, in turn, triggers the deletion of the ObjectToStringAdaptiveInferredPropertyValueWatchpoint O1.

After O1 is deleted in (6) and SD3 is allocated in (4), execution continues in AdaptiveInferredPropertyValueWatchpointBase::fire() where O1 gets installed in structure S3's watchpoint set.  This is obviously incorrect because O1 is already deleted.  The result is that badness happens later when S3's watchpoint set fires its watchpoints and accesses the deleted O1.

The fix is to enhance AdaptiveInferredPropertyValueWatchpointBase::fire() to check if "this" is still valid before proceeding to re-install itself or to invoke its handleFire() method.  ObjectToStringAdaptiveInferredPropertyValueWatchpoint (which extends AdaptiveInferredPropertyValueWatchpointBase) will override its isValid() method, and return false its owner StructureRareData is no longer reachable by the GC.  This ensures that it won't be deleted while it's installed to any watchpoint set.

Additional considerations and notes:
1. In the above, I talked about the ObjectToStringAdaptiveInferredPropertyValueWatchpoint being installed in watchpoint sets.  What actually happens is that ObjectToStringAdaptiveInferredPropertyValueWatchpoint has 2 members (m_structureWatchpoint and m_propertyWatchpoint) which may be installed in watchpoint sets.  The ObjectToStringAdaptiveInferredPropertyValueWatchpoint is not itself a Watchpoint object.  But for brevity, in the above, I refer to the ObjectToStringAdaptiveInferredPropertyValueWatchpoint instead of its Watchpoint members.  The description of the issue is still accurate given the life-cycle of the Watchpoint members are embedded in the enclosing ObjectToStringAdaptiveInferredPropertyValueWatchpoint object, and hence, they share the same life-cycle.

2. The top of AdaptiveInferredPropertyValueWatchpointBase::fire() removes its m_structureWatchpoint and m_propertyWatchpoint if they have been added to any watchpoint sets.  This is safe to do even if the owner StructureRareData is no longer reachable by the GC.  This is because the only way we can get to AdaptiveInferredPropertyValueWatchpointBase::fire() is if its Watchpoint members are still installed in some watchpoint set that fired.  This means that the AdaptiveInferredPropertyValueWatchpointBase instance has not been deleted yet, because its destructor will automatically remove the Watchpoint members from any watchpoint sets.

<rdar://problem/31458393>
Comment 1 Mark Lam 2017-05-24 11:28:24 PDT
Created attachment 311139 [details]
proposed patch.
Comment 2 Build Bot 2017-05-24 12:12:05 PDT
Comment on attachment 311139 [details]
proposed patch.

Attachment 311139 [details] did not pass jsc-ews (mac):
Output: http://webkit-queues.webkit.org/results/3808194

New failing tests:
stress/symbol-tostringtag-watchpoints.js.dfg-eager-no-cjit-validate
stress/symbol-tostringtag-watchpoints.js.ftl-eager
stress/symbol-tostringtag-watchpoints.js.no-llint
stress/symbol-tostringtag-watchpoints.js.dfg-maximal-flush-validate-no-cjit
stress/symbol-tostringtag-watchpoints.js.ftl-no-cjit-validate-sampling-profiler
stress/symbol-tostringtag-watchpoints.js.ftl-no-cjit-b3o1
stress/symbol-tostringtag-watchpoints.js.no-cjit-validate-phases
stress/symbol-tostringtag-watchpoints.js.ftl-no-cjit-no-inline-validate
stress/symbol-tostringtag-watchpoints.js.ftl-eager-no-cjit-b3o1
stress/symbol-tostringtag-watchpoints.js.no-ftl
stress/symbol-tostringtag-watchpoints.js.ftl-no-cjit-no-put-stack-validate
stress/symbol-tostringtag-watchpoints.js.default
stress/symbol-tostringtag-watchpoints.js.ftl-no-cjit-small-pool
apiTests
Comment 3 Mark Lam 2017-05-24 12:37:32 PDT
Comment on attachment 311139 [details]
proposed patch.

Tests found bugs. Will fix.
Comment 4 Mark Lam 2017-05-24 15:47:50 PDT
Created attachment 311163 [details]
proposed patch.
Comment 5 Saam Barati 2017-05-24 19:40:20 PDT
Comment on attachment 311163 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=311163&action=review

> Source/JavaScriptCore/ChangeLog:25
> +        5. It is possible that the Structure S1, and StructureRareData SD1 that owns the
> +           ObjectToStringAdaptiveInferredPropertyValueWatchpoint O1 is no longer reachable
> +           by the GC, and therefore will be collected soon.
> +        6. The allocation of SD3 in (4) may trigger the sweeping of the StructureRareData
> +           SD1.  This, in turn, triggers the deletion of the
> +           ObjectToStringAdaptiveInferredPropertyValueWatchpoint O1.

What is SD1 here? How is it relevant? You're saying we're doing "S2->S3" transition.

Also, why can't this lifetime issue just be solved with moving the ObjectToStringAdaptiveInferredPropertyValueWatchpoint out of the rare data?

I don't get how this thing that was proven unreachable is actually reachable. Can you explain that a bit more?
Comment 6 Filip Pizlo 2017-05-24 19:59:22 PDT
(In reply to Saam Barati from comment #5)
> Comment on attachment 311163 [details]
> proposed patch.
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=311163&action=review
> 
> > Source/JavaScriptCore/ChangeLog:25
> > +        5. It is possible that the Structure S1, and StructureRareData SD1 that owns the
> > +           ObjectToStringAdaptiveInferredPropertyValueWatchpoint O1 is no longer reachable
> > +           by the GC, and therefore will be collected soon.
> > +        6. The allocation of SD3 in (4) may trigger the sweeping of the StructureRareData
> > +           SD1.  This, in turn, triggers the deletion of the
> > +           ObjectToStringAdaptiveInferredPropertyValueWatchpoint O1.
> 
> What is SD1 here? How is it relevant? You're saying we're doing "S2->S3"
> transition.
> 
> Also, why can't this lifetime issue just be solved with moving the
> ObjectToStringAdaptiveInferredPropertyValueWatchpoint out of the rare data?

The problem is that isWatchable() allocates and so it may call the rare data's destructor.

Say that you have the ObjectToStringAdaptiveInferredPropertyValueWatchpoint be a separate object owned by the rare data.  Then when we do isWatchable(), the rare data destructor will run, and the watchpoint will be deleted.  That's still bad.

You could ref-count the ObjectToStringAdaptiveInferredPropertyValueWatchpoint, and it could ref itself before calling isWatchable, but I somewhat prefer being able to answer this via GC.  I want objects to be allowed to ask: "am I really alive?"

> 
> I don't get how this thing that was proven unreachable is actually
> reachable. Can you explain that a bit more?
Comment 7 Saam Barati 2017-05-24 20:05:08 PDT
Can somebody explain to me exactly what the object graph is in this example? The structures in question, their prototype fields, etc. I’m a bit confused still. 

I feel like we’re missing an easier solution
Comment 8 Filip Pizlo 2017-05-24 20:06:22 PDT
Comment on attachment 311163 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=311163&action=review

> Source/JavaScriptCore/heap/FreeList.cpp:48
> +bool FreeList::contains(const void* target) const
> +{
> +    if (remaining) {
> +        const void* start = (payloadEnd - remaining);
> +        const void* end = payloadEnd;
> +        return (start <= target) && (target < end);
> +    }
> +
> +    FreeCell* candidate = head;
> +    while (candidate) {
> +        if (candidate == target)
> +            return true;
> +        candidate = candidate->next;
> +    }
> +
> +    return false;
> +}
> +

Nice.

> Source/JavaScriptCore/heap/HeapCellInlines.h:42
> +    if (markedBlockHandle.isFreeListed())
> +        return !markedBlockHandle.isFreeListedCell(this);

I think you commented to me earlier that we could say that free-listed objects are "live".  I get now that this would be correct for the purposes of your patch, but would just be inconsistent with the GC's lingo (in other contexts it's important to know that you're not "live" if you're still on the free list).

Maybe we want to eventually add something like HeapCell::isPendingDestruction(), which could just be:

if (isLargeAllocation())
    return !largeAllocation().isLive();
auto& markedBlockHandle = markedBlock().handle();
if (markedBlockhandle.isFreeListed())
    return false;
return !markedBlockHandle.isLive(this);

Basically, we'd say that isPendingDestruction() can only be called on objects that are either live right now, or objects that ceased to be live but have not yet been destructed.

You don't have to make this change.  It's just an observation.
Comment 9 Filip Pizlo 2017-05-24 20:10:06 PDT
(In reply to Saam Barati from comment #7)
> Can somebody explain to me exactly what the object graph is in this example?
> The structures in question, their prototype fields, etc. I’m a bit confused
> still. 
> 
> I feel like we’re missing an easier solution

Structure S has a rare data R.

Rare data R has a watchpoint W.

Watchpoint W is installed in a watchpoint set in structure T.

T != S.

Then:

1) GC happens, marks T but not S or R.  Note that S and R are dead but not destructed.  Therefore W is still installed in T.
2) T does something that fires W.
3) W calls isWatchable(), which allocates, which finally calls R's destructor - but not before we allocate another rare data right over it.
4) Heap corruption - the destruction of R, and the removal of W, overwrites a newly allocated rare data.

One way to view this problem is that watchpoints shouldn't allocate.

Another way to view this problem is that watchpoints owned by GC objects shouldn't allocate.

An even more specific way to view this problem is that watchpoints owned by GC objects shouldn't do anything if those objects are pending destruction.
Comment 10 Saam Barati 2017-05-24 22:54:04 PDT
(In reply to Filip Pizlo from comment #9)
> (In reply to Saam Barati from comment #7)
> > Can somebody explain to me exactly what the object graph is in this example?
> > The structures in question, their prototype fields, etc. I’m a bit confused
> > still. 
> > 
> > I feel like we’re missing an easier solution
> 
> Structure S has a rare data R.
> 
> Rare data R has a watchpoint W.
> 
> Watchpoint W is installed in a watchpoint set in structure T.
> 
> T != S.
> 
> Then:
> 
> 1) GC happens, marks T but not S or R.  Note that S and R are dead but not
> destructed.  Therefore W is still installed in T.
> 2) T does something that fires W.
> 3) W calls isWatchable(), which allocates, which finally calls R's
> destructor - but not before we allocate another rare data right over it.
> 4) Heap corruption - the destruction of R, and the removal of W, overwrites
> a newly allocated rare data.
> 
> One way to view this problem is that watchpoints shouldn't allocate.
> 
> Another way to view this problem is that watchpoints owned by GC objects
> shouldn't allocate.
> 
> An even more specific way to view this problem is that watchpoints owned by
> GC objects shouldn't do anything if those objects are pending destruction.
This makes sense. Thanks for explaining. I see exactly what’s happening.

This patch employs this fix only for this particular adaptive watchpoint. Why don’t we apply this to all of them. My suggestion would be: adaptive watchpoints take an optional JSCell that represents their owner. If the owner is null, they’re always valid. Otherwise, they’re only valid if their owner is live.
Comment 11 Keith Miller 2017-05-24 23:04:57 PDT
Interesting I was under the impression we deferred GC while relocating watchpoints. I guess that was a bad assumption...
Comment 12 Keith Miller 2017-05-24 23:09:49 PDT
I still kinda feel like deferring GC when firing watchpoints would be a more comprehensive solution. Since I think this problem could happen with any adaptive watchpoint.
Comment 13 Saam Barati 2017-05-24 23:18:41 PDT
(In reply to Keith Miller from comment #12)
> I still kinda feel like deferring GC when firing watchpoints would be a more
> comprehensive solution. Since I think this problem could happen with any
> adaptive watchpoint.

I don’t think this solves the issue since we’ve already done the GC at this point. Does DeferGC guarantee we won’t destruct something on the free list?
Comment 14 Filip Pizlo 2017-05-25 08:51:50 PDT
(In reply to Keith Miller from comment #12)
> I still kinda feel like deferring GC when firing watchpoints would be a more
> comprehensive solution. Since I think this problem could happen with any
> adaptive watchpoint.

Deferring GC does not prevent sweeping.

The bug is sweeping, not GC.
Comment 15 Filip Pizlo 2017-05-25 08:52:13 PDT
(In reply to Saam Barati from comment #13)
> (In reply to Keith Miller from comment #12)
> > I still kinda feel like deferring GC when firing watchpoints would be a more
> > comprehensive solution. Since I think this problem could happen with any
> > adaptive watchpoint.
> 
> I don’t think this solves the issue since we’ve already done the GC at this
> point. Does DeferGC guarantee we won’t destruct something on the free list?

No. There is no way to guarantee that we don't destruct when allocating.
Comment 16 Filip Pizlo 2017-05-25 09:04:54 PDT
(In reply to Saam Barati from comment #10)
> (In reply to Filip Pizlo from comment #9)
> > (In reply to Saam Barati from comment #7)
> > > Can somebody explain to me exactly what the object graph is in this example?
> > > The structures in question, their prototype fields, etc. I’m a bit confused
> > > still. 
> > > 
> > > I feel like we’re missing an easier solution
> > 
> > Structure S has a rare data R.
> > 
> > Rare data R has a watchpoint W.
> > 
> > Watchpoint W is installed in a watchpoint set in structure T.
> > 
> > T != S.
> > 
> > Then:
> > 
> > 1) GC happens, marks T but not S or R.  Note that S and R are dead but not
> > destructed.  Therefore W is still installed in T.
> > 2) T does something that fires W.
> > 3) W calls isWatchable(), which allocates, which finally calls R's
> > destructor - but not before we allocate another rare data right over it.
> > 4) Heap corruption - the destruction of R, and the removal of W, overwrites
> > a newly allocated rare data.
> > 
> > One way to view this problem is that watchpoints shouldn't allocate.
> > 
> > Another way to view this problem is that watchpoints owned by GC objects
> > shouldn't allocate.
> > 
> > An even more specific way to view this problem is that watchpoints owned by
> > GC objects shouldn't do anything if those objects are pending destruction.
> This makes sense. Thanks for explaining. I see exactly what’s happening.
> 
> This patch employs this fix only for this particular adaptive watchpoint.
> Why don’t we apply this to all of them. My suggestion would be: adaptive
> watchpoints take an optional JSCell that represents their owner. If the
> owner is null, they’re always valid. Otherwise, they’re only valid if their
> owner is live.

This would only be better if the adaptive watchpoints that had an owner didn't already point to that owner in some way.

I do wonder how many other adaptive watchpoints have this bug.  Probably not many.  The reason why this all worked before is that watchpoints didn't allocate.  I think that there are a bunch of watchpoints that "just work" because of the lack of allocation.

I think that if we wanted to make this code make more sense, then we would teach Watchpoint about its m_owner. This would add some space bloat, but I think it would be worth it. Then watchpoints would not do anything if the owner was dead.

If we did this, then maybe we could also introduce the policy that watchpoints get fired when their owner dies. Perhaps it could be a special kind of firing - maybe a separate virtual method just for this purpose. If we were willing to make Watchpoints be JSCells, then we could trivially do this (allocate Watchpoints in a special Subspace that does a finalization action that fires all watchpoints whose owners died - note that doing it as a finalization action ensures that it does not happen during sweep.  The reason why we currently don't fire watchpoints when owners die is that we don't want to fire watchpoints while sweeping - that's even worse than sweeping while firing watchpoints.).
Comment 17 Filip Pizlo 2017-05-25 09:06:44 PDT
(In reply to Filip Pizlo from comment #16)
> (In reply to Saam Barati from comment #10)
> > (In reply to Filip Pizlo from comment #9)
> > > (In reply to Saam Barati from comment #7)
> > > > Can somebody explain to me exactly what the object graph is in this example?
> > > > The structures in question, their prototype fields, etc. I’m a bit confused
> > > > still. 
> > > > 
> > > > I feel like we’re missing an easier solution
> > > 
> > > Structure S has a rare data R.
> > > 
> > > Rare data R has a watchpoint W.
> > > 
> > > Watchpoint W is installed in a watchpoint set in structure T.
> > > 
> > > T != S.
> > > 
> > > Then:
> > > 
> > > 1) GC happens, marks T but not S or R.  Note that S and R are dead but not
> > > destructed.  Therefore W is still installed in T.
> > > 2) T does something that fires W.
> > > 3) W calls isWatchable(), which allocates, which finally calls R's
> > > destructor - but not before we allocate another rare data right over it.
> > > 4) Heap corruption - the destruction of R, and the removal of W, overwrites
> > > a newly allocated rare data.
> > > 
> > > One way to view this problem is that watchpoints shouldn't allocate.
> > > 
> > > Another way to view this problem is that watchpoints owned by GC objects
> > > shouldn't allocate.
> > > 
> > > An even more specific way to view this problem is that watchpoints owned by
> > > GC objects shouldn't do anything if those objects are pending destruction.
> > This makes sense. Thanks for explaining. I see exactly what’s happening.
> > 
> > This patch employs this fix only for this particular adaptive watchpoint.
> > Why don’t we apply this to all of them. My suggestion would be: adaptive
> > watchpoints take an optional JSCell that represents their owner. If the
> > owner is null, they’re always valid. Otherwise, they’re only valid if their
> > owner is live.
> 
> This would only be better if the adaptive watchpoints that had an owner
> didn't already point to that owner in some way.
> 
> I do wonder how many other adaptive watchpoints have this bug.  Probably not
> many.  The reason why this all worked before is that watchpoints didn't
> allocate.  I think that there are a bunch of watchpoints that "just work"
> because of the lack of allocation.
> 
> I think that if we wanted to make this code make more sense, then we would
> teach Watchpoint about its m_owner. This would add some space bloat, but I
> think it would be worth it. Then watchpoints would not do anything if the
> owner was dead.
> 
> If we did this, then maybe we could also introduce the policy that
> watchpoints get fired when their owner dies. Perhaps it could be a special
> kind of firing - maybe a separate virtual method just for this purpose. If
> we were willing to make Watchpoints be JSCells, then we could trivially do
> this (allocate Watchpoints in a special Subspace that does a finalization
> action that fires all watchpoints whose owners died - note that doing it as
> a finalization action ensures that it does not happen during sweep.  The
> reason why we currently don't fire watchpoints when owners die is that we
> don't want to fire watchpoints while sweeping - that's even worse than
> sweeping while firing watchpoints.).

Anyway - I still think we should land Mark's fix. It's a step in the right direction.
Comment 18 Mark Lam 2017-05-25 10:04:01 PDT
Thanks for the review.  Landed in r217429: <http://trac.webkit.org/r217429>.
Comment 19 Saam Barati 2017-05-25 10:58:36 PDT
(In reply to Filip Pizlo from comment #17)
> (In reply to Filip Pizlo from comment #16)
> > (In reply to Saam Barati from comment #10)
> > > (In reply to Filip Pizlo from comment #9)
> > > > (In reply to Saam Barati from comment #7)
> > > > > Can somebody explain to me exactly what the object graph is in this example?
> > > > > The structures in question, their prototype fields, etc. I’m a bit confused
> > > > > still. 
> > > > > 
> > > > > I feel like we’re missing an easier solution
> > > > 
> > > > Structure S has a rare data R.
> > > > 
> > > > Rare data R has a watchpoint W.
> > > > 
> > > > Watchpoint W is installed in a watchpoint set in structure T.
> > > > 
> > > > T != S.
> > > > 
> > > > Then:
> > > > 
> > > > 1) GC happens, marks T but not S or R.  Note that S and R are dead but not
> > > > destructed.  Therefore W is still installed in T.
> > > > 2) T does something that fires W.
> > > > 3) W calls isWatchable(), which allocates, which finally calls R's
> > > > destructor - but not before we allocate another rare data right over it.
> > > > 4) Heap corruption - the destruction of R, and the removal of W, overwrites
> > > > a newly allocated rare data.
> > > > 
> > > > One way to view this problem is that watchpoints shouldn't allocate.
> > > > 
> > > > Another way to view this problem is that watchpoints owned by GC objects
> > > > shouldn't allocate.
> > > > 
> > > > An even more specific way to view this problem is that watchpoints owned by
> > > > GC objects shouldn't do anything if those objects are pending destruction.
> > > This makes sense. Thanks for explaining. I see exactly what’s happening.
> > > 
> > > This patch employs this fix only for this particular adaptive watchpoint.
> > > Why don’t we apply this to all of them. My suggestion would be: adaptive
> > > watchpoints take an optional JSCell that represents their owner. If the
> > > owner is null, they’re always valid. Otherwise, they’re only valid if their
> > > owner is live.
> > 
> > This would only be better if the adaptive watchpoints that had an owner
> > didn't already point to that owner in some way.
> > 
> > I do wonder how many other adaptive watchpoints have this bug.  Probably not
> > many.  The reason why this all worked before is that watchpoints didn't
> > allocate.  I think that there are a bunch of watchpoints that "just work"
> > because of the lack of allocation.
> > 
> > I think that if we wanted to make this code make more sense, then we would
> > teach Watchpoint about its m_owner. This would add some space bloat, but I
> > think it would be worth it. Then watchpoints would not do anything if the
> > owner was dead.
> > 
> > If we did this, then maybe we could also introduce the policy that
> > watchpoints get fired when their owner dies. Perhaps it could be a special
> > kind of firing - maybe a separate virtual method just for this purpose. If
> > we were willing to make Watchpoints be JSCells, then we could trivially do
> > this (allocate Watchpoints in a special Subspace that does a finalization
> > action that fires all watchpoints whose owners died - note that doing it as
> > a finalization action ensures that it does not happen during sweep.  The
> > reason why we currently don't fire watchpoints when owners die is that we
> > don't want to fire watchpoints while sweeping - that's even worse than
> > sweeping while firing watchpoints.).
> 
> Anyway - I still think we should land Mark's fix. It's a step in the right
> direction.
I agree. I also think it’s worth auditing other watchpoints for this exact bug. We can either employ a similar solution as we did here or one of your other suggested fixes.