Consider the following scenario: 1. A ObjectToStringAdaptiveInferredPropertyValueWatchpoint O1, watches for structure transitions, e.g. structure S2 transitioning to structure S3. In this case, O1 would be installed in S2's watchpoint set. 2. When the structure transition happens, structure S2 will fire watchpoint O1. 3. O1's handler will normally re-install itself in the watchpoint set of the new "transitioned to" structure S3. 4. "Installation" here requires writing into the StructureRareData SD3 of the new structure S3. If SD3 does not exist yet, the installation process will trigger the allocation of StructureRareData SD3. 5. It is possible that the Structure S1, and StructureRareData SD1 that owns the ObjectToStringAdaptiveInferredPropertyValueWatchpoint O1 is no longer reachable by the GC, and therefore will be collected soon. 6. The allocation of SD3 in (4) may trigger the sweeping of the StructureRareData SD1. This, in turn, triggers the deletion of the ObjectToStringAdaptiveInferredPropertyValueWatchpoint O1. After O1 is deleted in (6) and SD3 is allocated in (4), execution continues in AdaptiveInferredPropertyValueWatchpointBase::fire() where O1 gets installed in structure S3's watchpoint set. This is obviously incorrect because O1 is already deleted. The result is that badness happens later when S3's watchpoint set fires its watchpoints and accesses the deleted O1. The fix is to enhance AdaptiveInferredPropertyValueWatchpointBase::fire() to check if "this" is still valid before proceeding to re-install itself or to invoke its handleFire() method. ObjectToStringAdaptiveInferredPropertyValueWatchpoint (which extends AdaptiveInferredPropertyValueWatchpointBase) will override its isValid() method, and return false its owner StructureRareData is no longer reachable by the GC. This ensures that it won't be deleted while it's installed to any watchpoint set. Additional considerations and notes: 1. In the above, I talked about the ObjectToStringAdaptiveInferredPropertyValueWatchpoint being installed in watchpoint sets. What actually happens is that ObjectToStringAdaptiveInferredPropertyValueWatchpoint has 2 members (m_structureWatchpoint and m_propertyWatchpoint) which may be installed in watchpoint sets. The ObjectToStringAdaptiveInferredPropertyValueWatchpoint is not itself a Watchpoint object. But for brevity, in the above, I refer to the ObjectToStringAdaptiveInferredPropertyValueWatchpoint instead of its Watchpoint members. The description of the issue is still accurate given the life-cycle of the Watchpoint members are embedded in the enclosing ObjectToStringAdaptiveInferredPropertyValueWatchpoint object, and hence, they share the same life-cycle. 2. The top of AdaptiveInferredPropertyValueWatchpointBase::fire() removes its m_structureWatchpoint and m_propertyWatchpoint if they have been added to any watchpoint sets. This is safe to do even if the owner StructureRareData is no longer reachable by the GC. This is because the only way we can get to AdaptiveInferredPropertyValueWatchpointBase::fire() is if its Watchpoint members are still installed in some watchpoint set that fired. This means that the AdaptiveInferredPropertyValueWatchpointBase instance has not been deleted yet, because its destructor will automatically remove the Watchpoint members from any watchpoint sets. <rdar://problem/31458393>
Created attachment 311139 [details] proposed patch.
Comment on attachment 311139 [details] proposed patch. Attachment 311139 [details] did not pass jsc-ews (mac): Output: http://webkit-queues.webkit.org/results/3808194 New failing tests: stress/symbol-tostringtag-watchpoints.js.dfg-eager-no-cjit-validate stress/symbol-tostringtag-watchpoints.js.ftl-eager stress/symbol-tostringtag-watchpoints.js.no-llint stress/symbol-tostringtag-watchpoints.js.dfg-maximal-flush-validate-no-cjit stress/symbol-tostringtag-watchpoints.js.ftl-no-cjit-validate-sampling-profiler stress/symbol-tostringtag-watchpoints.js.ftl-no-cjit-b3o1 stress/symbol-tostringtag-watchpoints.js.no-cjit-validate-phases stress/symbol-tostringtag-watchpoints.js.ftl-no-cjit-no-inline-validate stress/symbol-tostringtag-watchpoints.js.ftl-eager-no-cjit-b3o1 stress/symbol-tostringtag-watchpoints.js.no-ftl stress/symbol-tostringtag-watchpoints.js.ftl-no-cjit-no-put-stack-validate stress/symbol-tostringtag-watchpoints.js.default stress/symbol-tostringtag-watchpoints.js.ftl-no-cjit-small-pool apiTests
Comment on attachment 311139 [details] proposed patch. Tests found bugs. Will fix.
Created attachment 311163 [details] proposed patch.
Comment on attachment 311163 [details] proposed patch. View in context: https://bugs.webkit.org/attachment.cgi?id=311163&action=review > Source/JavaScriptCore/ChangeLog:25 > + 5. It is possible that the Structure S1, and StructureRareData SD1 that owns the > + ObjectToStringAdaptiveInferredPropertyValueWatchpoint O1 is no longer reachable > + by the GC, and therefore will be collected soon. > + 6. The allocation of SD3 in (4) may trigger the sweeping of the StructureRareData > + SD1. This, in turn, triggers the deletion of the > + ObjectToStringAdaptiveInferredPropertyValueWatchpoint O1. What is SD1 here? How is it relevant? You're saying we're doing "S2->S3" transition. Also, why can't this lifetime issue just be solved with moving the ObjectToStringAdaptiveInferredPropertyValueWatchpoint out of the rare data? I don't get how this thing that was proven unreachable is actually reachable. Can you explain that a bit more?
(In reply to Saam Barati from comment #5) > Comment on attachment 311163 [details] > proposed patch. > > View in context: > https://bugs.webkit.org/attachment.cgi?id=311163&action=review > > > Source/JavaScriptCore/ChangeLog:25 > > + 5. It is possible that the Structure S1, and StructureRareData SD1 that owns the > > + ObjectToStringAdaptiveInferredPropertyValueWatchpoint O1 is no longer reachable > > + by the GC, and therefore will be collected soon. > > + 6. The allocation of SD3 in (4) may trigger the sweeping of the StructureRareData > > + SD1. This, in turn, triggers the deletion of the > > + ObjectToStringAdaptiveInferredPropertyValueWatchpoint O1. > > What is SD1 here? How is it relevant? You're saying we're doing "S2->S3" > transition. > > Also, why can't this lifetime issue just be solved with moving the > ObjectToStringAdaptiveInferredPropertyValueWatchpoint out of the rare data? The problem is that isWatchable() allocates and so it may call the rare data's destructor. Say that you have the ObjectToStringAdaptiveInferredPropertyValueWatchpoint be a separate object owned by the rare data. Then when we do isWatchable(), the rare data destructor will run, and the watchpoint will be deleted. That's still bad. You could ref-count the ObjectToStringAdaptiveInferredPropertyValueWatchpoint, and it could ref itself before calling isWatchable, but I somewhat prefer being able to answer this via GC. I want objects to be allowed to ask: "am I really alive?" > > I don't get how this thing that was proven unreachable is actually > reachable. Can you explain that a bit more?
Can somebody explain to me exactly what the object graph is in this example? The structures in question, their prototype fields, etc. I’m a bit confused still. I feel like we’re missing an easier solution
Comment on attachment 311163 [details] proposed patch. View in context: https://bugs.webkit.org/attachment.cgi?id=311163&action=review > Source/JavaScriptCore/heap/FreeList.cpp:48 > +bool FreeList::contains(const void* target) const > +{ > + if (remaining) { > + const void* start = (payloadEnd - remaining); > + const void* end = payloadEnd; > + return (start <= target) && (target < end); > + } > + > + FreeCell* candidate = head; > + while (candidate) { > + if (candidate == target) > + return true; > + candidate = candidate->next; > + } > + > + return false; > +} > + Nice. > Source/JavaScriptCore/heap/HeapCellInlines.h:42 > + if (markedBlockHandle.isFreeListed()) > + return !markedBlockHandle.isFreeListedCell(this); I think you commented to me earlier that we could say that free-listed objects are "live". I get now that this would be correct for the purposes of your patch, but would just be inconsistent with the GC's lingo (in other contexts it's important to know that you're not "live" if you're still on the free list). Maybe we want to eventually add something like HeapCell::isPendingDestruction(), which could just be: if (isLargeAllocation()) return !largeAllocation().isLive(); auto& markedBlockHandle = markedBlock().handle(); if (markedBlockhandle.isFreeListed()) return false; return !markedBlockHandle.isLive(this); Basically, we'd say that isPendingDestruction() can only be called on objects that are either live right now, or objects that ceased to be live but have not yet been destructed. You don't have to make this change. It's just an observation.
(In reply to Saam Barati from comment #7) > Can somebody explain to me exactly what the object graph is in this example? > The structures in question, their prototype fields, etc. I’m a bit confused > still. > > I feel like we’re missing an easier solution Structure S has a rare data R. Rare data R has a watchpoint W. Watchpoint W is installed in a watchpoint set in structure T. T != S. Then: 1) GC happens, marks T but not S or R. Note that S and R are dead but not destructed. Therefore W is still installed in T. 2) T does something that fires W. 3) W calls isWatchable(), which allocates, which finally calls R's destructor - but not before we allocate another rare data right over it. 4) Heap corruption - the destruction of R, and the removal of W, overwrites a newly allocated rare data. One way to view this problem is that watchpoints shouldn't allocate. Another way to view this problem is that watchpoints owned by GC objects shouldn't allocate. An even more specific way to view this problem is that watchpoints owned by GC objects shouldn't do anything if those objects are pending destruction.
(In reply to Filip Pizlo from comment #9) > (In reply to Saam Barati from comment #7) > > Can somebody explain to me exactly what the object graph is in this example? > > The structures in question, their prototype fields, etc. I’m a bit confused > > still. > > > > I feel like we’re missing an easier solution > > Structure S has a rare data R. > > Rare data R has a watchpoint W. > > Watchpoint W is installed in a watchpoint set in structure T. > > T != S. > > Then: > > 1) GC happens, marks T but not S or R. Note that S and R are dead but not > destructed. Therefore W is still installed in T. > 2) T does something that fires W. > 3) W calls isWatchable(), which allocates, which finally calls R's > destructor - but not before we allocate another rare data right over it. > 4) Heap corruption - the destruction of R, and the removal of W, overwrites > a newly allocated rare data. > > One way to view this problem is that watchpoints shouldn't allocate. > > Another way to view this problem is that watchpoints owned by GC objects > shouldn't allocate. > > An even more specific way to view this problem is that watchpoints owned by > GC objects shouldn't do anything if those objects are pending destruction. This makes sense. Thanks for explaining. I see exactly what’s happening. This patch employs this fix only for this particular adaptive watchpoint. Why don’t we apply this to all of them. My suggestion would be: adaptive watchpoints take an optional JSCell that represents their owner. If the owner is null, they’re always valid. Otherwise, they’re only valid if their owner is live.
Interesting I was under the impression we deferred GC while relocating watchpoints. I guess that was a bad assumption...
I still kinda feel like deferring GC when firing watchpoints would be a more comprehensive solution. Since I think this problem could happen with any adaptive watchpoint.
(In reply to Keith Miller from comment #12) > I still kinda feel like deferring GC when firing watchpoints would be a more > comprehensive solution. Since I think this problem could happen with any > adaptive watchpoint. I don’t think this solves the issue since we’ve already done the GC at this point. Does DeferGC guarantee we won’t destruct something on the free list?
(In reply to Keith Miller from comment #12) > I still kinda feel like deferring GC when firing watchpoints would be a more > comprehensive solution. Since I think this problem could happen with any > adaptive watchpoint. Deferring GC does not prevent sweeping. The bug is sweeping, not GC.
(In reply to Saam Barati from comment #13) > (In reply to Keith Miller from comment #12) > > I still kinda feel like deferring GC when firing watchpoints would be a more > > comprehensive solution. Since I think this problem could happen with any > > adaptive watchpoint. > > I don’t think this solves the issue since we’ve already done the GC at this > point. Does DeferGC guarantee we won’t destruct something on the free list? No. There is no way to guarantee that we don't destruct when allocating.
(In reply to Saam Barati from comment #10) > (In reply to Filip Pizlo from comment #9) > > (In reply to Saam Barati from comment #7) > > > Can somebody explain to me exactly what the object graph is in this example? > > > The structures in question, their prototype fields, etc. I’m a bit confused > > > still. > > > > > > I feel like we’re missing an easier solution > > > > Structure S has a rare data R. > > > > Rare data R has a watchpoint W. > > > > Watchpoint W is installed in a watchpoint set in structure T. > > > > T != S. > > > > Then: > > > > 1) GC happens, marks T but not S or R. Note that S and R are dead but not > > destructed. Therefore W is still installed in T. > > 2) T does something that fires W. > > 3) W calls isWatchable(), which allocates, which finally calls R's > > destructor - but not before we allocate another rare data right over it. > > 4) Heap corruption - the destruction of R, and the removal of W, overwrites > > a newly allocated rare data. > > > > One way to view this problem is that watchpoints shouldn't allocate. > > > > Another way to view this problem is that watchpoints owned by GC objects > > shouldn't allocate. > > > > An even more specific way to view this problem is that watchpoints owned by > > GC objects shouldn't do anything if those objects are pending destruction. > This makes sense. Thanks for explaining. I see exactly what’s happening. > > This patch employs this fix only for this particular adaptive watchpoint. > Why don’t we apply this to all of them. My suggestion would be: adaptive > watchpoints take an optional JSCell that represents their owner. If the > owner is null, they’re always valid. Otherwise, they’re only valid if their > owner is live. This would only be better if the adaptive watchpoints that had an owner didn't already point to that owner in some way. I do wonder how many other adaptive watchpoints have this bug. Probably not many. The reason why this all worked before is that watchpoints didn't allocate. I think that there are a bunch of watchpoints that "just work" because of the lack of allocation. I think that if we wanted to make this code make more sense, then we would teach Watchpoint about its m_owner. This would add some space bloat, but I think it would be worth it. Then watchpoints would not do anything if the owner was dead. If we did this, then maybe we could also introduce the policy that watchpoints get fired when their owner dies. Perhaps it could be a special kind of firing - maybe a separate virtual method just for this purpose. If we were willing to make Watchpoints be JSCells, then we could trivially do this (allocate Watchpoints in a special Subspace that does a finalization action that fires all watchpoints whose owners died - note that doing it as a finalization action ensures that it does not happen during sweep. The reason why we currently don't fire watchpoints when owners die is that we don't want to fire watchpoints while sweeping - that's even worse than sweeping while firing watchpoints.).
(In reply to Filip Pizlo from comment #16) > (In reply to Saam Barati from comment #10) > > (In reply to Filip Pizlo from comment #9) > > > (In reply to Saam Barati from comment #7) > > > > Can somebody explain to me exactly what the object graph is in this example? > > > > The structures in question, their prototype fields, etc. I’m a bit confused > > > > still. > > > > > > > > I feel like we’re missing an easier solution > > > > > > Structure S has a rare data R. > > > > > > Rare data R has a watchpoint W. > > > > > > Watchpoint W is installed in a watchpoint set in structure T. > > > > > > T != S. > > > > > > Then: > > > > > > 1) GC happens, marks T but not S or R. Note that S and R are dead but not > > > destructed. Therefore W is still installed in T. > > > 2) T does something that fires W. > > > 3) W calls isWatchable(), which allocates, which finally calls R's > > > destructor - but not before we allocate another rare data right over it. > > > 4) Heap corruption - the destruction of R, and the removal of W, overwrites > > > a newly allocated rare data. > > > > > > One way to view this problem is that watchpoints shouldn't allocate. > > > > > > Another way to view this problem is that watchpoints owned by GC objects > > > shouldn't allocate. > > > > > > An even more specific way to view this problem is that watchpoints owned by > > > GC objects shouldn't do anything if those objects are pending destruction. > > This makes sense. Thanks for explaining. I see exactly what’s happening. > > > > This patch employs this fix only for this particular adaptive watchpoint. > > Why don’t we apply this to all of them. My suggestion would be: adaptive > > watchpoints take an optional JSCell that represents their owner. If the > > owner is null, they’re always valid. Otherwise, they’re only valid if their > > owner is live. > > This would only be better if the adaptive watchpoints that had an owner > didn't already point to that owner in some way. > > I do wonder how many other adaptive watchpoints have this bug. Probably not > many. The reason why this all worked before is that watchpoints didn't > allocate. I think that there are a bunch of watchpoints that "just work" > because of the lack of allocation. > > I think that if we wanted to make this code make more sense, then we would > teach Watchpoint about its m_owner. This would add some space bloat, but I > think it would be worth it. Then watchpoints would not do anything if the > owner was dead. > > If we did this, then maybe we could also introduce the policy that > watchpoints get fired when their owner dies. Perhaps it could be a special > kind of firing - maybe a separate virtual method just for this purpose. If > we were willing to make Watchpoints be JSCells, then we could trivially do > this (allocate Watchpoints in a special Subspace that does a finalization > action that fires all watchpoints whose owners died - note that doing it as > a finalization action ensures that it does not happen during sweep. The > reason why we currently don't fire watchpoints when owners die is that we > don't want to fire watchpoints while sweeping - that's even worse than > sweeping while firing watchpoints.). Anyway - I still think we should land Mark's fix. It's a step in the right direction.
Thanks for the review. Landed in r217429: <http://trac.webkit.org/r217429>.
(In reply to Filip Pizlo from comment #17) > (In reply to Filip Pizlo from comment #16) > > (In reply to Saam Barati from comment #10) > > > (In reply to Filip Pizlo from comment #9) > > > > (In reply to Saam Barati from comment #7) > > > > > Can somebody explain to me exactly what the object graph is in this example? > > > > > The structures in question, their prototype fields, etc. I’m a bit confused > > > > > still. > > > > > > > > > > I feel like we’re missing an easier solution > > > > > > > > Structure S has a rare data R. > > > > > > > > Rare data R has a watchpoint W. > > > > > > > > Watchpoint W is installed in a watchpoint set in structure T. > > > > > > > > T != S. > > > > > > > > Then: > > > > > > > > 1) GC happens, marks T but not S or R. Note that S and R are dead but not > > > > destructed. Therefore W is still installed in T. > > > > 2) T does something that fires W. > > > > 3) W calls isWatchable(), which allocates, which finally calls R's > > > > destructor - but not before we allocate another rare data right over it. > > > > 4) Heap corruption - the destruction of R, and the removal of W, overwrites > > > > a newly allocated rare data. > > > > > > > > One way to view this problem is that watchpoints shouldn't allocate. > > > > > > > > Another way to view this problem is that watchpoints owned by GC objects > > > > shouldn't allocate. > > > > > > > > An even more specific way to view this problem is that watchpoints owned by > > > > GC objects shouldn't do anything if those objects are pending destruction. > > > This makes sense. Thanks for explaining. I see exactly what’s happening. > > > > > > This patch employs this fix only for this particular adaptive watchpoint. > > > Why don’t we apply this to all of them. My suggestion would be: adaptive > > > watchpoints take an optional JSCell that represents their owner. If the > > > owner is null, they’re always valid. Otherwise, they’re only valid if their > > > owner is live. > > > > This would only be better if the adaptive watchpoints that had an owner > > didn't already point to that owner in some way. > > > > I do wonder how many other adaptive watchpoints have this bug. Probably not > > many. The reason why this all worked before is that watchpoints didn't > > allocate. I think that there are a bunch of watchpoints that "just work" > > because of the lack of allocation. > > > > I think that if we wanted to make this code make more sense, then we would > > teach Watchpoint about its m_owner. This would add some space bloat, but I > > think it would be worth it. Then watchpoints would not do anything if the > > owner was dead. > > > > If we did this, then maybe we could also introduce the policy that > > watchpoints get fired when their owner dies. Perhaps it could be a special > > kind of firing - maybe a separate virtual method just for this purpose. If > > we were willing to make Watchpoints be JSCells, then we could trivially do > > this (allocate Watchpoints in a special Subspace that does a finalization > > action that fires all watchpoints whose owners died - note that doing it as > > a finalization action ensures that it does not happen during sweep. The > > reason why we currently don't fire watchpoints when owners die is that we > > don't want to fire watchpoints while sweeping - that's even worse than > > sweeping while firing watchpoints.). > > Anyway - I still think we should land Mark's fix. It's a step in the right > direction. I agree. I also think it’s worth auditing other watchpoints for this exact bug. We can either employ a similar solution as we did here or one of your other suggested fixes.