The file reader tests fast/files/read-blob-async.html and fast/files/read-file-async.html crash when COLLECT_ON_EVERY_ALLOCATION is turned on. It appears that the overridden FileReader::hasPendingActivity() returns false when m_state is set to DONE, which can allow the related JSFileReader object to be collected. From inspection, FileWriter has a similar issue. This can be fixed by eliminating the overriding hasPendingActivity() and using the ActiveDOMObject::hasPendingActivity() and placing setPendingActivity() /unsetPendingActivity() at appropriate places in the code. <rdar://problem/11501481>
Created attachment 143361 [details] Patch
What is the situation where m_state is set to DONE, script doesn't have a reference to FileReader, and yet the wrapper needs to be protected? Does the collection happen while a callback is executed?
(In reply to comment #2) > What is the situation where m_state is set to DONE, script doesn't have a reference to FileReader, and yet the wrapper needs to be protected? Does the collection happen while a callback is executed? Yes, a collection can happen when one of the event handlers is running.
Interesting. I wonder why the object's wrapper is not protected automatically when it's a target of an event that's being handled. Could this be a bug elsewhere?
(In reply to comment #4) > Interesting. I wonder why the object's wrapper is not protected automatically when it's a target of an event that's being handled. Could this be a bug elsewhere? The problem occurs because the garbage collection happens during GC after the event handler start executing. Because hasPendingActivity() returns false, the JSFileReader and all the event handlers have been GC'ed. When we invoke the handlers, we need a JSFileReader object and we create a new one, since the old one is gone. The new one now references the handlers via the FileReader. The error is actually that these handlers themselves were garbage collected. In all the traces I've seen, we actually die because the ScopeChainNode attached to the JSFunction object has been collected and put on a free list (zapped). We fail a GC sanity check when visiting the ScopeChainNode. The JSFunction has been collected, but its memory hasn't been reused. Remember, we need to induce this crash by turning on COLLECT_ON_EVERY_ALLOCATION, (which really doesn't GC on every allocation but makes GCs happen much more often). The timing window is small and can be induced by COLLECT_ON_EVERY_ALLOCATION, but it is still a vulnerability. I think there is a possibility that this issue exists in other code. The pattern to look for is executing a handler after the object's hasPendingActivity() would return false. I will be looking for this pattern in other code.
My question is why JSFileReader is not protected through event.target attribute. I think that it should be.
Comment on attachment 143361 [details] Patch Discussed this in person. The issue is that garbage collection can happen after state is set to DONE before window.event.target chain is fully established. Also, the suggested design is similar to what XMLHttpRequest does, which is good for consistency. r=me, but please add comments explaining why unsetPendingActivity are in the places they are in.
Committed r118253: <http://trac.webkit.org/changeset/118253>
Comment on attachment 143361 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=143361&action=review I think the safe way to do this is: Call setPendingActivity in doOperation, at the calls to writer()->write() and writer()->truncate(). Don't call setPendingActivity for an abort. Call unsetPendingActivity in didFail, didTruncate, and didWrite /after/ the check for m_operationInProgress == OperationAbort. Call unsetPendingActivity in completeAbort /after/ the call to doOperation. This needs a test, too, ideally one that would have crashed or leaked given what was just committed. > Source/WebCore/Modules/filesystem/FileWriter.cpp:94 > + setPendingActivity(this); I'm sorry for the late response, but I don't think this is correct. You're incrementing the refcount when stop() is called, but if you've gotten to this code, there's a write in progress, so the refcount was already incremented. > Source/WebCore/Modules/filesystem/FileWriter.cpp:301 > + unsetPendingActivity(this); You're dereffing the object here, but if this code is hit as the result of an abort call, there may be an outstanding didFail call that will arrive shortly. I say "may" because the abort may have been called when m_operationPending was already abort [if the user did write-abort-write-abort before the first abort completed]. If this deref kills the object, we'll crash when the didFail hits.
See code review comments.
(In reply to comment #9) > (From update of attachment 143361 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=143361&action=review > > I think the safe way to do this is: > Call setPendingActivity in doOperation, at the calls to writer()->write() and writer()->truncate(). > Don't call setPendingActivity for an abort. > Call unsetPendingActivity in didFail, didTruncate, and didWrite /after/ the check for m_operationInProgress == OperationAbort. > Call unsetPendingActivity in completeAbort /after/ the call to doOperation. > > This needs a test, too, ideally one that would have crashed or leaked given what was just committed. > > > Source/WebCore/Modules/filesystem/FileWriter.cpp:94 > > + setPendingActivity(this); > > I'm sorry for the late response, but I don't think this is correct. You're incrementing the refcount when stop() is called, but if you've gotten to this code, there's a write in progress, so the refcount was already incremented. > > > Source/WebCore/Modules/filesystem/FileWriter.cpp:301 > > + unsetPendingActivity(this); > > You're dereffing the object here, but if this code is hit as the result of an abort call, there may be an outstanding didFail call that will arrive shortly. > I say "may" because the abort may have been called when m_operationPending was already abort [if the user did write-abort-write-abort before the first abort completed]. If this deref kills the object, we'll crash when the didFail hits. Thanks for the suggestions. I will investigate these suggested changes and improving the tests.
It seems to me that the placement for unsetPendingActivity(this) in didWrite() needs to be inside the if (complete && numAborts == m_numAborts) then block after the signalCompletion call. The other unsetPendingActivity() calls need to be after signalCompletion(). That way the event handlers are called when hasPendingActivity() returns true.
(In reply to comment #12) > It seems to me that the placement for unsetPendingActivity(this) in didWrite() needs to be inside the if (complete && numAborts == m_numAborts) then block after the signalCompletion call. The other unsetPendingActivity() calls need to be after signalCompletion(). That way the event handlers are called when hasPendingActivity() returns true. You do want to restrict it to if (complete), but I don't think you want to guard on numAborts. You don't really care about those, because: you're just decrementing for the current operation's increment; aborting in the onwrite handler won't actually result in a didFail call, since there wasn't a write in progress, because complete was true; if there's both an abort and a write call during the onwrite handler...damn, that looks like a bug. We'll set m_operationInProgress to OperationAbort, but then we'll never get a didFail call. I think we want to set m_operationInProgress to OperationNone before we fire the progress event. But anyway, the new write call will do its own incrementation, and we'll do our own decrementation. Let me throw together a patch; this is getting to messy to do in English.
Created attachment 146076 [details] Patch
(In reply to comment #14) > Created an attachment (id=146076) [details] > Patch This patch has a test for the bug I talked about in which the didFail never comes; it doesn't have a test for extra refs. I'm not sure we have any way of detecting extra refs in a layout test, and hitting the extra deref may be tricky/flaky because of the timing.
Created attachment 146138 [details] Patch
Thanks Michael--I had to make a one-line change, though. In doOperation, we shouldn't be clearing m_operationInProgress during abort if it's already abort. This comes up in nested aborts [see fast/filesystem/file-writer-abort-depth.html]. If I hear no screams, I'll land the fixed version on Monday.
Committed r120013: <http://trac.webkit.org/changeset/120013>