87165 – Crash in fast/files/read tests during Garbage Collection

RESOLVED FIXED 87165

Crash in fast/files/read tests during Garbage Collection

https://bugs.webkit.org/show_bug.cgi?id=87165

Summary Crash in fast/files/read tests during Garbage Collection

Michael Saboff

Reported 2012-05-22 14:16:35 PDT

The file reader tests fast/files/read-blob-async.html and fast/files/read-file-async.html crash when COLLECT_ON_EVERY_ALLOCATION is turned on. It appears that the overridden FileReader::hasPendingActivity() returns false when m_state is set to DONE, which can allow the related JSFileReader object to be collected. From inspection, FileWriter has a similar issue. This can be fixed by eliminating the overriding hasPendingActivity() and using the ActiveDOMObject::hasPendingActivity() and placing setPendingActivity() /unsetPendingActivity() at appropriate places in the code. <rdar://problem/11501481>

Attachments
Patch (5.64 KB, patch) 2012-05-22 14:25 PDT, Michael Saboff	no flags	Details Formatted Diff Diff
Patch (13.87 KB, patch) 2012-06-06 11:36 PDT, Eric U.	no flags	Details Formatted Diff Diff
Patch (13.92 KB, patch) 2012-06-06 16:12 PDT, Eric U.	msaboff: review+	Details Formatted Diff Diff
Show Obsolete (2) View All Add attachment proposed patch, testcase, etc.

Michael Saboff

Comment 1 2012-05-22 14:25:03 PDT

Created attachment 143361 [details] Patch

Alexey Proskuryakov

Comment 2 2012-05-22 15:52:23 PDT

What is the situation where m_state is set to DONE, script doesn't have a reference to FileReader, and yet the wrapper needs to be protected? Does the collection happen while a callback is executed?

Michael Saboff

Comment 3 2012-05-22 20:52:49 PDT

(In reply to comment #2) > What is the situation where m_state is set to DONE, script doesn't have a reference to FileReader, and yet the wrapper needs to be protected? Does the collection happen while a callback is executed? Yes, a collection can happen when one of the event handlers is running.

Alexey Proskuryakov

Comment 4 2012-05-23 10:49:59 PDT

Interesting. I wonder why the object's wrapper is not protected automatically when it's a target of an event that's being handled. Could this be a bug elsewhere?

Michael Saboff

Comment 5 2012-05-23 11:03:02 PDT

(In reply to comment #4) > Interesting. I wonder why the object's wrapper is not protected automatically when it's a target of an event that's being handled. Could this be a bug elsewhere? The problem occurs because the garbage collection happens during GC after the event handler start executing. Because hasPendingActivity() returns false, the JSFileReader and all the event handlers have been GC'ed. When we invoke the handlers, we need a JSFileReader object and we create a new one, since the old one is gone. The new one now references the handlers via the FileReader. The error is actually that these handlers themselves were garbage collected. In all the traces I've seen, we actually die because the ScopeChainNode attached to the JSFunction object has been collected and put on a free list (zapped). We fail a GC sanity check when visiting the ScopeChainNode. The JSFunction has been collected, but its memory hasn't been reused. Remember, we need to induce this crash by turning on COLLECT_ON_EVERY_ALLOCATION, (which really doesn't GC on every allocation but makes GCs happen much more often). The timing window is small and can be induced by COLLECT_ON_EVERY_ALLOCATION, but it is still a vulnerability. I think there is a possibility that this issue exists in other code. The pattern to look for is executing a handler after the object's hasPendingActivity() would return false. I will be looking for this pattern in other code.

Alexey Proskuryakov

Comment 6 2012-05-23 11:36:21 PDT

My question is why JSFileReader is not protected through event.target attribute. I think that it should be.

Alexey Proskuryakov

Comment 7 2012-05-23 12:16:53 PDT

Comment on attachment 143361 [details] Patch Discussed this in person. The issue is that garbage collection can happen after state is set to DONE before window.event.target chain is fully established. Also, the suggested design is similar to what XMLHttpRequest does, which is good for consistency. r=me, but please add comments explaining why unsetPendingActivity are in the places they are in.

Michael Saboff

Comment 8 2012-05-23 14:58:57 PDT

Committed r118253: <http://trac.webkit.org/changeset/118253>

Eric U.

Comment 9 2012-06-05 10:10:02 PDT

Comment on attachment 143361 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=143361&action=review I think the safe way to do this is: Call setPendingActivity in doOperation, at the calls to writer()->write() and writer()->truncate(). Don't call setPendingActivity for an abort. Call unsetPendingActivity in didFail, didTruncate, and didWrite /after/ the check for m_operationInProgress == OperationAbort. Call unsetPendingActivity in completeAbort /after/ the call to doOperation. This needs a test, too, ideally one that would have crashed or leaked given what was just committed. > Source/WebCore/Modules/filesystem/FileWriter.cpp:94 > + setPendingActivity(this); I'm sorry for the late response, but I don't think this is correct. You're incrementing the refcount when stop() is called, but if you've gotten to this code, there's a write in progress, so the refcount was already incremented. > Source/WebCore/Modules/filesystem/FileWriter.cpp:301 > + unsetPendingActivity(this); You're dereffing the object here, but if this code is hit as the result of an abort call, there may be an outstanding didFail call that will arrive shortly. I say "may" because the abort may have been called when m_operationPending was already abort [if the user did write-abort-write-abort before the first abort completed]. If this deref kills the object, we'll crash when the didFail hits.

Eric U.

Comment 10 2012-06-05 10:10:42 PDT

See code review comments.

Michael Saboff

Comment 11 2012-06-05 11:14:15 PDT

(In reply to comment #9) > (From update of attachment 143361 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=143361&action=review > > I think the safe way to do this is: > Call setPendingActivity in doOperation, at the calls to writer()->write() and writer()->truncate(). > Don't call setPendingActivity for an abort. > Call unsetPendingActivity in didFail, didTruncate, and didWrite /after/ the check for m_operationInProgress == OperationAbort. > Call unsetPendingActivity in completeAbort /after/ the call to doOperation. > > This needs a test, too, ideally one that would have crashed or leaked given what was just committed. > > > Source/WebCore/Modules/filesystem/FileWriter.cpp:94 > > + setPendingActivity(this); > > I'm sorry for the late response, but I don't think this is correct. You're incrementing the refcount when stop() is called, but if you've gotten to this code, there's a write in progress, so the refcount was already incremented. > > > Source/WebCore/Modules/filesystem/FileWriter.cpp:301 > > + unsetPendingActivity(this); > > You're dereffing the object here, but if this code is hit as the result of an abort call, there may be an outstanding didFail call that will arrive shortly. > I say "may" because the abort may have been called when m_operationPending was already abort [if the user did write-abort-write-abort before the first abort completed]. If this deref kills the object, we'll crash when the didFail hits. Thanks for the suggestions. I will investigate these suggested changes and improving the tests.

Michael Saboff

Comment 12 2012-06-05 11:56:17 PDT

It seems to me that the placement for unsetPendingActivity(this) in didWrite() needs to be inside the if (complete && numAborts == m_numAborts) then block after the signalCompletion call. The other unsetPendingActivity() calls need to be after signalCompletion(). That way the event handlers are called when hasPendingActivity() returns true.

Eric U.

Comment 13 2012-06-05 13:14:37 PDT

(In reply to comment #12) > It seems to me that the placement for unsetPendingActivity(this) in didWrite() needs to be inside the if (complete && numAborts == m_numAborts) then block after the signalCompletion call. The other unsetPendingActivity() calls need to be after signalCompletion(). That way the event handlers are called when hasPendingActivity() returns true. You do want to restrict it to if (complete), but I don't think you want to guard on numAborts. You don't really care about those, because: you're just decrementing for the current operation's increment; aborting in the onwrite handler won't actually result in a didFail call, since there wasn't a write in progress, because complete was true; if there's both an abort and a write call during the onwrite handler...damn, that looks like a bug. We'll set m_operationInProgress to OperationAbort, but then we'll never get a didFail call. I think we want to set m_operationInProgress to OperationNone before we fire the progress event. But anyway, the new write call will do its own incrementation, and we'll do our own decrementation. Let me throw together a patch; this is getting to messy to do in English.

Eric U.

Comment 14 2012-06-06 11:36:12 PDT

Created attachment 146076 [details] Patch

Eric U.

Comment 15 2012-06-06 13:06:59 PDT

(In reply to comment #14) > Created an attachment (id=146076) [details] > Patch This patch has a test for the bug I talked about in which the didFail never comes; it doesn't have a test for extra refs. I'm not sure we have any way of detecting extra refs in a layout test, and hitting the extra deref may be tricky/flaky because of the timing.

Eric U.

Comment 16 2012-06-06 16:12:20 PDT

Created attachment 146138 [details] Patch

Eric U.

Comment 17 2012-06-06 16:14:30 PDT

Thanks Michael--I had to make a one-line change, though. In doOperation, we shouldn't be clearing m_operationInProgress during abort if it's already abort. This comes up in nested aborts [see fast/filesystem/file-writer-abort-depth.html]. If I hear no screams, I'll land the fixed version on Monday.

Eric U.

Comment 18 2012-06-11 15:29:49 PDT

Committed r120013: <http://trac.webkit.org/changeset/120013>

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware All

OS All

Product WebKit

Component JavaScriptCore

Assignee

Michael Saboff

Reported

2012-05-22 14:16 PDT

Modified

2012-06-11 15:29 PDT History

CC List

2 users Show

URL

Keywords InRadar

Depends on

Blocks