Bug 67684 - [Chromium/FileWriter] race condition in FileWriter completion can lead to assert
Summary: [Chromium/FileWriter] race condition in FileWriter completion can lead to assert
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Eric U.
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-09-06 16:40 PDT by Eric U.
Modified: 2011-09-27 18:45 PDT (History)
4 users (show)

See Also:


Attachments
Repro case (1.31 KB, text/html)
2011-09-06 16:40 PDT, Eric U.
no flags Details
Patch (21.37 KB, patch)
2011-09-21 17:20 PDT, Eric U.
no flags Details | Formatted Diff | Diff
Patch (26.06 KB, patch)
2011-09-27 12:51 PDT, Eric U.
no flags Details | Formatted Diff | Diff
Patch (26.58 KB, patch)
2011-09-27 14:29 PDT, Eric U.
no flags Details | Formatted Diff | Diff
Patch (26.59 KB, patch)
2011-09-27 17:12 PDT, Eric U.
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eric U. 2011-09-06 16:40:15 PDT
In FileWriter.cpp, when we complete a FileWriter action [didWrite, didTruncate, or didFail] we post a task to notify the user asynchronously.  We only update m_readyState when that task gets executed.  If, in the window between completion and the notification, we get a cancel request, we'll pass it through to a backend that knows it's already done.  On Chromium's implementation, this leads to an assertion failure.

The fix is to prevent any action that would be illegal in that interim state, perhaps by keying off of m_blobBeingWritten not being set or m_truncateLength being invalid.  The assertion fix is just to fix stop(), but abort() should also be handled.

See http://code.google.com/p/chromium/issues/detail?id=94895 for sample code that will cause the assertion failure [also attached].
Comment 1 Eric U. 2011-09-06 16:40:48 PDT
Created attachment 106516 [details]
Repro case
Comment 2 Eric U. 2011-09-21 17:09:24 PDT
The spec for these events has just changed as well [rolling in changes discussed in May].  This bug will require enough changes that it's worth taking the new spec into account while fixing it.
Comment 3 Eric U. 2011-09-21 17:20:50 PDT
Created attachment 108256 [details]
Patch
Comment 4 David Levin 2011-09-26 15:22:37 PDT
Comment on attachment 108256 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=108256&action=review

Looks close!

> LayoutTests/fast/filesystem/resources/file-writer-abort-continue.js:20
> +    onwritestart : onWriteStart0,

I think these methods like "onWriteStart0" would be better named if they indicated what they did as opposed to when they are called.

For example "onWriteStart0" could be "abortWriter".

"onAbort0" could be "logAbort"

> LayoutTests/fast/filesystem/resources/file-writer-abort-continue.js:21
> +    onwrite : onError,

I don't see onError defined anywhere.

> LayoutTests/fast/filesystem/resources/file-writer-abort-continue.js:62
> +// the followon action.

follow on?

> LayoutTests/fast/filesystem/resources/file-writer-abort.js:58
> +    cleanUp();

where is cleanUp?

> Source/WebCore/fileapi/FileWriter.cpp:82
> +    if (writer() && m_readyState == WRITING && m_backendIsActive && (m_queuedOperation == QueuedOperationNone))

Odd to add parens for the == at the end when the other == doesn't have them. (I think we avoid them in this case but if you want them, add them for m_readyState as well.)

> Source/WebCore/fileapi/FileWriter.cpp:100
> +    if (m_recursionDepth > kMaxRecursionDepth) {

Where is the test for this?

> Source/WebCore/fileapi/FileWriter.cpp:109
> +    if (m_backendIsActive) // We must be waiting for an abort to complete.

" // We must be waiting for an abort to complete."
How do you know that? ok, I see because m_readyState != WRITING and m_backendIsActive

> Source/WebCore/fileapi/FileWriter.cpp:112
> +        writer()->write(position(), m_blobBeingWritten.get());

It feels ugly to have this here an in the switch statement.

Can you pull that out to one place?

I'm thinking of something like this:

ASSERT(m_readyState != WRITING);
ASSERT(m_queuedOperation == QueuedOperationNone);

if (m_backendIsActive) // We must be waiting for an abort to complete.
  m_queuedOperation = operation;
else {
  switch (operation) {
// Something similar to what you have in FileWriter::didFail right now.
  }
}

> Source/WebCore/fileapi/FileWriter.cpp:136
> +    if (m_recursionDepth > kMaxRecursionDepth) {

Where is the test for this?

> Source/WebCore/fileapi/FileWriter.cpp:-132
> -        setError(FileError::INVALID_STATE_ERR, ec);

Is there is test that verifies this is no longer an error?

> Source/WebCore/fileapi/FileWriter.cpp:211
> +            writer()->write(position(), m_blobBeingWritten.get());

Do I get an event from this? If so, could I do an abort and then another write which would get messed up by the m_queuedOperation = QueuedOperationNone; below?

> Source/WebCore/fileapi/FileWriter.cpp:219
> +        default:

Omit the "default" to get the benefit of the compiler detecting when other enums are added.

Shouldn't this case also be able to do ASSERT(m_readyState == WRITING);?

> Source/WebCore/fileapi/FileWriter.cpp:224
> +    } else {

ASSERT(m_queuedOperation == QueuedOperationNone); ?

> Source/WebCore/fileapi/FileWriter.cpp:234
> +    m_truncateLength = -1;

Is there a test that would fail if this were not set here?

> Source/WebCore/fileapi/FileWriter.cpp:240
> +            fireEvent(eventNames().errorEvent);

There were no tests affected by this change? Why not?

(Does one of the added tests fail if I were to revert this part of the change?)

> Source/WebCore/fileapi/FileWriter.h:110
> +      QueuedOperationNone,

indent further

> Source/WebCore/fileapi/FileWriter.h:118
> +    bool m_backendIsActive;

What does this mean "backendIsAction"?

I think it means "m_isOperationInProgress"?
Comment 5 David Levin 2011-09-26 15:23:25 PDT
(In reply to comment #2)
> The spec for these events has just changed as well [rolling in changes discussed in May].  This bug will require enough changes that it's worth taking the new spec into account while fixing it.

It is nice to keep changes more focused if possible. :)
Comment 6 Eric U. 2011-09-27 12:51:24 PDT
Created attachment 108884 [details]
Patch
Comment 7 Eric U. 2011-09-27 12:52:12 PDT
(In reply to comment #4)
> (From update of attachment 108256 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=108256&action=review
> 
> Looks close!
> 
> > LayoutTests/fast/filesystem/resources/file-writer-abort-continue.js:20
> > +    onwritestart : onWriteStart0,
> 
> I think these methods like "onWriteStart0" would be better named if they indicated what they did as opposed to when they are called.
> 
> For example "onWriteStart0" could be "abortWriter".
> 
> "onAbort0" could be "logAbort"

Done.

> > LayoutTests/fast/filesystem/resources/file-writer-abort-continue.js:21
> > +    onwrite : onError,
> 
> I don't see onError defined anywhere.

It's in file-writer-utils.js.

> > LayoutTests/fast/filesystem/resources/file-writer-abort-continue.js:62
> > +// the followon action.
> 
> follow on?

Done.

> > LayoutTests/fast/filesystem/resources/file-writer-abort.js:58
> > +    cleanUp();
> 
> where is cleanUp?

file-writer-utils.js.  Also, I'd forgotten to set fileEntryForCleanup for it; fixed.

> > Source/WebCore/fileapi/FileWriter.cpp:82
> > +    if (writer() && m_readyState == WRITING && m_backendIsActive && (m_queuedOperation == QueuedOperationNone))
> 
> Odd to add parens for the == at the end when the other == doesn't have them. (I think we avoid them in this case but if you want them, add them for m_readyState as well.)

Yup; removed.

> > Source/WebCore/fileapi/FileWriter.cpp:100
> > +    if (m_recursionDepth > kMaxRecursionDepth) {
> 
> Where is the test for this?

Added file-writer-abort-depth.js

> > Source/WebCore/fileapi/FileWriter.cpp:109
> > +    if (m_backendIsActive) // We must be waiting for an abort to complete.
> 
> " // We must be waiting for an abort to complete."
> How do you know that? ok, I see because m_readyState != WRITING and m_backendIsActive

Expanded comment.

> > Source/WebCore/fileapi/FileWriter.cpp:112
> > +        writer()->write(position(), m_blobBeingWritten.get());
> 
> It feels ugly to have this here an in the switch statement.
> 
> Can you pull that out to one place?
> 
> I'm thinking of something like this:
> 
> ASSERT(m_readyState != WRITING);
> ASSERT(m_queuedOperation == QueuedOperationNone);
> 
> if (m_backendIsActive) // We must be waiting for an abort to complete.
>   m_queuedOperation = operation;
> else {
>   switch (operation) {
> // Something similar to what you have in FileWriter::didFail right now.
>   }
> }

I see what you're getting at, but I don't see how to make it cleaner than it is.
The three callsites that start operations in the backend are in write, truncate, and didFail.
The first is picking between write and enqueue-a-write.  The second is choosing between truncate and enqueue-a-truncation.  The third is choosing between start-enqueued-write, start-enqueued-truncation, and signalCompletion.  If I combine them, I need a function that can choose between write, truncate, [start-enqueued is pretty much the same code with different asserts], enqueue-a-write, enqueue-a-truncation, and signalCompletion.  That seems like a much more complex solution than what I've got now.  Also, either I pass in flags to tell it which operation I'm requesting, making it a lot like two functions combined with a flag to pick between them, or I don't, and it's got to figure it out all over again, despite the caller knowing which it was.

In terms of code size, I think you're shortening write and truncate by a maximum of 3 lines each, and adding significantly more than that to the shared function.

Given the choice, I guess I'd go with the flags approach; I'll do that if you feel strongly about it.

> > Source/WebCore/fileapi/FileWriter.cpp:136
> > +    if (m_recursionDepth > kMaxRecursionDepth) {
> 
> Where is the test for this?

Added file-writer-abort-depth.js.

> > Source/WebCore/fileapi/FileWriter.cpp:-132
> > -        setError(FileError::INVALID_STATE_ERR, ec);
> 
> Is there is test that verifies this is no longer an error?

Added to file-writer-abort.js

> > Source/WebCore/fileapi/FileWriter.cpp:211
> > +            writer()->write(position(), m_blobBeingWritten.get());
> 
> Do I get an event from this? If so, could I do an abort and then another write which would get messed up by the m_queuedOperation = QueuedOperationNone; below?

No, the event gets send when the user calls write() or truncate(), so we don't need to send an event here.
And those methods check+set readyState, so you can't queue more than one operation at a time; you have to abort in between.

> > Source/WebCore/fileapi/FileWriter.cpp:219
> > +        default:
> 
> Omit the "default" to get the benefit of the compiler detecting when other enums are added.

Hmm...I usually add the default to get the benefit of a sensible runtime error on invalid/uninitialized values.  Does WebKit have a standard behavior in this case?  It seems ugly to add a runtime check before the switch, then use the switch for the compile-time check.

> Shouldn't this case also be able to do ASSERT(m_readyState == WRITING);?

Yup; added.

> > Source/WebCore/fileapi/FileWriter.cpp:224
> > +    } else {
> 
> ASSERT(m_queuedOperation == QueuedOperationNone); ?

Added.

> > Source/WebCore/fileapi/FileWriter.cpp:234
> > +    m_truncateLength = -1;
> 
> Is there a test that would fail if this were not set here?

No; I'll add some asserts.  It's really just for cleanliness and ease of debugging problems--there's no way a user should be able to detect that value, and no path that should use it.  But since the asserts can check it, I should add them.

> > Source/WebCore/fileapi/FileWriter.cpp:240
> > +            fireEvent(eventNames().errorEvent);
> 
> There were no tests affected by this change? Why not?
> 
> (Does one of the added tests fail if I were to revert this part of the change?)

file-writer-abort-continue.js will fail if you revert this part of the change.
It sets onerror to onError, which will log it if we get an error event when we shouldn't.
Previously we weren't testing the abort path at all.

> > Source/WebCore/fileapi/FileWriter.h:110
> > +      QueuedOperationNone,
> 
> indent further

Fixed.

> > Source/WebCore/fileapi/FileWriter.h:118
> > +    bool m_backendIsActive;
> 
> What does this mean "backendIsAction"?
> 
> I think it means "m_isOperationInProgress"?

If you prefer ;'>.
Changed.
Comment 8 WebKit Review Bot 2011-09-27 13:25:13 PDT
Comment on attachment 108884 [details]
Patch

Attachment 108884 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/9879447

New failing tests:
fast/filesystem/file-writer-abort-depth.html
Comment 9 David Levin 2011-09-27 14:11:06 PDT
Comment on attachment 108884 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=108884&action=review

r- because I know you are doing one more revision (regarding the code rearrangement thing we discussed in email).

There are a few things below that might be nice to address but it looks good.

> LayoutTests/fast/filesystem/resources/file-writer-abort-continue.js:23
> +    onwriteend : checkLengthAndCallFollowOnAction

Thanks. I find it much easier to figure out what this test is doing now!

I personally think this FollowOnAction makes the code harder to read for very minimal gain but it is ok to leave it as is....

> LayoutTests/fast/filesystem/resources/file-writer-abort-continue.js:47
> +    onwriteend : onWriteEnd3

"onWriteEnd3" vestige of the old unreadable code? :)
Comment 10 Eric U. 2011-09-27 14:29:37 PDT
Created attachment 108898 [details]
Patch
Comment 11 Eric U. 2011-09-27 14:30:42 PDT
(In reply to comment #9)
> (From update of attachment 108884 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=108884&action=review
> 
> r- because I know you are doing one more revision (regarding the code rearrangement thing we discussed in email).

Done; thanks again.

> There are a few things below that might be nice to address but it looks good.
> 
> > LayoutTests/fast/filesystem/resources/file-writer-abort-continue.js:23
> > +    onwriteend : checkLengthAndCallFollowOnAction
> 
> Thanks. I find it much easier to figure out what this test is doing now!
> 
> I personally think this FollowOnAction makes the code harder to read for very minimal gain but it is ok to leave it as is....
> 
> > LayoutTests/fast/filesystem/resources/file-writer-abort-continue.js:47
> > +    onwriteend : onWriteEnd3
> 
> "onWriteEnd3" vestige of the old unreadable code? :)

Fixed.

Also fixed a typo in a test name and changed QueuedOperation to Operation, since it's now in more general use.
Comment 12 David Levin 2011-09-27 14:32:34 PDT
(In reply to comment #11)
>  changed QueuedOperation to Operation, since it's now in more general use.

Yeah, I was thinking something like that might be good with the code re-use change (but I didn't mention it). I'm glad you did that!
Comment 13 WebKit Review Bot 2011-09-27 15:15:58 PDT
Comment on attachment 108898 [details]
Patch

Attachment 108898 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/9877469

New failing tests:
fast/filesystem/file-writer-abort-depth.html
Comment 14 Eric U. 2011-09-27 17:12:01 PDT
Created attachment 108938 [details]
Patch
Comment 15 David Levin 2011-09-27 17:13:11 PDT
Comment on attachment 108938 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=108938&action=review

> LayoutTests/fast/filesystem/resources/file-writer-abort-depth.js:53
> +        }

So this was the culprit :)
Comment 16 WebKit Review Bot 2011-09-27 18:45:16 PDT
Comment on attachment 108938 [details]
Patch

Clearing flags on attachment: 108938

Committed r96177: <http://trac.webkit.org/changeset/96177>
Comment 17 WebKit Review Bot 2011-09-27 18:45:22 PDT
All reviewed patches have been landed.  Closing bug.