Bug 157525 - Modern IDB: Some blob tests ASSERT sometimes on the bots
Summary: Modern IDB: Some blob tests ASSERT sometimes on the bots
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Brady Eidson
URL:
Keywords:
: 157543 (view as bug list)
Depends on:
Blocks: 149117 154968
  Show dependency treegraph
 
Reported: 2016-05-10 10:31 PDT by Brady Eidson
Modified: 2016-05-23 16:29 PDT (History)
6 users (show)

See Also:


Attachments
main thread (128.52 KB, text/plain)
2016-05-15 11:22 PDT, Alexey Proskuryakov
no flags Details
BlobUtility thread (136.30 KB, text/plain)
2016-05-15 11:22 PDT, Alexey Proskuryakov
no flags Details
Speculative fix (6.17 KB, patch)
2016-05-23 15:38 PDT, Brady Eidson
achristensen: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Brady Eidson 2016-05-10 10:31:33 PDT
Modern IDB: Some blob tests ASSERT sometimes on the bots

https://build.webkit.org/results/Apple%20El%20Capitan%20Debug%20WK1%20(Tests)/r200618%20(5111)/results.html
Comment 1 Brady Eidson 2016-05-10 10:34:41 PDT
Without even thinking about, this seems obvious - Blobs were done before Workers, and that means they were done before the threading assertions were added.

Blobs do a little bit of thread hopping, so I'm sure there's just one or two adjustments that need to be made.
Comment 2 Brady Eidson 2016-05-10 15:55:14 PDT
*** Bug 157543 has been marked as a duplicate of this bug. ***
Comment 3 Brady Eidson 2016-05-10 16:11:43 PDT
I will not be able to look into this for awhile, nor do I have a clean enough check out to do anything about it.

We should comment out the assert for now, in lieu of skipping all blob tests.
Comment 4 Brady Eidson 2016-05-11 11:41:51 PDT
I cannot reproduce this no matter how many iterations and combinations of tests I try.
Comment 5 Brady Eidson 2016-05-11 11:45:21 PDT
It's somewhat confusing.

In these tests that are failing, the TransactionOperations are being created on the main thread, so they must be torn down on a background thread for the ASSERT to fire.

However, I can't see the code path where it's possible for them to be torn down on a background thread.
Comment 6 Alexey Proskuryakov 2016-05-15 11:21:56 PDT
This assertion can fire on both main thread and secondary threads. Attaching a couple crash logs.
Comment 7 Alexey Proskuryakov 2016-05-15 11:22:15 PDT
Created attachment 278975 [details]
main thread
Comment 8 Alexey Proskuryakov 2016-05-15 11:22:40 PDT
Created attachment 278976 [details]
BlobUtility thread
Comment 9 Alexey Proskuryakov 2016-05-19 13:28:19 PDT
This is pretty frequent on the bots, sadly.
Comment 10 Brady Eidson 2016-05-19 14:08:20 PDT
I'll be taking care of this after I resolve https://bugs.webkit.org/show_bug.cgi?id=157917, which seems much more pressing.
Comment 11 Brady Eidson 2016-05-20 09:37:16 PDT
The "main thread" crash Alexey attached is not actually related to blobs - It's a worker-related crash.

That one seems more imminently explorable.
Comment 12 Brady Eidson 2016-05-20 09:37:44 PDT
(In reply to comment #11)
> The "main thread" crash Alexey attached is not actually related to blobs -
> It's a worker-related crash.
> 
> That one seems more imminently explorable.

Maybe not, I don't know why I thought that. 
Sucks I can't reproduce any of these.

But there's at least two different things going on here (blobs and workers)
Comment 13 Brady Eidson 2016-05-20 09:41:29 PDT
It's related to nesting these lambdas, having one call the other, etc etc., and then the order in which they are torn down.

The teardown order is obviously racey.

This would be *simply* solved if we had C++14 lambda capture semantics, because we could move the RefPtr around instead of copying it.

*sigh*

I'll figure something out.
Comment 14 Brady Eidson 2016-05-20 09:57:34 PDT
In general, all of the blob writing code gets Workers wrong, as it presumes MainThreadedness for the callbacks.

Filed https://bugs.webkit.org/show_bug.cgi?id=157947 for that
Comment 15 Brady Eidson 2016-05-23 12:39:36 PDT
We have C++14 now. I'm working on a resolution to this today.
Comment 16 Brady Eidson 2016-05-23 15:38:02 PDT
Created attachment 279598 [details]
Speculative fix

This is speculative in that I could never nail down a way to reproduce the bug nearly as reliably as the bots do.

But reproducibility aside, this should be a positive change.
Comment 17 Alex Christensen 2016-05-23 15:46:48 PDT
Comment on attachment 279598 [details]
Speculative fix

Move all the things!
Comment 18 Brady Eidson 2016-05-23 16:14:41 PDT
Landed that in http://trac.webkit.org/changeset/201302

It should significantly improve things for the blob test.

The other one that Alexey attached is still potentially a problem. The patch landed is likely to improve things, making it less likely to happen. But it's still definitely a potential problem.
Comment 19 Brady Eidson 2016-05-23 16:16:20 PDT
The big problem is in IDBConnectionProxy:

template<typename T, typename... Parameters, typename... Arguments>
void performCallbackOnCorrectThread(T& object, void (T::*method)(Parameters...), Arguments&&... arguments)
{
    ASSERT(isMainThread());

    if (object.originThreadID() == currentThread()) {
        (object.*method)(arguments...);
        return;
    }

    ScriptExecutionContext* context = object.scriptExecutionContext();
    if (!context)
        return;

    context->postCrossThreadTask(object, method, arguments...);
}

It is possible that the worker thread's context has been stopped by the time a TransactionOperation that was created on the worker thread is completed.

In that case, it falls into the "if (!context) return;" clause.

And when it does that, the operation will be released on the main thread, which is its only option.
Comment 20 Brady Eidson 2016-05-23 16:21:14 PDT
(In reply to comment #19)
> 
> It is possible that the worker thread's context has been stopped by the time
> a TransactionOperation that was created on the worker thread is completed.
> 
> In that case, it falls into the "if (!context) return;" clause.
> 
> And when it does that, the operation will be released on the main thread,
> which is its only option.

So, what do we do for this case?

We either convince ourselves that its okay in such cases to destroy that TransactionOperation on the main thread, or we create a way to keep the thread itself alive past the lifetime of the WorkerThread so the transaction operation can be destroyed appropriately.

I think the later is the better choice, but *might* be complicated.
Comment 21 Brady Eidson 2016-05-23 16:28:59 PDT
Since this bug was originally and generally about the blob-related ASSERTs, I'll close it.

The remaining TransactionOperation issue is now covered separately by https://bugs.webkit.org/show_bug.cgi?id=158004
Comment 22 Brady Eidson 2016-05-23 16:29:23 PDT
And, of course, since this fix was speculative, we'll just reopen this if it didn't work.

But it should.