Bug 75111

Summary: fast/workers/storage/use-same-database-in-page-and-workers.html times out frequently
Product: WebKit Reporter: Ryosuke Niwa <rniwa>
Component: WebCore Misc.Assignee: Michael Nordman <michaeln>
Status: REOPENED ---    
Severity: Normal CC: abarth, commit-queue, dslomov, dumi, japhet, levin, levin+threading, michaeln, tony, webkit.review.bot, zhenghao
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 50995    
Attachments:
Description Flags
v8bindings
levin: review-, michaeln: commit-queue-
v8bindings
none
Archive of layout-test-results from webkit-cq-02 none

Description Ryosuke Niwa 2011-12-22 12:23:19 PST
fast/workers/storage/use-same-database-in-page-and-workers.html times out on various platforms according to:
http://test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=fast%2Fworkers%2Fstorage%2Fuse-same-database-in-page-and-workers.html
Comment 1 David Levin 2011-12-22 22:13:14 PST
Adding Michael because I think he is now handling database instead of Dumi.
Comment 2 Michael Nordman 2012-01-04 13:46:05 PST
This test runs 400 transactions which actually does take some time. The harness claims 'timeout' after 6 seconds, I suspect this test just ends up taking longer when being run in parallel on a busy system, >15ms/transaction would exceed 6 seconds.

Possibly of greater interest, there are some crashes in debug builds. Looks like we're hitting this assertion in some cases.

ASSERTION FAILED: m_workerContext->hasOneRef()
third_party/WebKit/Source/WebCore/workers/WorkerThread.cpp(161)
Comment 3 Ryosuke Niwa 2012-01-04 13:51:00 PST
(In reply to comment #2)
> This test runs 400 transactions which actually does take some time. The harness claims 'timeout' after 6 seconds, I suspect this test just ends up taking longer when being run in parallel on a busy system, >15ms/transaction would exceed 6 seconds.

Can we split the test into smaller pieces? Or add SLOW so that it won't time out?

> Possibly of greater interest, there are some crashes in debug builds. Looks like we're hitting this assertion in some cases.
> 
> ASSERTION FAILED: m_workerContext->hasOneRef()
> third_party/WebKit/Source/WebCore/workers/WorkerThread.cpp(161)

Please file a bug for this.
Comment 4 Michael Nordman 2012-01-04 14:22:30 PST
> Can we split the test into smaller pieces? Or add SLOW so that it won't time out?

It's initiating a total of 400 write xactions from 4 different threads which are all serialized. We could change the tests to write less and read more instead. Read xactions can run in parallel with one another and don't require flushing to disk.
Comment 5 Michael Nordman 2012-01-06 11:51:40 PST
There is separate bug filed about tripping on this ASSERTION in this test.
https://bugs.webkit.org/show_bug.cgi?id=75048

> > Possibly of greater interest, there are some crashes in debug builds.
> > Looks like we're hitting this assertion in some cases.
> > 
> > ASSERTION FAILED: m_workerContext->hasOneRef()
> > third_party/WebKit/Source/WebCore/workers/WorkerThread.cpp(161)
> 
> Please file a bug for this.
Comment 6 Ryosuke Niwa 2012-01-10 02:04:51 PST
use-same-database-in-page-and-workers.html is now crashing on cr-mac:
http://build.webkit.org/results/Chromium%20Mac%20Release%20(Tests)/r104540%20(11740)/results.html
Comment 7 Ryosuke Niwa 2012-01-10 02:06:28 PST
Also see the bug 48457
Comment 8 Hao Zheng 2012-02-15 01:02:22 PST
https://bugs.webkit.org/show_bug.cgi?id=75048 (Linux debug crash at ASSERT) is fixed.

Mac Release still crashes occasionly:
http://build.chromium.org/p/chromium.webkit/builders/Webkit%20Mac10.6/builds/12239
http://build.chromium.org/p/chromium.webkit/builders/Webkit%20Mac10.6/builds/12220
http://build.chromium.org/p/chromium.webkit/builders/Webkit%20Mac10.6/builds/12137

No stacktrace, and hard to reproduce.

For the timeout issue, the test runs for 25 secs in Debug on my Z600. Shall we mark it SLOW, or reduce the number of transactions in the test?
Comment 9 Tony Chang 2012-02-15 10:18:58 PST
(In reply to comment #8)
> For the timeout issue, the test runs for 25 secs in Debug on my Z600. Shall we mark it SLOW, or reduce the number of transactions in the test?

Reducing the number of transaction or splitting into multiple tests is normally better, but SLOW is also OK.
Comment 10 Hao Zheng 2012-02-15 22:20:33 PST
Sadly, we still have a rare crash in Linux Debug (no crash for the recent 30 builds). I guess the Mac Release crash may be rooted from the same cause.

http://test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=fast%2Fworkers%2Fstorage%2Fuse-same-database-in-page-and-workers.html

http://build.chromium.org/p/chromium.webkit/builders/Webkit%20Linux%20(dbg)/builds/3401/steps/webkit_tests/logs/stdio

2012-02-15 04:19:54,057 36685 single_test_runner.py:202 DEBUG worker/27 Stacktrace for fast/workers/storage/use-same-database-in-page-and-workers.html:
ERROR: Unable to turn on incremental auto-vacuum (0 not an error)
third_party/WebKit/Source/WebCore/storage/AbstractDatabase.cpp(272) : virtual bool WebCore::AbstractDatabase::performOpenAndVerify(bool, WebCore::ExceptionCode&, WTF::String&)
ERROR: Error (5) reading text result from database (SELECT value FROM __WebKitDatabaseInfoTable__ WHERE key = 'WebKitDatabaseVersionKey';)
third_party/WebKit/Source/WebCore/storage/AbstractDatabase.cpp(85) : bool WebCore::retrieveTextResultFromDatabase(WebCore::SQLiteDatabase&, const WTF::String&, WTF::String&)
ERROR: Failed to retrieve version from database file://::UseSameDatabaseInPageAndWorkers
third_party/WebKit/Source/WebCore/storage/AbstractDatabase.cpp(426) : bool WebCore::AbstractDatabase::getVersionFromDatabase(WTF::String&, bool)
ERROR: Unable to turn on incremental auto-vacuum (5 database is locked)
third_party/WebKit/Source/WebCore/storage/AbstractDatabase.cpp(272) : virtual bool WebCore::AbstractDatabase::performOpenAndVerify(bool, WebCore::ExceptionCode&, WTF::String&)
ERROR: Error (5) preparing statement to read text result from database (SELECT value FROM __WebKitDatabaseInfoTable__ WHERE key = 'WebKitDatabaseVersionKey';)
third_party/WebKit/Source/WebCore/storage/AbstractDatabase.cpp(71) : bool WebCore::retrieveTextResultFromDatabase(WebCore::SQLiteDatabase&, const WTF::String&, WTF::String&)
ERROR: Failed to retrieve version from database file://::UseSameDatabaseInPageAndWorkers
third_party/WebKit/Source/WebCore/storage/AbstractDatabase.cpp(426) : bool WebCore::AbstractDatabase::getVersionFromDatabase(WTF::String&, bool)
ERROR: Unable to turn on incremental auto-vacuum (5 database is locked)
third_party/WebKit/Source/WebCore/storage/AbstractDatabase.cpp(272) : virtual bool WebCore::AbstractDatabase::performOpenAndVerify(bool, WebCore::ExceptionCode&, WTF::String&)
ERROR: Error (5) preparing statement to read text result from database (SELECT value FROM __WebKitDatabaseInfoTable__ WHERE key = 'WebKitDatabaseVersionKey';)
third_party/WebKit/Source/WebCore/storage/AbstractDatabase.cpp(71) : bool WebCore::retrieveTextResultFromDatabase(WebCore::SQLiteDatabase&, const WTF::String&, WTF::String&)
ERROR: Failed to retrieve version from database file://::UseSameDatabaseInPageAndWorkers
third_party/WebKit/Source/WebCore/storage/AbstractDatabase.cpp(426) : bool WebCore::AbstractDatabase::getVersionFromDatabase(WTF::String&, bool)
ERROR: Failed to set maximum size of database to 881664 bytes
third_party/WebKit/Source/WebCore/platform/sql/SQLiteDatabase.cpp(182) : void WebCore::SQLiteDatabase::setMaximumSize(int64_t)
ERROR: Failed to set maximum size of database to 881664 bytes
third_party/WebKit/Source/WebCore/platform/sql/SQLiteDatabase.cpp(182) : void WebCore::SQLiteDatabase::setMaximumSize(int64_t)
1   0x187f72c
2   0x148582f
3   0x1484d49
4   0x146a5c6
5   0x14b6e7b
6   0x14b6a3c
7   0x14b67a9
8   0x14baf6f
9   0x14a8083
10  0x14bae70
11  0x14bac1a
12  0x1d97dcd
13  0x7f47c22a79ca
14  0x7f47bfef570d clone
[37735:39791:14821647597541:ERROR:process_util_posix.cc(142)] Received signal 11
	base::debug::StackTrace::StackTrace() [0x72d5a2]
	base::(anonymous namespace)::StackDumpSignalHandler() [0x6e7309]
	0x7f47bfe42af0
	WebCore::V8SQLTransactionCallback::handleEvent() [0x187f736]
	WebCore::SQLTransaction::deliverTransactionCallback() [0x148582f]
	WebCore::SQLTransaction::performPendingCallback() [0x1484d49]
	WebCore::DeliverPendingCallbackTask::performTask() [0x146a5c6]
	WebCore::WorkerRunLoop::Task::performTask() [0x14b6e7b]
	WebCore::WorkerRunLoop::runInMode() [0x14b6a3c]
	WebCore::WorkerRunLoop::run() [0x14b67a9]
	WebCore::WorkerThread::runEventLoop() [0x14baf6f]
	WebCore::DedicatedWorkerThread::runEventLoop() [0x14a8083]
	WebCore::WorkerThread::workerThread() [0x14bae70]
	WebCore::WorkerThread::workerThreadStart() [0x14bac1a]
	WTF::threadEntryPoint() [0x1d97dcd]
	start_thread [0x7f47c22a79ca]
	0x7f47bfef570d
None
2012-02-15 04:19:54,058 36685 worker.py:190 DEBUG worker/27 killing driver
2012-02-15 04:19:54,058 36685 worker.py:212 DEBUG worker/27 fast/workers/storage/use-same-database-in-page-and-workers.html failed:
2012-02-15 04:19:54,058 36611 printing.py:462 INFO   fast/workers/storage/use-same-database-in-page-and-workers.html -> unexpected DumpRenderTree crash
2012-02-15 04:19:54,059 36685 worker.py:214 DEBUG worker/27  DumpRenderTree crashed




I think it's a race condition that when the test timed out, the SQLTransaction running is disposed in another thread. However, it's harder to reproduce than the fixed crash.

Code at the crash spot (generated code):

bool V8SQLTransactionCallback::handleEvent(SQLTransaction* transaction)
{
    if (!canInvokeCallback())
        return true;

    v8::HandleScope handleScope;

    v8::Handle<v8::Context> v8Context = toV8Context(scriptExecutionContext(), m_worldContext);
    if (v8Context.IsEmpty())
        return true;

    v8::Context::Scope scope(v8Context);

    v8::Handle<v8::Value> transactionHandle = toV8(transaction);
    if (transactionHandle.IsEmpty()) {
        CRASH();                      // <----------- possibly crash here
        return true;
    }

    v8::Handle<v8::Value> argv[] = {
        transactionHandle
    };

    bool callbackReturnValue = false;
    return !invokeCallback(m_callback, 1, argv, callbackReturnValue, scriptExecutionContext());
}
Comment 11 Michael Nordman 2012-02-16 11:31:13 PST
(In reply to comment #10)
> Sadly, we still have a rare crash in Linux Debug (no crash for the recent 30 builds).

bummer :(

> I think it's a race condition that when the test timed out, the SQLTransaction running is disposed in another thread. However, it's harder to reproduce than the fixed crash.

What leads you to believe that the SQLTransaction has been deleted? I see DeliverPendingCallbackTask up on the stack which holds a RefPtr<> to the SQLTransaction. Looks more like the v8Wrapper (terminology?) around the SQLTransaction has been disposed of.

Where's the source for the canInvokeCallback() up at the top? I imagine this is happening during worker shutdown. If thats true, should canInvokeCallback() be detecting that and returning false in that case?

> Code at the crash spot (generated code):
> 
> bool V8SQLTransactionCallback::handleEvent(SQLTransaction* transaction)
> {
>     if (!canInvokeCallback())
>         return true;
> 
>     v8::HandleScope handleScope;
> 
>     v8::Handle<v8::Context> v8Context = toV8Context(scriptExecutionContext(), m_worldContext);
>     if (v8Context.IsEmpty())
>         return true;
> 
>     v8::Context::Scope scope(v8Context);
> 
>     v8::Handle<v8::Value> transactionHandle = toV8(transaction);
>     if (transactionHandle.IsEmpty()) {
>         CRASH();                      // <----------- possibly crash here
>         return true;
>     }
> 
>     v8::Handle<v8::Value> argv[] = {
>         transactionHandle
>     };
> 
>     bool callbackReturnValue = false;
>     return !invokeCallback(m_callback, 1, argv, callbackReturnValue, scriptExecutionContext());
> }
Comment 12 Hao Zheng 2012-02-17 00:50:24 PST
(In reply to comment #11)
> (In reply to comment #10)
> > Sadly, we still have a rare crash in Linux Debug (no crash for the recent 30 builds).
> 
> bummer :(
> 
> > I think it's a race condition that when the test timed out, the SQLTransaction running is disposed in another thread. However, it's harder to reproduce than the fixed crash.
> 
> What leads you to believe that the SQLTransaction has been deleted? I see DeliverPendingCallbackTask up on the stack which holds a RefPtr<> to the SQLTransaction. Looks more like the v8Wrapper (terminology?) around the SQLTransaction has been disposed of.
> 
> Where's the source for the canInvokeCallback() up at the top? I imagine this is happening during worker shutdown. If thats true, should canInvokeCallback() be detecting that and returning false in that case?

Yes, you are right. SQLTransaction is not deleted. I finally reproduced a similar crash on my machine at V8SQLTransactionErrorCallback::handleEvent. The problem is toV8 returned an empty handle. Inside toV8, wrapSlow is called, and I guess some DOM objects have been disposed in the main thread after timeout but I haven't traced deeper here. We need to be very careful when shutting down the worker thread, because the main thread may be stopped already. I actually got another crash in workerThread() (see below for clearness).

v8::Handle<v8::Object> V8SQLError::wrapSlow(SQLError* impl)
{
    v8::Handle<v8::Object> wrapper;
    V8Proxy* proxy = 0;
    wrapper = V8DOMWrapper::instantiateV8Object(proxy, &info, impl);
    if (wrapper.IsEmpty())
        return wrapper;

    impl->ref();
    v8::Persistent<v8::Object> wrapperHandle = v8::Persistent<v8::Object>::New(wrapper);

    if (!hasDependentLifetime)
        wrapperHandle.MarkIndependent();
    getDOMObjectMap().set(impl, wrapperHandle);
    return wrapper;
}

 66     v8::Handle<v8::Value> errorHandle = toV8(error);
 67     if (errorHandle.IsEmpty()) {
 68         CRASH();
 69         return true;
 70     }

#0  0x000000000196b710 in WebCore::V8SQLTransactionErrorCallback::handleEvent (this=0x7ff65d909280, error=0x7ff65d9379a0) at out/Debug/obj/gen/webcore/bindings/V8SQLTransactionErrorCallback.cpp:68
#1  0x000000000145cdb5 in WebCore::SQLTransaction::deliverTransactionErrorCallback (this=0x7ff65d773840) at third_party/WebKit/Source/WebCore/storage/SQLTransaction.cpp:604
#2  0x000000000145aec3 in WebCore::SQLTransaction::performPendingCallback (this=0x7ff65d773840) at third_party/WebKit/Source/WebCore/storage/SQLTransaction.cpp:215
#3  0x000000000144034c in WebCore::DeliverPendingCallbackTask::performTask (this=0x7ff662bed980) at third_party/WebKit/Source/WebCore/storage/Database.cpp:342
#4  0x000000000148d6f5 in WebCore::WorkerRunLoop::Task::performTask (this=0x7ff65d9379c0, runLoop=..., context=0x7ff662259d00) at third_party/WebKit/Source/WebCore/workers/WorkerRunLoop.cpp:224
#5  0x000000000148d1a6 in WebCore::WorkerRunLoop::runInMode (this=0x7ff6621c5a30, context=0x7ff662259d00, predicate=...) at third_party/WebKit/Source/WebCore/workers/WorkerRunLoop.cpp:168
#6  0x000000000148ce75 in WebCore::WorkerRunLoop::run (this=0x7ff6621c5a30, context=0x7ff662259d00) at third_party/WebKit/Source/WebCore/workers/WorkerRunLoop.cpp:135
#7  0x00000000014919db in WebCore::WorkerThread::runEventLoop (this=0x7ff6621c5a00) at third_party/WebKit/Source/WebCore/workers/WorkerThread.cpp:189
#8  0x000000000147e593 in WebCore::DedicatedWorkerThread::runEventLoop (this=0x7ff6621c5a00) at third_party/WebKit/Source/WebCore/workers/DedicatedWorkerThread.cpp:66
#9  0x0000000001491842 in WebCore::WorkerThread::workerThread (this=0x7ff6621c5a00) at third_party/WebKit/Source/WebCore/workers/WorkerThread.cpp:161
#10 0x0000000001491542 in WebCore::WorkerThread::workerThreadStart (thread=0x7ff6621c5a00) at third_party/WebKit/Source/WebCore/workers/WorkerThread.cpp:128
#11 0x0000000001d4fc11 in WTF::threadEntryPoint (contextData=0x7ff662352640) at third_party/WebKit/Source/JavaScriptCore/wtf/Threading.cpp:67
#12 0x00007ff66b1c59ca in start_thread (arg=<optimized out>) at pthread_create.c:300
#13 0x00007ff668e1370d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#14 0x0000000000000000 in ?? ()




Another crash could happen:

#0  0x000000000050c517 in WebCore::PlatformSupport::didStopWorkerRunLoop (loop=0x7fffea7f58f0) at third_party/WebKit/Source/WebKit/chromium/src/PlatformSupport.cpp:1130
#1  0x000000000149187e in WebCore::WorkerThread::workerThread (this=0x7fffea7f58c0) at third_party/WebKit/Source/WebCore/workers/WorkerThread.cpp:165
#2  0x000000000149151e in WebCore::WorkerThread::workerThreadStart (thread=0x7fffea7f58c0) at third_party/WebKit/Source/WebCore/workers/WorkerThread.cpp:128
#3  0x0000000001d4fbed in WTF::threadEntryPoint (contextData=0x7fffeb18d4b0) at third_party/WebKit/Source/JavaScriptCore/wtf/Threading.cpp:67
#4  0x00007ffff37b69ca in start_thread (arg=<optimized out>) at pthread_create.c:300
#5  0x00007ffff140470d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112

void PlatformSupport::didStopWorkerRunLoop(WorkerRunLoop* loop)
{
    webKitPlatformSupport()->didStopWorkerRunLoop(WebWorkerRunLoop(loop));
}

webKitPlatformSupport() could return NULL here, as it has been shutdown() already (see chromium/src/WebKit.cpp).
Comment 13 Michael Nordman 2012-02-17 11:40:29 PST
> Yes, you are right. SQLTransaction is not deleted. I finally reproduced a similar crash on my machine at V8SQLTransactionErrorCallback::handleEvent. The problem is toV8 returned an empty handle. Inside toV8, wrapSlow is called, and I guess some DOM objects have been disposed in the main thread after timeout but I haven't traced deeper here. We need to be very careful when shutting down the worker thread, because the main thread may be stopped already. I actually got another crash in workerThread() (see below for clearness).

This is fragile stuff. I'm less concerned about the process shutdown case than the worker shutdown case where the process will continue. (Crashing in the former case means we shutdown faster ;)

    v8::Handle<v8::Value> transactionHandle = toV8(transaction);
    if (transactionHandle.IsEmpty()) {
        CRASH();                      // <----------- possibly crash here
        return true;
    }

If we're convinced this is happening only in odd worker shutdown corner cases, a very proximate workaround for CRASH in particular could be to use 'custom' handlers with the CRASH() line removed. A more general solution, where canInvokeCallback() would return false while in this state of shutting down, might be better if we can really determine we're past the point of no return in some shutdown sequence.

> 
> void PlatformSupport::didStopWorkerRunLoop(WorkerRunLoop* loop)
> {
>     webKitPlatformSupport()->didStopWorkerRunLoop(WebWorkerRunLoop(loop));
> }
> 
> webKitPlatformSupport() could return NULL here, as it has been shutdown() already (see chromium/src/WebKit.cpp).

Ideally, worker threads should be stopped prior to shutdown() returning, idk if that's actually the case though. If not the case, shutdown() could monitor how many WebWorkerRunLoop are still spinning and wait till it drops to zero. This would only be an issue at process shutdown. The v8 related guys are more concerning.
Comment 14 Hao Zheng 2012-03-01 01:22:40 PST
Now, Linux Release has a crash, as CRASH() is effective in both Release and Debug. The crash can be seen more often on Android, partly because Android is slower than desktop. We can turn CRASH into ASSERT so that crash is only seen in Debug build. However, I think a better way is to fix the race condition.

http://build.chromium.org/p/chromium.webkit/builders/Webkit%20Linux/builds/21589
Comment 15 David Levin 2012-03-01 09:28:08 PST
See https://bugs.webkit.org/show_bug.cgi?id=50995

This is making the commit queue very unstable.
Comment 16 Michael Nordman 2012-03-02 13:03:05 PST
I think i see a way to determine if we're "past the point of no return" and should not bother trying to invoke the callback.

    bool ActiveDOMObjectCallbackImpl::canInvokeCallback()
    {
        MutexLocker locker(m_mutex);   // As an aside, why is there a mutex in this class???
        if (m_suspended || m_stopped)
            return false;

        // The new code is to get if execution has been terminated.
        ScriptExecutionContext* context = ActiveDOMObject::scriptExecutionContext();
        if (context && context->isWorkerContext()) {
            WorkerContext* workerContext = (WorkerContext*) context;
            if (workerContext->script()->isExecutionForbidden())
                return false;
        }
        return true;
    }


bool WorkerScriptController::isExecutionForbidden() const
{
    ASSERT(m_workerContext->isContextThread());
    if (m_executionForbidden)
        return true;

    // new code to detect termination and to pull that up into the forbidden flag
    m_executionForbidden = v8::V8::IsExecutionTerminating(m_isolate);
    return m_executionForbidden;
}

WorkerThread::stop() calls scheduleExecutionTermination() at the very start of the shutdown process.
Comment 17 Michael Nordman 2012-03-02 13:27:36 PST
Better yet... change the binding generated code to check for the 'isTerminating' condition prior to committing it's suicidal CRASH()


bool V8SQLTransactionErrorCallback::handleEvent(SQLError* error)
{
  .....
    v8::Handle<v8::Value> errorHandle = toV8(error);
    if (errorHandle.IsEmpty()) {
        if (!isScriptControllerTerminating())
            CRASH();
        return true;
    }
  .....
}


    bool ActiveDOMObjectCallbackImpl::isScriptControllerTerminating()
    {
        ScriptExecutionContext* context = ActiveDOMObject::scriptExecutionContext();
        if (!context)
            return true;
        if (context->isWorkerContext()) {
            WorkerContext* workerContext = (WorkerContext*) context;
            if (workerContext->script()->isExecutionForbidden())
                return true;
        }
        return false;
    }


bool WorkerScriptController::isExecutionForbidden() const
{
    ASSERT(m_workerContext->isContextThread());
    if (m_executionForbidden)
        return true;

    // new code to detect termination and to pull that up into the forbidden flag
    m_executionForbidden = v8::V8::IsExecutionTerminating(m_isolate);
    return m_executionForbidden;
}
Comment 18 Dmitry Lomov 2012-03-02 13:34:01 PST
(In reply to comment #17)
> Better yet... change the binding generated code to check for the 'isTerminating' condition prior to committing it's suicidal CRASH()

this actually sounds good to me.

> 
> 
> bool V8SQLTransactionErrorCallback::handleEvent(SQLError* error)
> {
>   .....
>     v8::Handle<v8::Value> errorHandle = toV8(error);
>     if (errorHandle.IsEmpty()) {
>         if (!isScriptControllerTerminating())
>             CRASH();
>         return true;
>     }
>   .....
> }
> 
> 
>     bool ActiveDOMObjectCallbackImpl::isScriptControllerTerminating()
>     {
>         ScriptExecutionContext* context = ActiveDOMObject::scriptExecutionContext();
>         if (!context)
>             return true;
>         if (context->isWorkerContext()) {
>             WorkerContext* workerContext = (WorkerContext*) context;
>             if (workerContext->script()->isExecutionForbidden())
>                 return true;
>         }
>         return false;
>     }
> 
> 
> bool WorkerScriptController::isExecutionForbidden() const
> {
>     ASSERT(m_workerContext->isContextThread());
>     if (m_executionForbidden)
>         return true;
> 
>     // new code to detect termination and to pull that up into the forbidden flag
>     m_executionForbidden = v8::V8::IsExecutionTerminating(m_isolate);
>     return m_executionForbidden;
> }
Comment 19 Michael Nordman 2012-03-02 13:48:07 PST
(In reply to comment #18)
> (In reply to comment #17)
> > Better yet... change the binding generated code to check for the 'isTerminating' condition prior to committing it's suicidal CRASH()
> 
> this actually sounds good to me.

you sound surprised :)
Comment 20 Dmitry Lomov 2012-03-02 13:51:30 PST
(In reply to comment #19)
> (In reply to comment #18)
> > (In reply to comment #17)
> > > Better yet... change the binding generated code to check for the 'isTerminating' condition prior to committing it's suicidal CRASH()
> > 
> > this actually sounds good to me.
> 
> you sound surprised :)

;) yes, I do not really understand how this callback gets called after V8 bindings shut down, but this might be a solution for now..
Comment 21 Michael Nordman 2012-03-02 19:09:55 PST
I have a patch in the works, will post for review next week.

> ;) yes, I do not really understand how this callback gets called after V8 bindings shut down, but this might be a solution for now..

What does "binding shut down" mean and when does that happen?

Here's my working theory...

WorkerThread::stop() is called at arbitrary time wrt whatever is happening on worker. It immediately calls scheduleExecutionTermination() and then post a WorkerThreadShutdownStartTask to the worker thread proper. On the worker thread, c++ is still running as if nothing has happened... lalala... comes upon time to invoke a callback and CRASH() since toV8() fails to produce anything... how kind of the bindings glopware to do that.

The patch should make it so that if that c++ code tries to call into v8 via one of these generated callback classes after scheduleExecutionTermination() has been invoked... will return harmlessly instead of CRASH()'ing.

I'll look for custom callbacks to poke at to similarly test for isTerminating().
Comment 22 Michael Nordman 2012-03-02 19:25:19 PST
> I have a patch in the works, will post for review next week.

Here's a preview, i'll jump thru the pita (webkit patching making / changelog editting / after moving changes over to a 'real' webkit client) hoops next week.

https://chromiumcodereview.appspot.com/9572031/
Comment 23 Michael Nordman 2012-03-05 17:55:54 PST
I've updated the patch in chromiumcodereview tool. The patch also simplifies the ActiveDOMCallback impl, https://chromiumcodereview.appspot.com/9572031/

I'll move it over to a this bug soon'ish.

(In reply to comment #22)
> > I have a patch in the works, will post for review next week.
> 
> Here's a preview, i'll jump thru the pita (webkit patching making / changelog editting / after moving changes over to a 'real' webkit client) hoops next week.
> 
> https://chromiumcodereview.appspot.com/9572031/
Comment 24 Michael Nordman 2012-03-06 16:24:03 PST
Created attachment 130464 [details]
v8bindings
Comment 25 David Levin 2012-03-06 16:28:22 PST
Comment on attachment 130464 [details]
v8bindings

View in context: https://bugs.webkit.org/attachment.cgi?id=130464&action=review

> Source/WebCore/bindings/v8/WorkerScriptController.cpp:105
> +    MutexLocker locker(m_scheduledTerminationMutex);

quick comment. I don't understand why one should protect this bool using a mutex? (What does that do?)
Comment 26 Michael Nordman 2012-03-06 16:40:34 PST
Comment on attachment 130464 [details]
v8bindings

View in context: https://bugs.webkit.org/attachment.cgi?id=130464&action=review

>> Source/WebCore/bindings/v8/WorkerScriptController.cpp:105
>> +    MutexLocker locker(m_scheduledTerminationMutex);
> 
> quick comment. I don't understand why one should protect this bool using a mutex? (What does that do?)

The setter is called as the first step of the worker shutdown process on the main thread. The getter is called after toV8(xxx) fails to return a valid value on the worker thread. If there's another way to guarantee that the getter will return true after the setter has been called, that'd be fine by me. Is there another way to make that guarantee?
Comment 27 Michael Nordman 2012-03-06 16:52:15 PST
> > quick comment. I don't understand why one should protect this bool using a mutex? (What does that do?)
> 
> The setter is called as the first step of the worker shutdown process on the main thread. The getter is called after toV8(xxx) fails to return a valid value on the worker thread. If there's another way to guarantee that the getter will return true after the setter has been called, that'd be fine by me. Is there another way to make that guarantee?

Maybe something in Atomics.h?
Comment 28 David Levin 2012-03-06 17:05:13 PST
(In reply to comment #26)
> (From update of attachment 130464 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=130464&action=review
> 
> >> Source/WebCore/bindings/v8/WorkerScriptController.cpp:105
> >> +    MutexLocker locker(m_scheduledTerminationMutex);
> > 
> > quick comment. I don't understand why one should protect this bool using a mutex? (What does that do?)
> 
> The setter is called as the first step of the worker shutdown process on the main thread. The getter is called after toV8(xxx) fails to return a valid value on the worker thread. If there's another way to guarantee that the getter will return true after the setter has been called, that'd be fine by me. Is there another way to make that guarantee?

I just don't understand how this guarantees anything.

Suppose thread 2 calls isExecutionTerminating is called just before scheduleExecutionTermination is called on thread 1. It returns false but right after it returned false scheduleExecutionTermination is called on thread 1.

You've got some deeper insight here but I'm not getting it.
Comment 29 Michael Nordman 2012-03-06 17:16:39 PST
toV8(xxx) can return empty anytime after scheduleExecutionTermination() has been called, we shouldn't crash in that case.

> Suppose thread 2 calls isExecutionTerminating is called just before scheduleExecutionTermination is called on thread 1. It returns false but right after it returned false scheduleExecutionTermination is called on thread 1.

in the bindings layer, the logic looks like this on your thread2

>   .....
>     v8::Handle<v8::Value> errorHandle = toV8(error);
>     if (errorHandle.IsEmpty()) {
>         if (!isScriptControllerTerminating())
>             CRASH();
>         return true;
>     }
>   .....

at the isScriptControllerTerminating() callsite, what happens afterwards on thread1 doesn't matter, what matters is what thread1 did prior to that call... if it has called scheduleExecutionTermination(), we shouldn't crash
Comment 30 David Levin 2012-03-06 17:42:16 PST
(In reply to comment #29)
> toV8(xxx) can return empty anytime after scheduleExecutionTermination() has been called, we shouldn't crash in that case.
> 
> > Suppose thread 2 calls isExecutionTerminating is called just before scheduleExecutionTermination is called on thread 1. It returns false but right after it returned false scheduleExecutionTermination is called on thread 1.
> 
> in the bindings layer, the logic looks like this on your thread2
> 
> >   .....
> >     v8::Handle<v8::Value> errorHandle = toV8(error);
> >     if (errorHandle.IsEmpty()) {
> >         if (!isScriptControllerTerminating())
> >             CRASH();
> >         return true;
> >     }
> >   .....
> 
> at the isScriptControllerTerminating() callsite, what happens afterwards on thread1 doesn't matter, what matters is what thread1 did prior to that call... if it has called scheduleExecutionTermination(), we shouldn't crash

I think I get it. It would be good to have a comment in the code here as it is pretty subtle.
Comment 31 David Levin 2012-03-06 17:55:38 PST
Comment on attachment 130464 [details]
v8bindings

View in context: https://bugs.webkit.org/attachment.cgi?id=130464&action=review

> Source/WebCore/ChangeLog:8
> +        - simplify ActiveDOMCallback, there is no need for it to support

Should be here:
bindings/generic/ActiveDOMCallback.cpp

> Source/WebCore/bindings/js/WorkerScriptController.cpp:165
> +    MutexLocker locker(m_scheduledTerminationMutex);

Why is the mutex needed here?

> Source/WebCore/bindings/v8/WorkerScriptController.cpp:2
> + * Copyright (C) 2012 Google Inc. All rights reserved.

Typically in WebKit the new year is appended.
 2009, 2012

> Source/WebCore/bindings/v8/WorkerScriptController.cpp:56
> +    , m_executionScheduledToTerminate(false)

Actually why add this? Doesn't WorkerRunLoop::terminated() suffice?

> Source/WebCore/dom/ScriptExecutionContext.cpp:95
> +    , m_reasonForSuspendingActiveDOMObjects(static_cast<ActiveDOMObject::ReasonForSuspension>(0))

Why did this get added?

Also if it must be initialized why not pick a value rather than do this typecasting thing.
Comment 32 Michael Nordman 2012-03-06 18:49:58 PST
No new patch yet...

>  It would be good to have a comment in the code here as it is pretty subtle.

This is the output of the code generator. Typically, there are no comments in the generated code. An example of its output can be seen in V8TestCallback.cpp. Also, i'm not sure it's all that subtle?

  // Only crash if we're not terminating
  if (!isScriptControllerTerminating())
      CRASH();

Not sure that really adds value, what comment would you like me to put there?


> > Source/WebCore/ChangeLog:8
> > +        - simplify ActiveDOMCallback, there is no need for it to support
> 
> Should be here:
> bindings/generic/ActiveDOMCallback.cpp
> 
> > Source/WebCore/bindings/js/WorkerScriptController.cpp:165
> > +    MutexLocker locker(m_scheduledTerminationMutex);
> 
> Why is the mutex needed here?

I'm confused, that's what we were just discussing?
Or maybe your asking "why add this to the JS class too?" If that's the question, purely to keep the two WorkerScriptController classes in sync.

> > Source/WebCore/bindings/v8/WorkerScriptController.cpp:2
> > + * Copyright (C) 2012 Google Inc. All rights reserved.
> 
> Typically in WebKit the new year is appended.
>  2009, 2012

will do

> > Source/WebCore/bindings/v8/WorkerScriptController.cpp:56
> > +    , m_executionScheduledToTerminate(false)
> 
> Actually why add this? Doesn't WorkerRunLoop::terminated() suffice?

Ditto, that's what we were just discussing?
It does not suffice, the worker run loop isn't terminated until after scheduleExecutionTermination() has been invoked.

> > Source/WebCore/dom/ScriptExecutionContext.cpp:95
> > +    , m_reasonForSuspendingActiveDOMObjects(static_cast<ActiveDOMObject::ReasonForSuspension>(0))
> 
> Why did this get added?

Purely code cleanup. I was assuming it was good practice to initialize data members in webcore, this one just wasn't being initialized. This initialization isn't needed to fix the bug and if its webkit practice to not initialize data members, i can remove it.

> Also if it must be initialized why not pick a value rather than do this typecasting thing.

An attempt to make it more clear that the value wasn't particularly important (maybe a not so successful attempt). I suppose if i had picked some value, you could ask "why did you pick that value rather than this value" as a review comment :)
Comment 33 David Levin 2012-03-06 19:30:50 PST
Almost done.

(In reply to comment #32)
> No new patch yet...
> 
> >  It would be good to have a comment in the code here as it is pretty subtle.
> 
> This is the output of the code generator. Typically, there are no comments in the generated code. An example of its output can be seen in V8TestCallback.cpp. Also, i'm not sure it's all that subtle?


I'm was attempting to refer to the mutex lock (which isn't autogenerated). The issue with using a mutex for a bool seemed subtle to me and could easily get "cleaned up" by someone in the future if there wasn't a warning.

Basically you're using the Mutex as a read barrier not for the mutex itself.


> > bindings/generic/ActiveDOMCallback.cpp
> > 
> > > Source/WebCore/bindings/js/WorkerScriptController.cpp:165
> > > +    MutexLocker locker(m_scheduledTerminationMutex);
> > 
> > Why is the mutex needed here?
> 
> I'm confused, that's what we were just discussing?

Different place.

> Or maybe your asking "why add this to the JS class too?" If that's the question, purely to keep the two WorkerScriptController classes in sync.

Yes, I guess this is another read barrier.

For these read barrier cases, it would be nice to note that it is being used as a read barrier and why you need a read barrier. Hope that makes sense.



> > > Source/WebCore/bindings/v8/WorkerScriptController.cpp:56
> > > +    , m_executionScheduledToTerminate(false)
> > 
> > Actually why add this? Doesn't WorkerRunLoop::terminated() suffice?
> 
> Ditto, that's what we were just discussing?
> It does not suffice, the worker run loop isn't terminated until after scheduleExecutionTermination() has been invoked.

I'm pretty sure that WorkerRunLoop::terminated is set as soon as scheduleExecutionTermination is called. 

See WorkerThread::stop()

> 
> > > Source/WebCore/dom/ScriptExecutionContext.cpp:95
> > > +    , m_reasonForSuspendingActiveDOMObjects(static_cast<ActiveDOMObject::ReasonForSuspension>(0))
> > 
> > Why did this get added?
> 
> Purely code cleanup. I was assuming it was good practice to initialize data members in webcore, this one just wasn't being initialized. This initialization isn't needed to fix the bug and if its webkit practice to not initialize data members, i can remove it.

I like deterministic behavior too. Perhaps it should be "-1" so that folks will still set it appropriately and not rely on 0. (If you want to keep 0, then use a real value -- otherwise it seems fragile).

> 
> if i had picked some value, you could ask "why did you pick that value rather than this value" as a review comment :)

Indeed I would have. (ChangeLog comments would help if you can project the questions of others looking at the code. If not, we'll do it this way :).)
Comment 34 David Levin 2012-03-07 07:55:53 PST
Thought about the barriers a bit more.

Given your usage there is no need to define them as class members.

Something like this will do the same thing:
  // Read barrier.
  {
    Mutex barrier;
    MutexLocker locker(barrier);
  }
  return whatever;


Or 
    write whatever 
  // Write barrier.
  {
    Mutex barrier;
    MutexLocker locker(barrier);
  }
This seems better overall.
Comment 35 Michael Nordman 2012-03-07 13:13:24 PST
About using WorkerRunLoop's terminated bit (instead of adding a new bit to WorkerScriptController), that looks like it might work.
At a minimum, would need some careful changes to WorkerThread.

void WorkerThread::stop()
{
    // Mutex protection is necessary because stop() can be called before the context is fully created.
    MutexLocker lock(m_threadCreationMutex);

    // Ensure that tasks are being handled by thread event loop. If script execution weren't forbidden, a while(1) loop in JS could keep the thread alive forever.
    if (m_workerContext) {
        m_workerContext->script()->scheduleExecutionTermination();  // *** WOULD HAVE TO MOVE THIS CALL BELOW .terminate()

#if ENABLE(SQL_DATABASE)
        DatabaseTracker::tracker().interruptAllDatabasesForContext(m_workerContext.get());
#endif
        m_runLoop.postTask(WorkerThreadShutdownStartTask::create());
    }
    m_runLoop.terminate();
}

And to move that callsite down below .terminate(), would need to modify WorkerThread::workerThread()
to enter/exit the m_threadCreationMutex prior to nulling out m_workerContext.

ActiveDOMCallback would need to make a call WorkerRunLoop::termniated() in it's new isScriptControllerTerminating() method, that's simple enough given the existing public interfaces...
    workerContext->thread()->runLoop().termniated()

Looks like we could get rid of the WorkerScriptController::isExecutionTerminating() method/mutex/bool in the patch in this way.

I'd be willing to give that a try, do you see any pitfalls in there?


> Given your usage there is no need to define them as class members.
> This seems better overall.

I think that's too subtle for my tastes. Not clear enough that it would actually yield the desired behavior.
Comment 36 David Levin 2012-03-07 13:45:08 PST
(In reply to comment #35)
> I'd be willing to give that a try, do you see any pitfalls in there?

Hmmm... thanks for spelling this out for me.
Let's avoid this. Sorry for the distrations.

> > Given your usage there is no need to define them as class members.
> > This seems better overall.
> 
> I think that's too subtle for my tastes. Not clear enough that it would actually yield the desired behavior.

How is it different? (The use of a mutex here is pretty subtle but from what you've said it sounds like you just want to use it as a memory barrier. I guess you could use a local mutex around the whole things to put up memory barriers on each side.)

For the jsc case it may be different since they may be doing more than setting a variable.  btw, is it safe to call m_globalData->terminator.shouldTerminate(); from other threads? (when m_globalData may be in use on another thread)
Comment 37 David Levin 2012-03-07 13:55:31 PST
Comment on attachment 130464 [details]
v8bindings

View in context: https://bugs.webkit.org/attachment.cgi?id=130464&action=review

>> Source/WebCore/bindings/js/WorkerScriptController.cpp:165
>> +    MutexLocker locker(m_scheduledTerminationMutex);
> 
> Why is the mutex needed here?

Comment 1.

> Source/WebCore/bindings/js/WorkerScriptController.cpp:171
> +    MutexLocker locker(m_scheduledTerminationMutex);

Comment 2.

> Source/WebCore/bindings/v8/WorkerScriptController.cpp:97
> +        MutexLocker locker(m_scheduledTerminationMutex);

Comment 3.

>>>> Source/WebCore/bindings/v8/WorkerScriptController.cpp:105
>>>> +    MutexLocker locker(m_scheduledTerminationMutex);
>>> 
>>> quick comment. I don't understand why one should protect this bool using a mutex? (What does that do?)
>> 
>> The setter is called as the first step of the worker shutdown process on the main thread. The getter is called after toV8(xxx) fails to return a valid value on the worker thread. If there's another way to guarantee that the getter will return true after the setter has been called, that'd be fine by me. Is there another way to make that guarantee?
> 
> I just don't understand how this guarantees anything.
> 
> Suppose thread 2 calls isExecutionTerminating is called just before scheduleExecutionTermination is called on thread 1. It returns false but right after it returned false scheduleExecutionTermination is called on thread 1.
> 
> You've got some deeper insight here but I'm not getting it.

Comment 4.
Comment 38 Michael Nordman 2012-03-07 14:51:13 PST
(In reply to comment #36)
> (In reply to comment #35)
> > I'd be willing to give that a try, do you see any pitfalls in there?
> 
> Hmmm... thanks for spelling this out for me.
> Let's avoid this. Sorry for the distrations.

Right, it didn't work out so well given the ScriptController can get cleared out prior to having a chance to invoke scheduleExecutionTermination() on it.

> > I think that's too subtle for my tastes. Not clear enough that it would actually yield the desired behavior.
> 
> How is it different? (The use of a mutex here is pretty subtle but from what you've said it sounds like you just want to use it as a memory barrier. I guess you could use a local mutex around the whole things to put up memory barriers on each side.)

For the sake of clarity, i want to enter/exit the same mutex in the getter and setter, not an arbitrary mutex. It's not clear to me that using an artbitrary mutex could be expected to yield correct results on mult-processor systems. It's crystal clear to me that using the same mutex should.

> For the jsc case it may be different since they may be doing more than setting a variable.  btw, is it safe to call m_globalData->terminator.shouldTerminate(); from other threads? (when m_globalData may be in use on another thread)

It should be fine, m_globalData->terminator just holds a simple bool under the covers and gets/sets it w/o any other artifacts.


sgtm... static_cast<ActiveDOMObject::ReasonForSuspension>(-1)
Comment 39 Michael Nordman 2012-03-07 15:09:17 PST
Created attachment 130706 [details]
v8bindings

new patch along the lines of the original with minor mods
- added comments about mutex usage (if you'd like other words, lmk and please be specific)
- init to invalid ActiveDOMObject::ReasonForSuspension>(-1) value
- poked at ChangeLog entry
- fixed up copyright dates in headers
Comment 40 Michael Nordman 2012-03-07 15:24:44 PST
*** Bug 50995 has been marked as a duplicate of this bug. ***
Comment 41 Michael Nordman 2012-03-07 17:36:39 PST
Comment on attachment 130706 [details]
v8bindings

putting in the cq
Comment 42 WebKit Review Bot 2012-03-07 19:08:02 PST
Comment on attachment 130706 [details]
v8bindings

Clearing flags on attachment: 130706

Committed r110131: <http://trac.webkit.org/changeset/110131>
Comment 43 WebKit Review Bot 2012-03-07 19:08:09 PST
All reviewed patches have been landed.  Closing bug.
Comment 44 Michael Nordman 2012-03-07 19:37:53 PST
ooops... busted the qt build... fixed here (i hope)
http://trac.webkit.org/changeset/110136
Comment 45 WebKit Commit Bot 2013-05-23 06:08:02 PDT
The commit-queue just saw fast/workers/storage/use-same-database-in-page-and-workers.html flake (text diff) while processing attachment 202664 [details] on bug 116659.
Bot: webkit-cq-02  Port: <class 'webkitpy.common.config.ports.MacPort'>  Platform: Mac OS X 10.8.3
Comment 46 WebKit Commit Bot 2013-05-23 06:08:06 PDT
Created attachment 202675 [details]
Archive of layout-test-results from webkit-cq-02