Bug 89123

Summary: JSLock should be per-JSGlobalData
Product: WebKit Reporter: Mark Hahnenberg <mhahnenberg>
Component: JavaScriptCoreAssignee: Mark Hahnenberg <mhahnenberg>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, ap, ddkilzer, dglazkov, ggaren, gustavo, haraken, japhet, jochen, kbalazs, levin+threading, ossy, philn, rakuco, webkit.review.bot, xan.lopez
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 89807, 89809    
Bug Blocks:    
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch
none
Try to fix Qt
none
Try to fix Qt again
none
Trying to fix windows now
none
Trying windows again
none
Windows
none
Windows
none
Win
none
Patch ggaren: review+

Description Mark Hahnenberg 2012-06-14 13:32:47 PDT
Currently JSLock is a global lock. It also doesn't quite work with our GCActivityCallback. To make it easier for clients to use the JSC API safely in a concurrent fashion, we should change JSLock to be per-JSGlobalData. The locking model is then: whenever something is about to touch JSC, it needs to take the JSLock in the JSGlobalData for the VM it is about to interact with.
Comment 1 Alexey Proskuryakov 2012-06-14 21:59:17 PDT
JSLock is legacy. If you plan to introduce a new threading model in JSC, it should probably not build on top of that.
Comment 2 Mark Hahnenberg 2012-06-15 12:40:34 PDT
(In reply to comment #1)
> JSLock is legacy. If you plan to introduce a new threading model in JSC, it should probably not build on top of that.

I'm not sure what you mean by this. Does the fact that it's "legacy" mean I can't modify its functionality? Because whatever new thing I create needs to be locked in all the same places as JSLock. So why not just call it JSLock?
Comment 3 Mark Hahnenberg 2012-06-15 12:44:23 PDT
To be more concrete, I've written a patch that changes how JSLock works so that each JSGlobalData has its own JSLock (which I've dubbed an API lock), and whenever anybody wants to interact with that JSGlobalData or its associated context, they must first acquire its API lock.

It seems to work fine. It passes all unit tests and passes most unit tests with --threaded turned on. The only ones that it doesn't pass with --threaded are just timeouts from all the stuff that the extra JS threads are doing.
Comment 4 Mark Hahnenberg 2012-06-15 17:32:34 PDT
Created attachment 147925 [details]
Patch
Comment 5 Build Bot 2012-06-15 17:51:24 PDT
Comment on attachment 147925 [details]
Patch

Attachment 147925 [details] did not pass win-ews (win):
Output: http://queues.webkit.org/results/12961734
Comment 6 Early Warning System Bot 2012-06-15 17:53:59 PDT
Comment on attachment 147925 [details]
Patch

Attachment 147925 [details] did not pass qt-wk2-ews (qt):
Output: http://queues.webkit.org/results/12960749
Comment 7 Early Warning System Bot 2012-06-15 17:56:39 PDT
Comment on attachment 147925 [details]
Patch

Attachment 147925 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/12966541
Comment 8 Gustavo Noronha (kov) 2012-06-15 18:49:55 PDT
Comment on attachment 147925 [details]
Patch

Attachment 147925 [details] did not pass gtk-ews (gtk):
Output: http://queues.webkit.org/results/12964692
Comment 9 Gyuyoung Kim 2012-06-15 19:02:46 PDT
Comment on attachment 147925 [details]
Patch

Attachment 147925 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/12955938
Comment 10 Geoffrey Garen 2012-06-17 21:45:16 PDT
Qt build failed:

/usr/bin/gold: /storage/WebKit-qt-ews/WebKitBuild/Release/Source/JavaScriptCore/release/libJavaScriptCore.a(JSGlobalData.o): in function JSC::JSGlobalData::~JSGlobalData():JSGlobalData.cpp(.text+0x2aba): error: undefined reference to 'JSC::IncrementalSweeper::signalAndWaitForExit()'
/usr/bin/gold: /storage/WebKit-qt-ews/WebKitBuild/Release/Source/JavaScriptCore/release/libJavaScriptCore.a(JSGlobalData.o): in function JSC::JSGlobalData::~JSGlobalData():JSGlobalData.cpp(.text+0x402e): error: undefined reference to 'JSC::IncrementalFailed to run "['Tools/Scripts/build-webkit', '--release', '--qt', '--makeargs="-j12"']" exit_code: 2

Windows build failed:

3>..\..\runtime\JSLock.cpp(268) : error C2511: 'JSC::JSLock::JSLock(JSC::ExecState *)' : overloaded member function not found in 'JSC::JSLock'
3>        c:\cygwin\home\buildbot\webkit\source\javascriptcore\runtime\JSLock.h(79) : see declaration of 'JSC::JSLock'
3>..\..\runtime\JSLock.cpp(299) : error C2059: syntax error : '}'
3>..\..\runtime\JSLock.cpp(300) : error C2612: trailing 'end of file' illegal in base/member initializer list
3>..\..\runtime\JSLock.cpp(300) : fatal error C1004: unexpected end-of-file found

GTK build failed:

./.libs/libjavascriptcoregtk-3.0.so: error: undefined reference to 'JSC::IncrementalSweeper::signalAndWaitForExit()'
collect2: error: ld returned 1 exit status
Comment 11 Geoffrey Garen 2012-06-17 22:26:41 PDT
Comment on attachment 147925 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=147925&action=review

Good first step, but it looks like this needs a bit more work.

> Source/JavaScriptCore/ChangeLog:109
> +        (JSC::GlobalJSLock::GlobalJSLock): This is our new global lock that must be taken when creating new contexts.
> +        (JSC::GlobalJSLock::~GlobalJSLock):

I didn't understand this part. What data does the global lock protect?

> Source/JavaScriptCore/jsc.cpp:686
> +    globalData.clear();

This should be automatic.

> Source/JavaScriptCore/heap/Heap.cpp:173
> +    if (globalData->isSharedInstance() && !isValidSharedInstanceThreadState(globalData))

Even non-"shared instance" VMs need locking, since they need mutual exclusion against GC on another thread. So, I think you should remove the isSharedInstance() escape from this ASSERT, and similar ASSERTs.

> Source/JavaScriptCore/heap/Heap.cpp:326
> +    ASSERT(m_globalData->apiLock().currentThreadIsHoldingLock() || !m_globalData->isSharedInstance());

Same comment about isSharedInstance() ASSERTs.

> Source/JavaScriptCore/heap/Heap.cpp:337
> +    ASSERT(m_globalData->apiLock().currentThreadIsHoldingLock() || !m_globalData->isSharedInstance());

Same comment about isSharedInstance() ASSERTs.

> Source/JavaScriptCore/heap/Heap.cpp:691
> +    ASSERT(globalData()->apiLock().currentThreadIsHoldingLock());

Same comment about isSharedInstance() ASSERTs.

> Source/JavaScriptCore/runtime/JSGlobalData.cpp:230
> +    // This must be the very first thing we do before tearing down any more of this object
> +    // because the Heap's GCActivityCallback could run another GC on a remote thread.
> +    heap.activityCallback()->signalAndWaitForExit();
> +    heap.sweeper()->signalAndWaitForExit();

Trying to reason through this signal and wait tear-down phase hurt my brain a little bit.

The fundamental problem we're trying to solve here is that a timer might fire with a pointer to an invalid context (since JSGlobalData tear-down may have freed the memory pointed to by the context). (The timer itself will still be valid since it's referenced by the run loop.)

One possible alternative is to have our timers ref / deref the JSGlobalData. Maybe we can come up with something better.

> Source/JavaScriptCore/runtime/JSGlobalData.h:185
> +        JSLock m_apiLock;

Do we only need this lock for API clients, or does WebKit need to use it too?

> Source/JavaScriptCore/runtime/JSLock.cpp:79
> +    if (m_globalData->refCount() > 1)
> +        m_globalData->apiLock().unlock();

It's a bit sketchy to test the refCount of something that can be ref'd/deref'd on multiple threads. Is there another way to do this?

> Source/JavaScriptCore/runtime/JSLock.cpp:99
> +    if (m_lockBehavior == SilenceAssertionsOnly)

I don't think we can support SilenceAssertionsOnly mode anymore, can we?

> Source/JavaScriptCore/runtime/JSLock.cpp:106
> +            SpinLockHolder holder(&m_spinLock);

I don't think the extra layer of locking buys us anything. Is this a speedup on something?
Comment 12 Geoffrey Garen 2012-06-18 08:25:41 PDT
Comment on attachment 147925 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=147925&action=review

> Source/JavaScriptCore/API/APIShims.h:-45
> -        m_globalData->heap.activityCallback()->synchronize();

I think you need to keep this call to synchronize(). Otherwise, if JSC usage starts on thread A, then continues on thread B, and thread A exits, the GC timer stops firing forever.

For the same reason, I think you need to *add* a synchronize function to the incremental sweeper.

I believe it's OK to remove the other two synchronize() calls, though.
Comment 13 Geoffrey Garen 2012-06-18 09:13:29 PDT
Comment on attachment 147925 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=147925&action=review

> Source/JavaScriptCore/heap/IncrementalSweeper.cpp:78
> +void IncrementalSweeper::signalAndWaitForExit()

Now that my brain hurts a little less, here's one scenario that could go make signalAndWaitForExit go wrong:
(1) Timer fires on thread A, enters function prologue for timerDidFire
(2) Thread A is interrupted
(3) Thread B deletes JSGlobalData
(4) Thread A resumes timerDidFire and dereferences a bunch of deleted memory

The simplest thing that could go wrong in (4) is that thisObject->m_deathMutex is garbage, so the lock deadlocks.

Here's an alternative strategy that I think would work:
(1) Give the GC activity callback and incremental sweeper a mutex, and a JSGlobalData* m_globalData
(2) Change timerDidFire to lock the mutex, then, if m_globalData is NULL, release the mutex, delete the callback object (canceling the timer), and return early
    (*) This is thread-safe because, once NULL, m_globalData will never become non-NULL again
(5) Change ~JSGlobalData to call invalidate(), which:
    (a) locks the mutex
    (b) if the timer's run loop is the current run loop, releases the mutex and deletes the callback object (canceling the timer)
    (b) else, sets m_globalData to NULL and changes the timer fire date to the distant past (this ensures timely termination on another thread)
Comment 14 Mark Hahnenberg 2012-06-20 18:45:32 PDT
Created attachment 148705 [details]
Patch
Comment 15 Early Warning System Bot 2012-06-20 19:10:31 PDT
Comment on attachment 148705 [details]
Patch

Attachment 148705 [details] did not pass qt-wk2-ews (qt):
Output: http://queues.webkit.org/results/13018070
Comment 16 Build Bot 2012-06-20 19:15:46 PDT
Comment on attachment 148705 [details]
Patch

Attachment 148705 [details] did not pass mac-ews (mac):
Output: http://queues.webkit.org/results/13012086
Comment 17 Build Bot 2012-06-20 19:29:23 PDT
Comment on attachment 148705 [details]
Patch

Attachment 148705 [details] did not pass win-ews (win):
Output: http://queues.webkit.org/results/13011078
Comment 18 Early Warning System Bot 2012-06-20 19:41:03 PDT
Comment on attachment 148705 [details]
Patch

Attachment 148705 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/13022042
Comment 19 WebKit Review Bot 2012-06-20 20:04:24 PDT
Comment on attachment 148705 [details]
Patch

Attachment 148705 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/13025004
Comment 20 Gyuyoung Kim 2012-06-20 21:16:53 PDT
Comment on attachment 148705 [details]
Patch

Attachment 148705 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/13022067
Comment 21 Mark Hahnenberg 2012-06-21 11:05:13 PDT
Created attachment 148843 [details]
Patch
Comment 22 Build Bot 2012-06-21 11:28:38 PDT
Comment on attachment 148843 [details]
Patch

Attachment 148843 [details] did not pass mac-ews (mac):
Output: http://queues.webkit.org/results/13034204
Comment 23 Early Warning System Bot 2012-06-21 11:39:22 PDT
Comment on attachment 148843 [details]
Patch

Attachment 148843 [details] did not pass qt-wk2-ews (qt):
Output: http://queues.webkit.org/results/13039155
Comment 24 Early Warning System Bot 2012-06-21 11:49:59 PDT
Comment on attachment 148843 [details]
Patch

Attachment 148843 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/13033189
Comment 25 WebKit Review Bot 2012-06-21 12:29:05 PDT
Comment on attachment 148843 [details]
Patch

Attachment 148843 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/13039169
Comment 26 Build Bot 2012-06-21 13:55:31 PDT
Comment on attachment 148843 [details]
Patch

Attachment 148843 [details] did not pass win-ews (win):
Output: http://queues.webkit.org/results/13039188
Comment 27 Mark Hahnenberg 2012-06-21 14:50:59 PDT
Created attachment 148890 [details]
Patch
Comment 28 Early Warning System Bot 2012-06-21 15:14:54 PDT
Comment on attachment 148890 [details]
Patch

Attachment 148890 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/13028255
Comment 29 Mark Hahnenberg 2012-06-21 15:17:45 PDT
Created attachment 148898 [details]
Try to fix Qt
Comment 30 Early Warning System Bot 2012-06-21 15:43:41 PDT
Comment on attachment 148898 [details]
Try to fix Qt

Attachment 148898 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/13035207
Comment 31 Mark Hahnenberg 2012-06-21 15:45:37 PDT
Created attachment 148904 [details]
Try to fix Qt again
Comment 32 Build Bot 2012-06-21 17:50:30 PDT
Comment on attachment 148904 [details]
Try to fix Qt again

Attachment 148904 [details] did not pass win-ews (win):
Output: http://queues.webkit.org/results/13025256
Comment 33 Mark Hahnenberg 2012-06-21 17:51:13 PDT
Created attachment 148925 [details]
Trying to fix windows now
Comment 34 Build Bot 2012-06-21 19:05:44 PDT
Comment on attachment 148925 [details]
Trying to fix windows now

Attachment 148925 [details] did not pass win-ews (win):
Output: http://queues.webkit.org/results/13028307
Comment 35 Mark Hahnenberg 2012-06-21 19:33:30 PDT
Created attachment 148944 [details]
Trying windows again
Comment 36 Build Bot 2012-06-21 20:03:22 PDT
Comment on attachment 148944 [details]
Trying windows again

Attachment 148944 [details] did not pass win-ews (win):
Output: http://queues.webkit.org/results/13038289
Comment 37 Mark Hahnenberg 2012-06-21 20:23:21 PDT
Created attachment 148950 [details]
Windows
Comment 38 Mark Hahnenberg 2012-06-21 20:46:44 PDT
Created attachment 148954 [details]
Windows
Comment 39 Build Bot 2012-06-21 21:26:35 PDT
Comment on attachment 148954 [details]
Windows

Attachment 148954 [details] did not pass win-ews (win):
Output: http://queues.webkit.org/results/13027355
Comment 40 Mark Hahnenberg 2012-06-22 07:46:25 PDT
Created attachment 149025 [details]
Win
Comment 41 Gavin Barraclough 2012-06-22 14:00:25 PDT
Comment on attachment 149025 [details]
Win

View in context: https://bugs.webkit.org/attachment.cgi?id=149025&action=review

> Source/JavaScriptCore/runtime/JSLock.cpp:42
> +static pthread_mutex_t giantGlobalJSLock = PTHREAD_MUTEX_INITIALIZER;

I'm not sure that 'giant' adds much here, but sure, why not. :o)

> Source/JavaScriptCore/runtime/JSLock.cpp:100
> +    {

Looks like this extra block is redundant, the scope goes to the end of the function anyway – should be able to remove.

> Source/WebCore/bridge/c/c_class.cpp:51
> +    // FIXME: Should we acquire a JSLock here?

Doesn't look like this code touches JSC, I'd delete this FIXME, looks like this was just junk code.

If you think there is a real risk, then keep the fixme & file new bugzilla bugs to track that this should be investigated. (though I think this is unnecessary)

> Source/WebCore/bridge/jni/jsc/JavaClassJSC.cpp:66
> +                // FIXME: Should we acquire a JSLock here?

Ditto to c_class, delete the FIXME.

> Source/WebCore/bridge/jni/jsc/JavaClassJSC.cpp:82
> +                // FIXME: Should we acquire a JSLock here?

Ditto to c_class, delete the FIXME.

> Source/WebCore/bridge/jni/jsc/JavaClassJSC.cpp:103
> +    // FIXME: Should we acquire a JSLock here?

Ditto to c_class, delete the FIXME.

> Source/WebCore/bridge/jni/jsc/JavaMethodJSC.cpp:114
> +        // FIXME: Should we acquire a JSLock here?

Ditto to c_class, delete the FIXME.
Comment 42 Mark Hahnenberg 2012-06-22 14:43:27 PDT
Committed r121058: <http://trac.webkit.org/changeset/121058>
Comment 43 Csaba Osztrogonác 2012-06-23 01:53:55 PDT
(In reply to comment #42)
> Committed r121058: <http://trac.webkit.org/changeset/121058>

It made plugin tests assert, I filed a new bug report for it - https://bugs.webkit.org/show_bug.cgi?id=89807
Comment 44 WebKit Review Bot 2012-06-23 06:28:29 PDT
Re-opened since this is blocked by 89809
Comment 45 Mark Hahnenberg 2012-06-25 16:46:36 PDT
Created attachment 149392 [details]
Patch
Comment 46 Balazs Kelemen 2012-06-26 09:43:12 PDT
(In reply to comment #45)
> Created an attachment (id=149392) [details]
> Patch

Isn't it already landed?
Comment 47 Mark Hahnenberg 2012-06-26 09:47:54 PDT
(In reply to comment #46)
> (In reply to comment #45)
> > Created an attachment (id=149392) [details] [details]
> > Patch
> 
> Isn't it already landed?

No, it was rolled out.
Comment 48 Csaba Osztrogonác 2012-06-26 09:49:12 PDT
(In reply to comment #46)
> (In reply to comment #45)
> > Created an attachment (id=149392) [details] [details]
> > Patch
> 
> Isn't it already landed?
No, it isn't. http://trac.webkit.org/changeset/121058 was the original patch, which was rolled out by http://trac.webkit.org/changeset/121098. I think it is the fixed patch.
Comment 49 Geoffrey Garen 2012-06-27 12:13:13 PDT
Comment on attachment 149392 [details]
Patch

r=me on the new changes

I think "sharedInstanceLock" would be a better name than "giantGlobalJSLock".
Comment 50 Mark Hahnenberg 2012-06-27 16:08:41 PDT
Committed r121381: <http://trac.webkit.org/changeset/121381>