169819 – [JSC] MachineThreads does not consider situation that one thread has multiple VMs

RESOLVED FIXED 169819

[JSC] MachineThreads does not consider situation that one thread has multiple VMs

https://bugs.webkit.org/show_bug.cgi?id=169819

Summary [JSC] MachineThreads does not consider situation that one thread has multiple...

Yusuke Suzuki

Reported 2017-03-17 11:04:02 PDT

[JSC] MachineThreads does not consider situation that one thread has multiple VMs

Attachments
Patch (17.58 KB, patch) 2017-03-17 11:04 PDT, Yusuke Suzuki	no flags	Details Formatted Diff Diff
Patch (18.77 KB, patch) 2017-03-17 11:11 PDT, Yusuke Suzuki	no flags	Details Formatted Diff Diff
Archive of layout-test-results from ews107 for mac-elcapitan-wk2 (1.00 MB, application/zip) 2017-03-17 12:19 PDT, Build Bot	no flags	Details
Patch (19.16 KB, patch) 2017-03-18 07:28 PDT, Yusuke Suzuki	no flags	Details Formatted Diff Diff
Patch (19.13 KB, patch) 2017-03-21 14:47 PDT, Yusuke Suzuki	no flags	Details Formatted Diff Diff
Patch (17.41 KB, patch) 2017-03-23 08:17 PDT, Yusuke Suzuki	no flags	Details Formatted Diff Diff
Patch (17.74 KB, patch) 2017-03-23 14:03 PDT, Yusuke Suzuki	no flags	Details Formatted Diff Diff
Show Obsolete (6) View All Add attachment proposed patch, testcase, etc.

Yusuke Suzuki

Comment 1 2017-03-17 11:04:20 PDT

Created attachment 304797 [details] Patch

Yusuke Suzuki

Comment 2 2017-03-17 11:11:10 PDT

Created attachment 304800 [details] Patch

Build Bot

Comment 3 2017-03-17 12:19:39 PDT

Comment on attachment 304800 [details] Patch Attachment 304800 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.webkit.org/results/3350965 New failing tests: webrtc/peer-connection-audio-mute2.html

Build Bot

Comment 4 2017-03-17 12:19:43 PDT

Created attachment 304808 [details] Archive of layout-test-results from ews107 for mac-elcapitan-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews107 Port: mac-elcapitan-wk2 Platform: Mac OS X 10.11.6

Yusuke Suzuki

Comment 5 2017-03-18 02:37:17 PDT

I think this patch should include futher changes: Start suspending threads in predefined manner to avoid deadlock.

Yusuke Suzuki

Comment 6 2017-03-18 07:28:11 PDT

Created attachment 304867 [details] Patch

Yusuke Suzuki

Comment 7 2017-03-18 07:28:37 PDT

(In reply to comment #5) > I think this patch should include futher changes: Start suspending threads > in predefined manner to avoid deadlock. As a first step, this patch just separates ThreadData from Thread.

Yusuke Suzuki

Comment 8 2017-03-21 14:47:11 PDT

Created attachment 305030 [details] Patch

Yusuke Suzuki

Comment 9 2017-03-22 10:59:02 PDT

Ping?

Mark Lam

Comment 10 2017-03-22 21:43:19 PDT

Comment on attachment 305030 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=305030&action=review > Source/JavaScriptCore/ChangeLog:10 > + While MachineThreads::Thread is created for every MachineThreads. Thus, one thread > + can have multiple MachineThreads::Thread if the thread has multiple VMs. > + It actually causes deadlock in GTK port DatabaseProcess. I think I understand what the issue is now. It's not so much that every VM allocates its own MachineThreads::Thread per thread it runs on, but that the Linux implementation of thread suspend/resume relies on having a thread specific singleton that houses the semaphore needed to communicate with the originating thread that requested the suspend/resume. As a result, when there's potentially more than one VM running on the same thread, MachineThreads::Thread cannot serve as that thread specific singleton that the Linux port needs. Hence, you're factoring out the ThreadData part as the thread specific singleton. Did I understand the issue correctly? > Source/JavaScriptCore/heap/MachineStackMarker.cpp:73 > -thread_local static std::atomic<JSC::MachineThreads::Thread*> threadLocalCurrentThread; > +static std::atomic<JSC::MachineThreads::ThreadData*> globalThreadDataToSuspend { nullptr }; Why is this needed? Shouldn't there only be 1 ThreadData per thread, and hence, you can use something like: thread_local static std::atomic<JSC::MachineThreads::ThreadData*> threadLocalCurrentThreadData; ... and revert the conversion to using globalThreadDataToSuspend below? > Source/JavaScriptCore/heap/MachineStackMarker.cpp:207 > + static NeverDestroyed<ThreadSpecific<MachineThreads::ThreadData, CanBeGCThread::True>> threadData; > + return threadData.get(); I'm not too clear on the interaction of NeverDestroyed with ThreadSpecific here. My concern is whether this will leak memory for every thread that is started and ended. For example, let's say you have the following scenario: 1. Thread T1 creates a VM. 2. Thread T2 enters the VM, thereby creating a ThreadSpecific<ThreadData> for T2. 3. Thread T2 exits the VM, and terminates itself. 4. Thread T3 now enters the VM, and creates a ThreadSpecific<ThreadData> for T3. 5. Thread T3 exits the VM, and terminated itself. 6. Repeat steps 2 and 3 for thread T4, T5, etc. just like we did for thread T3 in steps 4 and 5. On thread destruction, I know that ThreadSpecific will take care of destructing ThreadData and freeing its memory. What I suspect is that NeverDestroyed will allocated a memory for ThreadSpecific, but not free it. As a result, for every thread that is created and dies, we'll end up leaking sizeof(ThreadSpecific). Can you test and confirm that we won't be leaking memory here? > Source/JavaScriptCore/heap/MachineStackMarker.h:67 > + class ThreadData { This needs to be WTF_MAKE_FAST_ALLOCATED because ThreadSpecific::destroy() will call fastFree() on it. > Source/JavaScriptCore/heap/MachineStackMarker.h:147 > + bool suspend() > + { > + return data->suspend(); > + } > + void resume() > + { > + data->resume(); > + } > + size_t getRegisters(Registers& regs) > + { > + return data->getRegisters(regs); > + } > + void freeRegisters(Registers& regs) > + { > + data->freeRegisters(regs); > + } > + std::pair<void*, size_t> captureStack(void* stackTop) > + { > + return data->captureStack(stackTop); > + } nit: since these only have a single statement in their body, you can make them all single line just like the other methods on Thread.

Yusuke Suzuki

Comment 11 2017-03-23 06:04:31 PDT

Comment on attachment 305030 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=305030&action=review >> Source/JavaScriptCore/ChangeLog:10 >> + It actually causes deadlock in GTK port DatabaseProcess. > > I think I understand what the issue is now. It's not so much that every VM allocates its own MachineThreads::Thread per thread it runs on, but that the Linux implementation of thread suspend/resume relies on having a thread specific singleton that houses the semaphore needed to communicate with the originating thread that requested the suspend/resume. As a result, when there's potentially more than one VM running on the same thread, MachineThreads::Thread cannot serve as that thread specific singleton that the Linux port needs. Hence, you're factoring out the ThreadData part as the thread specific singleton. Did I understand the issue correctly? Right. >> Source/JavaScriptCore/heap/MachineStackMarker.cpp:207 >> + return threadData.get(); > > I'm not too clear on the interaction of NeverDestroyed with ThreadSpecific here. My concern is whether this will leak memory for every thread that is started and ended. For example, let's say you have the following scenario: > > 1. Thread T1 creates a VM. > 2. Thread T2 enters the VM, thereby creating a ThreadSpecific<ThreadData> for T2. > 3. Thread T2 exits the VM, and terminates itself. > 4. Thread T3 now enters the VM, and creates a ThreadSpecific<ThreadData> for T3. > 5. Thread T3 exits the VM, and terminated itself. > 6. Repeat steps 2 and 3 for thread T4, T5, etc. just like we did for thread T3 in steps 4 and 5. > > On thread destruction, I know that ThreadSpecific will take care of destructing ThreadData and freeing its memory. > What I suspect is that NeverDestroyed will allocated a memory for ThreadSpecific, but not free it. As a result, for every thread that is created and dies, we'll end up leaking sizeof(ThreadSpecific). > > Can you test and confirm that we won't be leaking memory here? Oooooops! Nice catch. Right, NeverDestroyed will not call the destructor. I'll offer another way to allocate thread local storage for that, this should be something like PerThread in bmalloc. >> Source/JavaScriptCore/heap/MachineStackMarker.h:67 >> + class ThreadData { > > This needs to be WTF_MAKE_FAST_ALLOCATED because ThreadSpecific::destroy() will call fastFree() on it. Thanks! Fixed. >> Source/JavaScriptCore/heap/MachineStackMarker.h:147 >> + } > > nit: since these only have a single statement in their body, you can make them all single line just like the other methods on Thread. OK, fixed.

Yusuke Suzuki

Comment 12 2017-03-23 07:59:36 PDT

Comment on attachment 305030 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=305030&action=review >>> Source/JavaScriptCore/heap/MachineStackMarker.cpp:207 >>> + return threadData.get(); >> >> I'm not too clear on the interaction of NeverDestroyed with ThreadSpecific here. My concern is whether this will leak memory for every thread that is started and ended. For example, let's say you have the following scenario: >> >> 1. Thread T1 creates a VM. >> 2. Thread T2 enters the VM, thereby creating a ThreadSpecific<ThreadData> for T2. >> 3. Thread T2 exits the VM, and terminates itself. >> 4. Thread T3 now enters the VM, and creates a ThreadSpecific<ThreadData> for T3. >> 5. Thread T3 exits the VM, and terminated itself. >> 6. Repeat steps 2 and 3 for thread T4, T5, etc. just like we did for thread T3 in steps 4 and 5. >> >> On thread destruction, I know that ThreadSpecific will take care of destructing ThreadData and freeing its memory. >> What I suspect is that NeverDestroyed will allocated a memory for ThreadSpecific, but not free it. As a result, for every thread that is created and dies, we'll end up leaking sizeof(ThreadSpecific). >> >> Can you test and confirm that we won't be leaking memory here? > > Oooooops! Nice catch. Right, NeverDestroyed will not call the destructor. > I'll offer another way to allocate thread local storage for that, this should be something like PerThread in bmalloc. Ah, no. NeverDestroyed<> is created as static. And its storage for ThreadSpecific itself is allocated per process (by static). Thus, I think it does not cause any memory leaks. static NeverDestroyed<ThreadSpecific<T>> t; In the above case, ThreadSpecific<T>'s storage itself is statically allocated by NeverDestroyed<> per process. And T()'s memory is allocated per thread in ThreadSpecific. And this allocated per thread memory should be deleted by the registered destructor by using pthread_key_create. See ThreadSpecific<T>::destroy(void* ptr) function. It finally calls `delete ptr`.

Yusuke Suzuki

Comment 13 2017-03-23 08:06:36 PDT

Comment on attachment 305030 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=305030&action=review >> Source/JavaScriptCore/heap/MachineStackMarker.cpp:73 >> +static std::atomic<JSC::MachineThreads::ThreadData*> globalThreadDataToSuspend { nullptr }; > > Why is this needed? Shouldn't there only be 1 ThreadData per thread, and hence, you can use something like: > thread_local static std::atomic<JSC::MachineThreads::ThreadData*> threadLocalCurrentThreadData; > > ... and revert the conversion to using globalThreadDataToSuspend below? I'm a bit worried about using thread_local inside the signal handler. But seeing the wiki https://sourceware.org/glibc/wiki/TLSandSignals, it seems that thread_local is async signal safe. So, we can just use thread_local. Reverting the changes about globalThreadDataToSuspend.

Yusuke Suzuki

Comment 14 2017-03-23 08:17:00 PDT

Created attachment 305196 [details] Patch

Mark Lam

Comment 15 2017-03-23 10:07:21 PDT

Comment on attachment 305196 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=305196&action=review > Source/JavaScriptCore/ChangeLog:10 > + While MachineThreads::Thread is created for every MachineThreads. Thus, one thread > + can have multiple MachineThreads::Thread if the thread has multiple VMs. > + It actually causes deadlock in GTK port DatabaseProcess. I suggest adding a little bit of context here like this: "The Linux port of PlatformThread suspend/resume mechanism relies on having a thread specific singleton thread data, and was relying on MachineThreads::Thread to be this thread specific singleton. But because MachineThreads::Thread is not a thread specific singleton, we can get a deadlock in the GTK port's DatabaseProcess." > Source/JavaScriptCore/ChangeLog:13 > + As a first step, this patch simply moves per thread data from MachineThreads::Thread > + to MachineThreads::ThreadData. This patch is no longer just a first step, right? It actually solves the issue, right? How about rephrasing this as: "This patch fixes this issue by moving per thread data from MachineThreads::Thread to MachineThreads::ThreadData, where there will only be one instance of MachineThreads::ThreadData per thread. Each MachineThreads::Thread will now point to the same MachineThreads::ThreadData for any given thread." > Source/JavaScriptCore/heap/MachineStackMarker.cpp:207 > + static NeverDestroyed<ThreadSpecific<MachineThreads::ThreadData, CanBeGCThread::True>> threadData; > + return threadData.get(); Looking at the implementation of ThreadSpecific<T>, it appears that it does not instantiate T until you invoke operator T*() on it. Invoking get() returns null if it's not initialized. Can you assert that threadData.get() is actually returning what you expect? > Source/JavaScriptCore/heap/MachineStackMarker.h:141 > + class Thread { > + WTF_MAKE_FAST_ALLOCATED; > + Thread(ThreadData*); > + > + public: > + using Registers = ThreadData::Registers; > + > + static Thread* createForCurrentThread(); > + > + bool operator==(const PlatformThread& other) const; > + bool operator!=(const PlatformThread& other) const { return !(*this == other); } > + > + bool suspend() { return data->suspend(); } > + void resume() { data->resume(); } > + size_t getRegisters(Registers& regs) { return data->getRegisters(regs); } > + void freeRegisters(Registers& regs) { data->freeRegisters(regs); } > + std::pair<void*, size_t> captureStack(void* stackTop) { return data->captureStack(stackTop); } > + > + const PlatformThread& platformThread() { return data->platformThread; } > + void* stackBase() const { return data->stackBase; } > + void* stackEnd() const { return data->stackEnd; } > + > + Thread* next; > + ThreadData* data; > + }; Thinking about this some more, I realized that this Thread class now only serves one purpose: to be a SinglyLinkedList. I was thinking you could get rid of it and just change MachineThread::m_registeredThreads to a WTF::SinglyLinkedList<ThreadData>, but found out that our WTF::SinglyLinkedList implementation does not yet support the iteration protocol. I suppose we can beef up the SinglyLinkedList impl later, and refactor this to use this to use it. The operator== and operator!= will need to move back into ThreadData. I think that that would make the code cleaner. You can even forego the renaming of MachineThreads::Thread to MachineThreads::ThreadData, and having to create all these wrapper functions. I do like the ThreadData name better, and style-wise, I prefer accessor functions but that's just a refactoring that can be applied later if needed. What do you think? If it's too much work, I'm ok with landing this patch for now and filing a separate bug to use a SinglyLinkedList later.

Yusuke Suzuki

Comment 16 2017-03-23 13:59:21 PDT

Comment on attachment 305196 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=305196&action=review Thanks! >> Source/JavaScriptCore/ChangeLog:10 >> + It actually causes deadlock in GTK port DatabaseProcess. > > I suggest adding a little bit of context here like this: > > "The Linux port of PlatformThread suspend/resume mechanism relies on having a thread specific singleton thread data, and was relying on MachineThreads::Thread to be this thread specific singleton. But because MachineThreads::Thread is not a thread specific singleton, we can get a deadlock in the GTK port's DatabaseProcess." OK, fixed. >> Source/JavaScriptCore/ChangeLog:13 >> + to MachineThreads::ThreadData. > > This patch is no longer just a first step, right? It actually solves the issue, right? How about rephrasing this as: > > "This patch fixes this issue by moving per thread data from MachineThreads::Thread to MachineThreads::ThreadData, where there will only be one instance of MachineThreads::ThreadData per thread. Each MachineThreads::Thread will now point to the same MachineThreads::ThreadData for any given thread." Sounds nice! Fixed. >> Source/JavaScriptCore/heap/MachineStackMarker.cpp:207 >> + return threadData.get(); > > Looking at the implementation of ThreadSpecific<T>, it appears that it does not instantiate T until you invoke operator T*() on it. Invoking get() returns null if it's not initialized. Can you assert that threadData.get() is actually returning what you expect? No. This get() is NeverDestoyed<T>::get(). It returns ThreadSpecific<>&. Then, ThreadSpecific<T>::operator T* is invoked here, which includes initializing logic. And ThreadSpecific<T>::get() is private function. Thus it cannot be accessed from outside of ThreadSpecific<>. But, inserting assertion is good idea. I've added ASSERT to MachineThreads::Thread constructor. >> Source/JavaScriptCore/heap/MachineStackMarker.h:141 >> + }; > > Thinking about this some more, I realized that this Thread class now only serves one purpose: to be a SinglyLinkedList. I was thinking you could get rid of it and just change MachineThread::m_registeredThreads to a WTF::SinglyLinkedList<ThreadData>, but found out that our WTF::SinglyLinkedList implementation does not yet support the iteration protocol. I suppose we can beef up the SinglyLinkedList impl later, and refactor this to use this to use it. The operator== and operator!= will need to move back into ThreadData. I think that that would make the code cleaner. You can even forego the renaming of MachineThreads::Thread to MachineThreads::ThreadData, and having to create all these wrapper functions. I do like the ThreadData name better, and style-wise, I prefer accessor functions but that's just a refactoring that can be applied later if needed. > > What do you think? If it's too much work, I'm ok with landing this patch for now and filing a separate bug to use a SinglyLinkedList later. Yeah. While using SinglyLinkedList is nice, I think we still need Thread separeted from ThreadData. Consider the situation, There are two VMs A and B. And there are two user threads i, ii, and iii. i use A. And later use B (with JSLock). ii use A. And later use B. iii use A. In that case, A's thread list should include i, ii, iii while B's thread list should include i and ii. If we move `Thread* next` to ThreadData, it breaks the above situation. Several clean up can be considered. 1. Creating SinglyListNode. Define the class like, class SinglyListNode<ThreadData>; It has operator-> to access ThreadData's accessor. And SinglyListNode adds m_next feature. And SinglyList<ThreadData> will be managed by this node. BTW, I think this is the same to the STL's std::list<ThreadData>. 2. Introduce IntrusiveList. Create the IntrusiveList class. And use it for the Thread. Like, Thread : public IntrusiveListNode<Thread>. I think it's a bit more generic approach. And maybe, it is useful. BTW, I think the above changes require some code addition to WTF. So I like doing this in the separate patch :)

Yusuke Suzuki

Comment 17 2017-03-23 14:03:16 PDT

Created attachment 305225 [details] Patch

Mark Lam

Comment 18 2017-03-23 14:04:44 PDT

Comment on attachment 305196 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=305196&action=review >>> Source/JavaScriptCore/heap/MachineStackMarker.cpp:207 >>> + return threadData.get(); >> >> Looking at the implementation of ThreadSpecific<T>, it appears that it does not instantiate T until you invoke operator T*() on it. Invoking get() returns null if it's not initialized. Can you assert that threadData.get() is actually returning what you expect? > > No. This get() is NeverDestoyed<T>::get(). It returns ThreadSpecific<>&. > Then, ThreadSpecific<T>::operator T* is invoked here, which includes initializing logic. > And ThreadSpecific<T>::get() is private function. Thus it cannot be accessed from outside of ThreadSpecific<>. > > But, inserting assertion is good idea. I've added ASSERT to MachineThreads::Thread constructor. Ah yes. My mistake. >>> Source/JavaScriptCore/heap/MachineStackMarker.h:141 >>> + }; >> >> Thinking about this some more, I realized that this Thread class now only serves one purpose: to be a SinglyLinkedList. I was thinking you could get rid of it and just change MachineThread::m_registeredThreads to a WTF::SinglyLinkedList<ThreadData>, but found out that our WTF::SinglyLinkedList implementation does not yet support the iteration protocol. I suppose we can beef up the SinglyLinkedList impl later, and refactor this to use this to use it. The operator== and operator!= will need to move back into ThreadData. I think that that would make the code cleaner. You can even forego the renaming of MachineThreads::Thread to MachineThreads::ThreadData, and having to create all these wrapper functions. I do like the ThreadData name better, and style-wise, I prefer accessor functions but that's just a refactoring that can be applied later if needed. >> >> What do you think? If it's too much work, I'm ok with landing this patch for now and filing a separate bug to use a SinglyLinkedList later. > > Yeah. While using SinglyLinkedList is nice, I think we still need Thread separeted from ThreadData. > Consider the situation, > > There are two VMs A and B. And there are two user threads i, ii, and iii. > > i use A. And later use B (with JSLock). > ii use A. And later use B. > iii use A. > > In that case, A's thread list should include i, ii, iii while B's thread list should include i and ii. > If we move `Thread* next` to ThreadData, it breaks the above situation. > > Several clean up can be considered. > > 1. Creating SinglyListNode. > > Define the class like, class SinglyListNode<ThreadData>; > It has operator-> to access ThreadData's accessor. And SinglyListNode adds m_next feature. > And SinglyList<ThreadData> will be managed by this node. BTW, I think this is the same to the STL's std::list<ThreadData>. > > 2. Introduce IntrusiveList. > > Create the IntrusiveList class. And use it for the Thread. Like, Thread : public IntrusiveListNode<Thread>. > I think it's a bit more generic approach. And maybe, it is useful. > > BTW, I think the above changes require some code addition to WTF. So I like doing this in the separate patch :) Or just use WTF::DoublyLinkedList. But you can do that in a separate patch.

Mark Lam

Comment 19 2017-03-23 14:05:33 PDT

Comment on attachment 305225 [details] Patch r=me

Yusuke Suzuki

Comment 20 2017-03-23 14:26:47 PDT

Comment on attachment 305225 [details] Patch Thanks!

WebKit Commit Bot

Comment 21 2017-03-23 14:55:02 PDT

Comment on attachment 305225 [details] Patch Clearing flags on attachment: 305225 Committed r214319: <http://trac.webkit.org/changeset/214319>

WebKit Commit Bot

Comment 22 2017-03-23 14:55:08 PDT

All reviewed patches have been landed. Closing bug.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version WebKit Nightly Build

Hardware Unspecified

OS Unspecified

Product WebKit

Component New Bugs

Assignee

Yusuke Suzuki

Reported

2017-03-17 11:04 PDT

Modified

2017-03-23 14:55 PDT History

CC List

12 users Show

URL

Keywords

Depends on

Blocks