WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
144824
[GTK] WorkQueue objects are not released
https://bugs.webkit.org/show_bug.cgi?id=144824
Summary
[GTK] WorkQueue objects are not released
Michael Catanzaro
Reported
2015-05-08 18:50:54 PDT
**FAIL** WTF_WorkQueue.Simple ../../Tools/TestWebKitAPI/Tests/WTF/WorkQueue.cpp:57 Value of: initialRefCount Actual: 2 Expected: 1 This is not a bug: we need the second ref to keep the WorkQueue alive in the lambda function where the main loop runs in another thread, in WorkQueue::platformInitialize: RefPtr<WorkQueue> protector(this); m_workQueueThread = createThread(threadName, [protector] { g_main_context_push_thread_default(protector->m_eventContext.get()); g_main_loop_run(protector->m_eventLoop.get()); g_main_context_pop_thread_default(protector->m_eventContext.get()); }); Since we need to keep the WorkQueue alive in another thread, I think the only option is to change the test, not the implementation. CC Brent since he wrote this test. We could guard the reference count checks with #if !PLATFORM(GTK), or just remove the checks completely.
Attachments
Patch
(3.39 KB, patch)
2015-05-09 08:21 PDT
,
Michael Catanzaro
no flags
Details
Formatted Diff
Diff
Patch
(3.49 KB, patch)
2015-05-09 10:35 PDT
,
Michael Catanzaro
no flags
Details
Formatted Diff
Diff
Patch
(3.14 KB, patch)
2015-05-10 04:09 PDT
,
Carlos Garcia Campos
no flags
Details
Formatted Diff
Diff
Updated patch
(3.71 KB, patch)
2015-05-11 02:39 PDT
,
Carlos Garcia Campos
zan
: review+
Details
Formatted Diff
Diff
Show Obsolete
(3)
View All
Add attachment
proposed patch, testcase, etc.
Michael Catanzaro
Comment 1
2015-05-08 18:59:57 PDT
So all of the reference count checks in this test check the initial reference count, except for this one: EXPECT_GT(queue->refCount(), 1); m_testCompleted.wait(m_lock); But that one would fail (on any platform) if each task completes prior to the assertion being run.
Michael Catanzaro
Comment 2
2015-05-09 08:21:44 PDT
Created
attachment 252775
[details]
Patch
Brent Fulgham
Comment 3
2015-05-09 09:51:35 PDT
Comment on
attachment 252775
[details]
Patch It doesn't look like we have the right behavior on iOS/mac with this change. :(
Brent Fulgham
Comment 4
2015-05-09 09:53:28 PDT
Comment on
attachment 252775
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=252775&action=review
> Tools/TestWebKitAPI/Tests/WTF/WorkQueue.cpp:71 > + EXPECT_GT(queue->refCount(), expectedInitialRefcount);
I think I originally tried to have something like this, but ran into problems with race conditions.
Brent Fulgham
Comment 5
2015-05-09 09:55:46 PDT
Comment on
attachment 252775
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=252775&action=review
>> Tools/TestWebKitAPI/Tests/WTF/WorkQueue.cpp:71 >> + EXPECT_GT(queue->refCount(), expectedInitialRefcount); > > I think I originally tried to have something like this, but ran into problems with race conditions.
Never mind -- this might be ok. It's just clang is unhappy with the difference in sign. I think refCount has a different type (maybe size_t?)
Michael Catanzaro
Comment 6
2015-05-09 10:34:29 PDT
The refcount is an unsigned int, but it was being stored in a signed int local variable. I'll switch the local variable declaration to use auto instead. Hopefully that will placate clang.
Michael Catanzaro
Comment 7
2015-05-09 10:35:52 PDT
Created
attachment 252780
[details]
Patch
Zan Dobersek
Comment 8
2015-05-09 10:56:09 PDT
So, in WorkQueue::platformInitialize() we create a thread and pass in a lambda that captured a RefPtr<WorkQueue>, essentially ensuring it doesn't get destroyed during that thread's lifetime. In createThread() the lambda (now wrapped in std::function<>) is stored on the heap, in the NewThreadContext object. That is again accessed on the newly-created thread, in threadEntryPoint(), moving the std::function<> off the heap, deleting the NewThreadContext, and passing the thread control to said std::function<> which is now spinning the WorkQueue's m_eventLoop. The kicker is that because a reference to the WorkQueue is held on the WorkQueue-owned thread, the WorkQueue and that thread don't get destroyed when the user drops his last reference to it, as he'd expect. We also don't interrupt the main loop anywhere else but the WorkQueue destructor (well, ::platformInvalidate()). I haven't confirmed this analysis in real world, but if it's correct, we're not destroying any WorkQueue and the underlying thread that we create.
Martin Robinson
Comment 9
2015-05-09 13:08:52 PDT
(In reply to
comment #8
)
> I haven't confirmed this analysis in real world, but if it's correct, we're > not destroying any WorkQueue and the underlying thread that we create.
So this sounds like a memory leak, no?
Michael Catanzaro
Comment 10
2015-05-09 15:47:54 PDT
As a sanity check I added this to WorkQueue.cpp: #ifndef NDEBUG #include <wtf/RefCountedLeakCounter.h> #endif DEFINE_DEBUG_ONLY_GLOBAL(WTF::RefCountedLeakCounter, workQueueCounter, ("WorkQueue")); Then browsed through a few pages on a debug build. We seem to be leaking everything except WorkQueues.
Michael Catanzaro
Comment 11
2015-05-09 15:58:27 PDT
(In reply to
comment #10
)
> As a sanity check I added this to WorkQueue.cpp:
It wasn't a good sanity check because I forgot to increment and decrement the counter in the constructor/destructor. We leak 6 WorkQueues per WebView.
Michael Catanzaro
Comment 12
2015-05-09 18:01:29 PDT
A solution that seems seductively simple: GRefPtr<GMainContext> eventContextProtector(m_eventContext.get()); GRefPtr<GMainLoop> mainLoopProtector(m_eventLoop.get()); m_workQueueThread = createThread(threadName, [eventContextProtector, mainLoopProtector] { g_main_context_push_thread_default(eventContextProtector.get()); g_main_loop_run(mainLoopProtector.get()); g_main_context_pop_thread_default(eventContextProtector.get()); }); Looks good to me...? This fixes the test, but only brings us down to 5 leaks per WebView, so we're still losing a bunch of threads. Aside: Is there a reason to pop the thread default main context immediately before the thread ends, Carlos? Aside: Thanks for the test, Brent.
Carlos Garcia Campos
Comment 13
2015-05-10 02:53:59 PDT
(In reply to
comment #8
)
> So, in WorkQueue::platformInitialize() we create a thread and pass in a > lambda that captured a RefPtr<WorkQueue>, essentially ensuring it doesn't > get destroyed during that thread's lifetime.
Yes, but that was not the original intention, see
bug #140998
. We wanted to make sure that the thread was detached when the work queue is destroyed before the thread body has started. Since the WorkQueue was changed to be refcounted, every source takes a reference of the workQueue, so we don't really need to keep an extra reference for the whole like of the worker thread. I think the right solution could be to go back to the first version of the patch in
bug #140998
.
Carlos Garcia Campos
Comment 14
2015-05-10 03:34:51 PDT
(In reply to
comment #12
)
> A solution that seems seductively simple: > > GRefPtr<GMainContext> eventContextProtector(m_eventContext.get()); > GRefPtr<GMainLoop> mainLoopProtector(m_eventLoop.get()); > m_workQueueThread = createThread(threadName, [eventContextProtector, > mainLoopProtector] { > g_main_context_push_thread_default(eventContextProtector.get()); > g_main_loop_run(mainLoopProtector.get()); > g_main_context_pop_thread_default(eventContextProtector.get()); > }); > > Looks good to me...? This fixes the test, but only brings us down to 5 leaks > per WebView, so we're still losing a bunch of threads. > > Aside: Is there a reason to pop the thread default main context immediately > before the thread ends, Carlos?
Actually, we don't even need to pop the context, since we push it right after creating the thread and the thread-specific context queue will be freed when the thread exits.
> Aside: Thanks for the test, Brent.
Carlos Garcia Campos
Comment 15
2015-05-10 03:51:40 PDT
(In reply to
comment #14
)
> (In reply to
comment #12
) > > A solution that seems seductively simple: > > > > GRefPtr<GMainContext> eventContextProtector(m_eventContext.get()); > > GRefPtr<GMainLoop> mainLoopProtector(m_eventLoop.get()); > > m_workQueueThread = createThread(threadName, [eventContextProtector, > > mainLoopProtector] { > > g_main_context_push_thread_default(eventContextProtector.get()); > > g_main_loop_run(mainLoopProtector.get()); > > g_main_context_pop_thread_default(eventContextProtector.get()); > > }); > > > > Looks good to me...? This fixes the test, but only brings us down to 5 leaks > > per WebView, so we're still losing a bunch of threads. > > > > Aside: Is there a reason to pop the thread default main context immediately > > before the thread ends, Carlos? > > Actually, we don't even need to pop the context, since we push it right > after creating the thread and the thread-specific context queue will be > freed when the thread exits.
We could also push/pop for every task that is run in the queue, but i think it's simpler to push right after creating the queue and never pop.
> > Aside: Thanks for the test, Brent.
Carlos Garcia Campos
Comment 16
2015-05-10 04:09:01 PDT
Created
attachment 252812
[details]
Patch This should fix the test and the work queue leak.
Michael Catanzaro
Comment 17
2015-05-10 08:37:33 PDT
Carlos, I would change the first line of your commit message: "[GTK] WorkQueue runs forever" would be more descriptive. I tried to write a test to ensure the thread actually quits, but it's hard: detaching the thread in the destructor makes it impossible to join on it to ensure it completes, and counting threads in /proc is racy without a join, so we'd have to add new API to WorkQueue to change the behavior of the destructor just for such a test... and then we're not really testing the same functionality anymore. So I think we should be content with the existing refcount test.
Carlos Garcia Campos
Comment 18
2015-05-11 02:39:03 PDT
Created
attachment 252853
[details]
Updated patch Zan pointed out that the thread could keep running forever if the work queue is released before the thread has run the main loop. This updated patch fixes that case by scheduling a main loop quit when the work queue is released before the thread has started. As Michael said, this is not easy to test, I ran some custom tests with printfs to verify the behaviour.
Zan Dobersek
Comment 19
2015-05-11 02:50:36 PDT
Comment on
attachment 252853
[details]
Updated patch View in context:
https://bugs.webkit.org/attachment.cgi?id=252853&action=review
> Source/WTF/wtf/gtk/WorkQueueGtk.cpp:72 > if (m_eventLoop) {
This check shouldn't be necessary, we ASSERT the m_eventLoop existence in the constructor and don't destroy it anywhere.
Carlos Garcia Campos
Comment 20
2015-05-11 02:53:58 PDT
(In reply to
comment #19
)
> Comment on
attachment 252853
[details]
> Updated patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=252853&action=review
> > > Source/WTF/wtf/gtk/WorkQueueGtk.cpp:72 > > if (m_eventLoop) { > > This check shouldn't be necessary, we ASSERT the m_eventLoop existence in > the constructor and don't destroy it anywhere.
Yes, this is because in the past the platformInvalidate was not called by the destructor, but by invalidate method called by the user.
Carlos Garcia Campos
Comment 21
2015-05-11 03:12:01 PDT
I'm going to leave the checks in the end, to ensure it keeps working even if the implementation changes again and platformInvalidate is moved somewhere else.
Carlos Garcia Campos
Comment 22
2015-05-11 03:23:12 PDT
Committed
r184072
: <
http://trac.webkit.org/changeset/184072
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug