114432 – [EFL][WK2] Connection is leaking on process termination

RESOLVED WONTFIX 114432

[EFL][WK2] Connection is leaking on process termination

https://bugs.webkit.org/show_bug.cgi?id=114432

Summary [EFL][WK2] Connection is leaking on process termination

Rafael Brandao

Reported 2013-04-11 03:50:24 PDT

ConnectionUnix is leaking on process termination

Attachments
Patch (6.97 KB, patch) 2013-04-11 03:56 PDT, Rafael Brandao	no flags	Details Formatted Diff Diff
Patch (2.69 KB, patch) 2013-04-11 05:02 PDT, Rafael Brandao	no flags	Details Formatted Diff Diff
Archive of layout-test-results from webkit-ews-14 for mac-mountainlion-wk2 (949.01 KB, application/zip) 2013-04-11 05:24 PDT, Build Bot	no flags	Details
Patch for efl's work queue (2.93 KB, patch) 2013-04-11 13:47 PDT, Lauro Moura Maranhao Neto	no flags	Details Formatted Diff Diff
Updated patch for efl's work queue (1.35 KB, patch) 2013-04-11 14:52 PDT, Lauro Moura Maranhao Neto	no flags	Details Formatted Diff Diff
Patch (3.27 KB, patch) 2013-04-12 09:41 PDT, Rafael Brandao	no flags	Details Formatted Diff Diff
Patch (4.52 KB, patch) 2013-04-12 19:33 PDT, Rafael Brandao	no flags	Details Formatted Diff Diff
Patch (7.72 KB, patch) 2013-04-24 18:29 PDT, Rafael Brandao	no flags	Details Formatted Diff Diff
Patch (7.43 KB, patch) 2013-04-25 10:23 PDT, Rafael Brandao	no flags	Details Formatted Diff Diff
Patch (3.67 KB, patch) 2013-05-15 17:18 PDT, Rafael Brandao	no flags	Details Formatted Diff Diff
Show Obsolete (8) View All Add attachment proposed patch, testcase, etc.

Rafael Brandao

Comment 1 2013-04-11 03:56:45 PDT

Created attachment 197568 [details] Patch

Rafael Brandao

Comment 2 2013-04-11 04:02:16 PDT

I've left a few printfs so it can be easier to understand what is going on: Connection should be killed as soon as ChildProcessProxy clears the reference to it. This would cause the work queue to be destroyed and the thread resources to be cleaned up. Since the work queue is holding references to Connection and Connection owns that work queue, it will never be destroyed. It looks a bit like what is done on platformInvalidate after the patch to bug #108140. Since this code is very tricky, I'd welcome feedback.

Early Warning System Bot

Comment 3 2013-04-11 04:02:24 PDT

Comment on attachment 197568 [details] Patch Attachment 197568 [details] did not pass qt-wk2-ews (qt): Output: http://webkit-commit-queue.appspot.com/results/17670128

Rafael Brandao

Comment 4 2013-04-11 04:04:18 PDT

I believe this also happens to Qt, but I couldn't figure out how to run WK2 API tests there. I wonder if this is avoided on Mac. Anders, what do you think?

Rafael Brandao

Comment 5 2013-04-11 05:02:41 PDT

Created attachment 197575 [details] Patch

Build Bot

Comment 6 2013-04-11 05:24:40 PDT

Comment on attachment 197575 [details] Patch Attachment 197575 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-commit-queue.appspot.com/results/17641090 New failing tests: fast/repaint/japanese-rl-selection-repaint-in-regions.html

Build Bot

Comment 7 2013-04-11 05:24:43 PDT

Created attachment 197577 [details] Archive of layout-test-results from webkit-ews-14 for mac-mountainlion-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: webkit-ews-14 Port: mac-mountainlion-wk2 Platform: Mac OS X 10.8.2

Adenilson Cavalcanti Silva

Comment 8 2013-04-11 08:36:46 PDT

About WK2 unit tests: a) Qt: WebKitBuild/Release/Tools/TestWebKitAPI/Tests/WebKit2/tst_webkit2 b) Mac: ./Tools/Scripts/run-api-tests --root=./WebKitBuild/Debug/ c) EFL: each test will be a binary in 'bin', i.e test_webkit2_api_TestName About the patch: no tests?

Lauro Moura Maranhao Neto

Comment 9 2013-04-11 13:47:55 PDT

Created attachment 197664 [details] Patch for efl's work queue I managed to fix the leak in EFL's WorkQueue by changing the socket handler (a Connection bind) to a OwnPtr instead of a plain member, so it can be deref'ed when unregistering the socket handler.

Lauro Moura Maranhao Neto

Comment 10 2013-04-11 14:52:51 PDT

Created attachment 197677 [details] Updated patch for efl's work queue Simpler efl fix by just reassiging the handler, as suggested by andersca on IRC. Both versions of this fix are crashing on debug mode though.

Rafael Brandao

Comment 11 2013-04-12 09:41:25 PDT

Created attachment 197862 [details] Patch

Rafael Brandao

Comment 12 2013-04-12 09:42:54 PDT

(In reply to comment #11) > Created an attachment (id=197862) [details] > Patch This is based on Lauro Moura's fix and also fixes the crash. We've kept the work queue running even after deleting it on the main thread. Other ports like Qt waits for thread termination, so this is also based on their approach.

Rafael Brandao

Comment 13 2013-04-12 09:44:50 PDT

(In reply to comment #8) > > About the patch: no tests? We're open to suggestions on how to test this. It's very port specific and it's about leaking stuff, like file descriptors. :(

Jesus Sanchez-Palencia

Comment 14 2013-04-12 09:53:03 PDT

Comment on attachment 197862 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=197862&action=review > Source/WebKit2/ChangeLog:1 > +2013-04-12 Rafael Brandao <rafael.lobo@openbossa.org> You should perhaps mention that it is based on previous patches by FOO or BAR...

Rafael Brandao

Comment 15 2013-04-12 09:54:48 PDT

(In reply to comment #14) > (From update of attachment 197862 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=197862&action=review > > > Source/WebKit2/ChangeLog:1 > > +2013-04-12 Rafael Brandao <rafael.lobo@openbossa.org> > > You should perhaps mention that it is based on previous patches by FOO or BAR... Indeed, I did not figure out how to that so I've put his name on the patch too. Other thing we could do is to split the crash fix in another bug since they seem unrelated.

Rafael Brandao

Comment 16 2013-04-12 10:37:58 PDT

(In reply to comment #11) > Created an attachment (id=197862) [details] > Patch I share the concern of Anders about busy waiting on thread's completion might be bad but if we try to keep the WorkQueue alive by introducing a RefPtr on workQueueThread's mainloop, then the WorkQueue will never be released even if Connection goes away, and unfortunately we depend on WorkQueue being destroyed to ask the thread to stop. I'm open to suggestions on alternatives and also propose that we could move this in and solve it separately. After all, busy waiting for thread seems much better than having unexpected behavior when executing layout tests. :(

Rafael Brandao

Comment 17 2013-04-12 19:33:21 PDT

Created attachment 197916 [details] Patch

Rafael Brandao

Comment 18 2013-04-12 19:34:44 PDT

Anders, I took care of the busy wait by scheduling a release on the WorkQueue. It's also based on how WorkQueueMac works.

Rafael Brandao

Comment 19 2013-04-24 18:29:00 PDT

Created attachment 199575 [details] Patch

Byungwoo Lee

Comment 20 2013-04-24 20:35:36 PDT

Comment on attachment 199575 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=199575&action=review > Source/WebKit2/Platform/efl/WorkQueueEfl.cpp:133 > if (!item) > return; > > + if (m_hasDispatchedRelease) > + return; > + Why not if (!item || m_hasDispatchedRelease) ? > Source/WebKit2/Platform/efl/WorkQueueEfl.cpp:268 > +void WorkQueue::release() > +{ > + m_threadLoop = false; > +} > + > +void WorkQueue::dispatchRelease() > +{ > + if (m_hasDispatchedRelease) > + return; > + > + dispatch(WTF::bind(&WorkQueue::release, this)); > + m_hasDispatchedRelease = true; > +} Is there any reason to maintain the two flags (m_threadLoop, m_hasDispatchedRelease)?

Rafael Brandao

Comment 21 2013-04-25 10:22:21 PDT

(In reply to comment #20) > (From update of attachment 199575 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=199575&action=review > > > Source/WebKit2/Platform/efl/WorkQueueEfl.cpp:133 > > if (!item) > > return; > > > > + if (m_hasDispatchedRelease) > > + return; > > + > > Why not if (!item || m_hasDispatchedRelease) ? Sounds good. :) > > > Source/WebKit2/Platform/efl/WorkQueueEfl.cpp:268 > > +void WorkQueue::release() > > +{ > > + m_threadLoop = false; > > +} > > + > > +void WorkQueue::dispatchRelease() > > +{ > > + if (m_hasDispatchedRelease) > > + return; > > + > > + dispatch(WTF::bind(&WorkQueue::release, this)); > > + m_hasDispatchedRelease = true; > > +} > > Is there any reason to maintain the two flags (m_threadLoop, m_hasDispatchedRelease)? Nice catch! After you've asked, I came to think of it and really I could use only one. Thanks! I will upload a new version.

Rafael Brandao

Comment 22 2013-04-25 10:23:01 PDT

Created attachment 199682 [details] Patch

Byungwoo Lee

Comment 23 2013-04-25 22:01:36 PDT

Comment on attachment 199682 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=199682&action=review > Source/WebKit2/Platform/CoreIPC/unix/ConnectionUnix.cpp:153 > + m_connectionQueue->dispatchRelease(); Is there any reason to call dispatchRelease() instead of release()? > Source/WebKit2/Platform/efl/WorkQueueEfl.cpp:65 > - sendMessageToThread(finishThreadMessage); > + dispatchRelease(); Why should this be changed? Both functions seems to do the same operation. > Source/WebKit2/Platform/efl/WorkQueueEfl.cpp:190 > + // One more run to assure we'll clean up all remaining tasks. > + workQueue->performWork(); > + ASSERT(workQueue->m_workItemQueue.isEmpty()); > + workQueue->performTimerWork(); > + ASSERT(workQueue->m_timerWorkItems.isEmpty()); > + Why should this be added?

Rafael Brandao

Comment 24 2013-04-26 16:55:06 PDT

Thanks again for your review! :) (In reply to comment #23) > (From update of attachment 199682 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=199682&action=review > > > Source/WebKit2/Platform/CoreIPC/unix/ConnectionUnix.cpp:153 > > + m_connectionQueue->dispatchRelease(); > > Is there any reason to call dispatchRelease() instead of release()? Hm, maybe because the proper release will not happen immediately. Also I believe it matches more how WorkQueueMac works. > > > Source/WebKit2/Platform/efl/WorkQueueEfl.cpp:65 > > - sendMessageToThread(finishThreadMessage); > > + dispatchRelease(); > > Why should this be changed? > Both functions seems to do the same operation. Great catch! I've changed the patch so a "dispatchRelease" will just send the finish message to the thread. > > > Source/WebKit2/Platform/efl/WorkQueueEfl.cpp:190 > > + // One more run to assure we'll clean up all remaining tasks. > > + workQueue->performWork(); > > + ASSERT(workQueue->m_workItemQueue.isEmpty()); > > + workQueue->performTimerWork(); > > + ASSERT(workQueue->m_timerWorkItems.isEmpty()); > > + > > Why should this be added? This is the trickiest part so far, and I didn't find a clean solution yet; in fact, that code is not totally right. At this point, we no longer have a loop to run the queue. If for any concurrency issues we get a successful dispatch after we've already turned off the loop, then we will never clean up the queues and leak whatever was being referenced there and the work queue (each item has a ref()). Initially I've thought running both queues would take care of the problem: I would go for each item and deref() them. But the problem I can still think of a case where a random thread calls a dispatch before we turn the loop off and passes the check there, then on the main thread we ask to release and then the work queue's thread set its loop off and clean its queues. If we come back to the random thread, we would continue to run the dispatch and enqueue whatever was there, again producing a leak. This is theoretical, I can't think of how I can simulate this, but I believe it's a valid concern. I'll think more about this later.

Byungwoo Lee

Comment 25 2013-04-27 07:58:30 PDT

Comment on attachment 199682 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=199682&action=review >>> Source/WebKit2/Platform/CoreIPC/unix/ConnectionUnix.cpp:153 >>> + m_connectionQueue->dispatchRelease(); >> >> Is there any reason to call dispatchRelease() instead of release()? > > Hm, maybe because the proper release will not happen immediately. Also I believe it matches more how WorkQueueMac works. As my understanding, Connection::platformInvalidate() function works on the WorkQueue thread. I still cannot clearly understand why the delay should be needed. (Why should workqueue thread dispatch release() function to itself) >>> Source/WebKit2/Platform/efl/WorkQueueEfl.cpp:190 >>> + >> >> Why should this be added? > > This is the trickiest part so far, and I didn't find a clean solution yet; in fact, that code is not totally right. > > At this point, we no longer have a loop to run the queue. If for any concurrency issues we get a successful dispatch after we've already turned off the loop, then we will never clean up the queues and leak whatever was being referenced there and the work queue (each item has a ref()). > > Initially I've thought running both queues would take care of the problem: I would go for each item and deref() them. But the problem I can still think of a case where a random thread calls a dispatch before we turn the loop off and passes the check there, then on the main thread we ask to release and then the work queue's thread set its loop off and clean its queues. If we come back to the random thread, we would continue to run the dispatch and enqueue whatever was there, again producing a leak. This is theoretical, I can't think of how I can simulate this, but I believe it's a valid concern. I'll think more about this later. What is the difference between 'calling performWorks()' and 'deref() for each queue items' ? Can the concurrency issue that you mentioned be solved with this solution? And performWorks() and performTimerWork() will dispatch work items. Why should the items be dispatched after the thread loop is ended?

Anders Carlsson

Comment 26 2013-04-27 10:57:37 PDT

I’m still not sure if this is the right approach. Keep in mind that WorkQueue needs to support doing something like WorkQueue::create(“whatever”)->dispatch(myFunction); and after myFunction has successfully executed, the WorkQueue should be destroyed.

Byungwoo Lee

Comment 27 2013-04-28 12:38:34 PDT

With the comment from andersca, I found a problem of current efl workqueue ref()/deref() implementation. (deref() during performing work item can make dangling pointer) I filed a bug 115332 separately.

Byungwoo Lee

Comment 28 2013-04-29 03:08:30 PDT

(In reply to comment #27) > With the comment from andersca, I found a problem of current efl workqueue ref()/deref() implementation. > (deref() during performing work item can make dangling pointer) > I filed a bug 115332 separately. After filed the new bug, I read this patch and change-log again. And I found that this patch seems to try to fix similar problem also. (Initially, I cannot understand the purpose of this patch fully. Sorry about this) I didn't consider about the problem between Connection - WorkQueue (that you mentioned) in the new patch. But for the WorkQueue ref()/deref() problem, I think new approach (bug 115332) is more clear than this. Is it ok to separate the two issues? How about your opinion?

Rafael Brandao

Comment 29 2013-05-15 17:18:11 PDT

Created attachment 201899 [details] Patch

Rafael Brandao

Comment 30 2013-05-15 17:22:24 PDT

(In reply to comment #28) > (In reply to comment #27) > > With the comment from andersca, I found a problem of current efl workqueue ref()/deref() implementation. > > (deref() during performing work item can make dangling pointer) > > I filed a bug 115332 separately. > > After filed the new bug, I read this patch and change-log again. > And I found that this patch seems to try to fix similar problem also. > (Initially, I cannot understand the purpose of this patch fully. > Sorry about this) > > I didn't consider about the problem between Connection - WorkQueue (that you mentioned) in the new patch. > But for the WorkQueue ref()/deref() problem, > I think new approach (bug 115332) is more clear than this. > > Is it ok to separate the two issues? How about your opinion? I think both are trying to solve the same issue: WorkQueue being deleted while its runLoop uses it to do stuff. I have decided to go back to a simpler approach (busy-wait on the parent thread) because it is small change, easy to understand, and we can try to remove it in the long term. But my attempts so far lead to even more complicated concurrency issue, so I believe we should do something similar to what Mac port does rather than creating a thread per WorkQueue. Could you check if you can reproduce your issue with the latest patch?

Byungwoo Lee

Comment 31 2013-05-18 21:19:58 PDT

Comment on attachment 201899 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=201899&action=review (In reply to comment #30) > I think both are trying to solve the same issue: WorkQueue being deleted while its runLoop uses it to do stuff. Yes, I agree with that, both share the same issue, and this is the initial bug for the issue, it'll be better to fix it with this bug :) I'll try to keep refactoring issue back, and focus the issue of this bug. > Source/WebKit2/ChangeLog:19 > + A following issue was that Connection's WorkQueue's thread was still running after > + the WorkQueue was released and that thread could use its dangling pointer. > + This happens because each WorkQueue creates its own thread and pass its > + runLoop function to be executed in there. The solution to this dangling > + pointer was to wait on the parent thread for the end of the child thread > + before moving forward with WorkQueue destructor. Qt also does similar approach. I think waiting thread terminate is not enough to handle the dangling WorkQueue problem. As my understanding, WorkQueue::platformInvalidate() can be invoked in the workQueue Thread. I think this approach cannot handle the case when the WorkQueue is invalidated on the workQueue thread. > Source/WebKit2/Platform/efl/WorkQueueEfl.cpp:209 > + m_socketEventHandler = Function<void()>(); This will make assertion.

Byungwoo Lee

Comment 32 2013-05-21 19:08:53 PDT

For the clear point about the WorkQueue dangling pointer issue, I want to share my understanding. As anders commented on the #26, WorkQueue should support below use case. WorkQueue::create(“whatever”)->dispatch(myFunction); This means, The WorkQueue instance from WorkQueue::create("whatever") will create an event loop, and myFunction will be requested to dispatch on the event loop, and the WorkQueue instance will be de-referenced right after it. The request will be processed on the event loop by an async way, so, If myFunction was finished before the de-ref of WorkQueue instance, the event loop is in the idle state, so the de-ref will delete WorkQueue instance and the deletion should trigger event loop termination. If myFunction is processing at the moment of de-ref of WorkQueue instance, the WorkQueue should not be deleted because WorkQueue is currently processing. To ensure this, event loop should keep the WorkQueue reference while processing the WorkQueue. (I think this is the reason that WorkQueue is ref-counted object.) And after myFunction is finished (the WorkQueue processing is finished) in the event loop, the event loop will de-ref the WorkQueue. The de-ref will delete the WorkQueue instance and the deletion of the WorkQueue instance should trigger event loop termination. In my opinion, to prevent the dangling pointer problem, it will be better to find the right way to handle WorkQueue reference count.

Rafael Brandao

Comment 33 2013-05-22 12:58:03 PDT

(In reply to comment #32) > For the clear point about the WorkQueue dangling pointer issue, > I want to share my understanding. > > As anders commented on the #26, > WorkQueue should support below use case. > WorkQueue::create(“whatever”)->dispatch(myFunction); > > This means, > The WorkQueue instance from WorkQueue::create("whatever") will create an event loop, and myFunction will be requested to dispatch on the event loop, and the WorkQueue instance will be de-referenced right after it. > > The request will be processed on the event loop by an async way, so, > > If myFunction was finished before the de-ref of WorkQueue instance, the event loop is in the idle state, so the de-ref will delete WorkQueue instance and the deletion should trigger event loop termination. > > If myFunction is processing at the moment of de-ref of WorkQueue instance, > the WorkQueue should not be deleted because WorkQueue is currently processing. > To ensure this, event loop should keep the WorkQueue reference while processing the WorkQueue. (I think this is the reason that WorkQueue is ref-counted object.) > And after myFunction is finished (the WorkQueue processing is finished) in the event loop, the event loop will de-ref the WorkQueue. The de-ref will delete the WorkQueue instance and the deletion of the WorkQueue instance should trigger event loop termination. > > In my opinion, to prevent the dangling pointer problem, it will be better to find the right way to handle WorkQueue reference count. You are right. I had the wrong assumption that destruction would always happen on mainThread instead of the workQueueThread. This detail makes the whole thing even more complicated. The problem about keeping a reference on the WorkQueueThread is that if we are stuck waiting for sync I/O, we will just leak because WorkQueue::platformInvalidate will never be called. I was trying to workaround this issue and got a WIP patch (https://gist.github.com/rafaelbrandao/81929a88aec886d5457e), but running on debug mode can still make me get into WTF::Lock asserts while executing WorkQueue::performWork (WorkQueue was already deleted). And looking at your initial approach in the other bug, they are very similar. Still I don't think this is the right way to fix it, as Christophe Dumez pointed out it would just make the problem less frequent. Maybe we should go forward with a design similar to what Mac port does, but I'm not sure how we can get it without the usage of anonymous functions (check WorkQueueMac). Those functions would be directed to something else that will manage which thread will run each function and also manage the amount of threads (it could depend on the amount of cores). Feel free to move forward with fixing this bug, I'm stuck with other stuff and I understand this should be fixed as soon as possible.

Rafael Brandao

Comment 34 2013-05-22 17:31:14 PDT

Thanks a lot for your review. :) (In reply to comment #31) > (From update of attachment 201899 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=201899&action=review > > (In reply to comment #30) > > I think both are trying to solve the same issue: WorkQueue being deleted while its runLoop uses it to do stuff. > > Yes, I agree with that, both share the same issue, and this is the initial bug for the issue, it'll be better to fix it with this bug :) > > I'll try to keep refactoring issue back, and focus the issue of this bug. I agree that we should try to leave refactoring for later, but now I'm unsure of how we should fix this. :-( > > > Source/WebKit2/ChangeLog:19 > > + A following issue was that Connection's WorkQueue's thread was still running after > > + the WorkQueue was released and that thread could use its dangling pointer. > > + This happens because each WorkQueue creates its own thread and pass its > > + runLoop function to be executed in there. The solution to this dangling > > + pointer was to wait on the parent thread for the end of the child thread > > + before moving forward with WorkQueue destructor. Qt also does similar approach. > > I think waiting thread terminate is not enough to handle the dangling WorkQueue problem. > As my understanding, WorkQueue::platformInvalidate() can be invoked in the workQueue Thread. > I think this approach cannot handle the case when the WorkQueue is invalidated on the workQueue thread. Indeed. When I compiled in debug mode I could finally see it crashing. Thanks for the notice! > > > Source/WebKit2/Platform/efl/WorkQueueEfl.cpp:209 > > + m_socketEventHandler = Function<void()>(); > > This will make assertion. Why it would assert? This function is called on WorkQueueThread when Connection is about to close. This handler was holding a reference to the WorkQueue on WorkQueueThread so without this change, WorkQueue would never be destroyed.

Byungwoo Lee

Comment 35 2013-05-22 18:22:54 PDT

Comment on attachment 201899 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=201899&action=review >>> Source/WebKit2/Platform/efl/WorkQueueEfl.cpp:209 >>> + m_socketEventHandler = Function<void()>(); >> >> This will make assertion. > > Why it would assert? This function is called on WorkQueueThread when Connection is about to close. This handler was holding a reference to the WorkQueue on WorkQueueThread so without this change, WorkQueue would never be destroyed. I thought that 'unregisterSocketEventHandler() can be invoked on a thread which is not the workqueue thread. If so, m_socketEventHandler() will make assersion if m_socketEventHandler is Function<void()>() because there is any locking for m_socketEventHandler. But WorkQueue::unregisterSocketEventHandler() will only work on the workqueue thread because Connection::platformInvalidate() is always working on the workqueue thread. So, setting m_socketDescriptor as invalidSocketDescriptor will be enough to prevent calling m_socketEventHandler() because the value is checked before calling m_socketEventHandler(). I think this is redundant.

Byungwoo Lee

Comment 36 2013-05-22 18:35:30 PDT

(In reply to comment #35) > (From update of attachment 201899 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=201899&action=review > > >>> Source/WebKit2/Platform/efl/WorkQueueEfl.cpp:209 > >>> + m_socketEventHandler = Function<void()>(); > >> > >> This will make assertion. > > > > Why it would assert? This function is called on WorkQueueThread when Connection is about to close. This handler was holding a reference to the WorkQueue on WorkQueueThread so without this change, WorkQueue would never be destroyed. > > I thought that 'unregisterSocketEventHandler() can be invoked on a thread which is not the workqueue thread. > If so, m_socketEventHandler() will make assersion if m_socketEventHandler is Function<void()>() because there is any locking for m_socketEventHandler. > > But WorkQueue::unregisterSocketEventHandler() will only work on the workqueue thread because Connection::platformInvalidate() is always working on the workqueue thread. > So, setting m_socketDescriptor as invalidSocketDescriptor will be enough to prevent calling m_socketEventHandler() because the value is checked before calling m_socketEventHandler(). > I think this is redundant. And I cannot understand that 'This handler was holding a reference to the WorkQueue on WorkQueueThread' I cannot find any workqueue referencing about m_socketEventHandler.

Rafael Brandao

Comment 37 2013-05-22 22:23:54 PDT

Comment on attachment 201899 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=201899&action=review >>>>> Source/WebKit2/Platform/efl/WorkQueueEfl.cpp:209 >>>>> + m_socketEventHandler = Function<void()>(); >>>> >>>> This will make assertion. >>> >>> Why it would assert? This function is called on WorkQueueThread when Connection is about to close. This handler was holding a reference to the WorkQueue on WorkQueueThread so without this change, WorkQueue would never be destroyed. >> >> I thought that 'unregisterSocketEventHandler() can be invoked on a thread which is not the workqueue thread. >> If so, m_socketEventHandler() will make assersion if m_socketEventHandler is Function<void()>() because there is any locking for m_socketEventHandler. >> >> But WorkQueue::unregisterSocketEventHandler() will only work on the workqueue thread because Connection::platformInvalidate() is always working on the workqueue thread. >> So, setting m_socketDescriptor as invalidSocketDescriptor will be enough to prevent calling m_socketEventHandler() because the value is checked before calling m_socketEventHandler(). >> I think this is redundant. > > And I cannot understand that 'This handler was holding a reference to the WorkQueue on WorkQueueThread' I cannot find any workqueue referencing about m_socketEventHandler. Sorry, it's not WorkQueue's reference being held on WorkQueueThread, but Connection's reference. Check on ConnectionUnix where registerSocketEventHandler is called. It does a WTF::bind of a Connection's function: this increases ref count of Connection until that function is cleared. If we don't clear m_socketEventHandler on unregisterSocketEventHandler, then a reference to Connection will keep him alive because of its own WorkQueue and then both Connection and WorkQueue will leak (thanks to Lauro for fixing it!).

Byungwoo Lee

Comment 38 2013-05-23 19:40:00 PDT

Comment on attachment 201899 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=201899&action=review >>>>>> Source/WebKit2/Platform/efl/WorkQueueEfl.cpp:209 >>>>>> + m_socketEventHandler = Function<void()>(); >>>>> >>>>> This will make assertion. >>>> >>>> Why it would assert? This function is called on WorkQueueThread when Connection is about to close. This handler was holding a reference to the WorkQueue on WorkQueueThread so without this change, WorkQueue would never be destroyed. >>> >>> I thought that 'unregisterSocketEventHandler() can be invoked on a thread which is not the workqueue thread. >>> If so, m_socketEventHandler() will make assersion if m_socketEventHandler is Function<void()>() because there is any locking for m_socketEventHandler. >>> >>> But WorkQueue::unregisterSocketEventHandler() will only work on the workqueue thread because Connection::platformInvalidate() is always working on the workqueue thread. >>> So, setting m_socketDescriptor as invalidSocketDescriptor will be enough to prevent calling m_socketEventHandler() because the value is checked before calling m_socketEventHandler(). >>> I think this is redundant. >> >> And I cannot understand that 'This handler was holding a reference to the WorkQueue on WorkQueueThread' I cannot find any workqueue referencing about m_socketEventHandler. > > Sorry, it's not WorkQueue's reference being held on WorkQueueThread, but Connection's reference. Check on ConnectionUnix where registerSocketEventHandler is called. It does a WTF::bind of a Connection's function: this increases ref count of Connection until that function is cleared. If we don't clear m_socketEventHandler on unregisterSocketEventHandler, then a reference to Connection will keep him alive because of its own WorkQueue and then both Connection and WorkQueue will leak (thanks to Lauro for fixing it!). Ok, Now I can understand about the reference chain between WorkQueue and Connection with m_socketEventHandler (Thanks!), and clearing m_socketEventHandler looks reasonable. How about separating the workqueue problem from this patch? Clearing m_socketEventHandler looks right approach to prevent the leaking of the Connection, but dangling work queue problem looks a different issue that can be handled on another bug.

Rafael Brandao

Comment 39 2013-05-24 11:03:02 PDT

(In reply to comment #38) > (From update of attachment 201899 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=201899&action=review > How about separating the workqueue problem from this patch? > Clearing m_socketEventHandler looks right approach to prevent the leaking of the Connection, but dangling work queue problem looks a different issue that can be handled on another bug. Sigh. I totally agree with this; in fact, I had also proposed that earlier. But maybe we should be careful about landing Connection fix without WorkQueue's destruction fix since one exposes the other. It will probably make our testing suit a mess (random crashes).

Byungwoo Lee

Comment 40 2013-05-24 18:58:13 PDT

(In reply to comment #39) > (In reply to comment #38) > > (From update of attachment 201899 [details] [details]) > > View in context: https://bugs.webkit.org/attachment.cgi?id=201899&action=review > > How about separating the workqueue problem from this patch? > > Clearing m_socketEventHandler looks right approach to prevent the leaking of the Connection, but dangling work queue problem looks a different issue that can be handled on another bug. > > Sigh. I totally agree with this; in fact, I had also proposed that earlier. But maybe we should be careful about landing Connection fix without WorkQueue's destruction fix since one exposes the other. It will probably make our testing suit a mess (random crashes). Then, How about creating a new bug for the dangling workqueue and setting dependency with this?

Nick Diego Yamane (diegoyam)

Comment 41 2013-06-06 16:47:39 PDT

I've just made some tests using the latest version of patch from https://bugs.webkit.org/show_bug.cgi?id=115332 + the WorkQueue's socketEventHandler cleanup at 'unregisterSocketEventHandler' function (proposed previously here). I used this API test case (https://github.com/WebKitNix/webkitnix/blob/master/Tools/TestWebKitAPI/Tests/nix/WebViewWebProcessCrashed.cpp) from WebKitNix port, which forces WebProcess to crash/reload sequentially multiple times. Bellow some details about the results: New implementation based on 'DispatchQueue' seems to be much more robust and easy to maintain/extend than the previous one, however at some point (usually after 400 crash/respawns) the test crashes with a ASSERTION fail at WTF::Mutex's destructor. More details here, gdb session log: http://paste.debian.net/8904/). From that debug session I could see that a call to pthread_mutex_destroy() at Mutex's destructor was returning error code 16, which most probably means it was trying to destroy a locked mutex[1]. m_writeToPipeDescriptorLock was the mutex causing the problem. So, I propose (after the mentioned patch has been landed) to protect the DispatchQueue destruction using a MutexLocker on that mutex to prevent this concurrency issue between ~DispatchQueue and DispatchQueue::wakeup(). [1] http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_mutex_init.html

8179.jeong@gmail.com

Comment 42 2013-09-27 18:54:48 PDT

*** Bug 122044 has been marked as a duplicate of this bug. ***

Michael Catanzaro

Comment 43 2017-03-11 10:31:36 PST

Closing this bug because the EFL port has been removed from trunk. If you feel this bug applies to a different upstream WebKit port and was closed in error, please either update the title and reopen the bug, or leave a comment to request this.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution WONTFIX

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Unspecified

OS Unspecified

Product WebKit

Component New Bugs

Assignee

Rafael Brandao

Reported

2013-04-11 03:50 PDT

Modified

2017-03-11 10:31 PST History

CC List

18 users Show

URL

Keywords

Duplicates (1)

122044 View as bug list

Depends on

115332

Blocks

Dependencies

tree graph