RESOLVED FIXED 180978
Deadlock in GStreamer video sink during shutdown
https://bugs.webkit.org/show_bug.cgi?id=180978
Summary Deadlock in GStreamer video sink during shutdown
Sebastian Dröge (slomo)
Reported 2017-12-19 09:54:54 PST
There's a rare deadlock in the GStreamer video sink on shutdown. What can happen is that the video sink is waiting on the draw condition variable for the main thread to actually render the frame, while the main thread is waiting for the GStreamer video sink (and its streaming thread) to shut down. Patch follows in a minute, but it needs some longer testing to make sure this problem is solved for real. In the testcase it happens about once every 24 hours.
Attachments
0001-GStreamer-Don-t-wait-for-draw-condition-variable-whe.patch (5.24 KB, patch)
2017-12-19 09:56 PST, Sebastian Dröge (slomo)
no flags
0001-GStreamer-Don-t-wait-for-draw-condition-variable-whe.patch (4.46 KB, patch)
2017-12-19 10:03 PST, Sebastian Dröge (slomo)
no flags
Sebastian Dröge (slomo)
Comment 1 2017-12-19 09:56:27 PST
Created attachment 329768 [details] 0001-GStreamer-Don-t-wait-for-draw-condition-variable-whe.patch
Sebastian Dröge (slomo)
Comment 2 2017-12-19 10:03:51 PST
Created attachment 329769 [details] 0001-GStreamer-Don-t-wait-for-draw-condition-variable-whe.patch There was some locale confusion by GIT, this should be better now
Sebastian Dröge (slomo)
Comment 3 2017-12-21 06:14:10 PST
(In reply to Sebastian Dröge (slomo) from comment #0) > Patch follows in a minute, but it needs some longer testing to make sure > this problem is solved for real. In the testcase it happens about once every > 24 hours. Probably confirmed, this deadlock did not happen again
Carlos Garcia Campos
Comment 4 2018-01-03 03:33:48 PST
Miguel, could you review this? you were the last one dealing with the deadlocks related to the draw timer. I'll r+ it if approved by Miguel.
Xabier Rodríguez Calvar
Comment 5 2018-01-09 00:43:46 PST
(In reply to Carlos Garcia Campos from comment #4) > Miguel, could you review this? you were the last one dealing with the > deadlocks related to the draw timer. I'll r+ it if approved by Miguel. I agree, just pinged him.
Miguel Gomez
Comment 6 2018-01-09 01:12:24 PST
(In reply to Sebastian Dröge (slomo) from comment #3) > (In reply to Sebastian Dröge (slomo) from comment #0) > > > Patch follows in a minute, but it needs some longer testing to make sure > > this problem is solved for real. In the testcase it happens about once every > > 24 hours. > > Probably confirmed, this deadlock did not happen again Sorry for the late reply. Is this issue happening when accelerated compositing is disabled? Or does it happen with accelerated compositing enabled but gstreamer-gl is disabled? I ask because the patch touches the two cases but I'm not sure which is one you're trying to fix. My bet is that this is the AC enabled non gst-gl case, cause I think with AC disabled the deadlock should not be happening. In that case, I think what we need is just to modify MediaPlayerPrivateGStreamerBase::cancelRepaint() to notify drawCondition always (as you did in your patch, but we don't need the m_drawCancelled variable) instead of just doing it for the non AC case. I think that the problem we find here is that the compositing thread has been destroyed so the gstreamer thread will never get the drawCondition notification from it, and we need to send it from the main thread prior to destroying the player.
Sebastian Dröge (slomo)
Comment 7 2018-01-09 05:50:59 PST
It happens when accelerated compositing is disabled, that's the case I was debugging and which this patch is fixing. I only touched that other case for completeness to ensure that nothing is ever waiting on the condition variable during shutdown, no matter which code path is taken. Why is the cancelled boolean unneeded in your opinion? One potential problem here is that the condition variable is signalled (due to cancelling) before waiting happens, and then it would be waiting forever if we didn't have another boolean flag that can be checked before waiting to decide whether we have to wait or should not wait because nothing is ever going to wake us up again.
Miguel Gomez
Comment 8 2018-01-09 07:01:11 PST
(In reply to Sebastian Dröge (slomo) from comment #7) > It happens when accelerated compositing is disabled, that's the case I was > debugging and which this patch is fixing. > > I only touched that other case for completeness to ensure that nothing is > ever waiting on the condition variable during shutdown, no matter which code > path is taken. Ah, ok. Bad assumption from my side then. > Why is the cancelled boolean unneeded in your opinion? One potential problem > here is that the condition variable is signalled (due to cancelling) before > waiting happens, and then it would be waiting forever if we didn't have > another boolean flag that can be checked before waiting to decide whether we > have to wait or should not wait because nothing is ever going to wake us up > again. The boolean comment was in case the problem was with AC enabled. But it's not the case. cancelRepaint() gets called in the destructor after the sink's repaint callback is disconnected. I was assuming that after this no new repaint callbacks could happen, but if you were able to reproduce a situation where cancelRepaint() notifies the condition before triggerRepaint() waits on it, then it means that after disconnecting the signals there might be callbacks running anyway. The usage of the boolean makes sense then. But I'm not sure about trying to fix the AC case as well, as in that case the condition is not notified from the main thread but from the compositing one. And notifying it from the main thread could cause gstreamer to release the frame while the compositor thread is still using it. In this case I would prefer to leave the behavior as is: the main thread is blocked while trying to set the pipeline to null while the compositor thread is using the frame. Once the compositor thread notifies the gstreamer thread, this can proceed to set the null state and then the main thread can continue.
Sebastian Dröge (slomo)
Comment 9 2018-01-09 07:43:12 PST
(In reply to Miguel Gomez from comment #8) > But I'm not sure about trying to fix the AC case as well, as in that case > the condition is not notified from the main thread but from the compositing > one. And notifying it from the main thread could cause gstreamer to release > the frame while the compositor thread is still using it. In this case I > would prefer to leave the behavior as is: the main thread is blocked while > trying to set the pipeline to null while the compositor thread is using the > frame. Once the compositor thread notifies the gstreamer thread, this can > proceed to set the null state and then the main thread can continue. Is it guaranteed however that in the AC case the compositor thread is always going to wake up the condition variable, even if both the GStreamer streaming thread and the main thread are blocked? Also even in the AC case it seems to make sense to remember if we were cancelled (i.e. if we're shutting down), and if so just not wait (or render or anything) at all.
Sebastian Dröge (slomo)
Comment 10 2018-01-13 11:03:08 PST
How should we move forward here?
Miguel Gomez
Comment 11 2018-01-15 01:21:46 PST
(In reply to Sebastian Dröge (slomo) from comment #10) > How should we move forward here? Let's just go ahead with your patch. Carlos, could you r+, please?
Philippe Normand
Comment 12 2018-01-15 01:56:35 PST
Sebastian, do you want the commit-queue to land this patch? I don't remember if you have a SVN account or not.
Sebastian Dröge (slomo)
Comment 13 2018-01-15 02:27:08 PST
(In reply to Philippe Normand from comment #12) > Sebastian, do you want the commit-queue to land this patch? I don't remember > if you have a SVN account or not. I do, but let's just have the commit-queue handle that. I'm lazy :)
WebKit Commit Bot
Comment 14 2018-01-15 02:50:49 PST
Comment on attachment 329769 [details] 0001-GStreamer-Don-t-wait-for-draw-condition-variable-whe.patch Clearing flags on attachment: 329769 Committed r226947: <https://trac.webkit.org/changeset/226947>
WebKit Commit Bot
Comment 15 2018-01-15 02:50:50 PST
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 16 2018-01-15 02:51:36 PST
Note You need to log in before you can comment on or make changes to this bug.