WebKitWebSrc as it is in master has a number of race conditions leading to occasional starvation (due to cancelling the wrong request) or data corruption (due to pushing data from a cancelled request). The threading situation wasn't easy to follow, as it wasn't clear access to what members should be protected by what mutex, in what circumstances. Also, some parts of the design were also introducing addicional complexity, such as the first request being sent from the main thread whereas the rest were being sent from the streaming thread or basesrc async start. In response, this patch reworks all the locking in WebKitWebSrc to use WTF::DataMutex. This ensures all accesses to its (now explicit) protected members are locked. The two mutexes and condition variables have been simplified into one, as there was no obvious need or benefit for two of each in this case. Requests have been numbered, which allows to safely and atomically ignore results from cancelled requests, avoiding data corruption races, and makes following them in debug logs much easier. The conditions for making and cancelling requests have been simplified to a simpler and safer model: There is at most only one active request at anytime, flushes cancel the request, and the first create() call always makes the new request (both at startup and after a flush). Debug asserts and notes about the flow of operations during basesrc seeks have been provided. As this effort needed a review of the entire WebKitWebSrc, cleanups, corrections and documentation comments have been provided where appropriate. This patch introduces no visible behavior changes, just stability improvements.
Created attachment 396095 [details] Patch
Comment on attachment 396095 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=396095&action=review Other than a couple of comments, LGTM, but I would like Phil to have a look at this as well since this is a big rework. > Source/WebCore/platform/graphics/gstreamer/WebKitWebSourceGStreamer.cpp:105 > - if (resource) { > - auto* client = static_cast<CachedResourceStreamingClient*>(resource->client()); > - if (client) > - client->setSourceElement(nullptr); > - > - resource->setClient(nullptr); > - } > - loader = nullptr; > + DataMutex<WebKitWebSrcPrivate::StreamingMembers>::LockedWrapper members(dataMutex); > + ASSERT(!members->resource); > + members->loader = nullptr; Is this right? It looks like we are forgetting to set the client to null. Then we are also checking for resources and then we set the loader to null? > Source/WebCore/platform/graphics/gstreamer/WebKitWebSourceGStreamer.cpp:329 > - case PROP_KEEP_ALIVE: > + case PROP_KEEP_ALIVE: { > src->priv->keepAlive = g_value_get_boolean(value); > break; > + } This does not seem necessary. > Source/WebCore/platform/graphics/gstreamer/WebKitWebSourceGStreamer.cpp:480 > + if (gst_pad_peer_query(GST_BASE_SRC_PAD(baseSrc), query.get())) { You seem to be sending the context query just in one way and doc says we should send it upstream and downstream, why? OTOH, there are no other GStreamer elements that are going to answer this message because this is a very specific request, I think we do nothing by performing a context query and can go directly to posting the message. > Source/WebCore/platform/graphics/gstreamer/WebKitWebSourceGStreamer.cpp:826 > + RunLoop::main().dispatch([protector, resource = WTFMove(members->resource), requestNumber = members->requestNumber] { Why don't you create the protector here? If you leave it like this, you are reffing at the beginning for the function and again here.
Comment on attachment 396095 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=396095&action=review >> Source/WebCore/platform/graphics/gstreamer/WebKitWebSourceGStreamer.cpp:480 >> + if (gst_pad_peer_query(GST_BASE_SRC_PAD(baseSrc), query.get())) { > > You seem to be sending the context query just in one way and doc says we should send it upstream and downstream, why? > > OTOH, there are no other GStreamer elements that are going to answer this message because this is a very specific request, I think we do nothing by performing a context query and can go directly to posting the message. Well this is a src element, so there's no sink pad, hence no real upstream I suppose :)
(In reply to Philippe Normand from comment #3) > Well this is a src element, so there's no sink pad, hence no real upstream I > suppose :) If I were dumber, I wouldn't have head already.
Comment on attachment 396095 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=396095&action=review >>> Source/WebCore/platform/graphics/gstreamer/WebKitWebSourceGStreamer.cpp:480 >>> + if (gst_pad_peer_query(GST_BASE_SRC_PAD(baseSrc), query.get())) { >> >> You seem to be sending the context query just in one way and doc says we should send it upstream and downstream, why? >> >> OTOH, there are no other GStreamer elements that are going to answer this message because this is a very specific request, I think we do nothing by performing a context query and can go directly to posting the message. > > Well this is a src element, so there's no sink pad, hence no real upstream I suppose :) Calvaris is right, I agree there's not much point of sending a query no one would answer anyway.
Comment on attachment 396095 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=396095&action=review >> Source/WebCore/platform/graphics/gstreamer/WebKitWebSourceGStreamer.cpp:105 >> + members->loader = nullptr; > > Is this right? It looks like we are forgetting to set the client to null. Then we are also checking for resources and then we set the loader to null? Note the ASSERT(!members->resource). Reached this point, resource is null already. We were never entering into the if. The resource is already cancelled during UnLock(), which will be called during tear down by basesrc. On the other hand, the loader is preserved in the keep-alive case. Setting `loader` to null inside the notifyAndWait() ensures that it's destroyed from the main thread, which errs in the safe side since this is WebKit network API. >>>> Source/WebCore/platform/graphics/gstreamer/WebKitWebSourceGStreamer.cpp:480 >>>> + if (gst_pad_peer_query(GST_BASE_SRC_PAD(baseSrc), query.get())) { >>> >>> You seem to be sending the context query just in one way and doc says we should send it upstream and downstream, why? >>> >>> OTOH, there are no other GStreamer elements that are going to answer this message because this is a very specific request, I think we do nothing by performing a context query and can go directly to posting the message. >> >> Well this is a src element, so there's no sink pad, hence no real upstream I suppose :) > > Calvaris is right, I agree there's not much point of sending a query no one would answer anyway. We are a source element, we only have downstream indeed. Bins catch and store contexts, so after the message has been posted once -- even in a different WebKitWebSrc in the same pipeline, the query is indeed answered. (Note this is done to satisfy adaptivedemux cases where there are several HTTP source elements hanging around.) Also, this code is not new, it has just been reindented into `members.runUnlocked()`. >> Source/WebCore/platform/graphics/gstreamer/WebKitWebSourceGStreamer.cpp:826 >> + RunLoop::main().dispatch([protector, resource = WTFMove(members->resource), requestNumber = members->requestNumber] { > > Why don't you create the protector here? If you leave it like this, you are reffing at the beginning for the function and again here. Inertia. I could replace it with a generalized lambda capture like [protector = WTF::ensureGRef(src)] here and in every other case in the file actually.
Created attachment 396541 [details] Patch
The updated version removes the usage of MainThreadNotifier after I identified a race condition consequence of its usage: if two tasks to make an HTTP request are passed to notify in quick succession, MainThreadNotifier discards the newest one, never sending the request. Removing MainThreadNotifier usages also simplifies the WebKitWebSrc private destruction code.
Created attachment 397449 [details] Patch
Comment on attachment 397449 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=397449&action=review > Source/WebCore/platform/graphics/gstreamer/WebKitWebSourceGStreamer.cpp:446 > + if (members->readPosition == members->size) { > + GST_TRACE_OBJECT(src, "just downloaded the last chunk in the file, loadFinished() is about to be called"); > + return; > + } I uploaded a revision of the patch after identifying and fixing a bug, fixed by these lines of code. Before they were causing Resource cancellation, and therefore, loadFinished() wasn't called and EOS wasn't sent, causing the video not to play because as far as the queues could know in that state, they were starved.
/Volumes/Data/worker/Commit-Queue/build/LayoutTests/ChangeLog neither lists a valid reviewer nor contains the string "Unreviewed" or "Rubber stamp" (case insensitive).
Created attachment 397678 [details] Patch for landing
ChangeLog entry in Source/WebCore/ChangeLog contains OOPS!.
Created attachment 397680 [details] Patch for landing
Committed r260755: <https://trac.webkit.org/changeset/260755> All reviewed patches have been landed. Closing bug and clearing flags on attachment 397680 [details].
*** Bug 209719 has been marked as a duplicate of this bug. ***
*** Bug 206162 has been marked as a duplicate of this bug. ***
*** Bug 203194 has been marked as a duplicate of this bug. ***
*** Bug 204654 has been marked as a duplicate of this bug. ***
Comment on attachment 397680 [details] Patch for landing View in context: https://bugs.webkit.org/attachment.cgi?id=397680&action=review > Source/WebCore/platform/graphics/gstreamer/WebKitWebSourceGStreamer.cpp:-1090 > - if (response.httpStatusCode() == 200) { > - // Range request didn't have a ranged response; resetting offset. > - priv->readPosition = 0; > - } else if (response.httpStatusCode() != 206) { Why was this removed?