Bug 191004 - [GTK] Layout test media/track/track-cue-missing.html is failing
Summary: [GTK] Layout test media/track/track-cue-missing.html is failing
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKitGTK (show other bugs)
Version: Other
Hardware: PC Linux
: P2 Normal
Assignee: Enrique Ocaña
URL:
Keywords:
: 190871 (view as bug list)
Depends on:
Blocks:
 
Reported: 2018-10-27 18:20 PDT by Michael Catanzaro
Modified: 2021-09-13 09:04 PDT (History)
16 users (show)

See Also:


Attachments
Patch (6.39 KB, patch)
2021-04-29 09:23 PDT, Enrique Ocaña
no flags Details | Formatted Diff | Diff
Patch (7.90 KB, patch)
2021-05-05 04:54 PDT, Enrique Ocaña
no flags Details | Formatted Diff | Diff
Patch (8.71 KB, patch)
2021-05-07 04:52 PDT, Enrique Ocaña
no flags Details | Formatted Diff | Diff
Patch (17.01 KB, patch)
2021-05-12 09:39 PDT, Enrique Ocaña
no flags Details | Formatted Diff | Diff
Patch (17.99 KB, patch)
2021-05-14 09:15 PDT, Enrique Ocaña
no flags Details | Formatted Diff | Diff
Patch (18.15 KB, patch)
2021-05-25 05:14 PDT, Enrique Ocaña
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Catanzaro 2018-10-27 18:20:49 PDT
Layout test media/track/track-cue-missing.html is failing since it was added in r237376 "TextTrack cues should be updated more often than every 250ms." There is a text diff:

--- /home/slave/webkitgtk/gtk-linux-64-release-tests/build/layout-test-results/media/track/track-cue-missing-expected.txt
+++ /home/slave/webkitgtk/gtk-linux-64-release-tests/build/layout-test-results/media/track/track-cue-missing-actual.txt
@@ -5,6 +5,6 @@
 EVENT(loadeddata)
 Created 121 cues.
 EVENT(ended)
-EXPECTED (missedCueCount < '50') OK
+EXPECTED (missedCueCount < '50'), OBSERVED '96' FAIL
 END OF TEST
Comment 1 Miguel Gomez 2019-06-13 06:30:16 PDT
Sometimes this is also crashing and timing out, with no information about why Updating expectations.
Comment 2 Enrique Ocaña 2021-04-29 09:23:02 PDT
Created attachment 427350 [details]
Patch
Comment 3 Alicia Boya García 2021-04-29 10:10:42 PDT
Comment on attachment 427350 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=427350&action=review

Good job. Frame accuracy is something we need to improve in WebKitGTK.

> Source/WebCore/platform/graphics/gstreamer/MediaPlayerPrivateGStreamer.cpp:2849
> +        RunLoop::main().dispatchAfter(Seconds((time-currentTime).toDouble() / double(m_playbackRate)), WTFMove(task));

This is not robust enough. This breaks if the video is paused or a seek occurs. This is also vulnerable to slight timing differences. Given that a common application of this API is subtitles, I can imagine how this could cause frames being shown with the wrong subtitle for a fraction of a second.

Instead, to get a more robust implementation I suggest taking advantage of MediaPlayerPrivateGStreamer::triggerRepaint(), since that will give you frame accuracy. You can get the stream time of the PTS of each sample, and if that time is >= the time previously saved in this function, run the callback.

Looking at the implementation in AVFoundation I see there is no need to have more than one callback at a time, so you can always clear the previous one. See performTaskAtMediaTime() in Source/WebCore/platform/graphics/avfoundation/objc/MediaPlayerPrivateAVFoundationObjC.mm for reference.

> Source/WebCore/platform/graphics/gstreamer/MediaPlayerPrivateGStreamer.h:187
> +    bool performTaskAtMediaTime(Function<void()>&&, const MediaTime&);

Please add the `override` keyword to indicate that this method comes from MediaPlayerPrivateInterface, as opposed to something internal of the GStreamer backend.

> Source/WebCore/platform/graphics/gstreamer/MediaPlayerPrivateGStreamer.h:409
> +    MediaTime playbackPosition(bool useCached = true) const;

Using an enum would be better for readability.
Comment 4 Enrique Ocaña 2021-05-05 04:54:14 PDT
Created attachment 427755 [details]
Patch
Comment 5 Enrique Ocaña 2021-05-05 05:00:39 PDT
*** Bug 190871 has been marked as a duplicate of this bug. ***
Comment 6 Alicia Boya García 2021-05-06 14:11:53 PDT
Comment on attachment 427755 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=427755&action=review

> Source/WebCore/platform/graphics/gstreamer/MediaPlayerPrivateGStreamer.cpp:3078
> +        m_cachedPosition = MediaTime(gst_segment_to_stream_time(gst_sample_get_segment(sample), GST_FORMAT_TIME, GST_BUFFER_PTS(buffer)), GST_SECOND);

The problem with this code is that it alters m_cachedPosition (and therefore currentMediaTime()) from a background thread.

From the conversation we had before you commented that a big part of the problem you were having is that you needed currentTime to be updated every frame, rather than every 250ms, and doing this here did the trick.

I did some testing to confirm it, and indeed, there is no good reason for currentTime to update only every >200 ms, at all. No other browser has such a low resolution. I propose that instead of caching currentTime for 200 ms as we currently do, we cache it only for the current iteration of the main thread loop. That could be its own separate patch and then this patch could focus on dispatching a task after a certain currentTime has been reached... although both issues are closely intermingled, so it doesn't have to be.

I asked in the Slack channel and Eric Carlson had an idea on how we can update m_cachedTime once per main thread loop iteration without unnecessary queries and without its value changing within the same iteration: After every update of m_cachedTime, set a task with HTMLMediaElement::enqueueTaskForDispatcher() that resets m_cachedTime so that the next frame it's checked again. For that to work is important that currentTime is only used from the main thread. Stream time can still be used to post tasks for performTaskAtMediaTime().

> Source/WebCore/platform/graphics/gstreamer/MediaPlayerPrivateGStreamer.h:405
> +            if (isEmpty() || !currentTime.isFinite() || (playbackRate >= 0 && m_targetTime > currentTime) || (playbackRate < 0 && m_targetTime < currentTime))

nit: I would prefer if you swap currentTime and m_targetTime, because given the combination of the inverse logic (early return) and swapped arguments from what one would more likely read I got a bit confused at first.
Comment 7 Enrique Ocaña 2021-05-07 04:34:55 PDT
Comment on attachment 427755 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=427755&action=review

>> Source/WebCore/platform/graphics/gstreamer/MediaPlayerPrivateGStreamer.cpp:3078
>> +        m_cachedPosition = MediaTime(gst_segment_to_stream_time(gst_sample_get_segment(sample), GST_FORMAT_TIME, GST_BUFFER_PTS(buffer)), GST_SECOND);
> 
> The problem with this code is that it alters m_cachedPosition (and therefore currentMediaTime()) from a background thread.
> 
> From the conversation we had before you commented that a big part of the problem you were having is that you needed currentTime to be updated every frame, rather than every 250ms, and doing this here did the trick.
> 
> I did some testing to confirm it, and indeed, there is no good reason for currentTime to update only every >200 ms, at all. No other browser has such a low resolution. I propose that instead of caching currentTime for 200 ms as we currently do, we cache it only for the current iteration of the main thread loop. That could be its own separate patch and then this patch could focus on dispatching a task after a certain currentTime has been reached... although both issues are closely intermingled, so it doesn't have to be.
> 
> I asked in the Slack channel and Eric Carlson had an idea on how we can update m_cachedTime once per main thread loop iteration without unnecessary queries and without its value changing within the same iteration: After every update of m_cachedTime, set a task with HTMLMediaElement::enqueueTaskForDispatcher() that resets m_cachedTime so that the next frame it's checked again. For that to work is important that currentTime is only used from the main thread. Stream time can still be used to post tasks for performTaskAtMediaTime().

I did a quick implementation of the m_cachedPosition reset every mainloop iteration by calling g_idle_add() in the player constructor, clearing it in the destructor and adding some printf()s to verify the result. It worked, but I got scared about the huge inefficiency that running some code on *every* mainloop iteration represents (it's *a lot*, and that code will run as long as the player is alive, no matter if it's paused or unused before destruction!!!). In the end, I considered that it would be way more efficient to just RunLoop::main().dispatch() a task to update m_cachedPosition in the main thread every triggerRepaint(). Yes, that's a lot of work, but much less than running some code every mainloop iteration. I'm sending a new patch with this new approach.
Comment 8 Enrique Ocaña 2021-05-07 04:52:20 PDT
Created attachment 427991 [details]
Patch
Comment 9 Alicia Boya García 2021-05-08 03:20:55 PDT
Comment on attachment 427755 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=427755&action=review

>>> Source/WebCore/platform/graphics/gstreamer/MediaPlayerPrivateGStreamer.cpp:3078
>>> +        m_cachedPosition = MediaTime(gst_segment_to_stream_time(gst_sample_get_segment(sample), GST_FORMAT_TIME, GST_BUFFER_PTS(buffer)), GST_SECOND);
>> 
>> The problem with this code is that it alters m_cachedPosition (and therefore currentMediaTime()) from a background thread.
>> 
>> From the conversation we had before you commented that a big part of the problem you were having is that you needed currentTime to be updated every frame, rather than every 250ms, and doing this here did the trick.
>> 
>> I did some testing to confirm it, and indeed, there is no good reason for currentTime to update only every >200 ms, at all. No other browser has such a low resolution. I propose that instead of caching currentTime for 200 ms as we currently do, we cache it only for the current iteration of the main thread loop. That could be its own separate patch and then this patch could focus on dispatching a task after a certain currentTime has been reached... although both issues are closely intermingled, so it doesn't have to be.
>> 
>> I asked in the Slack channel and Eric Carlson had an idea on how we can update m_cachedTime once per main thread loop iteration without unnecessary queries and without its value changing within the same iteration: After every update of m_cachedTime, set a task with HTMLMediaElement::enqueueTaskForDispatcher() that resets m_cachedTime so that the next frame it's checked again. For that to work is important that currentTime is only used from the main thread. Stream time can still be used to post tasks for performTaskAtMediaTime().
> 
> I did a quick implementation of the m_cachedPosition reset every mainloop iteration by calling g_idle_add() in the player constructor, clearing it in the destructor and adding some printf()s to verify the result. It worked, but I got scared about the huge inefficiency that running some code on *every* mainloop iteration represents (it's *a lot*, and that code will run as long as the player is alive, no matter if it's paused or unused before destruction!!!). In the end, I considered that it would be way more efficient to just RunLoop::main().dispatch() a task to update m_cachedPosition in the main thread every triggerRepaint(). Yes, that's a lot of work, but much less than running some code every mainloop iteration. I'm sending a new patch with this new approach.

> I got scared about the huge inefficiency that running some code on *every* mainloop iteration represents

Please note that the callback to be scheduled here is not to get the position but to clear it. It's therefore only added and run if during this main thread loop iteration the position has been queried, and the only thing it has to do is clear a MediaTime field. Adding callback to the main loop shouldn't be *that* expensive, since it's essentially just adding a small object to a queue and doing a function call that clears some recently used data.
Comment 10 Alicia Boya García 2021-05-08 03:27:02 PDT
However, on the topic of performance for position queries, while looking at gstbin code I realized that it searches the entire pipeline for elements filtering those marked as sink before answering position queries. Then it queries those and returns the biggest value. My impression is that, and not a callback that clears a small data field, could have a performance impact if that search ends up being recursive (I haven't digged deeper into this).

If that ends up being the case, a faster approach would be to query the sinks ourselves directly, rather than the pipeline.
Comment 11 Alicia Boya García 2021-05-10 06:26:25 PDT
Summarizing: things to do with this patch: two parts (ideally in two separate patches):

Part 1:

We need currentTime to return the current position, not the one <=200 ms ago. In order to do this, we will schedule a task with HTMLMediaElement::enqueueTaskForDispatcher() to clear m_cachedTime in the next main thread loop iteration right after m_cachedTime is set. (That will ensure that in a given tick currentTime never changes value but currentTime is always accurate to the current frame, and that position queries are only made when someone queries currentTime.)

Performance note: Please evaluate the performance implications of querying the position to the pipeline if it's queried every frame. If it's too slow, we can -- as an optimization -- query the sinks directly and take the maximum value, doing the same gstbin would do, but avoiding the traversal of the entire pipeline.

Part 2 (the one for this Bugzilla issue):

We want to support performTaskAtMediaTime(). To reduce the complexity of the implementation, for now we will only support it when there is video (we won't support it for audio-only files in this patch, although that could be added later, potentially with additional complexity, if needed).

We will add some code in the video sink that checks the stream time against a given target time, and if that time is hit or passed, enqueue a task to run in the main thread.

Thread-safety notice and something I forgot to mention: the variable holding the object used for checking the target time and holding the task (in the latest patch, m_pendingTask) must be set atomically, since it's read and written from different, concurrent threads. You can use a mutex for this.
Comment 12 Enrique Ocaña 2021-05-12 09:39:43 PDT
Created attachment 428378 [details]
Patch
Comment 13 Enrique Ocaña 2021-05-12 09:45:11 PDT
(In reply to Alicia Boya García from comment #11)

> We need currentTime to return the current position, not the one <=200 ms
> ago. In order to do this, we will schedule a task with
> HTMLMediaElement::enqueueTaskForDispatcher() to clear m_cachedTime in the
> next main thread loop iteration right after m_cachedTime is set.

In the end, I had to use RunLoop::main().dispatch() for that, as HTMLMediaElement::enqueueTaskForDispatcher() was only meant to be used from HTMLMediaElement code. On the one hand, it's a private method, and on the other hand, even if it was public, it should have to be exposed through the MediaPlayerClient interface, which is the only "face" of HTMLMediaElement that the player private can see.

About the other suggestions, all of them are implemented in the new patch I've just uploaded. See the ChangeLog for the performance measures you asked for.
Comment 14 Alicia Boya García 2021-05-13 09:10:34 PDT
Comment on attachment 428378 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=428378&action=review

> Source/WebCore/platform/graphics/gstreamer/MediaPlayerPrivateGStreamer.cpp:2859
> +    m_pendingTask = PendingTask(WTFMove(task), time, m_pendingTaskMutex);

You're constructing a new PendingTask -- its constructor takes the mutex, initializes its members while holding its mutex (totally unnecessary, since during the constructor execution no other code has received a reference to that instance). Then, the constructor withdraws the mutex, and this code runs operator=() on PendingTask, which -- now without any mutex protecting anything -- swaps the data members of the new instance with the members of the old instance.

r-

> Source/WebCore/platform/graphics/gstreamer/MediaPlayerPrivateGStreamer.h:308
> +    void invalidateCachedPositionOnNextIteration() const;

invalidateCachedPositionOnNextIteration() can be private, since it's only used by MediaPlayerPrivateGStreamer. Note playbackPosition() is also private. Both are things that could be changed easily if ever needed.
Comment 15 Enrique Ocaña 2021-05-13 10:34:18 PDT
Comment on attachment 428378 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=428378&action=review

>> Source/WebCore/platform/graphics/gstreamer/MediaPlayerPrivateGStreamer.cpp:2859
>> +    m_pendingTask = PendingTask(WTFMove(task), time, m_pendingTaskMutex);
> 
> You're constructing a new PendingTask -- its constructor takes the mutex, initializes its members while holding its mutex (totally unnecessary, since during the constructor execution no other code has received a reference to that instance). Then, the constructor withdraws the mutex, and this code runs operator=() on PendingTask, which -- now without any mutex protecting anything -- swaps the data members of the new instance with the members of the old instance.
> 
> r-

Thanks for catching this one. I forgot about operator=() when thinking on the cases that should be protected. I think I'm better moving the management of the lock out of PendingTask in the next patch version. I coded that management inside to avoid boilerplate, but it has proven to be a bad idea.
Comment 16 Enrique Ocaña 2021-05-14 09:15:54 PDT
Created attachment 428626 [details]
Patch
Comment 17 Alicia Boya García 2021-05-14 11:18:46 PDT
Comment on attachment 428626 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=428626&action=review

LGTM, r+ with the rename nits.

> Source/WebCore/platform/graphics/gstreamer/MediaPlayerPrivateGStreamer.cpp:2868
> +            m_playbackRate >= 0 ? TaskAtMediaTimeScheduler::LessOrEqual : TaskAtMediaTimeScheduler::GreaterOrEqual);

Shouldn't this be reversed? If m_playbackRate >= 0 we want to check when time is greater or equal to the target. The code being implemented with early return (and therefore negative condition) should not affect this.

> Source/WebCore/platform/graphics/gstreamer/MediaPlayerPrivateGStreamer.cpp:2871
> +    if (taskToSchedule.hasValue())

Add a explanatory comment: // Dispatch the task synchronously if the time is already reached.

> Source/WebCore/platform/graphics/gstreamer/MediaPlayerPrivateGStreamer.h:405
> +        Optional<Function<void()>> scheduleIfTimeReached(const MediaTime& currentTime)

The name of the function is now confusing if the scheduling is not done inside. We could have a more descriptive name. What about `checkTaskForScheduling()`?
Comment 18 Enrique Ocaña 2021-05-15 15:09:37 PDT
Comment on attachment 428626 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=428626&action=review

>> Source/WebCore/platform/graphics/gstreamer/MediaPlayerPrivateGStreamer.cpp:2868
>> +            m_playbackRate >= 0 ? TaskAtMediaTimeScheduler::LessOrEqual : TaskAtMediaTimeScheduler::GreaterOrEqual);
> 
> Shouldn't this be reversed? If m_playbackRate >= 0 we want to check when time is greater or equal to the target. The code being implemented with early return (and therefore negative condition) should not affect this.

That's the problem when the enum is named TargetTimeOperator (literally targetTime.operator(), such as targetTime.lessOrEqual(currentTime)). If you want to apply the operator on currentTime, the enum should be named CurrentTimeOperator. 
Anyway, to my taste, I think that it would be much better if the enum was named PlaybackDirection, with values Forward and Backwards.

>> Source/WebCore/platform/graphics/gstreamer/MediaPlayerPrivateGStreamer.h:405
>> +        Optional<Function<void()>> scheduleIfTimeReached(const MediaTime& currentTime)
> 
> The name of the function is now confusing if the scheduling is not done inside. We could have a more descriptive name. What about `checkTaskForScheduling()`?

That's a great name.
Comment 19 Alicia Boya García 2021-05-16 14:58:47 PDT
Comment on attachment 428626 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=428626&action=review

>>> Source/WebCore/platform/graphics/gstreamer/MediaPlayerPrivateGStreamer.cpp:2868
>>> +            m_playbackRate >= 0 ? TaskAtMediaTimeScheduler::LessOrEqual : TaskAtMediaTimeScheduler::GreaterOrEqual);
>> 
>> Shouldn't this be reversed? If m_playbackRate >= 0 we want to check when time is greater or equal to the target. The code being implemented with early return (and therefore negative condition) should not affect this.
> 
> That's the problem when the enum is named TargetTimeOperator (literally targetTime.operator(), such as targetTime.lessOrEqual(currentTime)). If you want to apply the operator on currentTime, the enum should be named CurrentTimeOperator. 
> Anyway, to my taste, I think that it would be much better if the enum was named PlaybackDirection, with values Forward and Backwards.

Yeah, let's go with PlaybackDirection for that one and remove the ambiguity.
Comment 20 Enrique Ocaña 2021-05-25 05:14:19 PDT
Created attachment 429645 [details]
Patch
Comment 21 Enrique Ocaña 2021-05-25 05:20:57 PDT
Changed TargetTimeOperator to PlaybackDirection as agreed, but I detected some regressions before resubmitting the patch.

Some media/track/in-band tests were crashing due to an infinite recursion when the task due time was already met in performTaskAtMediaTime() (so the task was directly run in the main thread there). The patch finally submitted before this comment already has the problem solved by enqueing the task for dispatching in the main thread.
Comment 22 EWS 2021-05-25 05:52:13 PDT
Committed r278004 (238117@main): <https://commits.webkit.org/238117@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 429645 [details].
Comment 23 Enrique Ocaña 2021-08-17 10:00:53 PDT
Arcady Goldmints-Orlov reported that this patch might have created a flaky regression in http/tests/security/webaudio-render-remote-audio-allowed-crossorigin-redirect.html

That test creates an <audio> element from JavaScript and wraps it in a webaudio source. Maybe the patch broke <audio> in some way and that causes problems in the test.