201323 – [MSE] Don't enqueue samples that start at a big discontinuity

RESOLVED FIXED201323

[MSE] Don't enqueue samples that start at a big discontinuity

https://bugs.webkit.org/show_bug.cgi?id=201323

Summary [MSE] Don't enqueue samples that start at a big discontinuity

Alicia Boya García

Reported 2019-08-29 18:10:57 PDT

With the old logic SourceBuffer was enqueueing the first frame to be appended in any circumstances. This was a bug because the user could append first [5, 10) and then [0, 5). With the old behavior [5, 10) would be enqueued first despite being clearly ahead of the initial playback time (zero). By the time [0, 5) is enqueued it can't be enqueued anymore because the decodeQueue is already ahead. This patch fixes that logic to work when the first segments are appended unordered. The test media-source-first-append-not-starting-at-zero.html validates it. The test media-source-append-presentation-durations.html checks the new logic does not break in presence of presentation duration != decode duration. As part of the same logic block, the lastEnqueuedPresentationTime was used to decide when it's necessary to perform reenqueue after an .erase() (it is necessary if any enqueued frames are replaced). Using lastEnqueuedPresentationTime was not entirely accurate in presence of B-frames, as you could erase a frame that has a presentation time higher than the last enqueued one. That logic is replaced with a monotonicly increasing highestEnqueuedPresentationTime and is tested by media-source-remove-b-frame.html. Tests: media/media-source/media-source-append-presentation-durations.html media/media-source/media-source-first-append-not-starting-at-zero.html media/media-source/media-source-remove-b-frame.html

Attachments
Patch (27.54 KB, patch) 2019-08-29 18:12 PDT, Alicia Boya García	no flags	Details Formatted Diff Diff
Patch (27.03 KB, patch) 2019-09-01 05:21 PDT, Alicia Boya García	no flags	Details Formatted Diff Diff
Patch (26.98 KB, patch) 2020-01-09 07:54 PST, Alicia Boya García	no flags	Details Formatted Diff Diff
Patch (27.01 KB, patch) 2020-01-15 06:40 PST, Alicia Boya García	no flags	Details Formatted Diff Diff
Show Obsolete (3) View All Add attachment proposed patch, testcase, etc.

Alicia Boya García

Comment 1 2019-08-29 18:12:24 PDT

Created attachment 377665 [details] Patch

Jer Noble

Comment 2 2019-08-29 20:47:13 PDT

Comment on attachment 377665 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=377665&action=review > Source/WebCore/Modules/mediasource/SourceBuffer.cpp:1803 > + if (trackBuffer.lastEnqueuedDecodeTime.isInvalid() || sample.decodeTime() >= trackBuffer.lastEnqueuedDecodeTime) { decodeTime is insufficient to enforce a strict ordering of samples. Two samples with the same decodeTime but different presentationTimes need to be ordered by both the DTS and PTS, which is why the original code uses the Decode maps KeyType. I also don’t understand why this code is necessary. Why isn’t the decode queue flushed and re-enqueued when [0,5) samples are appended? Time can’t be moving on from 0 since there’s nothing to display yet.

Alicia Boya García

Comment 3 2019-08-30 07:40:14 PDT

Comment on attachment 377665 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=377665&action=review >> Source/WebCore/Modules/mediasource/SourceBuffer.cpp:1803 >> + if (trackBuffer.lastEnqueuedDecodeTime.isInvalid() || sample.decodeTime() >= trackBuffer.lastEnqueuedDecodeTime) { > > decodeTime is insufficient to enforce a strict ordering of samples. Two samples with the same decodeTime but different presentationTimes need to be ordered by both the DTS and PTS, which is why the original code uses the Decode maps KeyType. > > I also don’t understand why this code is necessary. Why isn’t the decode queue flushed and re-enqueued when [0,5) samples are appended? Time can’t be moving on from 0 since there’s nothing to display yet. You have a point about the first paragraph, I will rework it to use DecodeKey instead of DecodeTime. About why the if in the first place instead of just flushing... well, this function is called every time a frame is appended (therefore, in append-order), which does not need to coincide with decode order. In the [5, 10) [0, 5) case this may not seem like a big deal, but imagine [0, 10) [15, 20) [10, 15)... inserting frames blindly in the decode queue would require a flush while the player is potentially playing [0, 10), which would cause an unnecessary hiccup as the decoders would have to start from scratch when [10, 15) was inserted. Never enqueueing frames too far ahead in the future avoids that.

Jer Noble

Comment 4 2019-08-30 13:33:39 PDT

(In reply to Alicia Boya García from comment #3) > About why the if in the first place instead of just flushing... well, this > function is called every time a frame is appended Is that true? provideMediaData() should only be called in sourceBufferPrivateAppendComplete(), removeCodedFrames(), and sourceBufferPrivateDidBecomeReadyForMoreSamples(). None of which are per-appended-frame. > (therefore, in > append-order), which does not need to coincide with decode order. In the [5, > 10) [0, 5) case this may not seem like a big deal, but imagine [0, 10) [15, > 20) [10, 15)... inserting frames blindly in the decode queue would require a > flush while the player is potentially playing [0, 10), which would cause an > unnecessary hiccup as the decoders would have to start from scratch when > [10, 15) was inserted. Never enqueueing frames too far ahead in the future > avoids that. That may be true, but I'm trying to work out why the decode queue cares about presentation times. There was already code to detect an unreasonably large gap between decode times. Why is that insufficient?

Alicia Boya García

Comment 5 2019-08-30 17:37:14 PDT

(In reply to Jer Noble from comment #4) > (In reply to Alicia Boya García from comment #3) > > About why the if in the first place instead of just flushing... well, this > > function is called every time a frame is appended > > Is that true? provideMediaData() should only be called in > sourceBufferPrivateAppendComplete(), removeCodedFrames(), and > sourceBufferPrivateDidBecomeReadyForMoreSamples(). None of which are > per-appended-frame. What is done per-frame is adding frames to the decodeQueue while an append is being parsed. provideMediaData(), is indeed called only once when that append has being finished parsing and tries to actually send the frames of the queue into the decoders. The piece of code the comment was in was sourceBufferPrivateDidReceiveSample(), which is called per-frame, and it decides whether a frame can be added to the queue or if it's late (the decode queue is ahead of that frame in decode time). > > > (therefore, in > > append-order), which does not need to coincide with decode order. In the [5, > > 10) [0, 5) case this may not seem like a big deal, but imagine [0, 10) [15, > > 20) [10, 15)... inserting frames blindly in the decode queue would require a > > flush while the player is potentially playing [0, 10), which would cause an > > unnecessary hiccup as the decoders would have to start from scratch when > > [10, 15) was inserted. Never enqueueing frames too far ahead in the future > > avoids that. > > That may be true, but I'm trying to work out why the decode queue cares > about presentation times. There was already code to detect an unreasonably > large gap between decode times. Why is that insufficient? All the numbers in the explanation above are decode time, not presentation time.

Alicia Boya García

Comment 6 2019-08-31 09:40:53 PDT

(In reply to Alicia Boya García from comment #5) > The piece of code the comment was in was > sourceBufferPrivateDidReceiveSample(), which is called per-frame, and it > decides whether a frame can be added to the queue or if it's late (the > decode queue is ahead of that frame in decode time). Correction: if it's late (a frame with an earlier DTS has already been enqueued). It's always confusing because enqueuing and adding to the decodeQueue mean different things. Enqueuing is the last part when SourceBuffer calls `enqueueSample()` on the SourceBufferPrivate to feed the frame to the decoder.

Alicia Boya García

Comment 7 2019-09-01 05:21:56 PDT

Created attachment 377820 [details] Patch

Alicia Boya García

Comment 8 2020-01-09 07:54:26 PST

Created attachment 387229 [details] Patch

Xabier Rodríguez Calvar

Comment 9 2020-01-09 08:38:58 PST

Comment on attachment 387229 [details] Patch LGTM, though maybe we could give Eric/Jer some time to have a look as well.

Alicia Boya García

Comment 10 2020-01-15 06:40:11 PST

Created attachment 387788 [details] Patch

WebKit Commit Bot

Comment 11 2020-01-16 02:43:25 PST

Comment on attachment 387788 [details] Patch Clearing flags on attachment: 387788 Committed r254670: <https://trac.webkit.org/changeset/254670>

WebKit Commit Bot

Comment 12 2020-01-16 02:43:26 PST

All reviewed patches have been landed. Closing bug.

Radar WebKit Bug Importer

Comment 13 2020-01-16 02:44:13 PST

<rdar://problem/58638968>

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version WebKit Nightly Build

Hardware Unspecified

OS Unspecified

Product WebKit

Component Media

Assignee

Alicia Boya García

Reported

2019-08-29 18:10 PDT

Modified

2020-01-16 02:44 PST History

CC List

9 users Show

URL

Keywords InRadar

Depends on

Blocks