Bug 135855

Summary: [MSE] YouTube will lose audio, video after seeking backwards to an unbuffered range.
Product: WebKit Reporter: Jer Noble <jer.noble>
Component: New BugsAssignee: Jer Noble <jer.noble>
Status: RESOLVED FIXED    
Severity: Normal CC: aboya, calvaris, commit-queue, eric.carlson, glenn, ltilve, philipj, sergio, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 135867    
Attachments:
Description Flags
Patch eric.carlson: review+

Description Jer Noble 2014-08-12 14:50:25 PDT
[MSE] YouTube will lose audio, video after seeking backwards to an unbuffered range.
Comment 1 Jer Noble 2014-08-12 14:56:50 PDT
Created attachment 236471 [details]
Patch
Comment 2 Jer Noble 2014-08-12 16:37:50 PDT
Committed r172506: <http://trac.webkit.org/changeset/172506>
Comment 3 Jer Noble 2014-12-16 13:26:53 PST
<rdar://problem/19269013>
Comment 4 Alicia Boya García 2018-04-17 09:46:05 PDT
Comment on attachment 236471 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=236471&action=review

> Source/WebCore/Modules/mediasource/SourceBuffer.cpp:1389
> +        // Do not enqueue samples spanning a significant unbuffered gap:

Why not avoid pushing such samples to the decodeQueue in the first place?
Comment 5 Jer Noble 2018-04-17 09:56:19 PDT
Comment on attachment 236471 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=236471&action=review

>> Source/WebCore/Modules/mediasource/SourceBuffer.cpp:1389
>> +        // Do not enqueue samples spanning a significant unbuffered gap:
> 
> Why not avoid pushing such samples to the decodeQueue in the first place?

You would also have to guarantee that removing a range of samples from the decodeQueue (as in removeSamplesFromTrackBuffer) also removed any samples that resulted in a large gap.
Comment 6 Alicia Boya García 2018-04-19 09:20:09 PDT
(In reply to Jer Noble from comment #5)
> Comment on attachment 236471 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=236471&action=review
> 
> >> Source/WebCore/Modules/mediasource/SourceBuffer.cpp:1389
> >> +        // Do not enqueue samples spanning a significant unbuffered gap:
> > 
> > Why not avoid pushing such samples to the decodeQueue in the first place?
> 
> You would also have to guarantee that removing a range of samples from the
> decodeQueue (as in removeSamplesFromTrackBuffer) also removed any samples
> that resulted in a large gap.

Thanks for replying. I think I get it; but let's use an example to make sure
we're on the same page. For this example PTS=DTS and all frames are sync frames
(as is the case with audio) and every frame is 1 second long to make it
simpler. Playback has not started yet and the first 10 seconds of audio were
just appended.

decodeQueue == trackBuffer.samples == [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Now the user calls SourceBuffer.remove(4, 7), which in turns calls
removeSamplesFromTrackBuffer([4, 5, 6]). These are the resulting frames in the
track buffer:

decodeQueue == trackBuffer.samples = [0, 1, 2, 3, 7, 8, 9]

So the condition discussed above is necessary to avoid [7, 8, 9] to be actually
enqueued until [4, 5, 6] are re-appended, right?

Let's say right after this we append a frame with DTS=10.
sourceBufferPrivateDidReceiveSample() is called and we reach this condition:

if (trackBuffer.lastEnqueuedDecodeEndTime.isInvalid() || decodeTimestamp >= trackBuffer.lastEnqueuedDecodeEndTime) {
    DecodeOrderSampleMap::KeyType decodeKey(sample.decodeTime(), sample.presentationTime());
    trackBuffer.decodeQueue.insert(DecodeOrderSampleMap::MapType::value_type(decodeKey, &sample));
}

decodeTimestamp == lastEnqueuedDecodeEndTime == 10, so the condition is true
and 10 is added to the decodeQueue.

decodeQueue == trackBuffer.samples == [0, 1, 2, 3, 7, 8, 9, 10]
lastEnqueuedDecodeEndTime == 4

At the end of the append, provideMediaData() is called. The loop enqueues the
first 4 frames successfully. In the next iteration it reaches DTS=7 and breaks
the loop in the condition we're discussing.

The problem I see is that DTS=7 is still removed from the decodeQueue because
the condition is after the `.erase()`. The next frames (8 to 10) remain in the
decodeQueue, at least until the next call. I'm not the first person to point
this problem: https://bugs.webkit.org/show_bug.cgi?id=180643