226481 – [MSE] When mediaSample of the same pts and dts are received consecutively, the mediaSample received later are discarded.

RESOLVED INVALID226481

[MSE] When mediaSample of the same pts and dts are received consecutively, the mediaSample received later are discarded.

https://bugs.webkit.org/show_bug.cgi?id=226481

Summary [MSE] When mediaSample of the same pts and dts are received consecutively, th...

Toshio Ogasawara

Reported 2021-05-31 22:15:57 PDT

When mediaSample of the same pts and dts are received consecutively, SourceBufferPrivate::didReceiveSample() discards the mediaSample received later. Remove the mediaSample received earlier in order to give priority to the mediaSample received later.

Attachments
patch (6.18 KB, text/plain) 2021-05-31 22:44 PDT, Toshio Ogasawara	no flags	Details
patch (6.18 KB, text/plain) 2021-05-31 23:41 PDT, Toshio Ogasawara	no flags	Details
patch (6.27 KB, patch) 2021-06-01 01:57 PDT, Toshio Ogasawara	no flags	Details Formatted Diff Diff
Show Obsolete (2) View All Add attachment proposed patch, testcase, etc.

Toshio Ogasawara

Comment 1 2021-05-31 22:44:17 PDT

Created attachment 430231 [details] patch

Toshio Ogasawara

Comment 2 2021-05-31 23:41:19 PDT

Created attachment 430233 [details] patch Fixed patch

Toshio Ogasawara

Comment 3 2021-06-01 01:57:16 PDT

Created attachment 430239 [details] patch Fixed patch

Jer Noble

Comment 4 2021-06-01 13:22:09 PDT

I'm not sure I understand; how did multiple samples with the same PTS & DTS get put in the trackBuffer without evicting previously appended samples?

Toshio Ogasawara

Comment 5 2021-06-01 21:13:24 PDT

>+ // When there are MediaSamples in track buffer which have the same PTS and DTS, all of them except for the last MediaSample should be removed. >+ if (trackBuffer.lastDecodeTimestamp.isValid() && decodeTimestamp == trackBuffer.lastDecodeTimestamp) { >+ auto iterPair = trackBuffer.samples.presentationOrder().findSamplesBetweenPresentationTimes(presentationTimestamp, frameEndTimestamp); >+ if (iterPair.first != trackBuffer.samples.presentationOrder().end()) >+ erasedSamples.addRange(iterPair.first, iterPair.second); >+ } >+ > // 1.15 Remove decoding dependencies of the coded frames removed in the previous step: > DecodeOrderSampleMap::MapType dependentSamples; > if (!erasedSamples.empty()) { Previously added samples can be ranged with erasedSamples.addRange () and removed with the next step, "1.15 Remove decoding dependencies of the coded frames removed in the previous step:". Samples with the same PTS and DTS get put in the trackBuffer by removing the previously added sample.

Toshio Ogasawara

Comment 6 2021-06-02 01:09:24 PDT

(In reply to Jer Noble from comment #4) > I'm not sure I understand; how did multiple samples with the same PTS & DTS > get put in the trackBuffer without evicting previously appended samples? For example, if you switch the resolution during video playback, segment data with different resolutions and the same PTS and DTS at the beginning may be added. When MediaSample is created with multiple frames, the PTS and DTS of the first frame data will be the PTS and DTS of MediaSample. In this case, MediaSamples with the same PTS and DTS were added in succession.

Radar WebKit Bug Importer

Comment 7 2021-06-07 22:16:17 PDT

<rdar://problem/78983232>

Toshio Ogasawara

Comment 8 2021-06-10 02:30:00 PDT

If you create a MediaSample with one frame of video, such as a GTK port, you will not encounter this issue. I created a mediasample from multiple frames of video on my own port I developed. Example) * "MediaSample = one frame" ReceiveSample MediaSample-00:[frame00 PTS:0 DTS:0 Duration:1 Resolution:720 I-frame] MediaSample-01:[frame01 PTS:1 DTS:1 Duration:1 Resolution:720 P-frame] MediaSample-02:[frame02 PTS:2 DTS:2 Duration:1 Resolution:720 B-frame] ReceiveSample MediaSample-03:[frame03 PTS:0 DTS:0 Duration:1 Resolution:480 I-frame] MediaSample-04:[frame04 PTS:1 DTS:1 Duration:1 Resolution:480 P-frame] MediaSample-05:[frame05 PTS:2 DTS:2 Duration:1 Resolution:480 B-frame] * "MediaSample = multiple frame" ReceiveSample MediaSample-00:[PTS:0 DTS:0 Duration:3 Resolution:720 I-frame (frame00 PTS:0 DTS:0 Duration:1 Resolution:720 I-frame, frame01 PTS:1 DTS:1 Duration:1 Resolution:720 P-frame, frame02 PTS:2 DTS:2 Duration:1 Resolution:720 B-frame)] ReceiveSample MediaSample-01:[PTS:0 DTS:0 Duration:3 Resolution:480 I-frame (frame03 PTS:0 DTS:0 Duration:1 Resolution:480 I-frame, frame04 PTS:1 DTS:1 Duration:1 Resolution:480 P-frame, frame05 PTS:2 DTS:2 Duration:1 Resolution:480 B-frame)] As mentioned above, didReceiveSample does not receive MediaSample with the same PTS and DTS consecutively when "MediaSample = one frame". didReceiveSample may receive MediaSample with the same PTS and DTS consecutively when "MediaSample = multiple frame".

Jean-Yves Avenard [:jya]

Comment 9 2021-06-15 03:52:48 PDT

(In reply to Toshio Ogasawara from comment #8) > If you create a MediaSample with one frame of video, such as a GTK port, you > will not encounter this issue. > I created a mediasample from multiple frames of video on my own port I > developed. A MediaSample can only contain frames that have their pts only ever monotonically increasing. MediaSample was only ever designed to contained multiple *audio* frames ; not video.

Jean-Yves Avenard [:jya]

Comment 10 2021-06-15 04:01:33 PDT

Comment on attachment 430239 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=430239&action=review > Source/WebCore/platform/graphics/SourceBufferPrivate.cpp:1165 > + if (trackBuffer.lastDecodeTimestamp.isValid() && decodeTimestamp == trackBuffer.lastDecodeTimestamp) { This code is fundamentally incorrect. A dts value is meaningless, its only purpose is to determine in which order the frames should be decoded. It should never be used to determine which frame should be removed or where it should be inserted. There's no guarantee that a stream will use the same timeframe from one sample to another. The only thing that matters is that they are monotonically increasing. Only the presentation timestamp is relevant and should be used to determine what should be removed and where. When a PTS sample is to be removed, the entire GOP (group of picture) should be removed from that sample forward. E.g you remove all samples from that sample, until the next keyframe is found. The concept is touched on in MSE's Coded Frame Removal algorithm https://w3c.github.io/media-source/#dfn-coded-frame-removal

Toshio Ogasawara

Comment 11 2021-06-15 23:55:37 PDT

(In reply to Jean-Yves Avenard [:jya] from comment #9) > (In reply to Toshio Ogasawara from comment #8) > > If you create a MediaSample with one frame of video, such as a GTK port, you > > will not encounter this issue. > > I created a mediasample from multiple frames of video on my own port I > > developed. > > A MediaSample can only contain frames that have their pts only ever > monotonically increasing. > > MediaSample was only ever designed to contained multiple *audio* frames ; > not video. Thank you for explaining the specifications of MediaSample. We made a mistake in the design for MediaSample. So this ticket is closed as INVALID.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution INVALID

Priority P2

Severity Normal

Classification Unclassified

Version WebKit Nightly Build

Hardware Unspecified

OS Unspecified

Product WebKit

Component Media

Assignee

Nobody

Reported

2021-05-31 22:15 PDT

Modified

2022-02-10 16:38 PST History

CC List

12 users Show

URL

Keywords InRadar

Depends on

Blocks