In WebKit's SourceBuffer design, MediaSamples can handle plural frames but the eviction process seemingly expects that a MediaSample should be linked to one frame. With this precondition, some unnecessary MediaSamples in SourceBuffer are not deleted when overlapped MediaSamples are appended. When overlapped MediaSamples are composed of multiple frame data with DTS order and the following Condition A) or Condition B) is valid, the overlapped MediaSamples are not deleted; such MediaSamples are leaked. Condition A): A MediaSample's PTS + its duration exceeds the next MediaSample's PTS Condition B): The I-Frame start position of a newly appended MediaSample doesn't match the I-Frame start position of a MediaSample which is already buffered. To make the SourceBuffer eviction algorithm more general, the eviction process should check the end of MediaSample instead of presentation timestamp (PTS).
Created attachment 421614 [details] patch
Comment on attachment 421614 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=421614&action=review Happy to see your first patch here! I'll hunt some reviewers for you that should be cced to this bug. The EWS is red because the Mac bots are not passing the test you added so that needs to be resolved before this can land. > Source/WebCore/ChangeLog:8 > + Reviewed by NOBODY (OOPS!). > + > + Test: media/media-source/media-source-append-overlapping-mediasample-with-multiple-samples.html You did a very descriptive bug report so you should include some of that language in here to communicate the change.
(In reply to Don Olmstead from comment #2) > Comment on attachment 421614 [details] > The EWS is red because the Mac bots are not passing the test you added so > that needs to be resolved before this can land. Looks like the macOS EWS bots are unhappy about the capitalization of "bufferedSamplesForTrackID", which should be "bufferedSamplesForTrackId".
Comment on attachment 421614 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=421614&action=review > Source/WebCore/platform/graphics/SourceBufferPrivate.cpp:1141 > - // If highest presentation timestamp for track buffer is set and less than or equal to presentation timestamp > - if (trackBuffer.highestPresentationTimestamp.isValid() && trackBuffer.highestPresentationTimestamp - contiguousFrameTolerance <= presentationTimestamp) { > + // If highest presentation timestamp for track buffer is set and less than frame end timestamp. > + if (trackBuffer.highestPresentationTimestamp.isValid() && trackBuffer.highestPresentationTimestamp - contiguousFrameTolerance < frameEndTimestamp) { This text comes directly from the MSE standard, and it looks like it has been changed since this comment was written. The new text of "3.5.8 Coded Frame Processing" step 1.14, part 2 says: > If highest end timestamp for track buffer is set and less than or equal to presentation timestamp: It looks like we do correctly set "highestPresentationTimestamp" to the "frame end timestamp" later in step 1.19, but I do wonder if there were other changes to the specification around "frame end timestamp" that may have been missed. So your proposed change would be a willful departure from the text of the specification. Should this be brought up to the MSE spec authors first? Or is there another way to solve the issue you're attempting to fix?
(In reply to Jer Noble from comment #4) > Comment on attachment 421614 [details] > patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=421614&action=review > > This text comes directly from the MSE standard, and it looks like it has > been changed since this comment was written. The new text of "3.5.8 Coded > Frame Processing" step 1.14, part 2 says: > > > If highest end timestamp for track buffer is set and less than or equal to presentation timestamp: Looks like that rename happened as part of: https://github.com/w3c/media-source/commit/ae1709ea0b5152febc81b7992d8a0046f1f51ba6
Created attachment 421902 [details] patch Fixed patch
(In reply to Jer Noble from comment #4) > Comment on attachment 421614 [details] > patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=421614&action=review > > > Source/WebCore/platform/graphics/SourceBufferPrivate.cpp:1141 > > - // If highest presentation timestamp for track buffer is set and less than or equal to presentation timestamp > > - if (trackBuffer.highestPresentationTimestamp.isValid() && trackBuffer.highestPresentationTimestamp - contiguousFrameTolerance <= presentationTimestamp) { > > + // If highest presentation timestamp for track buffer is set and less than frame end timestamp. > > + if (trackBuffer.highestPresentationTimestamp.isValid() && trackBuffer.highestPresentationTimestamp - contiguousFrameTolerance < frameEndTimestamp) { > > This text comes directly from the MSE standard, and it looks like it has > been changed since this comment was written. The new text of "3.5.8 Coded > Frame Processing" step 1.14, part 2 says: > > > If highest end timestamp for track buffer is set and less than or equal to presentation timestamp: > > It looks like we do correctly set "highestPresentationTimestamp" to the > "frame end timestamp" later in step 1.19, but I do wonder if there were > other changes to the specification around "frame end timestamp" that may > have been missed. So your proposed change would be a willful departure from > the text of the specification. Should this be brought up to the MSE spec > authors first? Or is there another way to solve the issue you're attempting > to fix? I will ask 'https://github.com/w3c/media-source/issues' about this issue.
Created attachment 421910 [details] patch Fixed patch
Created attachment 422048 [details] patch Fixed patch
<rdar://problem/75059534>
I am discussing this issue at w3c/media-source/issue. https://github.com/w3c/media-source/issues/269
Do you have a sample file or stream that makes use of plural frames?
Sample URL: https://tama.tok.access-company.com/public/WebKit/sample00 In this sample, 0s-6s video's MediaSamples overlap.
> If highest end timestamp for track buffer is set and less than or equal to presentation timestamp: Currently, I am inquiring about the description of this MSE spec in the following issue, but the MSE spec authors has not responded for 3 weeks. https://github.com/w3c/media-source/issues/269 Please let me know if there is a good way to solve this issue.
Thank you for your cooperation. https://bugs.webkit.org/show_bug.cgi?id=226481#c9 >A MediaSample can only contain frames that have their pts only ever monotonically increasing. >MediaSample was only ever designed to contained multiple *audio* frames ; not video. I understand the specifications of MediaSample. For video frames, multiple frames are not packaged into the same MediaSample. We made a mistake in the design for MediaSample So this ticket is closed as INVALID.