Summary: | [MSE] SourceBuffer sample time increment vs. last frame duration check is broken | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | up | ||||||||||||||
Component: | Media | Assignee: | Nobody <webkit-unassigned> | ||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||
Severity: | Blocker | CC: | commit-queue, eric.carlson, jer.noble, lietz, webkit-bug-importer | ||||||||||||||
Priority: | P2 | Keywords: | InRadar | ||||||||||||||
Version: | Safari Technology Preview | ||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||
OS: | All | ||||||||||||||||
Attachments: |
|
Description
up
2019-02-16 09:06:35 PST
@up, are you describing “decode time stamp” with “dts” or “display time stamp”? Also, decode duration or display duration? Also, we have a pretty capable “mock” MSE implementation for running layout tests that isn’t dependent on any particular demuxer/decoder. I wonder if we could translate your examples into a LayoutTest and verify that it is indeed failing the way you describe. FWIW, for audio samples, we have a way of “splitting” a large sample group into smaller ones for the purposes of overlap avoidance. So the “large sample” issue should be less of a problem in practice. FWIW #2, other container formats (most notably Matroshka) are much less rigorous than ISO BMFF when it comes to timing info. @up, do you have a public ally available trstcase that demonstrates the issue? @jer Thank you for the fast response. I was referring to decode time and decode duration, as this is what is checked in this case. dts vs. pts does not really matter, or is a more special question, because for video without frame reordering, e.g. H.264 Baseline Profile, dts and pts are the same, even if there is a constant offset sometimes, and will break just as described. Sorry, I'm not familiar (yet) with the tests in WebKit. I have seen the mock source, but not 100% sure how it works. Can somebody assist to model the first frame sequence I have described in a mock source test? Regarding a test case, I will try to provide one, but as I said the issue can be understood logically and that's what I tried to explain in the report. I suggest to focus on creating the mock test first. Regarding the split up of large audio sample groups, what are the limits that will cause the split up? Problem is that especially in low latency live streaming use cases, the fMP4 fragments are kept pretty small, might be even video frame length. Really appreciating your help. (In reply to up from comment #5) > @jer Thank you for the fast response. I was referring to decode time and > decode duration, as this is what is checked in this case. dts vs. pts does > not really matter, or is a more special question, because for video without > frame reordering, e.g. H.264 Baseline Profile, dts and pts are the same, > even if there is a constant offset sometimes, and will break just as > described. > > Sorry, I'm not familiar (yet) with the tests in WebKit. I have seen the mock > source, but not 100% sure how it works. Can somebody assist to model the > first frame sequence I have described in a mock source test? Sure, here's a representative sort of test case for this kind of test: https://trac.webkit.org/browser/webkit/trunk/LayoutTests/media/media-source/media-source-dropped-iframe.html Which has the expected results of: https://trac.webkit.org/browser/webkit/trunk/LayoutTests/media/media-source/media-source-dropped-iframe-expected.txt You can see the interface to "makeAInitSegment()" and "makeASample()" here: https://trac.webkit.org/browser/webkit/trunk/LayoutTests/media/media-source/mock-media-source.js From your initial example, you would need something like: var samples = concatenateSamples( makeASample(0, 0, 10, 100, 1, SAMPLE_FLAG.SYNC, 1), makeASample(10, 10, 90, 100, 1, SAMPLE_FLAG.SYNC, 1), makeASample(100, 100, 10, 100, 1, SAMPLE_FLAG.SYNC, 1), ); ..for a sample sequence of DTS,Duration [{0,10}, {10, 90}, {100, 10}], all with a timebase of 100 where each sample was an I-frame. > Regarding a test case, I will try to provide one, but as I said the issue > can be understood logically and that's what I tried to explain in the > report. > I suggest to focus on creating the mock test first. > > Regarding the split up of large audio sample groups, what are the limits > that will cause the split up? Problem is that especially in low latency live > streaming use cases, the fMP4 fragments are kept pretty small, might be even > video frame length. I believe in theory that on platforms using CoreMedia (i.e., macOS and iOS), a composite audio sample can be divided up to the sample rate. So a composite sample with a DTS,Duration of {0,44100} and a timescale of 44100 (which is also the audio track's sample rate) could be cleanly divided into two samples of {0, 12345} and {12345, 31755}. (Note that this will probably never line up perfectly with the timescale of a video track, whose own timebases tend to be not be based on the audio track's sample rate.) Created attachment 362395 [details]
Test case file for the reported issue
Created attachment 362396 [details]
Expected output file for the reported issue
Created attachment 362398 [details]
Correct test case file for the reported issue
@jer I have created a simple test case and expected output file. The expectation is that none of the added samples gets dropped. The case will fail with current webkit version due to the describe issue. Sample 3 and all following samples will get dropped until the next sync sample, which might take multiple seconds and might repeat for the next group of pictures/samples. @jer What's the best way to proceed? I can propose a patch that will maintain the current logic as far as possible while fixing the issue.? (In reply to up from comment #11) > @jer What's the best way to proceed? I can propose a patch that will > maintain the current logic as far as possible while fixing the issue.? @up, sure, that'd be fine! When you propose a patch, the EWS (early warning system) bots will build the patch and run all the existing tests, including all the MSE ones to ensure that there are no regressions to previous fixes. Created attachment 362766 [details]
Patch Rev1
@jer I have added a patch. Code only, so without test case files and change log yet. I will add more comments on the idea of the patch. Is this ok for now, or should I submit the complete patch already before review?
(In reply to up from comment #13) > Created attachment 362766 [details] > Patch Rev1 > > @jer I have added a patch. Code only, so without test case files and change > log yet. I will add more comments on the idea of the patch. Is this ok for > now, or should I submit the complete patch already before review? @up I think what you've done so far is fine. You can signal that your patch is ready for review by setting the "r?" flag on the attachment, but attaching "work in progress" patches like yours is perfectly acceptable. Next step is to force the EWS bots to try to apply, build, and test your patch. The EWS bots will automatically do so for any "r?" patch, but for "work in progress" patches, you have to do that manually by clicking the "Submit for EWS analysis" button next to the attachment. Created attachment 362781 [details]
Patch Rev2
@jer Patch Rev2 has passed the EWS. Intentions: - Keeping changes local, compact and clear - Avoiding to modify the state of the SourceBuffer object - Not mixing time differences and durations in greatestDecodeDuration in general - Avoiding to make the final conditional statement more complex and nested - Respecting the current greatestDecodeDuration approach and complementing it only as far as required - Fixing the issue while not breaking out-of-order appends This looks like a good approach. For Rev 3, please add the tests you attached to this bug in LayoutTests/media/media-source, and generate a ChangeLog using 'prepare-ChangeLog -b 194747', and I'll review it. @jer Thanks, I will do so. Created attachment 363004 [details]
Patch Rev3
Comment on attachment 363004 [details]
Patch Rev3
r=me. If you'd like me (or any other WebKit contributor) to land it on your behalf, please set the 'cq' flag to 'cq?'.
@jer Thank you! Patch Rev3 passed the EWS. I have set the commit-queue flag for it to ?. Comment on attachment 363004 [details] Patch Rev3 Clearing flags on attachment: 363004 Committed r242129: <https://trac.webkit.org/changeset/242129> All reviewed patches have been landed. Closing bug. (In reply to up from comment #21) > @jer Thank you! Patch Rev3 passed the EWS. I have set the commit-queue flag > for it to ?. @up thanks for finding, reporting and fixing this! Thank you all, and especially @jer for the support and guidance! @jer Hi, just checked the recent STP 77 branch. As it seems to be branched a bit back in time this patch is not included. Is there something else required from our side, except patience, to hopefully find it in the next STP(s) and the next Safari OSX release based on recent STPs? @up, this change will automatically get picked up in the next STP, so there patience is all that’s required. It will also get picked up in a future Safari release, and I will ping this thread once that release is available. @jer Seems like the issue has been released in Safari 12.1 without this patch. Hello everybody, we are excited to learn how the patch we proposed might land in the Safari releases. We noticed that Safari 12.1 still has the bug we worked on. This has a severe impact on our live streaming business applications. Is there a way to find out the ETA for getting this into production? It would be great to hear back about this. Please pm or mail me. Thank you! Oliver Lietz Founder/CEO nanocosmos |