Bug 190085

Summary: [MSE] Use tolerance when growing the coded frame group
Product: WebKit Reporter: Alicia Boya García <aboya>
Component: MediaAssignee: Alicia Boya García <aboya>
Status: RESOLVED FIXED    
Severity: Normal CC: aperez, commit-queue, ews-watchlist, jer.noble, mcatanzaro, rniwa, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Archive of layout-test-results from ews100 for mac-sierra
none
Archive of layout-test-results from ews104 for mac-sierra-wk2
none
Patch
none
Patch aboya: commit-queue+

Description Alicia Boya García 2018-09-28 13:03:38 PDT
This patch introduces a millisecond tolerance in the range of
potential frames that should be erased frame from the track buffer
when the coded frame group is growing.

This is necessary because some files have imprecise overlapping
timestamps (especially WebM files).

This fixes a stall when seeking back and forth in YouTube with WebM
video.

A test case simulating the problem with video/mock using timestamps
similar to those of a typical 30 fps WebM video is also added.
Comment 1 Alicia Boya García 2018-09-28 16:42:14 PDT
Created attachment 351139 [details]
Patch
Comment 2 EWS Watchlist 2018-09-28 18:01:00 PDT
Comment on attachment 351139 [details]
Patch

Attachment 351139 [details] did not pass mac-ews (mac):
Output: https://webkit-queues.webkit.org/results/9388444

New failing tests:
imported/w3c/web-platform-tests/media-source/mediasource-config-change-mp4-av-audio-bitrate.html
imported/w3c/web-platform-tests/media-source/mediasource-config-change-mp4-av-video-bitrate.html
Comment 3 EWS Watchlist 2018-09-28 18:01:02 PDT
Created attachment 351152 [details]
Archive of layout-test-results from ews100 for mac-sierra

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews100  Port: mac-sierra  Platform: Mac OS X 10.12.6
Comment 4 Alicia Boya García 2018-09-28 18:24:18 PDT
The bot failures don't give me very useful information... assert_unreached: Unexpected event 'error'

I can't reproduce the issue either in my platform. Any help on what's happening there?
Comment 5 EWS Watchlist 2018-09-28 18:53:20 PDT
Comment on attachment 351139 [details]
Patch

Attachment 351139 [details] did not pass mac-wk2-ews (mac-wk2):
Output: https://webkit-queues.webkit.org/results/9388966

New failing tests:
imported/w3c/web-platform-tests/media-source/mediasource-config-change-mp4-av-audio-bitrate.html
Comment 6 EWS Watchlist 2018-09-28 18:53:22 PDT
Created attachment 351158 [details]
Archive of layout-test-results from ews104 for mac-sierra-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews104  Port: mac-sierra-wk2  Platform: Mac OS X 10.12.6
Comment 7 Jer Noble 2018-10-01 13:30:05 PDT
(In reply to Alicia Boya García from comment #4)
> The bot failures don't give me very useful information... assert_unreached:
> Unexpected event 'error'

I'm guessing that the media element has thrown an "error" event. This is likely because this change introduced behavior that broke the decoder path.

If there are idiosyncrasies in WebM's timestamps, I think this is a bug that is better fixed at the parser level, not in SourceBuffer.  E.g., if WebM can't produce timestamps (accurately) with sub-millisecond precision, you may be able to work around this by truncating all WebM timestamps to 1ms, rather than introduce 1ms of "slop" for every parser.
Comment 8 Alicia Boya García 2018-10-02 01:54:20 PDT
(In reply to Jer Noble from comment #7)
> (In reply to Alicia Boya García from comment #4)
> > The bot failures don't give me very useful information... assert_unreached:
> > Unexpected event 'error'
> 
> I'm guessing that the media element has thrown an "error" event. This is
> likely because this change introduced behavior that broke the decoder path.

Yes, this is what I was able to know looking at the code.

> If there are idiosyncrasies in WebM's timestamps, I think this is a bug that
> is better fixed at the parser level, not in SourceBuffer.  E.g., if WebM
> can't produce timestamps (accurately) with sub-millisecond precision, you
> may be able to work around this by truncating all WebM timestamps to 1ms,
> rather than introduce 1ms of "slop" for every parser.

That's my alternative plan. I would rather do this in SourceBuffer though because sloppiness is not completely WebM specific (althout most of it happens in WebM).
Comment 9 Jer Noble 2018-10-02 09:18:19 PDT
Alicia, do you have a link to the failing fMP4 content? I'd like to see if it fails the same way on our parser.
Comment 10 Alicia Boya García 2018-10-02 09:37:06 PDT
(In reply to Jer Noble from comment #9)
> Alicia, do you have a link to the failing fMP4 content? I'd like to see if
> it fails the same way on our parser.

https://github.com/youtube/js_mse_eme/issues/29
Comment 11 Jer Noble 2018-10-02 09:42:03 PDT
(In reply to Alicia Boya García from comment #10)
> (In reply to Jer Noble from comment #9)
> > Alicia, do you have a link to the failing fMP4 content? I'd like to see if
> > it fails the same way on our parser.
> 
> https://github.com/youtube/js_mse_eme/issues/29

YouTube has a significant number of broken muxes which they clean up in JS before feeding into Safari. I wonder if they need to enable these same fixes for non-Safari WebKit based browsers.
Comment 12 Alicia Boya García 2018-10-02 10:25:29 PDT
(In reply to Jer Noble from comment #11)
> (In reply to Alicia Boya García from comment #10)
> > (In reply to Jer Noble from comment #9)
> > > Alicia, do you have a link to the failing fMP4 content? I'd like to see if
> > > it fails the same way on our parser.
> > 
> > https://github.com/youtube/js_mse_eme/issues/29
> 
> YouTube has a significant number of broken muxes which they clean up in JS
> before feeding into Safari.

That's a horrifying hack.

So far, I've only seen this issue in a YouTube test. It would be a bit strange that they force implementors trying to comply with the YouTube test suite to handle these broken files while at the same time they patch them on runtime in the actual YouTube, but I would not discard it.

Have you look at the failure? For some reason it seems to happen only with specific versions of OS X.
Comment 13 Michael Catanzaro 2018-10-17 15:53:19 PDT
I'm not comfortable with this being fixed for 2.22 but not on trunk. We're going to forget and it will come back to bit us in 2.24.
Comment 14 Alicia Boya García 2018-10-17 17:16:11 PDT
We are working on the bug that was exposed by this patch.

https://bugs.webkit.org/show_bug.cgi?id=190590

Once that bug is out, this patch can be sent to EWS again.
Comment 15 Alicia Boya García 2018-10-23 09:40:58 PDT
Created attachment 352979 [details]
Patch
Comment 16 Alicia Boya García 2018-10-23 16:38:37 PDT
Yay, indeed after fixing the bug linked above the patch passes EWS.
Comment 17 Michael Catanzaro 2018-10-30 11:23:37 PDT
(In reply to Michael Catanzaro from comment #13)
> I'm not comfortable with this being fixed for 2.22 but not on trunk. We're
> going to forget and it will come back to bit us in 2.24.

Still nervous about this. Who is supposed to review it?
Comment 18 Jer Noble 2018-10-30 13:19:20 PDT
Comment on attachment 352979 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=352979&action=review

r=me with nit.

> LayoutTests/media/media-source/media-source-append-acb-tolerance.html:39
> +    function makeAWebMLikeSample(presentationTimeMilliseconds, durationMicroseconds, trackID, flags, generation) {
> +        // The original makeASample() function uses millisecond precision for all the fields, but for this test we need
> +        // to test what happens with overlaps smaller than that.
> +
> +        var byteLength = 30;
> +        var buffer = new ArrayBuffer(byteLength);
> +        var array = new Uint8Array(buffer);
> +        array.set(stringToArray('smpl'));
> +
> +        var view = new DataView(buffer);
> +        view.setUint32(4, byteLength, true);
> +
> +        var timeScale = 1e6;
> +        view.setInt32(8, timeScale, true);
> +        view.setInt32(12, presentationTimeMilliseconds * 1000, true);
> +        view.setInt32(16, presentationTimeMilliseconds * 1000, true);
> +        view.setInt32(20, durationMicroseconds, true);
> +        view.setInt32(24, trackID, true);
> +        view.setUint8(28, flags);
> +        view.setUint8(29, generation);
> +
> +        return buffer
> +    }

mock-media-source.js's makeASample() now takes a timeScale for the last parameter (after generation), so you can likely remove this method...

> LayoutTests/media/media-source/media-source-append-acb-tolerance.html:54
> +            const ptsMilliseconds = Math.round(1000 * (start + numFrame / fps));
> +            const durationMicroseconds = 33333;
> +            samples.push(makeAWebMLikeSample(ptsMilliseconds, durationMicroseconds, 1, numFrame === 0 ? SAMPLE_FLAG.SYNC : SAMPLE_FLAG.NONE));

...And update this section to use makeASample().
Comment 19 Michael Catanzaro 2018-10-30 19:22:05 PDT
Thanks Jer!
Comment 20 Alicia Boya García 2018-10-31 04:32:43 PDT
Comment on attachment 352979 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=352979&action=review

>> LayoutTests/media/media-source/media-source-append-acb-tolerance.html:39
>> +    }
> 
> mock-media-source.js's makeASample() now takes a timeScale for the last parameter (after generation), so you can likely remove this method...

The problem with makeASample(), even if it accepts a timescale argument, is that it forces you to pass timestamps in seconds, incurring in unsafe float division and multiplication and rounding errors. Now, those could (probably) be amended with rounding instead of truncating when calling setInt*(), but that's not easy to check if that's safe either.

I think it would be better if we split makeASample() into makeASample() and makeASampleWithTimeScale(), the later using integer timescale units instead of (float) seconds.
Comment 21 Jer Noble 2018-10-31 08:36:17 PDT
(In reply to Alicia Boya García from comment #20)
> Comment on attachment 352979 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=352979&action=review
> 
> >> LayoutTests/media/media-source/media-source-append-acb-tolerance.html:39
> >> +    }
> > 
> > mock-media-source.js's makeASample() now takes a timeScale for the last parameter (after generation), so you can likely remove this method...
> 
> The problem with makeASample(), even if it accepts a timescale argument, is
> that it forces you to pass timestamps in seconds, incurring in unsafe float
> division and multiplication and rounding errors. Now, those could (probably)
> be amended with rounding instead of truncating when calling setInt*(), but
> that's not easy to check if that's safe either.
> 
> I think it would be better if we split makeASample() into makeASample() and
> makeASampleWithTimeScale(), the later using integer timescale units instead
> of (float) seconds.

There's not that many tests; we could just update makeASample to take media time values and time scales rather than seconds, and change all the tests that use makeASample.
Comment 22 Alicia Boya García 2018-10-31 13:18:30 PDT
(In reply to Jer Noble from comment #21)
> There's not that many tests; we could just update makeASample to take media
> time values and time scales rather than seconds, and change all the tests
> that use makeASample.

OK, there you have: https://bugs.webkit.org/show_bug.cgi?id=191128
Comment 23 Alicia Boya García 2018-10-31 14:38:29 PDT
Created attachment 353534 [details]
Patch
Comment 24 WebKit Commit Bot 2018-10-31 15:05:59 PDT
Comment on attachment 353534 [details]
Patch

Clearing flags on attachment: 353534

Committed r237657: <https://trac.webkit.org/changeset/237657>
Comment 25 WebKit Commit Bot 2018-10-31 15:06:01 PDT
All reviewed patches have been landed.  Closing bug.
Comment 26 Radar WebKit Bug Importer 2018-10-31 15:06:28 PDT
<rdar://problem/45713452>