Bug 209119

Summary: [GStreamer][MSE] Youtube 'live stream'/H264 URLs fail to play, VP8/9 URLs play OK
Product: WebKit Reporter: Chris <cbillington>
Component: WebKitGTKAssignee: Philippe Normand <pnormand>
Status: RESOLVED FIXED    
Severity: Normal CC: aboya, bugs-noreply, calvaris, cbillington, cgarcia, eocanha, eric.carlson, ews-watchlist, glenn, gustavo, jer.noble, menard, philipj, pnormand, sergio, vjaquez
Priority: P2    
Version: Other   
Hardware: PC   
OS: Linux   
See Also: https://bugs.webkit.org/show_bug.cgi?id=206873
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch none

Description Chris 2020-03-14 21:32:30 PDT
Arch Linux x86_64
Since updating to Gnome 3.36/Epiphany(Web) 3.36  Epiphany doesn't play H264 Youtube videos. 
Eventually YT message appears 'If playback doesn't begin shortly, try restarting your device'

These are usually 'live streams', since they are not often streamed using the YT-preferred VP8/9 codec.

An example is https://www.youtube.com/watch?v=9Auq9mYxFEE (Sky UK News live stream): 

'Stats for nerds' show the video codec as 'avc1.4d401f(136). Other live streams using this codec also fail. 

Tested using Epiphany/Gnome Web and also Webkit2gtk MiniBrowser (with default User Agent string) on two different systems with different Intel graphics carss.

Previously these kind of YT URLs worked with Epiphany 3.34/Webkit2gtk 2.26, though with noticeable video decoding delay relative to audio.
Comment 1 Chris 2020-03-14 21:35:49 PDT
$ youtube-dl -F https://www.youtube.com/watch?v=9Auq9mYxFEE
[youtube] 9Auq9mYxFEE: Downloading webpage
[youtube] 9Auq9mYxFEE: Downloading m3u8 information
[youtube] 9Auq9mYxFEE: Downloading MPD manifest
[info] Available formats for 9Auq9mYxFEE:
format code  extension  resolution note
91           mp4        256x144    HLS  197k , avc1.42c00b, 30.0fps, mp4a.40.5@ 48k
92           mp4        426x240    HLS  338k , avc1.4d4015, 30.0fps, mp4a.40.5@ 48k
93           mp4        640x360    HLS  829k , avc1.4d401e, 30.0fps, mp4a.40.2@128k
94           mp4        854x480    HLS 1380k , avc1.4d401f, 30.0fps, mp4a.40.2@128k
95           mp4        1280x720   HLS 2593k , avc1.4d401f, 30.0fps, mp4a.40.2@256k
96           mp4        1920x1080  HLS 4715k , avc1.640028, 30.0fps, mp4a.40.2@256k (best)
Comment 2 Philippe Normand 2020-03-15 03:05:47 PDT
Hi Chris, thanks for the bug-report. I confirm this is indeed a regression in 2.28. The stream plays fine in 2.26.
Comment 3 Philippe Normand 2020-03-19 09:34:32 PDT
Created attachment 393986 [details]
Patch
Comment 4 Xabier Rodríguez Calvar 2020-03-19 10:03:59 PDT
Comment on attachment 393986 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=393986&action=review

If Alicia says it's ok, I'm good as well.

> Source/WebCore/platform/graphics/gstreamer/mse/AppendPipeline.cpp:464
> +    static MediaTime decodeTimeOffset = MediaTime::invalidTime();

I think this should be MediaTime m_decodeTimeOffset in AppendPipeline.h.
Comment 5 Alicia Boya García 2020-03-19 11:21:55 PDT
Comment on attachment 393986 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=393986&action=review

This is not the correct solution. In the review I gave some background on the purpose of the code that is being altered by the patch, which we don't want to break, plus some other problems it introduces. In any case, tweaking the timestamps exposed to WebKit or GStreamer elements is not the solution in this case.

I've gave a look to the streams YouTube is serving for live streams. They don't have edit lists or any other MP4 time offseting mechanisms. The problem is somewhere else, and actually rather simple: we're just given a stream that starts at time 13:00:00, not at 0:00:00. But YouTube is also setting currentTime to 13:00:00. We should be able to play that way!

You can confirm this by opening the inspector and querying the buffered ranges and currentTime:

> $("video").buffered.start(0) / 3600
13
> $("video").currentTime / 3600
13.00218702527778

Also I will say this worked fine in my Epiphany 3.34.3 with WebKitGTK 2.26.4 with avc1 and mp4a, while this seems to be broken in my master branch with the same codecs, so watch out for regressions.

> Source/WebCore/platform/graphics/gstreamer/MediaSampleGStreamer.cpp:-98
> -void MediaSampleGStreamer::applyPtsOffset(MediaTime timestampOffset)

For some context: the original purpose of applyPtsOffset() is to ensure the believed first sample of the stream starts in clean PTS boundaries.

This is a hack. We shouldn't have to do this, but because of a bug in qtdemux in gst < 1.16, this is how we have been managing. (The fix https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/-/commit/c2a0da8096009f0f99943f78dc18066965be60f9)

Also, this is only relevant for MP4 streams, Matroska doesn't have this problem. In particular, the problem is qtdemux used to obviate edit lists for all push-mode based streams. For an explanation of why edit lists are a thing and why they are important, I have a section in my media notes: https://github.com/ntrrgc/media-notes/wiki/MP4-Notes#simple-edit-lists

Put simply, edit lists define a mapping between the timestamps the tracks in a MP4 movie are encoded with, and the timestamps these sections of the track are placed in the overall movie... And they are only a thing because of technicalities.

Here is the example from my media notes. Applications like WebKit are interested in Movie DTS and Movie PTS. Unfortunately, before the patch previously mentioned, there was no way to get these (and even now, you have to go through the extra step of converting to stream time to get them, which, for consistency between supported GStreamer runtimes, we're not doing yet).

                     ·---·   ·---·   ·---·
Decode order:        | A |   | C |   | B |
                     ·---·   ·---·   ·---·
Duration:             100     100     100
Track DTS:              0     100     200
Track PTS (AKA CTS)   100     300     200
Movie DTS:           -100       0     100
Movie PTS:              0     200     100

Note Track PTS starts at 100, not 0. This is our biggest problem, as playing starting at zero will fail as no frame with PTS=0 seems to arrive. So, what is the hack we did?

    // Add a gap sample if a gap is detected before the first sample.
    if (mediaSample->decodeTime() == MediaTime::zeroTime() && mediaSample->presentationTime() > MediaTime::zeroTime() && mediaSample->presentationTime() <= MediaTime(1, 10)) {
        GST_DEBUG("Adding gap offset");
        mediaSample->applyPtsOffset(MediaTime::zeroTime());
    }

If we have a frame with DTS=0 (track DTS always starts implicitly at zero), but with PTS > 0 (and not too long), we basically extend the frame to the left, by setting its PTS = 0 and enlarging its duration to compensate.

Note this is intentionally the only use of applyPtsOffset(), and the argument is arguably redundant, since it's always zero. Also you could argue the name is not descriptive, since it's doing more of an extension than an offset. Also remember this is only done *for the first sample* (since that's the only one that can have DTS=0).

void MediaSampleGStreamer::applyPtsOffset(MediaTime timestampOffset)
{
    if (m_pts > timestampOffset) {
        m_duration = m_duration + (m_pts - timestampOffset);
        m_pts = timestampOffset;
    }
}

The consequences of this hack are:

1) First frame is slightly (typically one frame) longer.
2) Because we're not actually playing frames on the correct movie PTS, we can have slight A/V synchronization problems, also in the order of one frame.

None of these are the end of the world, since their effects are too slight to be noticed by users in typical files. Lots of players in the wild have broken edit list handling and happy users.

Also, this fix has a very important property: The resulting PTS and DTS that WebKit keeps don't vary no matter the order of appended media segments. Break that property and you'll have seek bugs eventually, so you should be very careful with adding offsets in MSE code.

>> Source/WebCore/platform/graphics/gstreamer/mse/AppendPipeline.cpp:464
>> +    static MediaTime decodeTimeOffset = MediaTime::invalidTime();
> 
> I think this should be MediaTime m_decodeTimeOffset in AppendPipeline.h.

This should be, if anything, a member variable. Otherwise, wrong offsets would be applied to other AppendPipeline streams, potentially corrupting them.

> Source/WebCore/platform/graphics/gstreamer/mse/AppendPipeline.cpp:466
> +        decodeTimeOffset = mediaSample->decodeTime();

This will break any playback use case where the first appended sample is not from the beginning of the movie.
Comment 6 Philippe Normand 2020-03-20 02:58:56 PDT
(In reply to Alicia Boya García from comment #5)

> 
> Also I will say this worked fine in my Epiphany 3.34.3 with WebKitGTK 2.26.4
> with avc1 and mp4a, while this seems to be broken in my master branch with
> the same codecs, so watch out for regressions.
> 

As mentioned in comment 2 indeed.

> > Source/WebCore/platform/graphics/gstreamer/MediaSampleGStreamer.cpp:-98
> > -void MediaSampleGStreamer::applyPtsOffset(MediaTime timestampOffset)
> 
> For some context: the original purpose of applyPtsOffset() is to ensure the
> believed first sample of the stream starts in clean PTS boundaries.
> 
> This is a hack.

OK but it seems this has nothing to do with the issue at hand. As mentioned in the ChangeLog, the SourceBuffer interprets the huge DTS (not PTS) as a gap and thus exits early from provideMediaData().
Comment 7 Alicia Boya García 2020-03-20 05:31:16 PDT
(In reply to Philippe Normand from comment #6)
> > > Source/WebCore/platform/graphics/gstreamer/MediaSampleGStreamer.cpp:-98
> > > -void MediaSampleGStreamer::applyPtsOffset(MediaTime timestampOffset)
> > 
> > For some context: the original purpose of applyPtsOffset() is to ensure the
> > believed first sample of the stream starts in clean PTS boundaries.
> > 
> > This is a hack.
> 
> OK but it seems this has nothing to do with the issue at hand.

Yeah, it has nothing to do with the issue at hand, but since you were changing that code, I wanted to provide an explanation for why it exists, since it's something I don't consider obvious.

> As mentioned
> in the ChangeLog, the SourceBuffer interprets the huge DTS (not PTS) as a
> gap and thus exits early from provideMediaData().

A huge DTS shouldn't be a problem if currentTime is also huge. Any frame with PTS <= currentTime should be enough to start playback. I wouldn't discard the possibility of a bug in SourceBuffer either. Mock codec tests are useful to test this. See LayoutTests/media/media-source/media-source-append-overlapping-dts.html for an example.
Comment 8 Philippe Normand 2020-03-23 08:18:07 PDT
(In reply to Alicia Boya García from comment #7)
> > As mentioned
> > in the ChangeLog, the SourceBuffer interprets the huge DTS (not PTS) as a
> > gap and thus exits early from provideMediaData().
> 
> A huge DTS shouldn't be a problem if currentTime is also huge. Any frame
> with PTS <= currentTime should be enough to start playback. I wouldn't
> discard the possibility of a bug in SourceBuffer either. Mock codec tests
> are useful to test this. See
> LayoutTests/media/media-source/media-source-append-overlapping-dts.html for
> an example.

OK but as Safari doesn't seem to suffer from this bug, I'm not sure this is an issue in SourceBuffer or any of the cross-platform code... But I'll have a look, anyway...
Comment 9 Enrique Ocaña 2020-03-26 05:57:27 PDT
I've been debugging the issue. This is a case of initial append and playback start at a non-zero position.

What is happening is that the unbuffered gap bailout code added in bug 201323 is preventing the appended data to be automatically enqueed as they are appended. That's not necessarily bad, because a later reenqueue after the seek to the playback start position should enquee the data.

However, the seek is never propagated to MediaSource and SourceBuffer because doSeek() detects that the pipeline is in "async change READY --> PAUSED" and just waits for an hypotetical start (from zero) to finish, but that'll never happen because no sample has been fed to the pipeline.
Comment 10 Enrique Ocaña 2020-04-21 07:23:14 PDT
Created attachment 397079 [details]
Patch
Comment 11 Xabier Rodríguez Calvar 2020-04-21 08:00:59 PDT
Comment on attachment 397079 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=397079&action=review

> Source/WebCore/platform/graphics/gstreamer/mse/MediaPlayerPrivateGStreamerMSE.cpp:264
> +        && (!(state == GST_STATE_PLAYING && newState == GST_STATE_PAUSED)
> +            && !(state == GST_STATE_READY && newState >= GST_STATE_PAUSED)))

"De-morganize" this, please.

> Source/WebCore/platform/graphics/gstreamer/mse/MediaPlayerPrivateGStreamerMSE.cpp:270
> +            && (!(state == GST_STATE_PLAYING && newState == GST_STATE_PAUSED)
> +                && !(state == GST_STATE_READY && newState >= GST_STATE_PAUSED))) {

Ditto.
Comment 12 Alicia Boya García 2020-04-21 08:13:43 PDT
Comment on attachment 397079 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=397079&action=review

Overall LGTM.

> Source/WebCore/platform/graphics/gstreamer/mse/WebKitMediaSourceGStreamer.cpp:738
> +        gst_event_parse_segment(event, const_cast<const GstSegment**>(&originalSegment));

The segment in GST_EVENT_SEGMENT is immutable. Instead of forcing it to mutate with const_casts, try to replace the whole info->data with a new GstEvent* with the desired segment.

Tip: When doing so, it's useful to use gst_event_replace(&info->data, newEvent) so GStreamer unrefs the old one.
Comment 13 Enrique Ocaña 2020-04-21 10:28:10 PDT
Created attachment 397093 [details]
Patch
Comment 14 Xabier Rodríguez Calvar 2020-04-22 03:27:35 PDT
Comment on attachment 397093 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=397093&action=review

> Source/WebCore/platform/graphics/gstreamer/mse/MediaPlayerPrivateGStreamerMSE.cpp:253
> +static bool checkStateChangeShouldDelaySeek(GstStateChangeReturn getStateResult, GstState currentState, GstState newState)
> +{
> +    if (getStateResult != GST_STATE_CHANGE_ASYNC)
> +        return false;
> +    if (GST_STATE_TRANSITION(currentState, newState) == GST_STATE_CHANGE_PLAYING_TO_PAUSED)
> +        return false;
> +    if (currentState == GST_STATE_READY && newState >= GST_STATE_PAUSED)
> +        return false;
> +    return true;
> +}

This function looks much better than the code before. If I can do a nit, I would do just: shouldDelaySeek. And similar for the variables you use below.

> Source/WebCore/platform/graphics/gstreamer/mse/MediaPlayerPrivateGStreamerMSE.cpp:364
> +        GST_DEBUG("doSeek(): Initial seek succeeded, returning true");

You don't need to specify the function in the seek, it's included in the log already.

> Source/WebCore/platform/graphics/gstreamer/mse/MediaPlayerPrivateGStreamerMSE.cpp:373
> +            GST_DEBUG("doSeek(): gst_element_seek() failed, returning false");

Ditto

> Source/WebCore/platform/graphics/gstreamer/mse/MediaPlayerPrivateGStreamerMSE.cpp:377
> +        GST_DEBUG("doSeek(): gst_element_seek() succeeded, returning true");

Ditto

> Source/WebCore/platform/graphics/gstreamer/mse/WebKitMediaSourceGStreamer.cpp:774
> +            initialSeekSegmentFixerProbe, gst_segment_copy(segment.get()), reinterpret_cast<GDestroyNotify>(gst_segment_free));

I don't think it is needed to copy the segment here, right? We are not using it anymore so I think we can either .release() the GUniquePtr or just not use it.
Comment 15 Enrique Ocaña 2020-04-22 03:45:10 PDT
Comment on attachment 397093 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=397093&action=review

>> Source/WebCore/platform/graphics/gstreamer/mse/WebKitMediaSourceGStreamer.cpp:774
>> +            initialSeekSegmentFixerProbe, gst_segment_copy(segment.get()), reinterpret_cast<GDestroyNotify>(gst_segment_free));
> 
> I don't think it is needed to copy the segment here, right? We are not using it anymore so I think we can either .release() the GUniquePtr or just not use it.

I wish we could do that, but unfortunately this code is inside a loop, so the same segment can be passed to more than one "instance" of the probe. The probe that runs first would use and free the segment and the probe that runs second (assuming 1 audio and 1 video streams) would operate on a dangling pointer.
Comment 16 Enrique Ocaña 2020-04-22 04:01:23 PDT
Created attachment 397185 [details]
Patch
Comment 17 Enrique Ocaña 2020-04-22 04:04:17 PDT
I ended up naming the function as checkShouldDelaySeek() and the variable as shouldDelaySeek because the compiler complains that they can't have the same name.
Comment 18 EWS 2020-04-22 04:36:31 PDT
Committed r260506: <https://trac.webkit.org/changeset/260506>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 397185 [details].