Bug 230435 - Regression (r282201 - r282220?) : [ MacOS ] http/tests/media/track-in-band-hls-metadata.html is a flaky failure
Summary: Regression (r282201 - r282220?) : [ MacOS ] http/tests/media/track-in-band-hl...
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: Media (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
Keywords: InRadar
Depends on:
Reported: 2021-09-17 16:35 PDT by ayumi_kojima
Modified: 2021-10-12 09:03 PDT (History)
4 users (show)

See Also:


Note You need to log in before you can comment on or make changes to this bug.
Description ayumi_kojima 2021-09-17 16:35:43 PDT

Is a flaky failure on MacOS.

History: https://results.webkit.org/?suite=layout-tests&test=http%2Ftests%2Fmedia%2Ftrack-in-band-hls-metadata.html

The test has been marked as pass/timeout at Bug 140022 and has been flaky timing out/ timing out as far as seen in the history.

The flaky failure started showing up at around r282220.


--- /Volumes/Data/worker/bigsur-debug-applesilicon-tests-wk2/build/layout-test-results/http/tests/media/track-in-band-hls-metadata-expected.txt
+++ /Volumes/Data/worker/bigsur-debug-applesilicon-tests-wk2/build/layout-test-results/http/tests/media/track-in-band-hls-metadata-actual.txt
@@ -20,26 +20,26 @@
 * 1
 EXPECTED (typeof(cue) != 'undefined') OK
 EXPECTED (cue.data == 'null') OK
-EXPECTED (cue.type == 'com.apple.quicktime.HLS') OK
+EXPECTED (cue.type == 'com.apple.quicktime.HLS'), OBSERVED 'org.id3' FAIL
 EXPECTED (cue.value != 'null') OK
-EXPECTED (cue.value.key == '"X-START-OFFSET"') OK
-EXPECTED (cue.value.data == '"0.000000"') OK
+EXPECTED (cue.value.key == '"X-START-OFFSET"'), OBSERVED '"TIT2"' FAIL
+EXPECTED (cue.value.data == '"0.000000"'), OBSERVED '"Stream Counting"' FAIL
 * 2
 EXPECTED (typeof(cue) != 'undefined') OK
 EXPECTED (cue.data == 'null') OK
 EXPECTED (cue.type == 'com.apple.quicktime.HLS') OK
 EXPECTED (cue.value != 'null') OK
-EXPECTED (cue.value.key == '"X-END-OFFSET"') OK
-EXPECTED (cue.value.data == '"5.000000"') OK
+EXPECTED (cue.value.data == '"5.000000"'), OBSERVED '"0.000000"' FAIL
 * 3
 EXPECTED (typeof(cue) != 'undefined') OK
 EXPECTED (cue.data == 'null') OK
-EXPECTED (cue.type == 'org.id3') OK
+EXPECTED (cue.type == 'org.id3'), OBSERVED 'com.apple.quicktime.HLS' FAIL
 EXPECTED (cue.value != 'null') OK
-EXPECTED (cue.value.key == '"TIT2"') OK
-EXPECTED (cue.value.data == '"Stream Counting"') OK
+EXPECTED (cue.value.key == '"TIT2"'), OBSERVED '"X-END-OFFSET"' FAIL
+EXPECTED (cue.value.data == '"Stream Counting"'), OBSERVED '"5.000000"' FAIL
 * 4
 EXPECTED (typeof(cue) != 'undefined') OK
Comment 1 Radar WebKit Bug Importer 2021-09-17 16:37:27 PDT
Comment 2 ayumi_kojima 2021-09-17 16:41:39 PDT
Marked test expectations: https://trac.webkit.org/changeset/282709/webkit
Comment 3 ayumi_kojima 2021-10-12 09:00:50 PDT
I was able to reproduce the failure at TOT locally on BigSur using run-webkit-tests -1 --force --debug --iterations 50 http/tests/media/track-in-band-hls-metadata.html --exit-after-n-failures 1.

According to the history, the test has been flaky timing out for long time, but at r282220 it started flaky failing. At around that revision, the timeout became flakier.

The test passed on r282200 locally. It timed out on r282205 and r282217 and hanged with --no-timeout.

From the reproduction, I think the regression range is r282201 - r282220, but not able to find it out the exact revision because I couldn't reproduce the failure.