Bug 227366 - REGRESSION: http/tests/preload/onload_event.html is a flaky timeout on Big Sur wk1 Release
Summary: REGRESSION: http/tests/preload/onload_event.html is a flaky timeout on Big Su...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Page Loading (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Jer Noble
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2021-06-24 11:07 PDT by ayumi_kojima
Modified: 2021-08-08 14:53 PDT (History)
15 users (show)

See Also:


Attachments
Expectations (1.58 KB, patch)
2021-06-25 15:02 PDT, ayumi_kojima
no flags Details | Formatted Diff | Diff
sample (118.19 KB, text/plain)
2021-06-25 16:50 PDT, Alexey Proskuryakov
no flags Details
Patch (7.58 KB, patch)
2021-08-05 14:22 PDT, Jer Noble
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description ayumi_kojima 2021-06-24 11:07:29 PDT
http/tests/preload/onload_event.html (layout-tests)

Flaky time out failure 

History: https://results.webkit.org/?suite=layout-tests&test=http%2Ftests%2Fpreload%2Fonload_event.html
Comment 1 Radar WebKit Bug Importer 2021-06-24 11:09:19 PDT
<rdar://problem/79733280>
Comment 2 ayumi_kojima 2021-06-24 16:46:01 PDT
I was able to reproduce this timeout using command:run-webkit-tests http/tests/preload/onload_event.html --iterations 500 -f --force -1
Comment 3 Alexey Proskuryakov 2021-06-24 18:33:34 PDT
This test is usually very fast, so timing out means that it fails.

This started 2020-06-03. It seems important to find out when this started.
Comment 4 ayumi_kojima 2021-06-25 14:08:58 PDT
I was able to reproduce the issues with the earliest revision and it seems like the issue has been happening since the test began running on Big Sur
Comment 5 ayumi_kojima 2021-06-25 15:02:04 PDT
Created attachment 432294 [details]
Expectations
Comment 6 Alexey Proskuryakov 2021-06-25 15:05:34 PDT
What were your steps to reproduce? This looks like a regression the bots.
Comment 7 Ryan Haddad 2021-06-25 15:47:58 PDT
(In reply to Alexey Proskuryakov from comment #6)
> What were your steps to reproduce? This looks like a regression the bots.
The steps are in https://bugs.webkit.org/show_bug.cgi?id=227366#c2, but it would be good to know what "the earliest revision" tested was.
Comment 8 Truitt Savell 2021-06-25 15:57:26 PDT
(In reply to Alexey Proskuryakov from comment #6)
> What were your steps to reproduce? This looks like a regression the bots.

Ayumi was able to reproduce this with command:
run-webkit-tests http/tests/preload/onload_event.html --iterations 500 -f --force -1

I confirmed this repro case on my system as well. 

I just ran this test on an old spade (278188) and was unable to reproduce it there. So I think we will be able to bisect to find where this regressed.
Comment 9 EWS 2021-06-25 16:17:22 PDT
Committed r279300 (239176@main): <https://commits.webkit.org/239176@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 432294 [details].
Comment 10 Alexey Proskuryakov 2021-06-25 16:50:01 PDT
Created attachment 432313 [details]
sample
Comment 11 Alexey Proskuryakov 2021-06-25 16:57:58 PDT
Not sure if it's truly a regression. Did we happen to update the bots on 2020-06-03?

Samples captured by bots happen to show what happens (and I reproduced the same locally). DumpRenderTree permanently freezes in AVFoundation!

    +                     989 WebCore::ThreadTimers::sharedTimerFiredInternal()  (in WebCore) + 198  [0x11337c4a6]  ThreadTimers.cpp:127
    +                       989 WebCore::PageGroup::captionPreferencesChanged()  (in WebCore) + 248  [0x1132d2d88]  PageGroup.cpp:108
    +                         989 WebCore::Page::captionPreferencesChanged()  (in WebCore) + 53  [0x1132c46d5]  Page.cpp:2807
    +                           989 WebCore::Page::forEachDocument(WTF::Function<void (WebCore::Document&)> const&) const  (in WebCore) + 334  [0x1132bb5be]  Page.cpp:3334
    +                             989 WebCore::Document::captionPreferencesChanged()  (in WebCore) + 52  [0x112c9f134]  Document.cpp:5706
    +                               989 WTF::WeakHashSet<WebCore::HTMLMediaElement, WTF::EmptyCounter>::forEach(WTF::Function<void (WebCore::HTMLMediaElement&)> const&)  (in WebCore) + 352  [0x112c8d7f0]  WeakHashSet.h:148
    +                                 989 WebCore::HTMLMediaElement::captionPreferencesChanged()  (in WebCore) + 66  [0x112ee6a82]  HTMLMediaElement.cpp:6550
    +                                   989 WebCore::MediaPlayerPrivateAVFoundationObjC::tracksChanged()  (in WebCore) + 728  [0x111d3d3b8]  MediaPlayerPrivateAVFoundationObjC.mm:2057
    +                                     989 -[AVAsset tracksWithMediaCharacteristics:]  (in AVFCore) + 54  [0x7fff316eba1b]
    +                                       989 -[AVAssetTrackEnumerator initWithAsset:mediaCharacteristics:]  (in AVFCore) + 30  [0x7fff316f24f7]
    +                                         989 -[AVAssetTrackEnumerator initWithAsset:]  (in AVFCore) + 82  [0x7fff316b8fb3]
    +                                           989 -[AVURLAsset tracks]  (in AVFCore) + 136  [0x7fff316b898c]
    +                                             989 _dispatch_lane_barrier_sync_invoke_and_complete  (in libdispatch.dylib) + 60  [0x7fff203925ce]
    +                                               989 _dispatch_client_callout  (in libdispatch.dylib) + 8  [0x7fff20385806]
    +                                                 989 __20-[AVURLAsset tracks]_block_invoke  (in AVFCore) + 72  [0x7fff316f151b]
    +                                                   989 -[AVAsset _tracksWithClass:]  (in AVFCore) + 129  [0x7fff316eb8cd]
    +                                                     989 -[AVFigAssetInspector trackCount]  (in AVFCore) + 32  [0x7fff31772de4]
    +                                                       989 -[AVFigObjectInspector _nonNilArrayForProperty:]  (in AVFCore) + 17  [0x7fff317bed76]
    +                                                         989 -[AVFigObjectInspector _tollFreeBridgedObjectForProperty:]  (in AVFCore) + 17  [0x7fff317bed38]
    +                                                           989 -[AVFigAssetInspector _valueAsCFTypeForProperty:]  (in AVFCore) + 101  [0x7fff31772acf]
    +                                                             989 ???  (in MediaToolbox)  load address 0x7fff2c0d3000 + 0x4e1bbd  [0x7fff2c5b4bbd]
    +                                                               989 _dispatch_group_wait_slow  (in libdispatch.dylib) + 49  [0x7fff2038626b]
    +                                                                 989 _dlock_wait  (in libdispatch.dylib) + 44  [0x7fff20385fd7]
    +                                                                   989 __ulock_wait  (in libsystem_kernel.dylib) + 10  [0x7fff204fb9ee]


And that in turn is caused by a freeze between WebCoreResourceHandleAsOperationQueueDelegate and CFNetwork.
Comment 12 Ryan Haddad 2021-06-25 17:49:20 PDT
This may line up with when we updated the Big Sur bots to macOS 11.4
Comment 13 Jer Noble 2021-08-05 13:51:10 PDT
Classic AVFoundation deadlock:

- MainThread is querying for state on AVAssetTrack
- URLAssetWorkQueue.I thread is requesting more data from WebCore to fulfill query
- WebCore dispatches to MainThread to do networking

The real fix is to move either all of networking to a dedicated runloop/thread/queue, or to move all of AVFoundation access to a dedicated runloop/thread/queue.
Comment 14 Jer Noble 2021-08-05 13:54:32 PDT
Short term, we should query assetStatus() before iterating over an asset's tracks.
Comment 15 Jer Noble 2021-08-05 14:22:42 PDT
Created attachment 435025 [details]
Patch
Comment 16 EWS 2021-08-08 14:53:18 PDT
Committed r280766 (240351@main): <https://commits.webkit.org/240351@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 435025 [details].