Bug 198867

Summary: REGRESSION: Layout Test webgl/many-contexts.html is a flaky timeout on Mojave+
Product: WebKit Reporter: Ryan Haddad <ryanhaddad>
Component: WebGLAssignee: Nobody <webkit-unassigned>
Status: NEW ---    
Severity: Normal CC: ap, dino, jbedard, justin_fan, rniwa, simon.fraser, tsavell, webkit-bot-watchers-bugzilla, webkit-bug-importer, youennf
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 239640    

Description Ryan Haddad 2019-06-14 13:57:13 PDT
The following layout test is a flaky timeout on Mojave

webgl/many-contexts.html

Flakiness Dashboard:

https://webkit-test-results.webkit.org/dashboards/flakiness_dashboard.html#showAllRuns=true&tests=webgl%2Fmany-contexts.html

--- /Volumes/Data/slave/mojave-release-tests-wk1/build/layout-test-results/webgl/many-contexts-expected.txt
+++ /Volumes/Data/slave/mojave-release-tests-wk1/build/layout-test-results/webgl/many-contexts-actual.txt
@@ -982,4 +982,5 @@
 CONSOLE MESSAGE: line 12: There are too many active WebGL contexts on this page, the oldest context will be lost.
 CONSOLE MESSAGE: line 12: There are too many active WebGL contexts on this page, the oldest context will be lost.
 CONSOLE MESSAGE: line 12: There are too many active WebGL contexts on this page, the oldest context will be lost.
+FAIL: Timed out waiting for notifyDone to be called
 PASS if this test did not crash.
Comment 1 Ryan Haddad 2019-06-14 14:00:32 PDT
This test had been taking about 20 seconds to run until Jun 8, 2019 ~r246232, and now it waffles between the high 20s and timeouts.
Comment 2 Truitt Savell 2019-06-17 09:45:52 PDT
When I try to reproduce this all I am getting locally is crashes.
Comment 3 Radar WebKit Bug Importer 2019-06-17 10:19:10 PDT
<rdar://problem/51810342>
Comment 4 Truitt Savell 2019-06-19 11:07:36 PDT
Ryan found that when he rebooted the leaks bot this test stopped timing out there and began slowly ticking back up in its time. it went from 35 second to 8 on restart and is back up to 10 seconds now.

The output of this test has more CONSOLE MESSAGE: lines dependent on how long it takes to timeout.
Comment 5 Jonathan Bedard 2019-07-12 09:08:57 PDT
This sounds like we're leaking something at the os level resource. This can probably effect other tests too.
Comment 6 Truitt Savell 2019-10-18 09:28:41 PDT
Marked this test as skip for Mojave+ while this is being investigated: https://trac.webkit.org/changeset/251286/webkit
Comment 7 Ryosuke Niwa 2019-10-30 22:49:55 PDT
This appears to be always passing now. We probably need to just get rid of the test expectation.
Comment 8 Alexey Proskuryakov 2019-10-30 23:50:23 PDT
It's skipped, so we don't have any history on flakiness dashboard.

LayoutTests/platform/mac/TestExpectations:2018:webkit.org/b/198867 [ Mojave+ ] webgl/many-contexts.html [ Skip ]

> This sounds like we're leaking something at the os level resource. This can probably effect other tests too.

Removing the expectation may be necessary to check if this is still happening, but it's also very dangerous as it can destabilize other tests.
Comment 9 Truitt Savell 2019-11-05 16:26:01 PST
I found another webgl test that is timing out and resets to 1 sec to pass after a restart of the bot. 

webgl/1.0.2/conformance/uniforms/out-of-bounds-uniform-array-access.html

History:
https://webkit-test-results.webkit.org/dashboards/flakiness_dashboard.html#showAllRuns=true&tests=webgl%2F1.0.2%2Fconformance%2Funiforms%2Fout-of-bounds-uniform-array-access.html

I rebooted the bot on 11/5 at 4pm
Comment 10 Truitt Savell 2019-11-06 08:01:48 PST
*** Bug 203856 has been marked as a duplicate of this bug. ***
Comment 11 Alexey Proskuryakov 2019-11-06 17:39:52 PST
So something in our tests makes WebGL slower with time, even as all of our processes get restarted. And the culprit may well be some other test or tests, not the ones listed here.

It seems like it may still be a problem on Catalina, as run times seem to be increasing - although they haven't reached 30 seconds yet.

I think that this has to be a GPU driver issue, but hopefully we can isolate and maybe work around it.