Bug 223767 - [Metal ANGLE] fast/canvas/webgl/out-of-bounds-simulated-vertexAttrib0-drawArrays.html causes GPURestarts on some machines
Summary: [Metal ANGLE] fast/canvas/webgl/out-of-bounds-simulated-vertexAttrib0-drawArr...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: ANGLE (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on: 223926
Blocks: anglemetal
  Show dependency treegraph
 
Reported: 2021-03-25 15:32 PDT by Kyle Piddington
Modified: 2021-03-30 05:34 PDT (History)
4 users (show)

See Also:


Attachments
Patch (1.70 KB, patch)
2021-03-25 15:36 PDT, Kyle Piddington
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Kyle Piddington 2021-03-25 15:32:26 PDT
When running fast/canvas/webgl/out-of-bounds-simulated-vertexAttrib0-drawArrays.html, some Intel machines experience GPURestarts. These restarts can cause waterfall effects, making other tests fail in parallel, and sometimes blacklisting the test running process from executing GPU work. this results in a cascade of unrelated failures.
The actual issue is unrelated to ANGLE, but rather the driver on the host systems. Submitting a large render workload (1 trillion+ points) causes the Intel GPUDriver to restart. 
To mitigate this issue, we can submit a smaller workload for this test, until the underlying driver bug is fixed.
Comment 1 Kyle Piddington 2021-03-25 15:33:59 PDT
Note, these restart issues were only seen in automation on MacMini8'1, and iMac16'2
Comment 2 Kyle Piddington 2021-03-25 15:36:03 PDT
Created attachment 424287 [details]
Patch
Comment 3 Dean Jackson 2021-03-25 20:20:52 PDT
Committed r275075 (235787@main): <https://commits.webkit.org/235787@main>
Comment 4 Radar WebKit Bug Importer 2021-03-25 20:21:15 PDT
<rdar://problem/75869407>
Comment 5 Kimmo Kinnunen 2021-03-25 23:51:04 PDT
I think the test originally tested the thing it tried to test?
If a test tests 0x40000000 to test for a bug, it's not really ok to change it to 0x1, 0x40, or 0x400000, is it? It appears that the bug that the test tests for really manifests.

Would it have been better to revert this change and skip the test instead?
If a test fails in other place due to a bug in implementation, we typically don't change the test...