After submitting the patch for EWS analysis for https://bugs.webkit.org/show_bug.cgi?id=153116, build bot allegedly said the patch caused some tests to fail, for example, the http/tests/contentextensions/font-display-none-repeated-layout.html test. However, the change in the patch should not affect content extensions at all and I ran the layout tests locally and they passed. So the layout tests can produce false positive results
The way EWS works is that when there are failures, it runs the tests again, and if failures are the same, it rolls out the patch, and runs the tests once again to check whether the failure(s) are a regression. This is what happened here - a flaky test failed twice in a row, and didn't fail on the third try. I don't think that there is anything actionable here for EWS - we probably shouldn't make it slower by trying even more times. http/tests/contentextensions/font-display-none-repeated-layout.html flakily fails with this diff: -CONSOLE MESSAGE: Content blocker prevented frame displaying http://127.0.0.1:8000/contentextensions/font-display-none-repeated-layout.html from loading a resource from http://127.0.0.1:8000/resources/Ahem.woff +CONSOLE MESSAGE: line 13: Content blocker prevented frame displaying http://127.0.0.1:8000/contentextensions/font-display-none-repeated-layout.html from loading a resource from http://127.0.0.1:8000/resources/Ahem.woff It's a bug in console support code that content blocker messages flakily depend on other state.
<rdar://problem/24231374>
I tried to fix this in http://trac.webkit.org/changeset/195161 but it looks like it didn't work :(
Is this still occurring?
Unsure, it's hard to tell which tests are flaky on EWS. Could be fixed indeed. This test is still flaky on GuardMalloc bots, but that's a separate issue.
FWIW, with the second patch I submitted for https://bugs.webkit.org/show_bug.cgi?id=153116, the tests did not fail. We should reduce/eliminate the flakiness of these tests. Flaky tests are not necessarily better than no tests at all since it takes engineering time to debug whether these test failures are false positive.