Bug 196096 - Many failures caused by async overflow scrolling weren't caught by EWS
Summary: Many failures caused by async overflow scrolling weren't caught by EWS
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-03-21 11:19 PDT by Simon Fraser (smfr)
Modified: 2019-03-22 16:56 PDT (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Simon Fraser (smfr) 2019-03-21 11:19:20 PDT
EWS said this patch passed:
https://webkit-queues.webkit.org/patch/365357/ios-sim-ews

but it caused lots of failures and crashes:
https://build.webkit.org/results/Apple%20iOS%2012%20Simulator%20Debug%20WK2%20(Tests)/r243288%20(2909)/results.html

WTF
Comment 1 Alexey Proskuryakov 2019-03-21 17:41:12 PDT
Wow, much clickbait! Re-titling.

The crashes in post-commit are debug assertions, so those are expected to not show up in release mode.

I checked the logs on the EWS bot, and there was absolutely nothing interesting there, the tests just plain passed:

2019-03-20 12:22:25,295 - Running: webkit-patch --status-host=webkit-queues.webkit.org --bot-id=ews124 build-and-test --no-clean --no-update --test --non-interactive --build-style=release --group=None --port=ios-simulator-wk2
2019-03-20 12:33:20,915 - Passed tests

Simon, what was the reason for the failures, and why was it hidden from EWS?
Comment 2 Simon Fraser (smfr) 2019-03-21 18:09:37 PDT
(In reply to Alexey Proskuryakov from comment #1)
> Wow, much clickbait! Re-titling.
> 
> The crashes in post-commit are debug assertions, so those are expected to
> not show up in release mode.

The crash (assertion) is explained in bug 196123.

> I checked the logs on the EWS bot, and there was absolutely nothing
> interesting there, the tests just plain passed:
> 
> 2019-03-20 12:22:25,295 - Running: webkit-patch
> --status-host=webkit-queues.webkit.org --bot-id=ews124 build-and-test
> --no-clean --no-update --test --non-interactive --build-style=release
> --group=None --port=ios-simulator-wk2
> 2019-03-20 12:33:20,915 - Passed tests
> 
> Simon, what was the reason for the failures, and why was it hidden from EWS?

The failure were legitimate test failures caused by differences in clipping and box sizes when elements with overflow:scroll were composited. They affected iOS WK2 tests, both release and debug. The patch should have absolutely caused these failures.
Comment 3 Alexey Proskuryakov 2019-03-21 20:08:39 PDT
So... why did it not cause these failures on EWS?

The only difference between EWS and post-commit bots is that EWS enables retries, thus hiding some flakiness. This is not good, but EWS can't function with the current level of flakiness, and it's better to hide some failures sometimes than to lose EWS entirely.

If these 100+ failures were hidden by retry, that's bad news. I can think of several explanations:
- A state leak between tests. If these tests only fail when run after some test that passes but leaks state (such as preferences), then they will pass on retry.
- Severe timing dependency. Retries are serial, so tests that can't run on a busy machine will pass on retry.
- Similar to #1, but persistent conditions related to iOS simulator state, not WebKit state.

In any of these cases, it sounds like there will be more work fixing things in WebKit.

Alternatively, if there is some other mysterious reason, then it is, well, mysterious.
Comment 4 Simon Fraser (smfr) 2019-03-21 20:48:38 PDT
Do we know what trunk revision EWS built against? Can we try to recreate the scenario?
Comment 5 Alexey Proskuryakov 2019-03-22 16:56:41 PDT
The bot updated the checkout at 2019-03-20 12:04:12, so that would be r243222.