| Summary: | results.webkit.org should provide API for EWS to check flakiness of tests | ||||||
|---|---|---|---|---|---|---|---|
| Product: | WebKit | Reporter: | Aakash Jain <aakash_jain> | ||||
| Component: | Tools / Tests | Assignee: | Jonathan Bedard <jbedard> | ||||
| Status: | ASSIGNED --- | ||||||
| Severity: | Normal | CC: | aakash_jain, ap, cgambrell, clopez, jbedard, jenner, ryanhaddad, webkit-bug-importer | ||||
| Priority: | P2 | Keywords: | InRadar | ||||
| Version: | WebKit Nightly Build | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| See Also: |
https://bugs.webkit.org/show_bug.cgi?id=224434 https://bugs.webkit.org/show_bug.cgi?id=224435 https://bugs.webkit.org/show_bug.cgi?id=204368 |
||||||
| Attachments: |
|
||||||
|
Description
Aakash Jain
2021-04-07 10:05:21 PDT
This might need discussion about the specifics of the API we might need for EWS, specifically for flakiness information. I think we can tackle the problem in two parts: API for dealing with flaky failures in EWS, API for dealing with consistent failures in EWS. For consistent failures, I filed two specific API requests in Bug 224434 and Bug 224435. I think the way that this API should work is that is should provide a "percent likelihood" for each outcome of a given test with a given configuration at a given commit. We will need to toy with the algorithm a bit to figure out what the appropriate way to rank commits surrounding the commit in question is, I'm envisioning a result that looks something like this:
{
"PASS": 80,
"FAIL": 10,
"TIMEOUT": 5,
"CRASH": 5
}
Meaning that given the configuration that the user provided, we would expect that the given test passes 80% of the time, fails 10% of the time, timeout 5% of the time and crashes 5% of the time. From that point, EWS can decide if the pass percentage is high enough to justify failing the build.
Created attachment 429332 [details]
Current mock-up of script
|