Bug 196004 - [ews-build] Retry API test in case of failures
Summary: [ews-build] Retry API test in case of failures
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: Other
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Aakash Jain
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2019-03-20 07:30 PDT by Aakash Jain
Modified: 2019-06-25 11:00 PDT (History)
7 users (show)

See Also:


Attachments
WIP (8.51 KB, patch)
2019-03-20 07:47 PDT, Aakash Jain
no flags Details | Formatted Diff | Diff
Proposed patch (9.65 KB, patch)
2019-03-20 13:21 PDT, Aakash Jain
no flags Details | Formatted Diff | Diff
Updated patch (9.44 KB, patch)
2019-03-21 12:51 PDT, Aakash Jain
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Aakash Jain 2019-03-20 07:30:19 PDT
When failures are found in run-api-tests, we should retry the API test to ensure that the failures are not because of flaky test. Further, we should run the API tests without the patch as well, to identify whether the failures are pre-existing or new failures introduced by the patch.
Comment 2 EWS Watchlist 2019-03-20 07:49:20 PDT Comment hidden (obsolete)
Comment 3 Jonathan Bedard 2019-03-20 10:11:44 PDT
Comment on attachment 365345 [details]
WIP

View in context: https://bugs.webkit.org/attachment.cgi?id=365345&action=review

> Tools/BuildSlaveSupport/ews-build/steps.py:524
> +        return self.getProperty('patchFailedToBuild') or self.getProperty('patchFailedAPITests')

A bit confused why only API tests are here. Doesn't this logic apply to JSC tests too? And in the future, basically any test suite we're retrying that uses a compiled WebKit?

> Tools/BuildSlaveSupport/ews-build/steps.py:759
> +        d = self.getTestsResults(RunAPITests.name)

d doesn't get used again? Should this be 'defer'?

> Tools/BuildSlaveSupport/ews-build/steps.py:794
> +        failures_with_patch = first_run_failures.intersection(second_run_failures)

Doesn't this throw out flakey failures? Do we really want to do that? It almost feels like union would be better.
Comment 4 Aakash Jain 2019-03-20 11:51:24 PDT
Comment on attachment 365345 [details]
WIP

View in context: https://bugs.webkit.org/attachment.cgi?id=365345&action=review

>> Tools/BuildSlaveSupport/ews-build/steps.py:524
>> +        return self.getProperty('patchFailedToBuild') or self.getProperty('patchFailedAPITests')
> 
> A bit confused why only API tests are here. Doesn't this logic apply to JSC tests too? And in the future, basically any test suite we're retrying that uses a compiled WebKit?

You are right. For now, these are the only two code paths which are using CompileWebKitToT. I haven't completed the JSC build/test retry logic.

>> Tools/BuildSlaveSupport/ews-build/steps.py:759
>> +        d = self.getTestsResults(RunAPITests.name)
> 
> d doesn't get used again? Should this be 'defer'?

it's used in all the subsequent lines below 'd.addCallback'.

>> Tools/BuildSlaveSupport/ews-build/steps.py:794
>> +        failures_with_patch = first_run_failures.intersection(second_run_failures)
> 
> Doesn't this throw out flakey failures? Do we really want to do that? It almost feels like union would be better.

Yeah, I believe that's what we would want here. If we do union, then in https://ews-build.webkit-uat.org/#/builders/19/builds/2256 EWS would claim 'TestWebKitAPI.ProcessSwap.NumberOfCachedProcesses' as a new failure introduced by the patch, which we don't want. (Note that NumberOfCachedProcesses is a flaky failure as noted in https://bugs.webkit.org/show_bug.cgi?id=195102)

Also, this is similar to https://trac.webkit.org/browser/webkit/trunk/Tools/Scripts/webkitpy/tool/bot/patchanalysistask.py#L281
Comment 5 Aakash Jain 2019-03-20 11:53:00 PDT
Sample run for catching API test failures: https://ews-build.webkit-uat.org/#/builders/19/builds/2260
Comment 6 Jonathan Bedard 2019-03-20 12:54:13 PDT
(In reply to Aakash Jain from comment #4)
> ...
> 
> >> Tools/BuildSlaveSupport/ews-build/steps.py:794
> >> +        failures_with_patch = first_run_failures.intersection(second_run_failures)
> > 
> > Doesn't this throw out flakey failures? Do we really want to do that? It almost feels like union would be better.
> 
> Yeah, I believe that's what we would want here. If we do union, then in
> https://ews-build.webkit-uat.org/#/builders/19/builds/2256 EWS would claim
> 'TestWebKitAPI.ProcessSwap.NumberOfCachedProcesses' as a new failure
> introduced by the patch, which we don't want. (Note that
> NumberOfCachedProcesses is a flaky failure as noted in
> https://bugs.webkit.org/show_bug.cgi?id=195102)
> 
> Also, this is similar to
> https://trac.webkit.org/browser/webkit/trunk/Tools/Scripts/webkitpy/tool/bot/
> patchanalysistask.py#L281

But what about the case were the patch introduces a NEW flakey failure? We would pass EWS, in that case.

Maybe the answer is that we don't have a good way to catch this without talking to a results database.
Comment 7 Aakash Jain 2019-03-20 13:19:33 PDT
> But what about the case were the patch introduces a NEW flakey failure? We
> would pass EWS, in that case.
> 
> Maybe the answer is that we don't have a good way to catch this without
> talking to a results database.
Agreed. It's hard to conclusively detect a flaky test added by the patch (unless we run API test on that patch lot of times). Results database (about API tests history) would be a good way to catch flaky failures.
Comment 8 Aakash Jain 2019-03-20 13:21:48 PDT
Created attachment 365387 [details]
Proposed patch
Comment 9 EWS Watchlist 2019-03-20 13:24:55 PDT Comment hidden (obsolete)
Comment 10 Lucas Forschler 2019-03-20 14:35:07 PDT
Comment on attachment 365387 [details]
Proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=365387&action=review

> Tools/BuildSlaveSupport/ews-build/steps.py:741
> +        return rc

you have a space on line 723 before the return. but omit it here. either is fine, just make them the same.

> Tools/BuildSlaveSupport/ews-build/steps.py:751
> +        return not self.doStepIf(step)

I find this confusing.  do we really need the doStepIf /hideStepIf function if we are just returning a property?

> Tools/BuildSlaveSupport/ews-build/steps.py:780
> +

will we always have first/second/clean run results? (do we have to run the second/clean set if everything in the first run passes?)

> Tools/BuildSlaveSupport/ews-build/steps.py:788
> +            return set([failure.get('name') for failure in result.get("Timedout", [])] +

discuss: Timedout vs TimedOut vs TimeOut?

do we use "Timedout" anywhere else? I think a simple case change is better for this string.

> Tools/BuildSlaveSupport/ews-build/steps.py:836
> +            self._addToLog('stderr', "ERROR: step not found: {}".format(step))

we are mixing single quotes and double quotes in this function... is that on purpose?
looks like this is common across all the addToLog calls.

> Tools/BuildSlaveSupport/ews-build/steps.py:847
> +        if log['type'] == 's':

can you remind me what the possible values are for 'type' ?
Comment 11 Aakash Jain 2019-03-20 14:53:16 PDT
Comment on attachment 365387 [details]
Proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=365387&action=review

>> Tools/BuildSlaveSupport/ews-build/steps.py:741
>> +        return rc
> 
> you have a space on line 723 before the return. but omit it here. either is fine, just make them the same.

Will fix it.

>> Tools/BuildSlaveSupport/ews-build/steps.py:751
>> +        return not self.doStepIf(step)
> 
> I find this confusing.  do we really need the doStepIf /hideStepIf function if we are just returning a property?

This class runs api tests (without patch), it is skipped if that property is not set.

>> Tools/BuildSlaveSupport/ews-build/steps.py:780
>> +
> 
> will we always have first/second/clean run results? (do we have to run the second/clean set if everything in the first run passes?)

Yes we will always have all three results, because this class AnalyzeAPITestsResults is called only if there was a API test retry and there were failures in the retry. Notice "addStepsAfterCurrentStep" in ReRunAPITests.

If everything in first run passes, we do not retry, we do not use 'addStepsAfterCurrentStep' and this class is never invoked.

>> Tools/BuildSlaveSupport/ews-build/steps.py:788
>> +            return set([failure.get('name') for failure in result.get("Timedout", [])] +
> 
> discuss: Timedout vs TimedOut vs TimeOut?
> 
> do we use "Timedout" anywhere else? I think a simple case change is better for this string.

'Timedout' is outputted by run-api-tests itself https://trac.webkit.org/browser/webkit/trunk/Tools/Scripts/webkitpy/api_tests/manager.py#L196

>> Tools/BuildSlaveSupport/ews-build/steps.py:836
>> +            self._addToLog('stderr', "ERROR: step not found: {}".format(step))
> 
> we are mixing single quotes and double quotes in this function... is that on purpose?
> looks like this is common across all the addToLog calls.

Will fix the quotes.

>> Tools/BuildSlaveSupport/ews-build/steps.py:847
>> +        if log['type'] == 's':
> 
> can you remind me what the possible values are for 'type' ?

# 's' = stdio, 't' = text, 'h' = html, 'd' = deleted
From https://github.com/buildbot/buildbot/blob/master/master/buildbot/db/model.py#L196
Comment 12 Aakash Jain 2019-03-21 12:51:17 PDT
Created attachment 365597 [details]
Updated patch

Incorporated review comments.
Comment 13 EWS Watchlist 2019-03-21 12:53:34 PDT Comment hidden (obsolete)
Comment 14 WebKit Commit Bot 2019-03-21 16:12:04 PDT
Comment on attachment 365597 [details]
Updated patch

Clearing flags on attachment: 365597

Committed r243342: <https://trac.webkit.org/changeset/243342>
Comment 15 WebKit Commit Bot 2019-03-21 16:12:06 PDT
All reviewed patches have been landed.  Closing bug.
Comment 16 Radar WebKit Bug Importer 2019-03-21 16:14:03 PDT
<rdar://problem/49129320>