Bug 196004

Summary: [ews-build] Retry API test in case of failures
Product: WebKit Reporter: Aakash Jain <aakash_jain>
Component: Tools / TestsAssignee: Aakash Jain <aakash_jain>
Status: RESOLVED FIXED    
Severity: Normal CC: aakash_jain, ap, commit-queue, ews-watchlist, jbedard, lforschler, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: Other   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=199194
Attachments:
Description Flags
WIP
none
Proposed patch
none
Updated patch none

Aakash Jain
Reported 2019-03-20 07:30:19 PDT
When failures are found in run-api-tests, we should retry the API test to ensure that the failures are not because of flaky test. Further, we should run the API tests without the patch as well, to identify whether the failures are pre-existing or new failures introduced by the patch.
Attachments
WIP (8.51 KB, patch)
2019-03-20 07:47 PDT, Aakash Jain
no flags
Proposed patch (9.65 KB, patch)
2019-03-20 13:21 PDT, Aakash Jain
no flags
Updated patch (9.44 KB, patch)
2019-03-21 12:51 PDT, Aakash Jain
no flags
EWS Watchlist
Comment 2 2019-03-20 07:49:20 PDT Comment hidden (obsolete)
Jonathan Bedard
Comment 3 2019-03-20 10:11:44 PDT
Comment on attachment 365345 [details] WIP View in context: https://bugs.webkit.org/attachment.cgi?id=365345&action=review > Tools/BuildSlaveSupport/ews-build/steps.py:524 > + return self.getProperty('patchFailedToBuild') or self.getProperty('patchFailedAPITests') A bit confused why only API tests are here. Doesn't this logic apply to JSC tests too? And in the future, basically any test suite we're retrying that uses a compiled WebKit? > Tools/BuildSlaveSupport/ews-build/steps.py:759 > + d = self.getTestsResults(RunAPITests.name) d doesn't get used again? Should this be 'defer'? > Tools/BuildSlaveSupport/ews-build/steps.py:794 > + failures_with_patch = first_run_failures.intersection(second_run_failures) Doesn't this throw out flakey failures? Do we really want to do that? It almost feels like union would be better.
Aakash Jain
Comment 4 2019-03-20 11:51:24 PDT
Comment on attachment 365345 [details] WIP View in context: https://bugs.webkit.org/attachment.cgi?id=365345&action=review >> Tools/BuildSlaveSupport/ews-build/steps.py:524 >> + return self.getProperty('patchFailedToBuild') or self.getProperty('patchFailedAPITests') > > A bit confused why only API tests are here. Doesn't this logic apply to JSC tests too? And in the future, basically any test suite we're retrying that uses a compiled WebKit? You are right. For now, these are the only two code paths which are using CompileWebKitToT. I haven't completed the JSC build/test retry logic. >> Tools/BuildSlaveSupport/ews-build/steps.py:759 >> + d = self.getTestsResults(RunAPITests.name) > > d doesn't get used again? Should this be 'defer'? it's used in all the subsequent lines below 'd.addCallback'. >> Tools/BuildSlaveSupport/ews-build/steps.py:794 >> + failures_with_patch = first_run_failures.intersection(second_run_failures) > > Doesn't this throw out flakey failures? Do we really want to do that? It almost feels like union would be better. Yeah, I believe that's what we would want here. If we do union, then in https://ews-build.webkit-uat.org/#/builders/19/builds/2256 EWS would claim 'TestWebKitAPI.ProcessSwap.NumberOfCachedProcesses' as a new failure introduced by the patch, which we don't want. (Note that NumberOfCachedProcesses is a flaky failure as noted in https://bugs.webkit.org/show_bug.cgi?id=195102) Also, this is similar to https://trac.webkit.org/browser/webkit/trunk/Tools/Scripts/webkitpy/tool/bot/patchanalysistask.py#L281
Aakash Jain
Comment 5 2019-03-20 11:53:00 PDT
Sample run for catching API test failures: https://ews-build.webkit-uat.org/#/builders/19/builds/2260
Jonathan Bedard
Comment 6 2019-03-20 12:54:13 PDT
(In reply to Aakash Jain from comment #4) > ... > > >> Tools/BuildSlaveSupport/ews-build/steps.py:794 > >> + failures_with_patch = first_run_failures.intersection(second_run_failures) > > > > Doesn't this throw out flakey failures? Do we really want to do that? It almost feels like union would be better. > > Yeah, I believe that's what we would want here. If we do union, then in > https://ews-build.webkit-uat.org/#/builders/19/builds/2256 EWS would claim > 'TestWebKitAPI.ProcessSwap.NumberOfCachedProcesses' as a new failure > introduced by the patch, which we don't want. (Note that > NumberOfCachedProcesses is a flaky failure as noted in > https://bugs.webkit.org/show_bug.cgi?id=195102) > > Also, this is similar to > https://trac.webkit.org/browser/webkit/trunk/Tools/Scripts/webkitpy/tool/bot/ > patchanalysistask.py#L281 But what about the case were the patch introduces a NEW flakey failure? We would pass EWS, in that case. Maybe the answer is that we don't have a good way to catch this without talking to a results database.
Aakash Jain
Comment 7 2019-03-20 13:19:33 PDT
> But what about the case were the patch introduces a NEW flakey failure? We > would pass EWS, in that case. > > Maybe the answer is that we don't have a good way to catch this without > talking to a results database. Agreed. It's hard to conclusively detect a flaky test added by the patch (unless we run API test on that patch lot of times). Results database (about API tests history) would be a good way to catch flaky failures.
Aakash Jain
Comment 8 2019-03-20 13:21:48 PDT
Created attachment 365387 [details] Proposed patch
EWS Watchlist
Comment 9 2019-03-20 13:24:55 PDT Comment hidden (obsolete)
Lucas Forschler
Comment 10 2019-03-20 14:35:07 PDT
Comment on attachment 365387 [details] Proposed patch View in context: https://bugs.webkit.org/attachment.cgi?id=365387&action=review > Tools/BuildSlaveSupport/ews-build/steps.py:741 > + return rc you have a space on line 723 before the return. but omit it here. either is fine, just make them the same. > Tools/BuildSlaveSupport/ews-build/steps.py:751 > + return not self.doStepIf(step) I find this confusing. do we really need the doStepIf /hideStepIf function if we are just returning a property? > Tools/BuildSlaveSupport/ews-build/steps.py:780 > + will we always have first/second/clean run results? (do we have to run the second/clean set if everything in the first run passes?) > Tools/BuildSlaveSupport/ews-build/steps.py:788 > + return set([failure.get('name') for failure in result.get("Timedout", [])] + discuss: Timedout vs TimedOut vs TimeOut? do we use "Timedout" anywhere else? I think a simple case change is better for this string. > Tools/BuildSlaveSupport/ews-build/steps.py:836 > + self._addToLog('stderr', "ERROR: step not found: {}".format(step)) we are mixing single quotes and double quotes in this function... is that on purpose? looks like this is common across all the addToLog calls. > Tools/BuildSlaveSupport/ews-build/steps.py:847 > + if log['type'] == 's': can you remind me what the possible values are for 'type' ?
Aakash Jain
Comment 11 2019-03-20 14:53:16 PDT
Comment on attachment 365387 [details] Proposed patch View in context: https://bugs.webkit.org/attachment.cgi?id=365387&action=review >> Tools/BuildSlaveSupport/ews-build/steps.py:741 >> + return rc > > you have a space on line 723 before the return. but omit it here. either is fine, just make them the same. Will fix it. >> Tools/BuildSlaveSupport/ews-build/steps.py:751 >> + return not self.doStepIf(step) > > I find this confusing. do we really need the doStepIf /hideStepIf function if we are just returning a property? This class runs api tests (without patch), it is skipped if that property is not set. >> Tools/BuildSlaveSupport/ews-build/steps.py:780 >> + > > will we always have first/second/clean run results? (do we have to run the second/clean set if everything in the first run passes?) Yes we will always have all three results, because this class AnalyzeAPITestsResults is called only if there was a API test retry and there were failures in the retry. Notice "addStepsAfterCurrentStep" in ReRunAPITests. If everything in first run passes, we do not retry, we do not use 'addStepsAfterCurrentStep' and this class is never invoked. >> Tools/BuildSlaveSupport/ews-build/steps.py:788 >> + return set([failure.get('name') for failure in result.get("Timedout", [])] + > > discuss: Timedout vs TimedOut vs TimeOut? > > do we use "Timedout" anywhere else? I think a simple case change is better for this string. 'Timedout' is outputted by run-api-tests itself https://trac.webkit.org/browser/webkit/trunk/Tools/Scripts/webkitpy/api_tests/manager.py#L196 >> Tools/BuildSlaveSupport/ews-build/steps.py:836 >> + self._addToLog('stderr', "ERROR: step not found: {}".format(step)) > > we are mixing single quotes and double quotes in this function... is that on purpose? > looks like this is common across all the addToLog calls. Will fix the quotes. >> Tools/BuildSlaveSupport/ews-build/steps.py:847 >> + if log['type'] == 's': > > can you remind me what the possible values are for 'type' ? # 's' = stdio, 't' = text, 'h' = html, 'd' = deleted From https://github.com/buildbot/buildbot/blob/master/master/buildbot/db/model.py#L196
Aakash Jain
Comment 12 2019-03-21 12:51:17 PDT
Created attachment 365597 [details] Updated patch Incorporated review comments.
EWS Watchlist
Comment 13 2019-03-21 12:53:34 PDT Comment hidden (obsolete)
WebKit Commit Bot
Comment 14 2019-03-21 16:12:04 PDT
Comment on attachment 365597 [details] Updated patch Clearing flags on attachment: 365597 Committed r243342: <https://trac.webkit.org/changeset/243342>
WebKit Commit Bot
Comment 15 2019-03-21 16:12:06 PDT
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 16 2019-03-21 16:14:03 PDT
Note You need to log in before you can comment on or make changes to this bug.