Bug 196004

Summary:

[ews-build] Retry API test in case of failures

Product:

WebKit

Reporter:

Aakash Jain <aakash_jain>

Component:

Tools / Tests

Assignee:

Aakash Jain <aakash_jain>

Status:

RESOLVED FIXED

Severity:

Normal

CC:

aakash_jain, ap, commit-queue, ews-watchlist, jbedard, lforschler, webkit-bug-importer

Priority:

Keywords:

InRadar

Version:

Other

Hardware:

Unspecified

OS:

Unspecified

See Also:

https://bugs.webkit.org/show_bug.cgi?id=199194

Attachments:

Description	Flags
WIP	none
Proposed patch	none
Updated patch	none

Aakash Jain

Reported 2019-03-20 07:30:19 PDT

When failures are found in run-api-tests, we should retry the API test to ensure that the failures are not because of flaky test. Further, we should run the API tests without the patch as well, to identify whether the failures are pre-existing or new failures introduced by the patch.

Attachments
WIP (8.51 KB, patch) 2019-03-20 07:47 PDT, Aakash Jain	no flags	Details Formatted Diff Diff
Proposed patch (9.65 KB, patch) 2019-03-20 13:21 PDT, Aakash Jain	no flags	Details Formatted Diff Diff
Updated patch (9.44 KB, patch) 2019-03-21 12:51 PDT, Aakash Jain	no flags	Details Formatted Diff Diff
Show Obsolete (2) View All Add attachment proposed patch, testcase, etc.

Aakash Jain

Comment 1 2019-03-20 07:47:39 PDT

EWS Watchlist

Comment 2 2019-03-20 07:49:20 PDT Comment hidden (obsolete)

Attachment 365345 [details] did not pass style-queue: ERROR: Tools/BuildSlaveSupport/ews-build/steps.py:708: [RunAPITests.evaluateCommand] Use of super on an old style class [pylint/E1002] [5] ERROR: Tools/BuildSlaveSupport/ews-build/steps.py:834: [AnalyzeAPITestsResults.getTestsResults] Instance of 'AnalyzeAPITestsResults' has no 'master' member [pylint/E1101] [5] ERROR: Tools/BuildSlaveSupport/ews-build/steps.py:841: [AnalyzeAPITestsResults.getTestsResults] Instance of 'AnalyzeAPITestsResults' has no 'master' member [pylint/E1101] [5] ERROR: Tools/BuildSlaveSupport/ews-build/steps.py:846: [AnalyzeAPITestsResults.getTestsResults] Undefined variable 'json' [pylint/E0602] [5] Total errors found: 4 in 2 files If any of these errors are false positives, please file a bug against check-webkit-style.

Jonathan Bedard

Comment 3 2019-03-20 10:11:44 PDT

Comment on attachment 365345 [details] WIP View in context: https://bugs.webkit.org/attachment.cgi?id=365345&action=review > Tools/BuildSlaveSupport/ews-build/steps.py:524 > + return self.getProperty('patchFailedToBuild') or self.getProperty('patchFailedAPITests') A bit confused why only API tests are here. Doesn't this logic apply to JSC tests too? And in the future, basically any test suite we're retrying that uses a compiled WebKit? > Tools/BuildSlaveSupport/ews-build/steps.py:759 > + d = self.getTestsResults(RunAPITests.name) d doesn't get used again? Should this be 'defer'? > Tools/BuildSlaveSupport/ews-build/steps.py:794 > + failures_with_patch = first_run_failures.intersection(second_run_failures) Doesn't this throw out flakey failures? Do we really want to do that? It almost feels like union would be better.

Aakash Jain

Comment 4 2019-03-20 11:51:24 PDT

Comment on attachment 365345 [details] WIP View in context: https://bugs.webkit.org/attachment.cgi?id=365345&action=review >> Tools/BuildSlaveSupport/ews-build/steps.py:524 >> + return self.getProperty('patchFailedToBuild') or self.getProperty('patchFailedAPITests') > > A bit confused why only API tests are here. Doesn't this logic apply to JSC tests too? And in the future, basically any test suite we're retrying that uses a compiled WebKit? You are right. For now, these are the only two code paths which are using CompileWebKitToT. I haven't completed the JSC build/test retry logic. >> Tools/BuildSlaveSupport/ews-build/steps.py:759 >> + d = self.getTestsResults(RunAPITests.name) > > d doesn't get used again? Should this be 'defer'? it's used in all the subsequent lines below 'd.addCallback'. >> Tools/BuildSlaveSupport/ews-build/steps.py:794 >> + failures_with_patch = first_run_failures.intersection(second_run_failures) > > Doesn't this throw out flakey failures? Do we really want to do that? It almost feels like union would be better. Yeah, I believe that's what we would want here. If we do union, then in https://ews-build.webkit-uat.org/#/builders/19/builds/2256 EWS would claim 'TestWebKitAPI.ProcessSwap.NumberOfCachedProcesses' as a new failure introduced by the patch, which we don't want. (Note that NumberOfCachedProcesses is a flaky failure as noted in https://bugs.webkit.org/show_bug.cgi?id=195102) Also, this is similar to https://trac.webkit.org/browser/webkit/trunk/Tools/Scripts/webkitpy/tool/bot/patchanalysistask.py#L281

Aakash Jain

Comment 5 2019-03-20 11:53:00 PDT

Sample run for catching API test failures: https://ews-build.webkit-uat.org/#/builders/19/builds/2260

Jonathan Bedard

Comment 6 2019-03-20 12:54:13 PDT

(In reply to Aakash Jain from comment #4) > ... > > >> Tools/BuildSlaveSupport/ews-build/steps.py:794 > >> + failures_with_patch = first_run_failures.intersection(second_run_failures) > > > > Doesn't this throw out flakey failures? Do we really want to do that? It almost feels like union would be better. > > Yeah, I believe that's what we would want here. If we do union, then in > https://ews-build.webkit-uat.org/#/builders/19/builds/2256 EWS would claim > 'TestWebKitAPI.ProcessSwap.NumberOfCachedProcesses' as a new failure > introduced by the patch, which we don't want. (Note that > NumberOfCachedProcesses is a flaky failure as noted in > https://bugs.webkit.org/show_bug.cgi?id=195102) > > Also, this is similar to > https://trac.webkit.org/browser/webkit/trunk/Tools/Scripts/webkitpy/tool/bot/ > patchanalysistask.py#L281 But what about the case were the patch introduces a NEW flakey failure? We would pass EWS, in that case. Maybe the answer is that we don't have a good way to catch this without talking to a results database.

Aakash Jain

Comment 7 2019-03-20 13:19:33 PDT

> But what about the case were the patch introduces a NEW flakey failure? We > would pass EWS, in that case. > > Maybe the answer is that we don't have a good way to catch this without > talking to a results database. Agreed. It's hard to conclusively detect a flaky test added by the patch (unless we run API test on that patch lot of times). Results database (about API tests history) would be a good way to catch flaky failures.

Aakash Jain

Comment 8 2019-03-20 13:21:48 PDT

Created attachment 365387 [details] Proposed patch

EWS Watchlist

Comment 9 2019-03-20 13:24:55 PDT Comment hidden (obsolete)

Attachment 365387 [details] did not pass style-queue: ERROR: Tools/BuildSlaveSupport/ews-build/steps.py:714: [RunAPITests.evaluateCommand] Use of super on an old style class [pylint/E1002] [5] ERROR: Tools/BuildSlaveSupport/ews-build/steps.py:839: [AnalyzeAPITestsResults.getTestsResults] Instance of 'AnalyzeAPITestsResults' has no 'master' member [pylint/E1101] [5] ERROR: Tools/BuildSlaveSupport/ews-build/steps.py:846: [AnalyzeAPITestsResults.getTestsResults] Instance of 'AnalyzeAPITestsResults' has no 'master' member [pylint/E1101] [5] Total errors found: 3 in 3 files If any of these errors are false positives, please file a bug against check-webkit-style.

Lucas Forschler

Comment 10 2019-03-20 14:35:07 PDT

Comment on attachment 365387 [details] Proposed patch View in context: https://bugs.webkit.org/attachment.cgi?id=365387&action=review > Tools/BuildSlaveSupport/ews-build/steps.py:741 > + return rc you have a space on line 723 before the return. but omit it here. either is fine, just make them the same. > Tools/BuildSlaveSupport/ews-build/steps.py:751 > + return not self.doStepIf(step) I find this confusing. do we really need the doStepIf /hideStepIf function if we are just returning a property? > Tools/BuildSlaveSupport/ews-build/steps.py:780 > + will we always have first/second/clean run results? (do we have to run the second/clean set if everything in the first run passes?) > Tools/BuildSlaveSupport/ews-build/steps.py:788 > + return set([failure.get('name') for failure in result.get("Timedout", [])] + discuss: Timedout vs TimedOut vs TimeOut? do we use "Timedout" anywhere else? I think a simple case change is better for this string. > Tools/BuildSlaveSupport/ews-build/steps.py:836 > + self._addToLog('stderr', "ERROR: step not found: {}".format(step)) we are mixing single quotes and double quotes in this function... is that on purpose? looks like this is common across all the addToLog calls. > Tools/BuildSlaveSupport/ews-build/steps.py:847 > + if log['type'] == 's': can you remind me what the possible values are for 'type' ?

Aakash Jain

Comment 11 2019-03-20 14:53:16 PDT

Comment on attachment 365387 [details] Proposed patch View in context: https://bugs.webkit.org/attachment.cgi?id=365387&action=review >> Tools/BuildSlaveSupport/ews-build/steps.py:741 >> + return rc > > you have a space on line 723 before the return. but omit it here. either is fine, just make them the same. Will fix it. >> Tools/BuildSlaveSupport/ews-build/steps.py:751 >> + return not self.doStepIf(step) > > I find this confusing. do we really need the doStepIf /hideStepIf function if we are just returning a property? This class runs api tests (without patch), it is skipped if that property is not set. >> Tools/BuildSlaveSupport/ews-build/steps.py:780 >> + > > will we always have first/second/clean run results? (do we have to run the second/clean set if everything in the first run passes?) Yes we will always have all three results, because this class AnalyzeAPITestsResults is called only if there was a API test retry and there were failures in the retry. Notice "addStepsAfterCurrentStep" in ReRunAPITests. If everything in first run passes, we do not retry, we do not use 'addStepsAfterCurrentStep' and this class is never invoked. >> Tools/BuildSlaveSupport/ews-build/steps.py:788 >> + return set([failure.get('name') for failure in result.get("Timedout", [])] + > > discuss: Timedout vs TimedOut vs TimeOut? > > do we use "Timedout" anywhere else? I think a simple case change is better for this string. 'Timedout' is outputted by run-api-tests itself https://trac.webkit.org/browser/webkit/trunk/Tools/Scripts/webkitpy/api_tests/manager.py#L196 >> Tools/BuildSlaveSupport/ews-build/steps.py:836 >> + self._addToLog('stderr', "ERROR: step not found: {}".format(step)) > > we are mixing single quotes and double quotes in this function... is that on purpose? > looks like this is common across all the addToLog calls. Will fix the quotes. >> Tools/BuildSlaveSupport/ews-build/steps.py:847 >> + if log['type'] == 's': > > can you remind me what the possible values are for 'type' ? # 's' = stdio, 't' = text, 'h' = html, 'd' = deleted From https://github.com/buildbot/buildbot/blob/master/master/buildbot/db/model.py#L196

Aakash Jain

Comment 12 2019-03-21 12:51:17 PDT

Created attachment 365597 [details] Updated patch Incorporated review comments.

EWS Watchlist

Comment 13 2019-03-21 12:53:34 PDT Comment hidden (obsolete)

Attachment 365597 [details] did not pass style-queue: ERROR: Tools/BuildSlaveSupport/ews-build/steps.py:714: [RunAPITests.evaluateCommand] Use of super on an old style class [pylint/E1002] [5] ERROR: Tools/BuildSlaveSupport/ews-build/steps.py:831: [AnalyzeAPITestsResults.getTestsResults] Instance of 'AnalyzeAPITestsResults' has no 'master' member [pylint/E1101] [5] ERROR: Tools/BuildSlaveSupport/ews-build/steps.py:838: [AnalyzeAPITestsResults.getTestsResults] Instance of 'AnalyzeAPITestsResults' has no 'master' member [pylint/E1101] [5] Total errors found: 3 in 3 files If any of these errors are false positives, please file a bug against check-webkit-style.

WebKit Commit Bot

Comment 14 2019-03-21 16:12:04 PDT

Comment on attachment 365597 [details] Updated patch Clearing flags on attachment: 365597 Committed r243342: <https://trac.webkit.org/changeset/243342>

WebKit Commit Bot

Comment 15 2019-03-21 16:12:06 PDT

All reviewed patches have been landed. Closing bug.

Radar WebKit Bug Importer

Comment 16 2019-03-21 16:14:03 PDT

<rdar://problem/49129320>

Note You need to log in before you can comment on or make changes to this bug.