When failures are found in run-api-tests, we should retry the API test to ensure that the failures are not because of flaky test. Further, we should run the API tests without the patch as well, to identify whether the failures are pre-existing or new failures introduced by the patch.
Created attachment 365345 [details] WIP Sample runs: Handling pre-existing failure in trunk: https://ews-build.webkit-uat.org/#/builders/19/builds/2259 https://ews-build.webkit-uat.org/#/builders/19/builds/2253 Handling pre-existing failure in trunk as well as flaky tests: https://ews-build.webkit-uat.org/#/builders/19/builds/2256 https://ews-build.webkit-uat.org/#/builders/19/builds/2242 Flaky failure in first run with no failures in second run: https://ews-build.webkit-uat.org/#/builders/20/builds/2040 https://ews-build.webkit-uat.org/#/builders/20/builds/2039 No failures: https://ews-build.webkit-uat.org/#/builders/20/builds/2045 https://ews-build.webkit-uat.org/#/builders/20/builds/2041
Attachment 365345 [details] did not pass style-queue: ERROR: Tools/BuildSlaveSupport/ews-build/steps.py:708: [RunAPITests.evaluateCommand] Use of super on an old style class [pylint/E1002] [5] ERROR: Tools/BuildSlaveSupport/ews-build/steps.py:834: [AnalyzeAPITestsResults.getTestsResults] Instance of 'AnalyzeAPITestsResults' has no 'master' member [pylint/E1101] [5] ERROR: Tools/BuildSlaveSupport/ews-build/steps.py:841: [AnalyzeAPITestsResults.getTestsResults] Instance of 'AnalyzeAPITestsResults' has no 'master' member [pylint/E1101] [5] ERROR: Tools/BuildSlaveSupport/ews-build/steps.py:846: [AnalyzeAPITestsResults.getTestsResults] Undefined variable 'json' [pylint/E0602] [5] Total errors found: 4 in 2 files If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 365345 [details] WIP View in context: https://bugs.webkit.org/attachment.cgi?id=365345&action=review > Tools/BuildSlaveSupport/ews-build/steps.py:524 > + return self.getProperty('patchFailedToBuild') or self.getProperty('patchFailedAPITests') A bit confused why only API tests are here. Doesn't this logic apply to JSC tests too? And in the future, basically any test suite we're retrying that uses a compiled WebKit? > Tools/BuildSlaveSupport/ews-build/steps.py:759 > + d = self.getTestsResults(RunAPITests.name) d doesn't get used again? Should this be 'defer'? > Tools/BuildSlaveSupport/ews-build/steps.py:794 > + failures_with_patch = first_run_failures.intersection(second_run_failures) Doesn't this throw out flakey failures? Do we really want to do that? It almost feels like union would be better.
Comment on attachment 365345 [details] WIP View in context: https://bugs.webkit.org/attachment.cgi?id=365345&action=review >> Tools/BuildSlaveSupport/ews-build/steps.py:524 >> + return self.getProperty('patchFailedToBuild') or self.getProperty('patchFailedAPITests') > > A bit confused why only API tests are here. Doesn't this logic apply to JSC tests too? And in the future, basically any test suite we're retrying that uses a compiled WebKit? You are right. For now, these are the only two code paths which are using CompileWebKitToT. I haven't completed the JSC build/test retry logic. >> Tools/BuildSlaveSupport/ews-build/steps.py:759 >> + d = self.getTestsResults(RunAPITests.name) > > d doesn't get used again? Should this be 'defer'? it's used in all the subsequent lines below 'd.addCallback'. >> Tools/BuildSlaveSupport/ews-build/steps.py:794 >> + failures_with_patch = first_run_failures.intersection(second_run_failures) > > Doesn't this throw out flakey failures? Do we really want to do that? It almost feels like union would be better. Yeah, I believe that's what we would want here. If we do union, then in https://ews-build.webkit-uat.org/#/builders/19/builds/2256 EWS would claim 'TestWebKitAPI.ProcessSwap.NumberOfCachedProcesses' as a new failure introduced by the patch, which we don't want. (Note that NumberOfCachedProcesses is a flaky failure as noted in https://bugs.webkit.org/show_bug.cgi?id=195102) Also, this is similar to https://trac.webkit.org/browser/webkit/trunk/Tools/Scripts/webkitpy/tool/bot/patchanalysistask.py#L281
Sample run for catching API test failures: https://ews-build.webkit-uat.org/#/builders/19/builds/2260
(In reply to Aakash Jain from comment #4) > ... > > >> Tools/BuildSlaveSupport/ews-build/steps.py:794 > >> + failures_with_patch = first_run_failures.intersection(second_run_failures) > > > > Doesn't this throw out flakey failures? Do we really want to do that? It almost feels like union would be better. > > Yeah, I believe that's what we would want here. If we do union, then in > https://ews-build.webkit-uat.org/#/builders/19/builds/2256 EWS would claim > 'TestWebKitAPI.ProcessSwap.NumberOfCachedProcesses' as a new failure > introduced by the patch, which we don't want. (Note that > NumberOfCachedProcesses is a flaky failure as noted in > https://bugs.webkit.org/show_bug.cgi?id=195102) > > Also, this is similar to > https://trac.webkit.org/browser/webkit/trunk/Tools/Scripts/webkitpy/tool/bot/ > patchanalysistask.py#L281 But what about the case were the patch introduces a NEW flakey failure? We would pass EWS, in that case. Maybe the answer is that we don't have a good way to catch this without talking to a results database.
> But what about the case were the patch introduces a NEW flakey failure? We > would pass EWS, in that case. > > Maybe the answer is that we don't have a good way to catch this without > talking to a results database. Agreed. It's hard to conclusively detect a flaky test added by the patch (unless we run API test on that patch lot of times). Results database (about API tests history) would be a good way to catch flaky failures.
Created attachment 365387 [details] Proposed patch
Attachment 365387 [details] did not pass style-queue: ERROR: Tools/BuildSlaveSupport/ews-build/steps.py:714: [RunAPITests.evaluateCommand] Use of super on an old style class [pylint/E1002] [5] ERROR: Tools/BuildSlaveSupport/ews-build/steps.py:839: [AnalyzeAPITestsResults.getTestsResults] Instance of 'AnalyzeAPITestsResults' has no 'master' member [pylint/E1101] [5] ERROR: Tools/BuildSlaveSupport/ews-build/steps.py:846: [AnalyzeAPITestsResults.getTestsResults] Instance of 'AnalyzeAPITestsResults' has no 'master' member [pylint/E1101] [5] Total errors found: 3 in 3 files If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 365387 [details] Proposed patch View in context: https://bugs.webkit.org/attachment.cgi?id=365387&action=review > Tools/BuildSlaveSupport/ews-build/steps.py:741 > + return rc you have a space on line 723 before the return. but omit it here. either is fine, just make them the same. > Tools/BuildSlaveSupport/ews-build/steps.py:751 > + return not self.doStepIf(step) I find this confusing. do we really need the doStepIf /hideStepIf function if we are just returning a property? > Tools/BuildSlaveSupport/ews-build/steps.py:780 > + will we always have first/second/clean run results? (do we have to run the second/clean set if everything in the first run passes?) > Tools/BuildSlaveSupport/ews-build/steps.py:788 > + return set([failure.get('name') for failure in result.get("Timedout", [])] + discuss: Timedout vs TimedOut vs TimeOut? do we use "Timedout" anywhere else? I think a simple case change is better for this string. > Tools/BuildSlaveSupport/ews-build/steps.py:836 > + self._addToLog('stderr', "ERROR: step not found: {}".format(step)) we are mixing single quotes and double quotes in this function... is that on purpose? looks like this is common across all the addToLog calls. > Tools/BuildSlaveSupport/ews-build/steps.py:847 > + if log['type'] == 's': can you remind me what the possible values are for 'type' ?
Comment on attachment 365387 [details] Proposed patch View in context: https://bugs.webkit.org/attachment.cgi?id=365387&action=review >> Tools/BuildSlaveSupport/ews-build/steps.py:741 >> + return rc > > you have a space on line 723 before the return. but omit it here. either is fine, just make them the same. Will fix it. >> Tools/BuildSlaveSupport/ews-build/steps.py:751 >> + return not self.doStepIf(step) > > I find this confusing. do we really need the doStepIf /hideStepIf function if we are just returning a property? This class runs api tests (without patch), it is skipped if that property is not set. >> Tools/BuildSlaveSupport/ews-build/steps.py:780 >> + > > will we always have first/second/clean run results? (do we have to run the second/clean set if everything in the first run passes?) Yes we will always have all three results, because this class AnalyzeAPITestsResults is called only if there was a API test retry and there were failures in the retry. Notice "addStepsAfterCurrentStep" in ReRunAPITests. If everything in first run passes, we do not retry, we do not use 'addStepsAfterCurrentStep' and this class is never invoked. >> Tools/BuildSlaveSupport/ews-build/steps.py:788 >> + return set([failure.get('name') for failure in result.get("Timedout", [])] + > > discuss: Timedout vs TimedOut vs TimeOut? > > do we use "Timedout" anywhere else? I think a simple case change is better for this string. 'Timedout' is outputted by run-api-tests itself https://trac.webkit.org/browser/webkit/trunk/Tools/Scripts/webkitpy/api_tests/manager.py#L196 >> Tools/BuildSlaveSupport/ews-build/steps.py:836 >> + self._addToLog('stderr', "ERROR: step not found: {}".format(step)) > > we are mixing single quotes and double quotes in this function... is that on purpose? > looks like this is common across all the addToLog calls. Will fix the quotes. >> Tools/BuildSlaveSupport/ews-build/steps.py:847 >> + if log['type'] == 's': > > can you remind me what the possible values are for 'type' ? # 's' = stdio, 't' = text, 'h' = html, 'd' = deleted From https://github.com/buildbot/buildbot/blob/master/master/buildbot/db/model.py#L196
Created attachment 365597 [details] Updated patch Incorporated review comments.
Attachment 365597 [details] did not pass style-queue: ERROR: Tools/BuildSlaveSupport/ews-build/steps.py:714: [RunAPITests.evaluateCommand] Use of super on an old style class [pylint/E1002] [5] ERROR: Tools/BuildSlaveSupport/ews-build/steps.py:831: [AnalyzeAPITestsResults.getTestsResults] Instance of 'AnalyzeAPITestsResults' has no 'master' member [pylint/E1101] [5] ERROR: Tools/BuildSlaveSupport/ews-build/steps.py:838: [AnalyzeAPITestsResults.getTestsResults] Instance of 'AnalyzeAPITestsResults' has no 'master' member [pylint/E1101] [5] Total errors found: 3 in 3 files If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 365597 [details] Updated patch Clearing flags on attachment: 365597 Committed r243342: <https://trac.webkit.org/changeset/243342>
All reviewed patches have been landed. Closing bug.
<rdar://problem/49129320>