RESOLVED FIXED 202424
[ews] Add API endpoint to retry failed builds for a patch
https://bugs.webkit.org/show_bug.cgi?id=202424
Summary [ews] Add API endpoint to retry failed builds for a patch
Aakash Jain
Reported 2019-10-01 12:37:57 PDT
Add API endpoint to retry failed builds for a patch. This would allow us to implement the functionality required in Bug 196599: 'EWS should have a way to retry a patch'
Attachments
Patch (7.05 KB, patch)
2019-10-01 13:27 PDT, Aakash Jain
no flags
Patch (7.08 KB, patch)
2019-10-02 11:37 PDT, Aakash Jain
jbedard: review+
Aakash Jain
Comment 1 2019-10-01 13:27:54 PDT
Jonathan Bedard
Comment 2 2019-10-01 14:18:43 PDT
Comment on attachment 379942 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=379942&action=review > Tools/BuildSlaveSupport/ews-app/ews/views/retrypatch.py:48 > + except: Can we be be more specific about the kind of errors we're catching here? At the moment, we're catching everything. > Tools/BuildSlaveSupport/ews-app/ews/views/retrypatch.py:51 > + builds_to_retry = StatusBubble().find_failed_builds_for_patch(patch_id) I understand the logic here, I just question it's wisdom. We're automatically retrying only the builds which failed, not the successful ones. Particularly with layout tests, I'm not sure this is a good idea....imagine a case where someone lands a patch which fixes your patch on iOS, but breaks it on MacOS. > Tools/BuildSlaveSupport/ews-app/ews/views/retrypatch.py:58 > + rc = Buildbot.retry_build(build.builder_id, build.number) I don't think rc is very useful here. > Tools/BuildSlaveSupport/ews-app/ews/views/statusbubble.py:212 > + def find_failed_builds_for_patch(self, patch_id): What are our plans to test this function? I feel like we should be able to test this function now.
Aakash Jain
Comment 3 2019-10-02 05:14:25 PDT
Comment on attachment 379942 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=379942&action=review >> Tools/BuildSlaveSupport/ews-app/ews/views/retrypatch.py:48 >> + except: > > Can we be be more specific about the kind of errors we're catching here? At the moment, we're catching everything. This is to catch any errors in converting patch_id to integer. It can be ValueError(when patch_id is not a integer), or TypeError (when patch_id is missing). I agree that it's good to catch specific exceptions. However, in this case being generic also seems fine, since if we can't get the patch_id for any reason, there is no point in continuing or even crashing. >> Tools/BuildSlaveSupport/ews-app/ews/views/retrypatch.py:51 >> + builds_to_retry = StatusBubble().find_failed_builds_for_patch(patch_id) > > I understand the logic here, I just question it's wisdom. We're automatically retrying only the builds which failed, not the successful ones. Particularly with layout tests, I'm not sure this is a good idea....imagine a case where someone lands a patch which fixes your patch on iOS, but breaks it on MacOS. This one is not an automatic retry. This is a manual retry, done by pressing a button by engineers. (See #4 in https://lists.webkit.org/pipermail/webkit-dev/2019-September/030804.html) Note that EWS is run on every patch for every configuration by default. If some EWS bubbles are red for a given patch, and any engineer believes that EWS is incorrect (maybe because of flaky tests, infrastructure issue, patch being part of patch-series which was processed too soon etc.), then he/she can press the 'Retry failed builds' button to retry those builds. A new uploaded patch (e.g.: to fix previous failures) is always processed on every EWS queue by default. >> Tools/BuildSlaveSupport/ews-app/ews/views/retrypatch.py:58 >> + rc = Buildbot.retry_build(build.builder_id, build.number) > > I don't think rc is very useful here. The other option I considered was following (but didn't find it as readable as this one). Let me know if you prefer this. if not Buildbot.retry_build(build.builder_id, build.number): >> Tools/BuildSlaveSupport/ews-app/ews/views/statusbubble.py:212 >> + def find_failed_builds_for_patch(self, patch_id): > > What are our plans to test this function? I feel like we should be able to test this function now. I tested in manually with various scenarios (using http://ews.webkit-uat.org/retry/). We don't have unit-tests for django app yet. I will add unit-tests separately in https://bugs.webkit.org/show_bug.cgi?id=202452
Jonathan Bedard
Comment 4 2019-10-02 07:38:14 PDT
Comment on attachment 379942 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=379942&action=review >>> Tools/BuildSlaveSupport/ews-app/ews/views/retrypatch.py:48 >>> + except: >> >> Can we be be more specific about the kind of errors we're catching here? At the moment, we're catching everything. > > This is to catch any errors in converting patch_id to integer. It can be ValueError(when patch_id is not a integer), or TypeError (when patch_id is missing). > > I agree that it's good to catch specific exceptions. However, in this case being generic also seems fine, since if we can't get the patch_id for any reason, there is no point in continuing or even crashing. I think the bigger problem is that since Python is interpreted, this will also catch syntax errors. In that case, the response will not be indicative of the actual error. >>> Tools/BuildSlaveSupport/ews-app/ews/views/retrypatch.py:51 >>> + builds_to_retry = StatusBubble().find_failed_builds_for_patch(patch_id) >> >> I understand the logic here, I just question it's wisdom. We're automatically retrying only the builds which failed, not the successful ones. Particularly with layout tests, I'm not sure this is a good idea....imagine a case where someone lands a patch which fixes your patch on iOS, but breaks it on MacOS. > > This one is not an automatic retry. This is a manual retry, done by pressing a button by engineers. (See #4 in https://lists.webkit.org/pipermail/webkit-dev/2019-September/030804.html) > > Note that EWS is run on every patch for every configuration by default. If some EWS bubbles are red for a given patch, and any engineer believes that EWS is incorrect (maybe because of flaky tests, infrastructure issue, patch being part of patch-series which was processed too soon etc.), then he/she can press the 'Retry failed builds' button to retry those builds. > > A new uploaded patch (e.g.: to fix previous failures) is always processed on every EWS queue by default. I'm ok with this as is, but we should keep an eye on it. I can see it potentially being a problem. >>> Tools/BuildSlaveSupport/ews-app/ews/views/retrypatch.py:58 >>> + rc = Buildbot.retry_build(build.builder_id, build.number) >> >> I don't think rc is very useful here. > > The other option I considered was following (but didn't find it as readable as this one). Let me know if you prefer this. > > if not Buildbot.retry_build(build.builder_id, build.number): I would prefer this other option, but not particularly strongly. (Also note that the WebKit Style guide would say to use 'not rc' instead of 'rc != True'
Aakash Jain
Comment 5 2019-10-02 11:37:29 PDT
Aakash Jain
Comment 6 2019-10-02 13:07:41 PDT
Radar WebKit Bug Importer
Comment 7 2019-10-02 13:08:19 PDT
Note You need to log in before you can comment on or make changes to this bug.