Bug 215944 - ews might mark build as successful if tests fail to run
Summary: ews might mark build as successful if tests fail to run
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: Other
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Aakash Jain
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2020-08-28 12:41 PDT by Aakash Jain
Modified: 2021-08-30 03:35 PDT (History)
5 users (show)

See Also:


Attachments
Patch (3.38 KB, patch)
2020-09-04 04:28 PDT, Aakash Jain
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Aakash Jain 2020-08-28 12:41:14 PDT
EWS might mark build as successful in certain conditions if tests fail to run due to some infrastructure issue (e.g.: test hanging). For e.g. in https://ews-build.webkit.org/#/builders/24/builds/24143 tests were getting stuck (and killed by buildbot after 20 minutes). However the build was marked successful.

Note that in case the tests fails with an exit code 254, we already have logic to retry the build. However, in case the tests get stuck, and were killed by buildbot after 20 minutes, our current logic doesn't handle that well.
Comment 1 Aakash Jain 2020-09-04 04:28:33 PDT
Created attachment 407956 [details]
Patch
Comment 2 Jonathan Bedard 2020-09-04 08:56:27 PDT
Comment on attachment 407956 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=407956&action=review

> Tools/BuildSlaveSupport/ews-build/steps.py:2219
> +            return self.retry_build('Unexpected infrastructure issue, retrying build')

Won't this put us into an infinite retry loop? I recall there being a bug about getting out of an infinite retry loop by updating the revision we're testing with.....have we landed that yet?
Comment 3 Aakash Jain 2020-09-04 09:29:41 PDT
(In reply to Jonathan Bedard from comment #2)
> Won't this put us into an infinite retry loop? I recall there being a bug about getting out of an infinite retry loop by updating the revision we're testing with.....have we landed that yet?
You are right, depending on what caused the issue, it might create infinite retry loop (e.g.: bad commit causing layout-test to hang would create infinite retry loop, whereas bot in a bad state causing tests to hang would cause retries only until the bot is fixed). 

Bug 203698  which has the fix to automatically get out of retries isn't landed yet (need some more testing for edge cases). That patch would update retry_build() method, which is used by this patch. So when we land that patch, the fix will be applicable for this case as well.

We can wait to land this patch after Bug 203698, or can land this now as well. In theory this patch is still an improvement, since without this, ews might show misleading green status-bubbles, which is probably worst than no results.
Comment 4 Jonathan Bedard 2020-09-04 09:51:08 PDT
(In reply to Aakash Jain from comment #3)
> (In reply to Jonathan Bedard from comment #2)
> > Won't this put us into an infinite retry loop? I recall there being a bug about getting out of an infinite retry loop by updating the revision we're testing with.....have we landed that yet?
> You are right, depending on what caused the issue, it might create infinite
> retry loop (e.g.: bad commit causing layout-test to hang would create
> infinite retry loop, whereas bot in a bad state causing tests to hang would
> cause retries only until the bot is fixed). 
> 
> Bug 203698  which has the fix to automatically get out of retries isn't
> landed yet (need some more testing for edge cases). That patch would update
> retry_build() method, which is used by this patch. So when we land that
> patch, the fix will be applicable for this case as well.
> 
> We can wait to land this patch after Bug 203698, or can land this now as
> well. In theory this patch is still an improvement, since without this, ews
> might show misleading green status-bubbles, which is probably worst than no
> results.

How far are we from getting bug 203698 landed? I'm ok landing this one now, but only if we get bug 203698 landed in the next week. It seems like this change will make the case covered by bug 203698 more common.
Comment 5 Aakash Jain 2020-09-04 11:08:04 PDT
(In reply to Jonathan Bedard from comment #4)
> How far are we from getting bug 203698 landed? I'm ok landing this one now, but only if we get bug 203698 landed in the next week. It seems like this change will make the case covered by bug 203698 more common.
I think next week is reasonable for Bug 203698. Can you please review that patch again as well?
Comment 6 Radar WebKit Bug Importer 2020-09-04 12:42:14 PDT
<rdar://problem/68360846>
Comment 7 EWS 2020-09-09 14:30:40 PDT
Committed r266799: <https://trac.webkit.org/changeset/266799>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 407956 [details].