WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
203698
[EWS] Layout testers can go in an infinite RETRY loop when there are 30+ failures on trunk
https://bugs.webkit.org/show_bug.cgi?id=203698
Summary
[EWS] Layout testers can go in an infinite RETRY loop when there are 30+ fail...
Aakash Jain
Reported
2019-10-31 14:28:23 PDT
Layout testers can go in an infinite RETRY loop when there are 30+ build failures on trunk. 30 is the layout-test failure limit we use in ews (--exit-after-n-failures 30). In case there are 30+ failures on trunk, layout-tests, re-run-layout-tests, and run-layout-tests-without-patch will all fail. EWS will simply retry the build, since in case of so many failures, EWS can't reliably tell if the patch introduce any new failures or not. However, the retried build will re-use the already built archive (built on a builder queue) and test using the same revision (on which builder queue built the patch). It will not build again with ToT. (This was done to save the building time) However, since EWS will keep retrying the build on same revision, it will keep failing. Even if the failures are fixed in ToT later, the tester queue will keep retrying the same patch on a given revision, and keep failing, and wouldn't be able to automatically recover (until the patch becomes obsolete, r- or bug is closed). For example, all of the following are retry of same patch on same revision:
https://ews-build.webkit.org/#/builders/24/builds/3152
https://ews-build.webkit.org/#/builders/24/builds/3162
https://ews-build.webkit.org/#/builders/24/builds/3168
https://ews-build.webkit.org/#/builders/24/builds/3174
https://ews-build.webkit.org/#/builders/24/builds/3181
https://ews-build.webkit.org/#/builders/24/builds/3189
https://ews-build.webkit.org/#/builders/24/builds/3195
Attachments
Patch
(13.45 KB, patch)
2019-11-22 15:23 PST
,
Aakash Jain
no flags
Details
Formatted Diff
Diff
Patch for landing
(44.37 KB, patch)
2020-09-14 09:42 PDT
,
Aakash Jain
no flags
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Aakash Jain
Comment 1
2019-11-12 06:41:38 PST
Few more examples from mac-debug-wk1 queue:
https://ews-build.webkit.org/#/builders/16/builds/7137
https://ews-build.webkit.org/#/builders/16/builds/7142
https://ews-build.webkit.org/#/builders/16/builds/7143
https://ews-build.webkit.org/#/builders/16/builds/7144
There were 30+ crashes on trunk yesterday due to
https://bugs.webkit.org/show_bug.cgi?id=201908
(e.g.:
https://build.webkit.org/builders/Apple%20High%20Sierra%20Debug%20WK1%20%28Tests%29/builds/11748
)
Aakash Jain
Comment 2
2019-11-22 15:23:42 PST
Created
attachment 384207
[details]
Patch
Jonathan Bedard
Comment 3
2019-11-22 17:44:26 PST
Comment on
attachment 384207
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=384207&action=review
> Tools/BuildSlaveSupport/ews-build/loadConfig.py:142 > + if not isTriggerUsedByAnyBuilder(config, scheduler['name']) and 'build' not in scheduler['name'].lower():
Is this saying that testers cannot trigger other testers?
> Tools/BuildSlaveSupport/ews-build/steps.py:494 > + self.include_revision = include_revision
What is 'revision' for EWS? The git hash?
> Tools/BuildSlaveSupport/ews-build/steps.py:1412 > + self.finished(SUCCESS)
Won't this turn the bubble green for a bit? Also, we still have an infinite loop problem is someone broke the build, right?
Aakash Jain
Comment 4
2019-11-22 18:06:00 PST
Comment on
attachment 384207
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=384207&action=review
>> Tools/BuildSlaveSupport/ews-build/loadConfig.py:142 >> + if not isTriggerUsedByAnyBuilder(config, scheduler['name']) and 'build' not in scheduler['name'].lower(): > > Is this saying that testers cannot trigger other testers?
This was testing that we do not have any unused triggers in config.json. But now it's possible to have a valid case of triggers which no-one in config.json will use directly (since these triggers would be triggered dynamically). Effectively yes, we don't have such a scenario (testers triggering other testers) yet.
>> Tools/BuildSlaveSupport/ews-build/steps.py:494 >> + self.include_revision = include_revision > > What is 'revision' for EWS? The git hash?
yeah, it's git hash
>> Tools/BuildSlaveSupport/ews-build/steps.py:1412 >> + self.finished(SUCCESS) > > Won't this turn the bubble green for a bit? > > Also, we still have an infinite loop problem is someone broke the build, right?
This will turn only the step green. Build would be red/FAILED because of previous layout-test steps failing.
Radar WebKit Bug Importer
Comment 5
2020-07-27 05:29:43 PDT
<
rdar://problem/66157544
>
Jonathan Bedard
Comment 6
2020-09-04 11:30:50 PDT
Comment on
attachment 384207
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=384207&action=review
> Tools/BuildSlaveSupport/ews-build/config.json:349 > + "triggered_by": ["ios-13-sim-build-ews"],
This makes sense to me, but I don't understand why this isn't already here, since ios-simulator-13 has been triggered by ios-13-sim-build-ews for months.
Aakash Jain
Comment 7
2020-09-04 11:40:55 PDT
Comment on
attachment 384207
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=384207&action=review
>> Tools/BuildSlaveSupport/ews-build/config.json:349 >> + "triggered_by": ["ios-13-sim-build-ews"], > > This makes sense to me, but I don't understand why this isn't already here, since ios-simulator-13 has been triggered by ios-13-sim-build-ews for months.
This 'triggered_by' key is used just for passing information. I wasn't able to dynamically extract this information (about who triggered current build) using buildbot. So I am passing this information in a static manner. This information is used by tester build, to trigger back the builder. Yes, iOS-13-Simulator-WK2-Tests-EWS is triggered by iOS-13-Simulator-Build-EWS using the 'triggers' key.
Jonathan Bedard
Comment 8
2020-09-04 12:43:13 PDT
Comment on
attachment 384207
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=384207&action=review
>>> Tools/BuildSlaveSupport/ews-build/config.json:349 >>> + "triggered_by": ["ios-13-sim-build-ews"], >> >> This makes sense to me, but I don't understand why this isn't already here, since ios-simulator-13 has been triggered by ios-13-sim-build-ews for months. > > This 'triggered_by' key is used just for passing information. I wasn't able to dynamically extract this information (about who triggered current build) using buildbot. So I am passing this information in a static manner. This information is used by tester build, to trigger back the builder. > > Yes, iOS-13-Simulator-WK2-Tests-EWS is triggered by iOS-13-Simulator-Build-EWS using the 'triggers' key.
That's unfortunate, but does answer my question!
Aakash Jain
Comment 9
2020-09-14 09:42:24 PDT
Created
attachment 408716
[details]
Patch for landing
Aakash Jain
Comment 10
2020-09-14 09:48:26 PDT
Comment on
attachment 408716
[details]
Patch for landing View in context:
https://bugs.webkit.org/attachment.cgi?id=408716&action=review
Did some more testing, and updated patch accordingly. Also re-based it on ToT.
> Tools/BuildSlaveSupport/ews-app/ews/views/statusbubble.py:156 > + bubble['state'] = 'provisional-fail'
Updated status-bubble code here to ensure that the status-bubble is orange when the build is waiting to be retried.
> Tools/BuildSlaveSupport/ews-build/factories_unittest.py:50 > def test_generic_factory(self):
Updated factories_unittest.py accordingly.
> Tools/BuildSlaveSupport/ews-build/steps.py:2186 > + self.setProperty('build_summary', message)
Added this line to ensure that the buildbot build summary is set properly.
Aakash Jain
Comment 11
2020-09-14 09:52:27 PDT
Sample run: Builder build:
https://ews-build.webkit-uat.org/#/builders/14/builds/1594
Corresponding tester build:
https://ews-build.webkit-uat.org/#/builders/16/builds/122
, it correctly triggered back the builder in step #30. Re-triggered builder build:
https://ews-build.webkit-uat.org/#/builders/14/builds/1607
, it correctly updated to ToT and triggered tester with ToT.
Aakash Jain
Comment 12
2020-09-14 09:55:33 PDT
Corresponding status-bubble for ios-wk2 is correctly orange and hover-over message is also correct in
https://ews.webkit-uat.org/status-bubble/407969/
EWS
Comment 13
2020-09-14 10:47:38 PDT
Committed
r267022
: <
https://trac.webkit.org/changeset/267022
> All reviewed patches have been landed. Closing bug and clearing flags on
attachment 408716
[details]
.
Aakash Jain
Comment 14
2020-09-18 10:14:04 PDT
This change seems to be working fine. e.g.:
https://ews-build.webkit.org/#/builders/24/builds/26185
failed with 30+ failures. It correctly re-trigerred the parent builder in:
https://ews-build.webkit.org/#/builders/23/builds/27400
. This correctly triggered the tester again with ToT in
https://ews-build.webkit.org/#/builders/24/builds/26223
, which passed.
David Kilzer (:ddkilzer)
Comment 15
2020-09-22 10:03:46 PDT
(In reply to Aakash Jain from
comment #14
)
> This change seems to be working fine. > > e.g.:
https://ews-build.webkit.org/#/builders/24/builds/26185
failed with > 30+ failures. It correctly re-trigerred the parent builder in: >
https://ews-build.webkit.org/#/builders/23/builds/27400
. This correctly > triggered the tester again with ToT in >
https://ews-build.webkit.org/#/builders/24/builds/26223
, which passed.
Awesome! Thanks Aakash!
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug