Bug 182085

Summary: Bubbles intermittently disappear from bot watchers dashboard
Product: WebKit Reporter: Aakash Jain <aakash_jain>
Component: Tools / TestsAssignee: Aakash Jain <aakash_jain>
Status: RESOLVED FIXED    
Severity: Normal CC: aakash_jain, ap, commit-queue, lforschler, ryanhaddad, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: Other   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Proposed patch
none
Alternative patch none

Description Aakash Jain 2018-01-24 20:51:22 PST
See <rdar://problem/36848949>.

[Error] Assertion Failed
	failureLogURL (BuildbotIteration.js:165)
	appendBuilderQueueStatus (BuildbotBuilderQueueView.js:80)
	(anonymous function) (BuildbotQueueView.js:320)
	(anonymous function)
	forEach
	appendBuildStyle (BuildbotQueueView.js:312)
	update (BuildbotBuilderQueueView.js:116)
Comment 1 Aakash Jain 2018-01-24 20:55:30 PST
Created attachment 332228 [details]
Proposed patch
Comment 2 Alexey Proskuryakov 2018-01-25 13:29:25 PST
Comment on attachment 332228 [details]
Proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=332228&action=review

> Tools/BuildSlaveSupport/build.webkit.org-config/public_html/dashboard/Scripts/BuildbotIteration.js:-165
> -        console.assert(this._firstFailedStep);

This confuses me. How can there be a failure when no steps failed?
Comment 3 Aakash Jain 2018-01-25 13:46:30 PST
> This confuses me. How can there be a failure when no steps failed?

Theoretically it can happen if:
1)the failed step is hidden (see https://trac.webkit.org/browser/webkit/trunk/Tools/BuildSlaveSupport/build.webkit.org-config/public_html/dashboard/Scripts/BuildbotIteration.js#L292)

2)the api call to fetch steps information returned unexpected data. Buildbot 0.8 has single API call while in Buildbot 0.9, steps information is fetched in separate API call.

I had the same confusion and couldn't reproduce the issue. The assertion indicate that there was atleast one such case. We could wait and try to reproduce the issue more reliably and figure out when does such a case happens.

This issue is intermittent and happens only with server running Buildbot 0.9, and anyways in Buildbot 0.9 we return buildPageURLForIteration() which doesn't depend on step information.
Comment 4 Aakash Jain 2018-01-25 17:47:01 PST
I added some logs and reproduced the issue and was able to find a case when overall result (this._results) was failure, while this._firstFailedStep was null. See attached screenshot in <rdar://36848949>.

It has to do with fetching the build info and build steps asynchronously (in Buildbot 0.9) and happens while fetching info from ongoing build (which is about to finish).

The order of events which can cause can be this:
1: fetched steps info (no failed step)
2: one of the build step failed (overall build status is set to failed)
3: fetched build info (found the build status as failed)
Comment 5 Alexey Proskuryakov 2018-01-26 10:07:13 PST
In this situation, does the fix fully correct the user experience, or are we just trading a bad bug for a slightly less bad one? Is it feasible to fully fix the behavior?
Comment 6 Aakash Jain 2018-01-31 15:55:54 PST
Created attachment 332816 [details]
Alternative patch

Simpler patch. It fixes the bug and doesn't change any functionality at all (i.e. returns the same URL which was returned before this code change, for both Buildbot 0.8 and 0.9).

The if (!this._firstFailedStep.logs) statement was added in https://trac.webkit.org/changeset/220120/webkit while adding Buildbot 0.9 support, and relies on the fact that Buildbot 0.9 api doesn't have logs inside _firstFailedStep, and so is always true for Buildbot 0.9.

We can also try to improve this method and construct the URL for the stdio. That would be a more complex fix (making sure this._firstFailedStep is set properly whenever this.failed is true), I am not sure if the additional benefit is worth investing the time in.
Comment 7 Alexey Proskuryakov 2018-01-31 17:39:48 PST
Comment on attachment 332816 [details]
Alternative patch

View in context: https://bugs.webkit.org/attachment.cgi?id=332816&action=review

> Tools/BuildSlaveSupport/build.webkit.org-config/public_html/dashboard/Scripts/BuildbotIteration.js:-167
> -        if (!this._firstFailedStep.logs)

Did you find out why we had this check? I'm not sure if removing it is safe, it was probably added to fix some assertion observed in practice.
Comment 8 Aakash Jain 2018-01-31 18:18:47 PST
> > Tools/BuildSlaveSupport/build.webkit.org-config/public_html/dashboard/Scripts/BuildbotIteration.js:-167
> > -        if (!this._firstFailedStep.logs)
> 
> Did you find out why we had this check? I'm not sure if removing it is safe,
> it was probably added to fix some assertion observed in practice.

Yes, I mentioned it in comment 6 above. Removing it is safe. We never needed this in Buildbot 0.8. I added this line while adding support for Buildbot 0.9.
Comment 9 WebKit Commit Bot 2018-02-01 09:30:30 PST
Comment on attachment 332816 [details]
Alternative patch

Clearing flags on attachment: 332816

Committed r227971: <https://trac.webkit.org/changeset/227971>
Comment 10 Radar WebKit Bug Importer 2018-02-01 10:36:53 PST
<rdar://problem/37121278>