Bug 61061

Summary: TestFailures page blames arbitrary revisions for breaking flaky tests
Product: WebKit Reporter: Adam Roben (:aroben) <aroben>
Component: Tools / TestsAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: dbates, joepeck
Priority: P2 Keywords: InRadar, ToolsHitList
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
URL: http://build.webkit.org/TestFailures/
Bug Depends on:    
Bug Blocks: 61062    
Attachments:
Description Flags
Teach TestFailures to detect possibly flaky tests and list them separately dbates: review+

Description Adam Roben (:aroben) 2011-05-18 08:45:57 PDT
The TestFailures page blames arbitrary revisions for breaking flaky tests. It would be a lot more helpful if it would instead identify the tests as flaky.
Comment 1 Adam Roben (:aroben) 2011-05-18 08:46:02 PDT
<rdar://problem/9452796>
Comment 2 Adam Roben (:aroben) 2011-06-29 09:17:12 PDT
Created attachment 99100 [details]
Teach TestFailures to detect possibly flaky tests and list them separately
Comment 3 Jessie Berlin 2011-06-29 11:09:23 PDT
Comment on attachment 99100 [details]
Teach TestFailures to detect possibly flaky tests and list them separately

Unofficial r=me! Excited to see this land.
Comment 4 Daniel Bates 2011-06-29 12:44:56 PDT
Comment on attachment 99100 [details]
Teach TestFailures to detect possibly flaky tests and list them separately

View in context: https://bugs.webkit.org/attachment.cgi?id=99100&action=review

This looks sane to me.

> Tools/BuildSlaveSupport/build.webkit.org-config/public_html/TestFailures/LayoutTestHistoryAnalyzer.js:36
>       * Preiodically calls callback until all current failures have been explained. Callback is

Preiodically => Periodically

(Just thought to mention the misspelling even though this text isn't part of the actual changes)

> Tools/BuildSlaveSupport/build.webkit.org-config/public_html/TestFailures/LayoutTestHistoryAnalyzer.js:66
>       * Each build contains just the failures that a) are still occuring on the bots, and b) were new

occuring => occurring

(Just thought to mention the misspelling even though this text isn't part of the actual changes)

> Tools/BuildSlaveSupport/build.webkit.org-config/public_html/TestFailures/LayoutTestHistoryAnalyzer.js:133
> +            const minimumRequiredTestRunsWithoutInterestingChanges = 5;

Note: IE doesn't support const (*). I take it we don't expect to access this web app from IE.

(*) The decision against const support in IE9 was remarked in <http://blogs.msdn.com/b/ie/archive/2010/08/25/chakra-interoperability-means-more-than-just-standards.aspx>.
Comment 5 Joseph Pecoraro 2011-06-29 12:49:34 PDT
View in context: https://bugs.webkit.org/attachment.cgi?id=99100&action=review

> Tools/BuildSlaveSupport/build.webkit.org-config/public_html/TestFailures/FlakyLayoutTestDetector.js:35
> +            return false;

Since the expected output of this function is an Array, it might make more sense to return null (still falsey).

> Tools/BuildSlaveSupport/build.webkit.org-config/public_html/TestFailures/LayoutTestHistoryAnalyzer.js:108
> +            var newFlakyTests = self._flakinessDetector.incorporateTestResults(nextBuildName, tests, tooManyFailures);
> +            if (newFlakyTests.length) {

If there were tooManyFailures, then this would return "false" (or maybe "null") in the future and the .length check would through an error. This should probably become:

    if (newFlakyTests && newFlakyTests.length) { ... }

> Tools/ChangeLog:31
> +        We tell our caller to keep calling until all current failures have been explained and we've
> +        gone through 5 builds without any new flaky tests being identified.

How far back does this normally go? Just 5-10 builds? 100 builds? Is there a limit on how far back this looks or could it go back to the start (if things were flaky).
Comment 6 Adam Roben (:aroben) 2011-06-29 13:09:24 PDT
(In reply to comment #5)
> View in context: https://bugs.webkit.org/attachment.cgi?id=99100&action=review
> 
> > Tools/BuildSlaveSupport/build.webkit.org-config/public_html/TestFailures/FlakyLayoutTestDetector.js:35
> > +            return false;
> 
> Since the expected output of this function is an Array, it might make more sense to return null (still falsey).
> 
> > Tools/BuildSlaveSupport/build.webkit.org-config/public_html/TestFailures/LayoutTestHistoryAnalyzer.js:108
> > +            var newFlakyTests = self._flakinessDetector.incorporateTestResults(nextBuildName, tests, tooManyFailures);
> > +            if (newFlakyTests.length) {
> 
> If there were tooManyFailures, then this would return "false" (or maybe "null") in the future and the .length check would through an error. This should probably become:
> 
>     if (newFlakyTests && newFlakyTests.length) { ... }

I think I'll change to returning an empty array in this case. The false is leftover from an earlier version of this function that just returned a boolean.

> > Tools/ChangeLog:31
> > +        We tell our caller to keep calling until all current failures have been explained and we've
> > +        gone through 5 builds without any new flaky tests being identified.
> 
> How far back does this normally go? Just 5-10 builds? 100 builds? Is there a limit on how far back this looks or could it go back to the start (if things were flaky).

It could go all the way back to the start. In practice it only goes back fewer than 20 builds in most cases.
Comment 7 Adam Roben (:aroben) 2011-06-29 14:49:48 PDT
Comment on attachment 99100 [details]
Teach TestFailures to detect possibly flaky tests and list them separately

View in context: https://bugs.webkit.org/attachment.cgi?id=99100&action=review

>> Tools/BuildSlaveSupport/build.webkit.org-config/public_html/TestFailures/LayoutTestHistoryAnalyzer.js:36
>>       * Preiodically calls callback until all current failures have been explained. Callback is
> 
> Preiodically => Periodically
> 
> (Just thought to mention the misspelling even though this text isn't part of the actual changes)

I'll fix this.

>> Tools/BuildSlaveSupport/build.webkit.org-config/public_html/TestFailures/LayoutTestHistoryAnalyzer.js:66
>>       * Each build contains just the failures that a) are still occuring on the bots, and b) were new
> 
> occuring => occurring
> 
> (Just thought to mention the misspelling even though this text isn't part of the actual changes)

I'll fix this.

>> Tools/BuildSlaveSupport/build.webkit.org-config/public_html/TestFailures/LayoutTestHistoryAnalyzer.js:133
>> +            const minimumRequiredTestRunsWithoutInterestingChanges = 5;
> 
> Note: IE doesn't support const (*). I take it we don't expect to access this web app from IE.
> 
> (*) The decision against const support in IE9 was remarked in <http://blogs.msdn.com/b/ie/archive/2010/08/25/chakra-interoperability-means-more-than-just-standards.aspx>.

Correct.
Comment 8 Adam Roben (:aroben) 2011-06-29 15:25:25 PDT
Committed r90054: <http://trac.webkit.org/changeset/90054>