Bug 61061

Summary:

TestFailures page blames arbitrary revisions for breaking flaky tests

Product:

WebKit

Reporter:

Adam Roben (:aroben) <aroben>

Component:

Tools / Tests

Assignee:

Nobody <webkit-unassigned>

Status:

RESOLVED FIXED

Severity:

Normal

CC:

dbates, joepeck

Priority:

Keywords:

InRadar, ToolsHitList

Version:

528+ (Nightly build)

Hardware:

All

OS:

All

URL:

http://build.webkit.org/TestFailures/

Bug Depends on:

Bug Blocks:

61062

Attachments:

Description	Flags
Teach TestFailures to detect possibly flaky tests and list them separately	dbates: review+

Adam Roben (:aroben)

Reported 2011-05-18 08:45:57 PDT

The TestFailures page blames arbitrary revisions for breaking flaky tests. It would be a lot more helpful if it would instead identify the tests as flaky.

Attachments
Teach TestFailures to detect possibly flaky tests and list them separately (21.22 KB, patch) 2011-06-29 09:17 PDT, Adam Roben (:aroben)	dbates: review+	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Adam Roben (:aroben)

Comment 1 2011-05-18 08:46:02 PDT

<rdar://problem/9452796>

Adam Roben (:aroben)

Comment 2 2011-06-29 09:17:12 PDT

Created attachment 99100 [details] Teach TestFailures to detect possibly flaky tests and list them separately

Jessie Berlin

Comment 3 2011-06-29 11:09:23 PDT

Comment on attachment 99100 [details] Teach TestFailures to detect possibly flaky tests and list them separately Unofficial r=me! Excited to see this land.

Daniel Bates

Comment 4 2011-06-29 12:44:56 PDT

Comment on attachment 99100 [details] Teach TestFailures to detect possibly flaky tests and list them separately View in context: https://bugs.webkit.org/attachment.cgi?id=99100&action=review This looks sane to me. > Tools/BuildSlaveSupport/build.webkit.org-config/public_html/TestFailures/LayoutTestHistoryAnalyzer.js:36 > * Preiodically calls callback until all current failures have been explained. Callback is Preiodically => Periodically (Just thought to mention the misspelling even though this text isn't part of the actual changes) > Tools/BuildSlaveSupport/build.webkit.org-config/public_html/TestFailures/LayoutTestHistoryAnalyzer.js:66 > * Each build contains just the failures that a) are still occuring on the bots, and b) were new occuring => occurring (Just thought to mention the misspelling even though this text isn't part of the actual changes) > Tools/BuildSlaveSupport/build.webkit.org-config/public_html/TestFailures/LayoutTestHistoryAnalyzer.js:133 > + const minimumRequiredTestRunsWithoutInterestingChanges = 5; Note: IE doesn't support const (*). I take it we don't expect to access this web app from IE. (*) The decision against const support in IE9 was remarked in <http://blogs.msdn.com/b/ie/archive/2010/08/25/chakra-interoperability-means-more-than-just-standards.aspx>.

Joseph Pecoraro

Comment 5 2011-06-29 12:49:34 PDT

View in context: https://bugs.webkit.org/attachment.cgi?id=99100&action=review > Tools/BuildSlaveSupport/build.webkit.org-config/public_html/TestFailures/FlakyLayoutTestDetector.js:35 > + return false; Since the expected output of this function is an Array, it might make more sense to return null (still falsey). > Tools/BuildSlaveSupport/build.webkit.org-config/public_html/TestFailures/LayoutTestHistoryAnalyzer.js:108 > + var newFlakyTests = self._flakinessDetector.incorporateTestResults(nextBuildName, tests, tooManyFailures); > + if (newFlakyTests.length) { If there were tooManyFailures, then this would return "false" (or maybe "null") in the future and the .length check would through an error. This should probably become: if (newFlakyTests && newFlakyTests.length) { ... } > Tools/ChangeLog:31 > + We tell our caller to keep calling until all current failures have been explained and we've > + gone through 5 builds without any new flaky tests being identified. How far back does this normally go? Just 5-10 builds? 100 builds? Is there a limit on how far back this looks or could it go back to the start (if things were flaky).

Adam Roben (:aroben)

Comment 6 2011-06-29 13:09:24 PDT

(In reply to comment #5) > View in context: https://bugs.webkit.org/attachment.cgi?id=99100&action=review > > > Tools/BuildSlaveSupport/build.webkit.org-config/public_html/TestFailures/FlakyLayoutTestDetector.js:35 > > + return false; > > Since the expected output of this function is an Array, it might make more sense to return null (still falsey). > > > Tools/BuildSlaveSupport/build.webkit.org-config/public_html/TestFailures/LayoutTestHistoryAnalyzer.js:108 > > + var newFlakyTests = self._flakinessDetector.incorporateTestResults(nextBuildName, tests, tooManyFailures); > > + if (newFlakyTests.length) { > > If there were tooManyFailures, then this would return "false" (or maybe "null") in the future and the .length check would through an error. This should probably become: > > if (newFlakyTests && newFlakyTests.length) { ... } I think I'll change to returning an empty array in this case. The false is leftover from an earlier version of this function that just returned a boolean. > > Tools/ChangeLog:31 > > + We tell our caller to keep calling until all current failures have been explained and we've > > + gone through 5 builds without any new flaky tests being identified. > > How far back does this normally go? Just 5-10 builds? 100 builds? Is there a limit on how far back this looks or could it go back to the start (if things were flaky). It could go all the way back to the start. In practice it only goes back fewer than 20 builds in most cases.

Adam Roben (:aroben)

Comment 7 2011-06-29 14:49:48 PDT

Comment on attachment 99100 [details] Teach TestFailures to detect possibly flaky tests and list them separately View in context: https://bugs.webkit.org/attachment.cgi?id=99100&action=review >> Tools/BuildSlaveSupport/build.webkit.org-config/public_html/TestFailures/LayoutTestHistoryAnalyzer.js:36 >> * Preiodically calls callback until all current failures have been explained. Callback is > > Preiodically => Periodically > > (Just thought to mention the misspelling even though this text isn't part of the actual changes) I'll fix this. >> Tools/BuildSlaveSupport/build.webkit.org-config/public_html/TestFailures/LayoutTestHistoryAnalyzer.js:66 >> * Each build contains just the failures that a) are still occuring on the bots, and b) were new > > occuring => occurring > > (Just thought to mention the misspelling even though this text isn't part of the actual changes) I'll fix this. >> Tools/BuildSlaveSupport/build.webkit.org-config/public_html/TestFailures/LayoutTestHistoryAnalyzer.js:133 >> + const minimumRequiredTestRunsWithoutInterestingChanges = 5; > > Note: IE doesn't support const (*). I take it we don't expect to access this web app from IE. > > (*) The decision against const support in IE9 was remarked in <http://blogs.msdn.com/b/ie/archive/2010/08/25/chakra-interoperability-means-more-than-just-standards.aspx>. Correct.

Adam Roben (:aroben)

Comment 8 2011-06-29 15:25:25 PDT

Committed r90054: <http://trac.webkit.org/changeset/90054>

Note You need to log in before you can comment on or make changes to this bug.