138217 – PatchAnalysisTask._test_patch() should have more tests that examine cases where the failure limit is exceeded.

NEW138217

PatchAnalysisTask._test_patch() should have more tests that examine cases where the failure limit is exceeded.

https://bugs.webkit.org/show_bug.cgi?id=138217

Summary PatchAnalysisTask._test_patch() should have more tests that examine cases whe...

Jake Nielsen

Reported 2014-10-30 11:41:33 PDT

commitqueuetask_unittest.py tests this code path. Add more tests there.

Attachments
Adds more test cases, and fixes a bug. (10.44 KB, patch) 2014-10-30 13:32 PDT, Jake Nielsen	dbates: review- dbates: commit-queue-	Details Formatted Diff Diff
Removes useless test (9.97 KB, patch) 2014-11-07 15:10 PST, Jake Nielsen	dbates: review- dbates: commit-queue-	Details Formatted Diff Diff
Show Obsolete (1) View All Add attachment proposed patch, testcase, etc.

Jake Nielsen

Comment 1 2014-10-30 13:32:09 PDT

Created attachment 240691 [details] Adds more test cases, and fixes a bug.

Alexey Proskuryakov

Comment 2 2014-10-30 13:39:02 PDT

Comment on attachment 240691 [details] Adds more test cases, and fixes a bug. View in context: https://bugs.webkit.org/attachment.cgi?id=240691&action=review > Tools/Scripts/webkitpy/tool/bot/commitqueuetask_unittest.py:524 > + first_test_failures=[], > + second_test_failures=[], > + clean_test_failures=_lots_of_failing_tests) > + > + self._run_and_expect_patch_analysis_result(commit_queue, PatchAnalysisResult.PASS) We shouldn't be testing clean tests in this case, should we?

Jake Nielsen

Comment 3 2014-10-30 13:41:51 PDT

(In reply to comment #2) > Comment on attachment 240691 [details] > Adds more test cases, and fixes a bug. > > View in context: > https://bugs.webkit.org/attachment.cgi?id=240691&action=review > > > Tools/Scripts/webkitpy/tool/bot/commitqueuetask_unittest.py:524 > > + first_test_failures=[], > > + second_test_failures=[], > > + clean_test_failures=_lots_of_failing_tests) > > + > > + self._run_and_expect_patch_analysis_result(commit_queue, PatchAnalysisResult.PASS) > > We shouldn't be testing clean tests in this case, should we? expect_clean_tests_to_run defaults to False, I can make it explicit if you want though.

Jake Nielsen

Comment 4 2014-10-31 11:42:53 PDT

bump

Jake Nielsen

Comment 5 2014-11-07 11:34:25 PST

double bump

Ryosuke Niwa

Comment 6 2014-11-07 13:53:48 PST

Comment on attachment 240691 [details] Adds more test cases, and fixes a bug. rs=me

Daniel Bates

Comment 7 2014-11-07 14:07:19 PST

Comment on attachment 240691 [details] Adds more test cases, and fixes a bug. View in context: https://bugs.webkit.org/attachment.cgi?id=240691&action=review >>> Tools/Scripts/webkitpy/tool/bot/commitqueuetask_unittest.py:524 >>> + self._run_and_expect_patch_analysis_result(commit_queue, PatchAnalysisResult.PASS) >> >> We shouldn't be testing clean tests in this case, should we? > > expect_clean_tests_to_run defaults to False, I can make it explicit if you want though. From my understanding, Alexey is remarking about the usefulness of this unit test given that we will neither build and run layout tests on the "patch" a second time (*) nor build and run layout tests without the "patch" (since there were no test failures when we built and tested the "patch" on the first try); => only the value of the first argument, first_test_failures, is important for the purpose of this test; => the name of this function is a misnomer. One way to resolve this issue is to rename this function to better describe its purpose, maybe test_patch_passes_on_first_try? You may also want to consider adding a comment and/or pick the empty list for the value of argument clean_test_failures to imply to the reader the arbitrariness of the value of the arguments second_test_failures and clean_test_failures. On another note, I'm unclear how beneficial this test case is given that we have the test case CommitQueueTaskTest.test_success_case(). (*) You commented on this behavior on line 510 of this patch. Is this comment incorrect?

Daniel Bates

Comment 8 2014-11-07 14:18:14 PST

Comment on attachment 240691 [details] Adds more test cases, and fixes a bug. View in context: https://bugs.webkit.org/attachment.cgi?id=240691&action=review > Tools/Scripts/webkitpy/tool/bot/commitqueuetask_unittest.py:532 > + # Unfortunately there are cases where the clean build will randomly fail enough tests to hit the failure limit. Can you provide a concrete example? I know we may not know an exact cause, but do we have an idea what the cause is when the a clean build randomly fails enough tests to hit the failure limit? Do we have so many non-skipped flaky tests?

Jake Nielsen

Comment 9 2014-11-07 15:03:22 PST

(In reply to comment #7) > Comment on attachment 240691 [details] > Adds more test cases, and fixes a bug. > > View in context: > https://bugs.webkit.org/attachment.cgi?id=240691&action=review > > >>> Tools/Scripts/webkitpy/tool/bot/commitqueuetask_unittest.py:524 > >>> + self._run_and_expect_patch_analysis_result(commit_queue, PatchAnalysisResult.PASS) > >> > >> We shouldn't be testing clean tests in this case, should we? > > > > expect_clean_tests_to_run defaults to False, I can make it explicit if you want though. > > From my understanding, Alexey is remarking about the usefulness of this unit > test given that we will neither build and run layout tests on the "patch" a > second time (*) nor build and run layout tests without the "patch" (since > there were no test failures when we built and tested the "patch" on the > first try); => only the value of the first argument, first_test_failures, is > important for the purpose of this test; => the name of this function is a > misnomer. One way to resolve this issue is to rename this function to better > describe its purpose, maybe test_patch_passes_on_first_try? You may also > want to consider adding a comment and/or pick the empty list for the value > of argument clean_test_failures to imply to the reader the arbitrariness of > the value of the arguments second_test_failures and clean_test_failures. > > On another note, I'm unclear how beneficial this test case is given that we > have the test case CommitQueueTaskTest.test_success_case(). That's a good point, I'll take it out. > > (*) You commented on this behavior on line 510 of this patch. Is this > comment incorrect? Yes, this comment is correct. The test case in question (test_tree_failure_limit_with_patch_that_fixes_redness) ensures the correctness of the comment, but as you pointed out, so does test_success_case.

Jake Nielsen

Comment 10 2014-11-07 15:05:53 PST

(In reply to comment #8) > Comment on attachment 240691 [details] > Adds more test cases, and fixes a bug. > > View in context: > https://bugs.webkit.org/attachment.cgi?id=240691&action=review > > > Tools/Scripts/webkitpy/tool/bot/commitqueuetask_unittest.py:532 > > + # Unfortunately there are cases where the clean build will randomly fail enough tests to hit the failure limit. > > Can you provide a concrete example? I know we may not know an exact cause, > but do we have an idea what the cause is when the a clean build randomly > fails enough tests to hit the failure limit? Do we have so many non-skipped > flaky tests? Alexey mentioned this case to me in person. As I recall there's a case where the browser can get resized and essentially any test that expects elements to end up at a specific location will fail. I may be misunderstanding/misremembering. Did I get that right Alexey?

Jake Nielsen

Comment 11 2014-11-07 15:10:20 PST

Created attachment 241211 [details] Removes useless test

Alexey Proskuryakov

Comment 12 2014-11-07 15:26:13 PST

It's not flaky tests, it's some flakiness in WebKitTestRunner, or WindowServer, or who know where. See e.g. <https://build.webkit.org/results/Apple%20Mavericks%20Debug%20WK1%20(Tests)/r175752%20(8770)/results.html> - that happens multiple times per week. People were looking into this, but didn't find the root cause yet.

Daniel Bates

Comment 13 2016-04-23 13:18:07 PDT

Comment on attachment 241211 [details] Removes useless test View in context: https://bugs.webkit.org/attachment.cgi?id=241211&action=review This patch is reasonable. It seems excessive to represent _lots_of_failing_tests with 100 failing tests. It seems sufficient to represents this with three failures and use the name of the variable to convey that it represents many test failures. If we make _lots_of_failing_tests be an array of three test names then we should update the other test cases such that we pass an array with less than three entries for {first, second, clean}_test_failures as appropriate for the test. > Tools/Scripts/webkitpy/tool/bot/commitqueuetask_unittest.py:518 > + def test_tree_failure_limit_with_patch_that_potentially_fixes_some_redness(self): How does this test case represent a patch that "potentially fixes some redness"? I mean, the patch fails the same tests in both the first and second tries. The result of the clean build represents an anomaly that we should not experience in theory, but seems to hit in practice according to you and Alexey. We should determine the root cause and fix this anomaly. I can understand the motivation for including a test for this anomaly. I do not see the benefit of this test case other than to serve as documentation that the anomaly can occur. This test case is logically not sound. If the purpose of this test is for documentation purposes then it would be better to document this anomaly in a bug. So that we can fix track it in a searchable manner and ultimately fix it. I do not see the benefit of validating in the webkitpy code that we can handle such a case because if a machine were to get into a state materially consistent with the state excised by this test case (maybe a cosmic ray hit it?) then the machine is messed up and I do not see how we can trust results from it, including results from the continued execution of the webkitpy code.

Daniel Bates

Comment 14 2016-04-23 13:23:29 PDT

(In reply to comment #13) > Comment on attachment 241211 [details] > Removes useless test > > View in context: > https://bugs.webkit.org/attachment.cgi?id=241211&action=review > > This patch is reasonable. [...] To clarify, I'm r-'ing this patch because this patch is several month old and I would like to see a rebased patch. I would also like to see us simplify this test based on remark in the first paragraph of comment #13 and I would like us to consider either renaming or removing the the test case test_tree_failure_limit_with_patch_that_potentially_fixes_some_redness().

Note You need to log in before you can comment on or make changes to this bug.

Status NEW

Resolution

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Unspecified

OS Unspecified

Product WebKit

Component Tools / Tests

Assignee

Srinivasan Vijayaraghavan

Reported

2014-10-30 11:41 PDT

Modified

2016-04-23 13:35 PDT History

CC List

8 users Show

URL

Keywords

Depends on

Blocks