Bug 73843

Summary:

nrwt exits early too frequently

Product:

WebKit

Reporter:

Ryosuke Niwa <rniwa>

Component:

Tools / Tests

Assignee:

Dirk Pranke <dpranke>

Status:

RESOLVED FIXED

Severity:

Normal

CC:

abarth, aroben, dpranke, eric, jberlin, lforschler, ojan, ossy, simon.fraser, webkit.review.bot

Priority:

Version:

528+ (Nightly build)

Hardware:

Unspecified

OS:

Unspecified

Bug Depends on:

Bug Blocks:

88680

Attachments:

Description	Flags
Patch	none
Patch	none
change 0 to None	none

Ryosuke Niwa

Reported 2011-12-05 10:41:04 PST

Exiting early after a mere 20+ tests timed out or crashed seems too disruptive. This has happened very frequently on Mac port for example: http://build.webkit.org/builders/SnowLeopard%20Intel%20Release%20%28Tests%29?numbuilds=100 We probably need to tolerate more tests to time out / crash.

Attachments
Patch (1.37 KB, patch) 2011-12-05 10:47 PST, Ryosuke Niwa	no flags	Details Formatted Diff Diff
Patch (2.86 KB, patch) 2012-06-08 17:52 PDT, Dirk Pranke	no flags	Details Formatted Diff Diff
change 0 to None (2.86 KB, patch) 2012-06-08 18:14 PDT, Dirk Pranke	no flags	Details Formatted Diff Diff
Show Obsolete (2) View All Add attachment proposed patch, testcase, etc.

Ryosuke Niwa

Comment 1 2011-12-05 10:47:33 PST

Created attachment 117899 [details] Patch

Adam Roben (:aroben)

Comment 2 2011-12-05 11:02:07 PST

Part of the point of this policy was to avoid slowing down the bots too much when some low-level crash is introduced in WebKit. In many cases, all 20 tests will have crashed for the same reason, and any other tests that would have been run would also have crashed for the same reason, so running more tests doesn't give more diagnostic information. And since we have to wait for Crash Reporter for each crashed test, running more tests can significantly slow down the bots.

Eric Seidel (no email)

Comment 3 2011-12-05 11:03:57 PST

I don't think you want to change this. At least not unless you fix https://bugs.webkit.org/show_bug.cgi?id=37797. But I don't think we want that behavior on the bots anyway.

Ryosuke Niwa

Comment 4 2011-12-05 11:05:00 PST

(In reply to comment #2) > Part of the point of this policy was to avoid slowing down the bots too much when some low-level crash is introduced in WebKit. In many cases, all 20 tests will have crashed for the same reason, and any other tests that would have been run would also have crashed for the same reason, so running more tests doesn't give more diagnostic information. And since we have to wait for Crash Reporter for each crashed test, running more tests can significantly slow down the bots. Right. However, in the world where we have multiple worker processes, 20 might be unrealistically low since many tests that get paged out temporarily will end up taking much longer time to run. This problem might go away once you add more RAM to Mac bots. For example, I haven't had this issue on my MacPro (has 12GB of RAM).

Eric Seidel (no email)

Comment 5 2011-12-05 11:07:04 PST

It is possible we should re-evaluate this for the multi-proesss world.

Dirk Pranke

Comment 6 2011-12-05 12:22:17 PST

I don't think we need to re-evaluate this. I think we need to fix the bugs that are causing these timeouts. For comparison, we don't have these issues on the chromium bots, and no one has found the exit-early behavior to be a problem.

Ojan Vafai

Comment 7 2012-06-08 13:23:08 PDT

(In reply to comment #6) > I don't think we need to re-evaluate this. I think we need to fix the bugs that are causing these timeouts. If we can fix the underlying issues, that's clearly the best resolution. > For comparison, we don't have these issues on the chromium bots, and no one has found the exit-early behavior to be a problem. Is this just because we have a lot of tests listed as expected to timeout/crash flakily so they don't contribute to the count?

Dirk Pranke

Comment 8 2012-06-08 17:52:30 PDT

Created attachment 146669 [details] Patch

Dirk Pranke

Comment 9 2012-06-08 17:53:03 PDT

rniwa / ojan, please take a look?

Dirk Pranke

Comment 10 2012-06-08 18:14:53 PDT

Created attachment 146673 [details] change 0 to None

Dirk Pranke

Comment 11 2012-06-08 18:26:24 PDT

Comment on attachment 146673 [details] change 0 to None setting cq- because we should update the chromium bots to pass these flags first.

Ojan Vafai

Comment 12 2012-06-08 19:42:15 PDT

Comment on attachment 146673 [details] change 0 to None I'd prefer a test showing that None does what we think it does.

Dirk Pranke

Comment 13 2012-06-09 20:17:37 PDT

(In reply to comment #12) > (From update of attachment 146673 [details]) > I'd prefer a test showing that None does what we think it does. I'm not entirely sure what you're looking for? It's the default argument, so the code path is being executed. However, I'm not sure how we would demonstrate that we never exit early? Given that there are many tests that include failures, is that sufficient? E.g., run_webkit_tests_integrationtest.MainTest.test_all shows that we run all of the tests and that N of them fail ...

Ojan Vafai

Comment 14 2012-06-10 09:09:11 PDT

(In reply to comment #13) > (In reply to comment #12) > > (From update of attachment 146673 [details] [details]) > > I'd prefer a test showing that None does what we think it does. > > I'm not entirely sure what you're looking for? It's the default argument, so the code path is being executed. However, I'm not sure how we would demonstrate that we never exit early? Given that there are many tests that include failures, is that sufficient? E.g., run_webkit_tests_integrationtest.MainTest.test_all shows that we run all of the tests and that N of them fail ... You're right. I didn't think it through very well.

WebKit Review Bot

Comment 15 2012-06-11 14:18:15 PDT

Comment on attachment 146673 [details] change 0 to None Clearing flags on attachment: 146673 Committed r120007: <http://trac.webkit.org/changeset/120007>

WebKit Review Bot

Comment 16 2012-06-11 14:18:21 PDT

All reviewed patches have been landed. Closing bug.

Note You need to log in before you can comment on or make changes to this bug.