39203 – [DRT/Chromium] Increase the time out value

RESOLVED FIXED 39203

[DRT/Chromium] Increase the time out value

https://bugs.webkit.org/show_bug.cgi?id=39203

Summary [DRT/Chromium] Increase the time out value

Kent Tamura

Reported 2010-05-17 00:48:44 PDT

[DRT/Chromium] Increase the time out value

Attachments
Proposed patch (1.65 KB, patch) 2010-05-17 00:53 PDT, Kent Tamura	no flags	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Kent Tamura

Comment 1 2010-05-17 00:53:08 PDT

Created attachment 56218 [details] Proposed patch

Tony Chang

Comment 2 2010-05-17 01:01:50 PDT

Ojan would be a good reviewer since he did a lot of work on getting the timeout as low as possible in TestShell. I'm a little confused. I thought when we hit the timeout, we first call notifyDone, and if that fails, we exit(0) the process (which looks like a crash). Is the call to notifyDone failing?

Kent Tamura

Comment 3 2010-05-17 02:16:52 PDT

(In reply to comment #2) > Ojan would be a good reviewer since he did a lot of work on getting the timeout as low as possible in TestShell. > > I'm a little confused. I thought when we hit the timeout, we first call notifyDone, and if that fails, we exit(0) the process (which looks like a crash). Is the call to notifyDone failing? I think DRT/Chromium works as the same as other platforms DRTs. The problem is that, if a DRT process exits before new-run-webkit-tests detects timeout, new-run-webkit-tests assumes the DRT process crashed. (webkitpy/layout_tests/port/server_process.py _read()) The current DRT/Chromium timeout value is 10 sec and new-run-webkit-tests timeout value is 6 sec. I'm not sure why new-run-webkit-tests can't detect timeout correctly. Anyway, making DRT timeout value longer solves the problem.

David Levin

Comment 4 2010-05-17 09:46:38 PDT

It seems like the DRT timeout should be longer than the script timeout (just like how you mentioned it was 10 seconds vs 6 seconds for test_shell and the script).

Eric Seidel (no email)

Comment 5 2010-05-20 01:03:26 PDT

Comment on attachment 56218 [details] Proposed patch OK.

Kent Tamura

Comment 6 2010-05-20 08:46:17 PDT

Comment on attachment 56218 [details] Proposed patch Clearing flags on attachment: 56218 Committed r59839: <http://trac.webkit.org/changeset/59839>

Kent Tamura

Comment 7 2010-05-20 08:46:27 PDT

All reviewed patches have been landed. Closing bug.

WebKit Review Bot

Comment 8 2010-05-20 09:48:08 PDT

http://trac.webkit.org/changeset/59839 might have broken Tiger Intel Release The following changes are on the blame list: http://trac.webkit.org/changeset/59840 http://trac.webkit.org/changeset/59839

Ojan Vafai

Comment 9 2010-05-20 14:25:18 PDT

Dirk is way more familiar with the current state of this code than I am. (In reply to comment #4) > It seems like the DRT timeout should be longer than the script timeout (just like how you mentioned it was 10 seconds vs 6 seconds for test_shell and the script). I actually think it's the other way around. If DRT times out, then it can dump whatever results it has so far and then load the next test. If the script times out, all it can do is kill DRT. new-run-webkit-tests passes the per-test timeout to test_shell when it's running the test, so that value isn't even used for new-run-webkit-tests. (In reply to comment #3) > (In reply to comment #2) > > I'm a little confused. I thought when we hit the timeout, we first call notifyDone, and if that fails, we exit(0) the process (which looks like a crash). Is the call to notifyDone failing? > > The problem is that, if a DRT process exits before new-run-webkit-tests detects timeout, new-run-webkit-tests assumes the DRT process crashed. If this is true, it's a regression or it's a difference between DRT and test_shell. There are two timeout cases: 1. The test times out due to logic error (e.g., forgot to call notifyDone). In this case, DRT can just dump #TIMEOUT and load the next test. 2. The test times out because DRT is hung. In this case, DRT needs to be killed. test_shell has code to also dump #TIMEOUT here in the relevant signal handlers. Does DRT not? #TIMEOUT lets new-run-webkit-tests know that the test timed out instead of crashing, even if DRT/test_shell disapears. Anyways, this patch is fine. We should match Mac DRT. But, I think the underlying bug is still there. Also, FWIW, the conclusion last time that we talked about timeouts on webkit-dev was that all tests would get a long (e.g. 30 seconds) timeout except for tests that are expected to timeout, which will get a much shorter timeout (e.g. a couple seconds). That way, we don't need to manually mark tests as SLOW and timeout tests don't make the test cycle time insanely large.

Dirk Pranke

Comment 10 2010-05-20 16:16:32 PDT

(In reply to comment #9) > Dirk is way more familiar with the current state of this code than I am. > > (In reply to comment #4) > > It seems like the DRT timeout should be longer than the script timeout (just like how you mentioned it was 10 seconds vs 6 seconds for test_shell and the script). > > I actually think it's the other way around. If DRT times out, then it can dump whatever results it has so far and then load the next test. If the script times out, all it can do is kill DRT. > > new-run-webkit-tests passes the per-test timeout to test_shell when it's running the test, so that value isn't even used for new-run-webkit-tests. > The script logic actually varies by port. The WebKit Mac port of new-run-webkit-tests (and I think the Chromium DRT port) use non-blocking I/O and a timeout to enforce timeouts in the script. This way the script can abort before DRT does, and DRT doesn't have to have a concept of a timeout. The Chromium ports (all platforms) pass the timeout to TestShell, and rely on Test Shell to enforce the timeout. At one point there was code in some configurations of new-run-webkit-tests to try and catch timeouts, but not in the regular code path. We do things this way because there is no way to do a non-blocking read using select on Windows from Python. I think (but am not positive) that you can do this in (cygwin?) Perl on Windows, and so WebKit Win doesn't have this problem. So, DRT and test_shell are definitely different. Separately, I am in the middle of trying to change new-run-webkit-tests to detect wedged threads and recover, in order to work around the Python multi-threading issues we're seeing. So it *may* be feasible to change test_shell to match DRT here, but I can't say so conclusively yet.

Kent Tamura

Comment 11 2010-05-20 23:08:30 PDT

(In reply to comment #10) > We do things this way because there is no way to do a non-blocking read using select on Windows from Python. I think (but am not positive) that you can do this in (cygwin?) Perl on Windows, and so WebKit Win doesn't have this problem. Do you mean we have no ways to make new-run-webkit-tests work with Chromium DRT for Windows?

Dirk Pranke

Comment 12 2010-05-21 09:55:14 PDT

(In reply to comment #11) > (In reply to comment #10) > > We do things this way because there is no way to do a non-blocking read using select on Windows from Python. I think (but am not positive) that you can do this in (cygwin?) Perl on Windows, and so WebKit Win doesn't have this problem. > > Do you mean we have no ways to make new-run-webkit-tests work with Chromium DRT for Windows? Well, that's the question. We may have to run it under cygwin. I am looking into other options.

Ojan Vafai

Comment 13 2010-05-21 11:58:38 PDT

(In reply to comment #10) > The WebKit Mac port of new-run-webkit-tests (and I think the Chromium DRT port) use non-blocking I/O and a timeout to enforce timeouts in the script. This way the script can abort before DRT does, and DRT doesn't have to have a concept of a timeout. > > The Chromium ports (all platforms) pass the timeout to TestShell, and rely on Test Shell to enforce the timeout. At one point there was code in some configurations of new-run-webkit-tests to try and catch timeouts, but not in the regular code path. > > We do things this way because there is no way to do a non-blocking read using select on Windows from Python. I think (but am not positive) that you can do this in (cygwin?) Perl on Windows, and so WebKit Win doesn't have this problem. > > So, DRT and test_shell are definitely different. > > Separately, I am in the middle of trying to change new-run-webkit-tests to detect wedged threads and recover, in order to work around the Python multi-threading issues we're seeing. So it *may* be feasible to change test_shell to match DRT here, but I can't say so conclusively yet. I don't quite get all this, but, we should do our best to to maintain the following: For tests that timeout, but don't hang DRT, we first dump the render dump. That way, for timing out tests, we still spit put a -actual.txt file where we can see what the state of the render tree was when it timed out. This is very helpful in debugging timeouts, especially ones that are hard to reproduce. If it's run-webkit-tests that handles the timeout, there's no way for DRT to dump the timeout information. All I'm suggesting really is that both DRT and run-webkit-tests should enforce timeouts. run-webkit-tests's timeout should just be a couple seconds longer than DRTs.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Other

OS All

Product WebKit

Component Tools / Tests

Assignee

Nobody

Reported

2010-05-17 00:48 PDT

Modified

2010-05-21 11:58 PDT History

CC List

8 users Show

URL

Keywords

Depends on

Blocks

38023

Dependencies

tree graph