Bug 29223 - run-webkit-tests has a timeout value that is too low
Summary: run-webkit-tests has a timeout value that is too low
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC OS X 10.5
: P2 Normal
Assignee: Andrew Wilson
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-09-12 10:14 PDT by Andrew Wilson
Modified: 2009-09-12 20:37 PDT (History)
3 users (show)

See Also:


Attachments
changed test timeout to 20 secs to avoid conflict with DRT timeout (1.09 KB, patch)
2009-09-12 10:17 PDT, Andrew Wilson
mrowe: review+
commit-queue: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Wilson 2009-09-12 10:14:57 PDT
There are two timeout mechanisms when running layout tests - one is enforced in DumpRenderTree, so that if a script calls waitUntilDone() but does not call notifyDone() within a given timeout period, DRT will flush the current output and fail the test.

run-webkit-tests also has a timeout value. If a test does not generate any output within a given timeout period, run-webkit-tests kills the test and discards any output other than stderr.

run-webkit-tests and DumpRenderTree both have the same timeout period currently (15 seconds) - since the run-webkit-tests timer is started first (before DumpRenderTree is called), it's virtually guaranteed to fire first, which means that timed out tests are almost impossible to debug because there's no test output.

We need to bump up the run-webkit-tests timeout to something like 17 seconds, to give DRT a chance to gracefully terminate the test.
Comment 1 Andrew Wilson 2009-09-12 10:17:37 PDT
Created attachment 39516 [details]
changed test timeout to 20 secs to avoid conflict with DRT timeout
Comment 2 Eric Seidel (no email) 2009-09-12 11:24:08 PDT
This was recently changed... and someone even brought up the "isnt' there a race with them being the same" problem.  I can't remember what the bug number was though.  We could probably figure out from the svn logs.
Comment 3 Andrew Wilson 2009-09-12 12:44:20 PDT
Mark, I see from the ChangeLog that you bumped up the DRT timeouts to match the 15 seconds in run-webkit-tests.

Looks like the original patch (https://trac.webkit.org/changeset/42012/trunk/WebKitTools/Scripts/run-webkit-tests) that updated run-webkit-tests timeouts explicitly kept the timeout in run-webkit-tests greater than the timeouts in DRT to address the same problem that my patch is re-addressing.

Seems like run-webkit-tests timeout needs to be greater than DRT's timeout. Any reason not to make this change (what were you trying to accomplish by having the timeouts be identical)?
Comment 4 Mark Rowe (bdash) 2009-09-12 13:04:35 PDT
Why make it 20 rather than 15.5 or 16?

The fundamental problem here is that we're using two separate timers here for the same purpose:  we want to detect that a test is taking to long.  When we've detected that it is we want to determine why it is taking too long: is it because DRT is hung, if it is responsive and waiting for waitUntilDone to be called.  If it's responsive we want it to give up and dump the current state.  We can do this by removing DRTs timeout completely and having the run-webkit-tests timeout ping DRT in some manner to notify it that it has timed out and to give up.  If DRT is responsive it can do this and respond to run-webkit-tests.  If not, run-webkit-tests knows that DRT is hung and can kill it.
Comment 5 Andrew Wilson 2009-09-12 14:10:37 PDT
(In reply to comment #4)
> Why make it 20 rather than 15.5 or 16?

Because it may take more than .5 or 1.0 seconds for DRT to start up, load the page, and execute page script. 5 seconds is probably too conservative, but typically the run-webkit-tests timeout will not get triggered (it's only triggered if DRT is unresponsive and has to be killed) so there's not much gain in being less conservative.

> 
> The fundamental problem here is that we're using two separate timers here for
> the same purpose:  we want to detect that a test is taking to long.  

I'd disagree. The DRT timer is for detecting when a given test case is failing to call notifyDone (for example, if there's a typo in the test so it's generating a JS exception, or the condition that should trigger the end of the test is not happening).

The timer in run-webkit-tests is for detecting when DRT is unresponsive. 

These are completely separate cases.

> When we've
> detected that it is we want to determine why it is taking too long: is it
> because DRT is hung, if it is responsive and waiting for waitUntilDone to be
> called.  If it's responsive we want it to give up and dump the current state. 
> We can do this by removing DRTs timeout completely and having the
> run-webkit-tests timeout ping DRT in some manner to notify it that it has timed
> out and to give up.  If DRT is responsive it can do this and respond to
> run-webkit-tests.  If not, run-webkit-tests knows that DRT is hung and can kill
> it.

So, I actually made that change (send SIGHUP to DRT) for Mac/Qt, then reverted it. The reasons I reverted it were:

1) You can only signal the process on the Mac/Qt platforms. On Windows (and others?) you can't.
2) It's a far more invasive change. I can get this *exact* behavior with a one-line change to run-webkit-tests. There's no end advantage to stripping out the timeouts in DRT.
Comment 6 Mark Rowe (bdash) 2009-09-12 14:56:21 PDT
(In reply to comment #5)
> (In reply to comment #4)
> > Why make it 20 rather than 15.5 or 16?
> 
> Because it may take more than .5 or 1.0 seconds for DRT to start up, load the
> page, and execute page script. 5 seconds is probably too conservative

Five seconds is probably too conservative for normal tests, but when tests are run under valgrind or GuardMalloc it may not be conservative enough.

> , but typically the run-webkit-tests timeout will not get triggered (it's only
> triggered if DRT is unresponsive and has to be killed) so there's not much gain
> in being less conservative.

It's also triggered on Mac OS X when tests are crashing, so increasing this timeout will substantially increase the run time when tests are crashing.

> So, I actually made that change (send SIGHUP to DRT) for Mac/Qt, then reverted
> it. The reasons I reverted it were:
> 
> 1) You can only signal the process on the Mac/Qt platforms. On Windows (and
> others?) you can't.

Sending a signal isn't what I was proposing. There are plenty of other ways to achieve this with varying levels of portability and complexity.  Writing a message to DRTs stdin is one that comes to mind.

> 2) It's a far more invasive change. I can get this *exact* behavior with a
> one-line change to run-webkit-tests.

This is a ridiculous statement.  It's obvious that you don't get "this *exact* behavior" with your proposed change.  The run-webkit-tests timeout is higher meaning that some situations (eg, the not particularly uncommon case of DRT crashing) are much slower.  It also doesn't have the advantages that removing the DRT timeout has.

> There's no end advantage to stripping out the timeouts in DRT.

There are compelling reasons to remove the timeout from DRT. One is alluded to above.
*) The right duration for the timeout depends on the environment.  When running under GuardMalloc or valgrind code can execute substantially slower.  The timeout in run-webkit-tests already compensates for this.
*) When running the tests on a less powerful device the tests may take longer.  This is much easier to accommodate in run-webkit-tests than it is inside DRT itself.
*) When running DRT under a debugger it is easy to trip the timeout by pausing too long at a breakpoint.

These are just a few that came to mind.  There are probably others.
Comment 7 Andrew Wilson 2009-09-12 15:18:01 PDT
There are indeed some arguments for someone doing a more comprehensive overhaul of the DRT/run-webkit-test timeout interactions.

I would like to suggest that as an interim measure, we increase the run-webkit-test timeout (if you don't like 20 seconds, then I can make it 17, or 16, or whatever). This does not preclude someone from making a more comprehensive change, and would let us debug layout tests that are timing out in the meantime.

My patch is a significant improvement on the status quo (currently, you can't debug timed out tests because no output is generated), and can be submitted immediately to enable us to investigate bugs like #29090 - we should not let the perfect become the enemy of the good.
Comment 8 Mark Rowe (bdash) 2009-09-12 15:31:33 PDT
(In reply to comment #7)
> There are indeed some arguments for someone doing a more comprehensive overhaul
> of the DRT/run-webkit-test timeout interactions.
> 
> I would like to suggest that as an interim measure, we increase the
> run-webkit-test timeout (if you don't like 20 seconds, then I can make it 17,
> or 16, or whatever). 

That seems reasonable to me if there is a bug tracking the follow-up improvements.
Comment 9 Mark Rowe (bdash) 2009-09-12 15:32:56 PDT
Comment on attachment 39516 [details]
changed test timeout to 20 secs to avoid conflict with DRT timeout

r=me
Comment 10 WebKit Commit Bot 2009-09-12 18:27:13 PDT
Comment on attachment 39516 [details]
changed test timeout to 20 secs to avoid conflict with DRT timeout

Rejecting patch 39516 from commit-queue.

This patch will require manual commit. Patch https://bugs.webkit.org/attachment.cgi?id=39516 from bug 29223 failed to download and apply.
Comment 11 Andrew Wilson 2009-09-12 20:33:16 PDT
Committed as r48335.
Comment 12 Andrew Wilson 2009-09-12 20:37:48 PDT
(In reply to comment #8)
> That seems reasonable to me if there is a bug tracking the follow-up
> improvements.

Logged as https://bugs.webkit.org/show_bug.cgi?id=29229