Bug 139671

Summary: Sort out timeout implementations in DRT and WKTR
Product: WebKit Reporter: Alexey Proskuryakov <ap>
Component: Tools / TestsAssignee: Alexey Proskuryakov <ap>
Status: RESOLVED FIXED    
Severity: Normal CC: bfulgham, cgarcia, clopez, commit-queue, dbates, glenn, gyuyoung.kim, ossy, simon.fraser
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=139734
Bug Depends on: 139774, 139734, 139772    
Bug Blocks:    
Attachments:
Description Flags
proposed fix simon.fraser: review+

Description Alexey Proskuryakov 2014-12-15 21:44:37 PST
There are multiple timeout watchdogs for regression tests:

1. One in run-webkit-test. When this fires, the tool gets sampled, and restarted. No partial output is saved. This supports [ Slow ] modifiers.

2. Async tests have their timeouts detected by DumpRenderTree and WebKitTestRunner, as long as main thread returns to event loop. When this timeout occurs, the test is incorrectly reported as failing, not timing out. There is partial output and an error line for the timeout. [ Slow ] modifiers are not respected - this timeout is always 30 seconds.

3. In WebKitTestRunner, UI process detects a timeout and fails. There is no partial output saved, no sample, and the test is reported as failing.

4. There are timeouts for individual messages that UI process sends to WebProcess in WebKitTestRunner.

#3 is no better than #1, and can be eliminated.

#2 should support [ Slow ] modifiers, and should be reported as an actual timeout that it is.

WebKitTestRunner has a lot longer async test timeout than DumpRenderTree. I don't think that this is necessary or desirable.

iOS has a lot longer DumpRenderTree timeout. I don't think that this is necessary or desirable.

Gtk has a very strange timeout computation in WebKitTestRunner, see bug 138831 comment 10.

I'm going to fix all these issues here, except for the Gtk one, which I cannot test.
Comment 1 Alexey Proskuryakov 2014-12-15 21:51:31 PST
Also not fixing: Windows DumpRenderTree. This should be straightforward, but needs building and testing on a Windows machine.
Comment 2 Alexey Proskuryakov 2014-12-15 22:33:29 PST
Created attachment 243350 [details]
proposed fix
Comment 3 WebKit Commit Bot 2014-12-15 22:35:47 PST
Attachment 243350 [details] did not pass style-queue:


ERROR: Tools/Scripts/webkitpy/port/driver.py:367:  multiple statements on one line (semicolon)  [pep8/E702] [5]
Total errors found: 1 in 17 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 4 Simon Fraser (smfr) 2014-12-15 23:12:15 PST
Comment on attachment 243350 [details]
proposed fix

View in context: https://bugs.webkit.org/attachment.cgi?id=243350&action=review

> Tools/DumpRenderTree/TestRunner.h:437
> +    int m_timeout;

Would be nice to use a comment saying what units this is in, or use a double, or std::chrono something.

> Tools/WebKitTestRunner/InjectedBundle/mac/TestRunnerMac.mm:55
> +    CFTimeInterval interval = m_timeout ? m_timeout / 1000.0 : defaultWaitToDumpWatchdogTimerInterval;

Can't we just initialize m_timeout with defaultWaitToDumpWatchdogTimerInterval ?
Comment 5 Carlos Garcia Campos 2014-12-16 02:05:39 PST
Comment on attachment 243350 [details]
proposed fix

View in context: https://bugs.webkit.org/attachment.cgi?id=243350&action=review

> Tools/WebKitTestRunner/InjectedBundle/gtk/TestRunnerGtk.cpp:53
> +    // FIXME: Use m_timeout, and only fall back to default when it's zero.
>      m_waitToDumpWatchdogTimer.scheduleAfterDelay("[WTR] waitToDumpWatchdogTimerCallback", [this] { waitToDumpWatchdogTimerFired(); },
> -        std::chrono::duration_cast<std::chrono::microseconds>(std::chrono::duration<double>(waitToDumpWatchdogTimerInterval)));
> +        std::chrono::duration_cast<std::chrono::microseconds>(std::chrono::duration<double>(defaultWaitToDumpWatchdogTimerInterval)));

So, what units is m_timeout in? The conversion to microseconds is correct here for the default timeout that is in seconds as a double value. I guess m_timeout is in milliseconds, since you are doing m_timeout/1000 in other cases. In this case that we know the value will not be truncated when converting to milliseconds (test timeouts are not that short), we could use milliseconds.

std::chrono::milliseconds interval = m_timeout ? std::chrono::milliseconds(m_timeout) : std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::duration<double>(defaultWaitToDumpWatchdogTimerInterval));
m_waitToDumpWatchdogTimer.scheduleAfterDelay("[WTR] waitToDumpWatchdogTimerCallback", [this] { waitToDumpWatchdogTimerFired(); }, interval);

Everything would be easier if m_timeout was a std::chrono::milliseconds and initialized to the default as Simon proposes.
Comment 6 Alexey Proskuryakov 2014-12-16 10:12:44 PST
> Would be nice to use a comment saying what units this is in, or use a double, or std::chrono something.

True. There are several more m_timeout variables in DumpRenderTree and WebKitTestRunner though, so it's better to update the types in a separate patch.

> Can't we just initialize m_timeout with defaultWaitToDumpWatchdogTimerInterval ?

Good idea.
Comment 7 Alexey Proskuryakov 2014-12-16 10:31:54 PST
Committed <https://trac.webkit.org/r177363>.
Comment 9 Alexey Proskuryakov 2014-12-16 13:58:48 PST
Sorry about that.

It's not clear to me from the logs what broke. I'd love to help, but someone else will need to analyze what the problems are. Essentially, I made all the fixes blindly, and they worked on Mac from the first time, there wasn't much Mac specific here.

One thing that I noticed earlier about Gtk tests was that they tend to time out unexpectedly quickly. See e.g. <http://webkit-test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=fast%2Fworkers%2Fworker-close-more.html> - why do tests time out in 6 seconds? Perhaps that's somehow related.
Comment 10 Carlos Alberto Lopez Perez 2014-12-16 18:57:52 PST
(In reply to comment #9)
> Sorry about that.
> 
> It's not clear to me from the logs what broke. I'd love to help, but someone
> else will need to analyze what the problems are. Essentially, I made all the
> fixes blindly, and they worked on Mac from the first time, there wasn't much
> Mac specific here.
> 
> One thing that I noticed earlier about Gtk tests was that they tend to time
> out unexpectedly quickly. See e.g.
> <http://webkit-test-results.appspot.com/dashboards/flakiness_dashboard.
> html#tests=fast%2Fworkers%2Fworker-close-more.html> - why do tests time out
> in 6 seconds? Perhaps that's somehow related.

I have detected two main issues:

 1)
 On the GTK port, we define the timeout to 6000ms (release), 12000ms (debug) or 10000ms (if checking memory leaks with valgrind)
 http://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/port/gtk.py?rev=177363#L83

 Your patch is substracting 5000ms from this timeout <http://trac.webkit.org/changeset/177363/trunk/Tools/Scripts/webkitpy/layout_tests/controllers/single_test_runner.py>
 Leaving it to 1000ms (1 sec). That don't seems enough for many tests to complete.
  
 Why are you removing those 5 seconds from the default timeout?


 2)
 r177363 breakes all ports that don't enable supports_per_test_timeout() on Tools/Scripts/webkitpy/port/$port.py
 This is because m_timeout is initialized to 0 <http://trac.webkit.org/browser/trunk/Tools/WebKitTestRunner/TestInvocation.cpp?rev=177363#L96>, and never updated with the port default value because the python scripts don't pass the --timeout argument when the previous is not enabled.
 
 Maybe we should just enable supports_per_test_timeout() for all ports at Tools/Scripts/webkitpy/port/base.py ?


So, I have tested, and the patch below will fix the breakage introduced by r177363 on the GTK port:

diff --git a/Tools/Scripts/webkitpy/layout_tests/controllers/single_test_runner.py b/Tools/Scripts/webkitpy/layout_tests/controllers/single_test_runner.py
index 8f72506..1c3fa21 100644
--- a/Tools/Scripts/webkitpy/layout_tests/controllers/single_test_runner.py
+++ b/Tools/Scripts/webkitpy/layout_tests/controllers/single_test_runner.py
@@ -62,11 +62,6 @@ class SingleTestRunner(object):
         self._stop_when_done = stop_when_done
 
         self._timeout = test_input.timeout
-        if self._timeout > 5000:
-            # Timeouts are detected by both script and tool; tool detected timeouts are
-            # better, because they contain partial output. Give the tool some time to
-            # report the timeout instead of being killed.
-            self._timeout = int(self._timeout) - 5000
 
         if self._reference_files:
             # Detect and report a test which has a wrong combination of expectation files.
diff --git a/Tools/Scripts/webkitpy/port/gtk.py b/Tools/Scripts/webkitpy/port/gtk.py
index 338e2d1..ad451be 100644
--- a/Tools/Scripts/webkitpy/port/gtk.py
+++ b/Tools/Scripts/webkitpy/port/gtk.py
@@ -80,6 +80,9 @@ class GtkPort(Port):
             return XorgDriver
         return XvfbDriver
 
+    def supports_per_test_timeout(self):
+        return True
+
     def default_timeout_ms(self):
         # Starting an application under Valgrind takes a lot longer than normal
         # so increase the timeout (empirically 10x is enough to avoid timeouts).



Any comments about it?
Comment 11 Carlos Alberto Lopez Perez 2014-12-16 19:01:35 PST
(In reply to comment #10)
>  1)
>  On the GTK port, we define the timeout to 6000ms (release), 12000ms (debug)
> or 10000ms (if checking memory leaks with valgrind)
>  http://trac.webkit.org/browser/trunk/Tools/Scripts/webkitpy/port/gtk.
> py?rev=177363#L83

I read it wrongly. When using valgrind we are doing a 10x (so 60000ms for release and 120000ms for debug)
Comment 12 Alexey Proskuryakov 2014-12-16 19:48:59 PST
Thank you for looking into this!

>  Why are you removing those 5 seconds from the default timeout?

This is needed to let tool-detected timeout be reported before run-webkit-tests kills the tool. When both have the same timeout, there is a race to detect it.

It is OK to change these numbers to some reasonable values that suit other ports.

>  Maybe we should just enable supports_per_test_timeout() for all ports at Tools/Scripts/webkitpy/port/base.py ?

I'm not sure if DumpRenderTree on Windows supports it yet, but otherwise, this seems like the right thing to do. We could also remove a bunch of code then.

But this is not the best solution to this problem, because we should also keep WebKitTestRunner working when launched manually, without the option.

> This is because m_timeout is initialized to 0

Oops, I tried to have it use the default 30 second timeout in this case, but didn't do that right. Should be an easy fix - I can do that tomorrow if no one beats me to it.
Comment 13 Carlos Alberto Lopez Perez 2014-12-17 05:00:17 PST
(In reply to comment #12)
> Thank you for looking into this!
> 
> >  Why are you removing those 5 seconds from the default timeout?
> 
> This is needed to let tool-detected timeout be reported before
> run-webkit-tests kills the tool. When both have the same timeout, there is a
> race to detect it.
> 
> It is OK to change these numbers to some reasonable values that suit other
> ports.
> 

And instead of doing that (subtracting 5 seconds from the parameter --timeout passed to the tool (WKTR))... what about just making run-webkit-test wait 1 second more? 


diff --git a/Tools/Scripts/webkitpy/port/driver.py b/Tools/Scripts/webkitpy/port/driver.py
index 715bd7e..9f5e3cd 100644
--- a/Tools/Scripts/webkitpy/port/driver.py
+++ b/Tools/Scripts/webkitpy/port/driver.py
@@ -180,6 +180,9 @@ class Driver(object):
 
         command = self._command_from_driver_input(driver_input)
         deadline = test_begin_time + int(driver_input.timeout) / 1000.0
+        # Add an extra second to let tool-detected timeout be reported before run-webkit-tests kills the tool.
+        # When both have the same timeout, there is a race to detect it.
+        deadline = deadline + 1.0
 
         self._server_process.write(command)
         text, audio = self._read_first_block(deadline)  # First block is either text or audio



At least to me, this seems less error prone and guaranteed to work on all ports (no matter which assumptions they make on the number of seconds for a test to timeout)
Comment 14 Alexey Proskuryakov 2014-12-17 09:44:21 PST
> And instead of doing that (subtracting 5 seconds from the parameter --timeout passed to the tool (WKTR))... what about just making run-webkit-test wait 1 second more?

In this case, run-webkit-tests would use a timeout that is different from what is specified in port configuration, or in --time-out-ms argument. That seems like it would introduce too much confusion. Also, it's not driver's job to make policy decisions, it should use whatever timeout is passed to it without further tweaking.
Comment 15 Alexey Proskuryakov 2014-12-17 09:54:54 PST
Fixing the known issues in bug 139734.
Comment 16 Carlos Alberto Lopez Perez 2014-12-17 10:06:29 PST
(In reply to comment #14)
> > And instead of doing that (subtracting 5 seconds from the parameter --timeout passed to the tool (WKTR))... what about just making run-webkit-test wait 1 second more?
> 
> In this case, run-webkit-tests would use a timeout that is different from
> what is specified in port configuration, or in --time-out-ms argument. That
> seems like it would introduce too much confusion.

You can see it also the other way around: With the changes introduced by r177363, the test timeouts faster (-5secs) than what is specified in port configuration, or in --time-out-ms argument.

I think that this actually introduces much more confusion than what I'm proposing.

> Also, it's not driver's
> job to make policy decisions, it should use whatever timeout is passed to it
> without further tweaking.

I don't see any policy decision here. Only a work-around to avoid a race condition. The subtracting of 5 seconds from r177363 isn't also policy, just another workaround.

I think that adding 1 second in run-webkit-tests is better. At least to me it looks cleaner.

If I say --timeou-out-ms 8, and I have to wait 9 seconds for the test to finish (or timeout), I probably won't notice.
But, on the other hand, if I get a timeout after 3 seconds (8-5) probably I will notice something weird.

Don't you agree on that?
Comment 17 Alexey Proskuryakov 2014-12-17 10:30:14 PST
> With the changes introduced by r177363, the test timeouts faster (-5secs) than what is specified in port configuration

This was already the case for many years. WebKitTestRunner has multiple shorter timeouts for specific operations, so a test could time out faster than specified in port configuration.

Mac in particular always had port timeout specified at 35 seconds, but both DumpRenderTree and WebKitTestRunner ignored that, and timed out after 30 seconds. Maybe there was some confusion due to that, but not anything show-stopping.

Additionally, a lot of tests have their own shorter timeouts implemented in JavaScript, which also don't obey port configuration.

I am not very strongly opposed to adding instead of subtracting, but I don't see it as obviously superior.

> I don't see any policy decision here. Only a work-around to avoid a race condition.

A workaround is something that can be made to a bug. But there is no bug here - when both the script and the tool have the same timeout, race is the expected behavior.

We need to have script timeout and tool timeout different, and this cannot be hidden under the carpet in driver.py, whose sole purpose is to translate Python variables into command line arguments, and then stdout/stderr into Python objects.

I posted a patch to bug 139734, let's move this discussion there.