Bug 66991 - Instrument NRWT or DRT to check whether DRT is timing out too early
Summary: Instrument NRWT or DRT to check whether DRT is timing out too early
Status: RESOLVED WONTFIX
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks: 64491
  Show dependency treegraph
 
Reported: 2011-08-25 15:39 PDT by Peter Kasting
Modified: 2012-06-08 16:20 PDT (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Kasting 2011-08-25 15:39:13 PDT
Dirk investigated two reports of mine of strange DRT behavior w.r.t. timeouts:

(1) Sometimes a test will be marked as TIMEOUT but will only claim to have taken a fraction of the available time.  See e.g. http://test-results.appspot.com/dashboards/flakiness_dashboard.html#group=%40DEPS%20-%20chromium.org&tests=fast%2Fforms%2Fform-associated-element-crash3.html where some tests TIMEOUT after 9 seconds when they should get 60.

(2) Sometimes a test will be marked as TIMEOUT without reporting a time at all.  See e.g. http://test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=fast%2Fdom%2FDOMImplementation%2FcreateDocumentType-err.html .

Dirk's comments to me were as follows:

"So, NRWT (in the Chromium case) decides that a test has timed out
because it is told by DRT that it did ... we pass a timeout value to
use to DRT, and then it tells us whether or not it took too long.

"This means that either (a) we're passing in the wrong value, or (b)
DRT is buggy and timing out too early. I verified that we were passing
in the correct value in at least one try, so I'm wondering if (b) is
the problem here.

"That would also explain how we can get some timeout results with no number.

"It would be straightforward to instrument either NRWT or DRT to check
for this. Can you file a bug for this?"

This is that bug.
Comment 1 Dirk Pranke 2011-08-25 15:44:45 PDT
Adding Tony to this as well ... note that this has definitely been seen on Linux and looks like it happens on windows as well. Haven't checked to see if this is happening on the Mac.

It's a bit of a toss-up as to how to deal with such a thing (short of fixing the bug, obviously). My first though would be to instrument DRT to print to stderr when it thinks a test has timed out (and to print the timeout we were given), and/or instrument NRWT to log the timeout we passed it and how long the test did take.

There's a question of whether NRWT should override DRT's interpretation and consider the test some other kind of failure instead of a TIMEOUT, but I don't know that it matters that much. We don't really have any other way to indicate "the testing infrastructure didn't work right" as opposed to "the test failed".
Comment 2 Peter Kasting 2011-08-25 16:13:02 PDT
In http://build.chromium.org/p/chromium.webkit/builders/Webkit%20Mac10.6%20%28CG%29/builds/342/steps/webkit_tests/logs/stdio , searching for pdf-as-background shows this output:

***
2011-08-25 15:54:43,868 26177 single_test_runner.py:221 DEBUG worker/13 fast/images/pdf-as-background.html output stderr lines:


2011-08-25 15:54:43,868 26177 worker.py:158 DEBUG worker/13 killing driver
2011-08-25 15:54:43,869 26177 worker.py:180 DEBUG worker/13 fast/images/pdf-as-background.html failed:
2011-08-25 15:54:43,869 26143 printing.py:469 INFO   fast/images/pdf-as-background.html -> unexpected test timed out
2011-08-25 15:54:43,869 26177 worker.py:182 DEBUG worker/13  Test timed out
***

This suggests that in this case some other error is resulting in "kill test, claim timeout" which (looking at http://test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=fast%2Fimages%2Fpdf-as-background.html ) is then getting reported as "timeout, 2 seconds" or similar.
Comment 3 Eric Seidel (no email) 2011-08-25 17:12:42 PDT
I wonder if this could relate to bug 63981.
Comment 4 Dirk Pranke 2012-06-08 16:00:38 PDT
I'm not sure if this is still an issue; I haven't seen anything suspicious like this in quite a while, and we use a totally different code path for handling timeouts now. Peter, are you okay with me closing this?
Comment 5 Peter Kasting 2012-06-08 16:20:37 PDT
If you don't think there's value in this bug, closing is fine.