Bug 35358 - new-run-webkit-tests: tests are randomly timing out when run
Summary: new-run-webkit-tests: tests are randomly timing out when run
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC OS X 10.5
: P2 Normal
Assignee: Dirk Pranke
URL:
Keywords:
Depends on:
Blocks: 34984
  Show dependency treegraph
 
Reported: 2010-02-24 13:40 PST by Eric Seidel (no email)
Modified: 2011-04-01 16:22 PDT (History)
4 users (show)

See Also:


Attachments
results.json (157.85 KB, text/plain)
2010-02-24 13:40 PST, Eric Seidel (no email)
no flags Details
expectations.json (also needed by the layout test dashboard) (6.08 KB, text/plain)
2010-02-24 13:40 PST, Eric Seidel (no email)
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Seidel (no email) 2010-02-24 13:40:19 PST
Created attachment 49430 [details]
results.json

tests are randomly timing out when run with run-chromium-webkit-tests

I ran:
while true; do run-chromium-webkit-tests --platform=mac-leopard --time-out-ms=15000; done

In a loop overnight.  The results that I got back strongly indicate that there is a problem with the harness, as every run or so at least one test will randomly time out.

See the attached results.json file.  (I can be loaded in layout test dashboard by dropping it in the right place ones tree (which Ojan could explain).
Comment 1 Eric Seidel (no email) 2010-02-24 13:40:44 PST
Created attachment 49431 [details]
expectations.json (also needed by the layout test dashboard)
Comment 2 Dirk Pranke 2010-02-24 14:36:20 PST
define "problem with the harness". What makes you think it's the harness's fault, as opposed to DRT doing unpredictable things under timing pressure and concurrency (or apache doing weird things, for that matter)? Or does "problem with the harness" include such issues?
Comment 3 Eric Seidel (no email) 2010-02-24 14:38:56 PST
Problem with the hardness would include all that, yes. :)
Comment 4 Dirk Pranke 2010-02-24 14:49:57 PST
Ah, see, I don't consider it to be a problem with the harness if DRT is being flaky under load (since there's nothing the harness can do to fix that except reduce the load). I consider that to be a problem w/ DRT. On the other hand, if it's apache, that's a problem w/ the harness.
Comment 5 Eric Seidel (no email) 2010-02-24 14:51:17 PST
At this point, we don't know what the problem is.  Why a test occasionally seems to lock up.
Comment 6 Eric Seidel (no email) 2010-02-24 15:08:56 PST
I appear to get random timeouts even with --num-test-shells=1.  Which points more fingers at the actual python scripts.  Still investigating.  I still don't know what actually happens when it "times out", if it's killing DRT because  it's hung, or what.
Comment 7 Ojan Vafai 2010-02-24 17:07:40 PST
It's unlikely the http server since plenty non-http tests timeout.

It wouldn't be hard to modify the script to distinguish between cases where the test times out versus cases where DRT needs to be killed. All that code is in test_shell_thread.py if you want to go that route.

One way to try and understand this a bit would be to print the output from DRT to stdout (or a file) so you can see if the output from DRT makes sense in the cases where timeouts happen.
Comment 8 Dirk Pranke 2010-09-23 14:36:34 PDT
interesting. dglazkov just found a bug in server_process that py that might be contributing to this.

https://bugs.webkit.org/show_bug.cgi?id=46406
Comment 9 Dirk Pranke 2011-04-01 16:22:03 PDT
It might be a combination of the bug dimitri fixed, or the fact that our timeouts are lower than ORWT by default, or any number of other bugs fixed in the past year. I think things should be okay now; after marking a couple of tests as SLOW I'm no longer seeing timeouts on SL.

Please reopen or file a new bug if we're still seeing issues.