Bug 37643 - new-run-webkit-tests has much higher startup latency than run-webkit-tests
: new-run-webkit-tests has much higher startup latency than run-webkit-tests
: WebKit
Tools / Tests
: 528+ (Nightly build)
: PC Mac OS X 10.5
: P2 Normal
Assigned To:
: 37780
  Show dependency treegraph
Reported: 2010-04-15 00:51 PST by
Modified: 2010-04-20 23:56 PST (History)

Patch (7.06 KB, patch)
2010-04-19 02:25 PST, Eric Seidel
no flags Review Patch | Details | Formatted Diff | Diff
Added small speedup to test gathering (9.12 KB, patch)
2010-04-19 03:50 PST, Eric Seidel
abarth: review-
commit-queue: commit‑queue-
Review Patch | Details | Formatted Diff | Diff


You need to log in before you can comment on or make changes to this bug.

Description From 2010-04-15 00:51:26 PST
Testing with a stopwatch, on my system (2.4GHz Core2 Duo MacBook Pro), the time to the first visible evidence of a test being run is:

run-webkit-tests: 10s
new-run-webkit-tests: 75s

On the other hand, new-run-webkit tests is definitely faster at running through all the layout tests. Total real time (as tested by "time"):

run-webkit-tests: 13m13.386s
new-run-webkit-tests : 18m35.731s

So at the very least, new-run-webkit-tests is giving up a big chunk of its performance advantage to startup latency.
------- Comment #1 From 2010-04-16 16:16:14 PST -------
wow, that's slow. I wonder if there was something wrong with your configuration. Were all your files on a local filesystem? I've only ever seen that kind of slowness on a windows machine running on top of an SMB share.

It is true that new-run-webkit-tests is slower than run-webkit-tests for a couple of reasons. The first has to do with spawning off multiple threads, which obviously takes time. The second is that there is a certain amount of "all at once" processing that new-run-webkit-tests does to compute statistics at the start. I have been tempted to fix the latter but I've never seen that it's been a real problem.

Either way, your numbers are an order of magnitude slower than I would expect, so we should figure out what's going on. If you could post the log output from --verbose from the beginning to when testing started that would be a big help.
------- Comment #2 From 2010-04-17 11:09:12 PST -------
Seems like we could benefit from using Python's profiler to figure out what's taking the time.
------- Comment #3 From 2010-04-19 01:45:40 PST -------
I ran:
python -m cProfile new-run-webkit-tests

and aborted it about 2 minutes in (20% through the tests on my machine).

We spent a whopping 45 SECONDS in time.sleep:
     3011   45.569    0.015   45.569    0.015 {time.sleep}

And 3 seconds making some 100k stat calls:
   109687    3.603    0.000    3.603    0.000 {posix.stat}

50 seconds were spent reading files:
     5441   50.576    0.009   50.576    0.009 {method 'read' of 'file' objects}
------- Comment #4 From 2010-04-19 02:02:39 PST -------
We're spending a lot of time reading image hashes up front.

TestInfo.__init__ does:
            self.image_hash = open(expected_hash_file, "r").read()

(which probably leaks a file handle).

run-webkit-tests reads image hashes when it gets to them.  We could to (and probably should).  There shouldn't be a thread-safety issue because only one thread is ever processing any given test.
------- Comment #5 From 2010-04-19 02:25:19 PST -------
Created an attachment (id=53662) [details]
------- Comment #6 From 2010-04-19 02:29:02 PST -------
With that patch I see the code sleeping for 5 seconds and reading for 10s.
      268    5.258    0.020    5.258    0.020 {time.sleep}
       15    9.412    0.627    9.412    0.627 {method 'read' of 'file' objects}

We're also spending 4s in stat,  1s listing directories and 1s calling "startswith":
   109687    4.061    0.000    4.061    0.000 {posix.stat}
  2842147    1.320    0.000    1.320    0.000 {method 'startswith' of 'str' objects}
     1813    1.168    0.001    1.168    0.001 {posix.listdir}

Obviously still more work to do.
------- Comment #7 From 2010-04-19 03:08:21 PST -------
This will also get easier to debug once bug 37780 lands.
------- Comment #8 From 2010-04-19 03:47:37 PST -------
One thing that would make this latency much easier to deal with is if the meter updated more frequently during things like "Checking build ...".  It seems that the part of the code which is updating the meter should be aware of the fact that we have to build DRT and build ImageDiff and possibly start helper processes.  We may just want to pass a meter object to the port during check_build().
------- Comment #9 From 2010-04-19 03:50:50 PST -------
Created an attachment (id=53667) [details]
Added small speedup to test gathering
------- Comment #10 From 2010-04-19 14:06:23 PST -------
(From update of attachment 53667 [details])
------- Comment #11 From 2010-04-19 14:19:49 PST -------
(From update of attachment 53667 [details])
LGTM, apart from your removal of my beloved one-character variable names ;)
------- Comment #12 From 2010-04-20 06:34:56 PST -------
(From update of attachment 53667 [details])
Rejecting patch 53667 from commit-queue.

Failed to run "['WebKitTools/Scripts/test-webkitpy']" exit_code: 1
Last 500 characters of output:
e 533, in loadTestsFromName
    module = __import__('.'.join(parts_copy))
  File "/Users/eseidel/Projects/CommitQueue/WebKitTools/Scripts/webkitpy/layout_tests/run_webkit_tests_unittest.py", line 36, in <module>
    import webkitpy.layout_tests.run_webkit_tests as run_webkit_tests
  File "/Users/eseidel/Projects/CommitQueue/WebKitTools/Scripts/webkitpy/layout_tests/run_webkit_tests.py", line 130
    with open(self._expected_hash_path, "r") as hash_file:
SyntaxError: invalid syntax

Full output: http://webkit-commit-queue.appspot.com/results/1626581
------- Comment #13 From 2010-04-20 10:20:31 PST -------
weird ... is the commit queue bot running an unusual version of python?
------- Comment #14 From 2010-04-20 12:26:04 PST -------
(From update of attachment 53667 [details])
Btw, this patch doesn't comply with our new unicode overlords.
------- Comment #15 From 2010-04-20 23:01:37 PST -------
I wrote this patch on SL which has python 2.6.  I suspect I need to fix it for 2.5.
------- Comment #16 From 2010-04-20 23:21:54 PST -------
from __future__ import with_statement
which fixed it.
------- Comment #17 From 2010-04-20 23:23:53 PST -------
I'll make our unicode overlords happy an re-post before I commit.
------- Comment #18 From 2010-04-20 23:56:34 PST -------
Landed as r57956.