Created attachment 49430 [details] results.json tests are randomly timing out when run with run-chromium-webkit-tests I ran: while true; do run-chromium-webkit-tests --platform=mac-leopard --time-out-ms=15000; done In a loop overnight. The results that I got back strongly indicate that there is a problem with the harness, as every run or so at least one test will randomly time out. See the attached results.json file. (I can be loaded in layout test dashboard by dropping it in the right place ones tree (which Ojan could explain).
Created attachment 49431 [details] expectations.json (also needed by the layout test dashboard)
define "problem with the harness". What makes you think it's the harness's fault, as opposed to DRT doing unpredictable things under timing pressure and concurrency (or apache doing weird things, for that matter)? Or does "problem with the harness" include such issues?
Problem with the hardness would include all that, yes. :)
Ah, see, I don't consider it to be a problem with the harness if DRT is being flaky under load (since there's nothing the harness can do to fix that except reduce the load). I consider that to be a problem w/ DRT. On the other hand, if it's apache, that's a problem w/ the harness.
At this point, we don't know what the problem is. Why a test occasionally seems to lock up.
I appear to get random timeouts even with --num-test-shells=1. Which points more fingers at the actual python scripts. Still investigating. I still don't know what actually happens when it "times out", if it's killing DRT because it's hung, or what.
It's unlikely the http server since plenty non-http tests timeout. It wouldn't be hard to modify the script to distinguish between cases where the test times out versus cases where DRT needs to be killed. All that code is in test_shell_thread.py if you want to go that route. One way to try and understand this a bit would be to print the output from DRT to stdout (or a file) so you can see if the output from DRT makes sense in the cases where timeouts happen.
interesting. dglazkov just found a bug in server_process that py that might be contributing to this. https://bugs.webkit.org/show_bug.cgi?id=46406
It might be a combination of the bug dimitri fixed, or the fact that our timeouts are lower than ORWT by default, or any number of other bugs fixed in the past year. I think things should be okay now; after marking a couple of tests as SLOW I'm no longer seeing timeouts on SL. Please reopen or file a new bug if we're still seeing issues.