Bug 60094

Summary: fast/encoding/parser-tests often timeout on debug bots
Product: WebKit Reporter: Dirk Pranke <dpranke>
Component: Tools / TestsAssignee: Jenn Braithwaite <jennb>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, dglazkov, eric, jennb
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: OS X 10.5   
Attachments:
Description Flags
patch none

Comment 1 Jenn Braithwaite 2011-05-04 17:20:19 PDT
I will halve the size of each batch of tests again.
Comment 2 Jenn Braithwaite 2011-05-04 17:49:39 PDT
Created attachment 92351 [details]
patch
Comment 3 Jenn Braithwaite 2011-05-04 17:53:11 PDT
Adam, could you review this as you reviewed the similar change in bug 51721 - this time I further halved each batch of tests.
Comment 4 Adam Barth 2011-05-04 18:02:02 PDT
Comment on attachment 92351 [details]
patch

I don't really understand the details of what this patch is doing, but it looks fine to me.
Comment 5 Jenn Braithwaite 2011-05-04 18:05:59 PDT
(In reply to comment #4)
> (From update of attachment 92351 [details])
> I don't really understand the details of what this patch is doing, but it looks fine to me.


Each batch used to run 10 tests, it now runs 5.  The script takes the first and last test, e.g. runtests(10, 19) is changed to runtests(10, 14).  A new test is added to invoke runtests(15, 19).  The expected results files were also updated so that the expected failures are in the expected results file for the correct batch.
Comment 6 Dirk Pranke 2011-05-04 18:27:52 PDT
I'm sorry, the bug report wasn't clear (editing to fix).

It's not the batch size that's the problem. If you look at the numbers on the dashboard, you'll see that most of the time the batches run in a couple of seconds, and once in a while they'll take a really long time and time out.

We need to figure out why they're timing out, not speed up the average case.
Comment 7 Jenn Braithwaite 2011-05-05 11:05:46 PDT
(In reply to comment #6)
> I'm sorry, the bug report wasn't clear (editing to fix).
> 
> It's not the batch size that's the problem. If you look at the numbers on the dashboard, you'll see that most of the time the batches run in a couple of seconds, and once in a while they'll take a really long time and time out.
> 
> We need to figure out why they're timing out, not speed up the average case.

Is there a way to tell from the logs whether the machine was under load when the timeouts occurred?  Reducing the batch sizes will reduce timeouts that occur due to the machine being under load, so it is still worth doing.
Comment 8 Dirk Pranke 2011-05-05 11:20:45 PDT
(In reply to comment #7)
> (In reply to comment #6)
> > I'm sorry, the bug report wasn't clear (editing to fix).
> > 
> > It's not the batch size that's the problem. If you look at the numbers on the dashboard, you'll see that most of the time the batches run in a couple of seconds, and once in a while they'll take a really long time and time out.
> > 
> > We need to figure out why they're timing out, not speed up the average case.
> 
> Is there a way to tell from the logs whether the machine was under load when the timeouts occurred?  Reducing the batch sizes will reduce timeouts that occur due to the machine being under load, so it is still worth doing.

Not really, but because of the way new-run-webkit-tests works, you should assume the machine is under load all the time (because that's the design goal, to get through the tests as quickly as possible).

It would be interesting to see if the tests never timed out if the machine wasn't under load; there is a patch floating around to try that, but it hasn't landed yet.
Comment 9 Dirk Pranke 2011-05-05 15:25:30 PDT
Sure. bug 59570 is the one to watch.
Comment 10 Eric Seidel (no email) 2011-06-02 08:06:43 PDT
Do we know why these tests are slow?
Comment 11 Jenn Braithwaite 2011-06-02 09:50:39 PDT
(In reply to comment #10)
> Do we know why these tests are slow?

I do not know why theses tests are occasionally slow.
Comment 12 Dimitri Glazkov (Google) 2011-06-14 11:16:36 PDT
It's interesting, but these tests never failed on upstream Chromium bots: http://test-results.appspot.com/dashboards/flakiness_dashboard.html#group=%40ToT%20-%20webkit.org&tests=fast%2Fencoding%2Fparser-tests
Comment 13 Jenn Braithwaite 2011-07-14 09:59:04 PDT
*** Bug 64498 has been marked as a duplicate of this bug. ***
Comment 14 Adam Barth 2011-10-14 17:24:42 PDT
Comment on attachment 92351 [details]
patch

Please re-nominate for review if this is still a problem.
Comment 15 Dirk Pranke 2013-09-18 17:00:32 PDT
This has been resolved in Blink; it's not clear if this was ever an issue for non-Chromium ports given that they had a much longer timeout.