Bug 34399

Summary: Http testing is flakey on the Windows Test bots
Product: WebKit Reporter: Andras Becsi <abecsi>
Component: Tools / TestsAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: bweinstein, cjerdonek, commit-queue, eric, ossy, tkent, zoltan
Priority: P2 Keywords: LayoutTestFailure
Version: 528+ (Nightly build)   
Hardware: PC   
OS: Windows 2000   

Description Andras Becsi 2010-01-31 13:13:59 PST
Recently there are many sporadic timeouts on the Windows bots after one http test times out all the followon are timing out too. It was suggested that the change in http://trac.webkit.org/changeset/53559 caused a regression (discussed here: https://bugs.webkit.org/show_bug.cgi?id=33153), which does not let run-webkit-tests recover from an http timeout.
I don't beleave this is true since there are several cases where one or two http tests time out and no other after them. Such examples are:

http://build.webkit.org/results/Windows%20Debug%20%28Tests%29/r54107%20%289228%29/results.html
http://build.webkit.org/results/Windows%20Debug%20%28Tests%29/r54106%20%289227%29/results.html
http://build.webkit.org/results/Windows%20Debug%20%28Tests%29/r54100%20%289223%29/results.html
http://build.webkit.org/results/Windows%20Debug%20%28Tests%29/r54056%20%289185%29/results.html
http://build.webkit.org/results/Windows%20Release%20%28Tests%29/r54095%20%288562%29/results.html
http://build.webkit.org/results/Windows%20Release%20%28Tests%29/r54083%20%288555%29/results.html
http://build.webkit.org/results/Windows%20Release%20%28Tests%29/r54069%20%288544%29/results.html
http://build.webkit.org/results/Windows%20Release%20%28Tests%29/r54057%20%288533%29/results.html

All the http tests time out after the following tests' time out (there might be 1 or 2 more, but the bots have a very tiny brain):
http/tests/xmlhttprequest/basic-auth.html
http/tests/security/xss-DENIED-xsl-external-entity-redirect.xml
http/tests/security/aboutBlank/xss-DENIED-navigate-opener-document-write.html
http/tests/workers/text-encoding.html

In addition lately there are numerous failed downloads on the Windows Test bots with stderr like:
"File 'archives/win-i386-release/54067.zip' not available at master"
and
"File 'archives/win-i386-debug/54065.zip' not available at master"
Comment 1 Brian Weinstein 2010-01-31 13:37:26 PST
Thanks for filing and doing some good legwork on this. I'll investigate further tomorrow.
Comment 2 Csaba Osztrogonác 2010-02-01 04:27:31 PST
I examined all results.html of Windows Release Test bot between r53559 and r54126. This flakeyness occured in r53891 first time. ( http://build.webkit.org/results/Windows%20Release%20%28Tests%29/r53891%20%288413%29/results.html )

I think, it might caused by http://trac.webkit.org/changeset/53889

Eric, Kent, Is it possible?
Comment 3 Brian Weinstein 2010-02-01 09:49:44 PST
(In reply to comment #2)
> I examined all results.html of Windows Release Test bot between r53559 and
> r54126. This flakeyness occured in r53891 first time. (
> http://build.webkit.org/results/Windows%20Release%20%28Tests%29/r53891%20%288413%29/results.html
> )
> 
> I think, it might caused by http://trac.webkit.org/changeset/53889
> 
> Eric, Kent, Is it possible?

It's not the first change I would have guessed, but anything that touches DRT is possible.

I'd say the best course of action would be to try rolling it out, and see if it fixes the issue. If not, we can roll it back in.
Comment 4 Csaba Osztrogonác 2010-02-03 13:20:32 PST
http://trac.webkit.org/changeset/53889 rolled out by http://trac.webkit.org/changeset/54295. If it won't solve the problem, I'll roll back it again.
Comment 5 Csaba Osztrogonác 2010-02-03 15:36:17 PST
(In reply to comment #4)
> http://trac.webkit.org/changeset/53889 rolled out by
> http://trac.webkit.org/changeset/54295. If it won't solve the problem, I'll
> roll back it again.

Unsuccessful experiment. :( So rolled back again by http://trac.webkit.org/changeset/54307
Comment 6 Csaba Osztrogonác 2010-02-03 16:37:02 PST
2nd experiment:

http://trac.webkit.org/changeset/53559 and http://trac.webkit.org/changeset/54084 rolled out by http://trac.webkit.org/changeset/54312

If it won't solve the problem, I'll roll back it again.
Comment 7 Csaba Osztrogonác 2010-02-03 17:08:10 PST
(In reply to comment #6)
> 2nd experiment:
> 
> http://trac.webkit.org/changeset/53559 and
> http://trac.webkit.org/changeset/54084 rolled out by
> http://trac.webkit.org/changeset/54312
> 
> If it won't solve the problem, I'll roll back it again.

Unsuccessful experiment again. :( So rolled back again by
http://trac.webkit.org/changeset/54314
Comment 8 Andras Becsi 2010-06-30 07:45:43 PDT
This is not an issue any more and http testing on the Windows bots seems stable for a long time now so I'll close this bug.
Comment 9 Eric Seidel (no email) 2010-06-30 13:40:00 PDT
I'm surprised if that's actually true.  We don't monitor the Windows test bots because they're not stable enough to be part of core. :)
Comment 10 Andras Becsi 2010-06-30 14:18:29 PDT
(In reply to comment #9)
> I'm surprised if that's actually true.  We don't monitor the Windows test bots because they're not stable enough to be part of core. :)

The problem of the bug (one http test timing out, thereafter all other http tests also time out) didn't appear for a long time now, so the bug seems fixed. This however does not mean that the bots are stable, generally speaking, but this is not an issue of this bug.
My previous comment did not explicitly explain that, so thanks Eric for pointing that out.