Bug 35905

Summary: REGRESSION(55699?): media/video-no-autoplay.html times out on Leopard Commit Bot
Product: WebKit Reporter: Eric Seidel (no email) <eric>
Component: Tools / TestsAssignee: Simon Fraser (smfr) <simon.fraser>
Status: RESOLVED FIXED    
Severity: Normal CC: beidson, cmarrin, darin, eric.carlson, jhoneycutt, simon.fraser
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: OS X 10.5   
Bug Depends on: 36033, 26391, 35977    
Bug Blocks: 31129    
Attachments:
Description Flags
Hang report from Leopard Commit Bot (media/video-no-autoplay.html)
none
Hang report from Leopard Commit Bot (media/video-played-collapse.html)
none
Patch oliver: review+

Description Eric Seidel (no email) 2010-03-08 23:31:21 PST
Created attachment 50279 [details]
Hang report from Leopard Commit Bot (media/video-no-autoplay.html)

REGRESSION: media/video-no-autoplay.html times out on Leopard Commit Bot

Log:
2 patches in commit-queue [50251, 50263]
Cleaning working directory
Updating working directory
Building WebKit
Running Python unit tests
Running Perl unit tests
Running JavaScriptCore tests
Running run-webkit-tests
Failed to run "['WebKitTools/Scripts/run-webkit-tests', '--no-launch-safari', '--exit-after-n-failures=1', '--quiet']" exit_code: 1
.................................................................................................................................................................................................................................................................................................
----------------------------------------------------------------------
Ran 289 tests in 0.920s

OK
/Users/eseidel/Projects/CommitQueue/WebKitTools/Scripts/webkitperl/VCSUtils_unittest/fixChangeLogPatch.......ok
/Users/eseidel/Projects/CommitQueue/WebKitTools/Scripts/webkitperl/VCSUtils_unittest/generatePatchCommand....ok
/Users/eseidel/Projects/CommitQueue/WebKitTools/Scripts/webkitperl/VCSUtils_unittest/gitdiff2svndiff.........ok
/Users/eseidel/Projects/CommitQueue/WebKitTools/Scripts/webkitperl/VCSUtils_unittest/parseDiff...............ok
/Users/eseidel/Projects/CommitQueue/WebKitTools/Scripts/webkitperl/VCSUtils_unittest/parseDiffHeader.........ok
/Users/eseidel/Projects/CommitQueue/WebKitTools/Scripts/webkitperl/VCSUtils_unittest/parsePatch..............ok
/Users/eseidel/Projects/CommitQueue/WebKitTools/Scripts/webkitperl/VCSUtils_unittest/runPatchCommand.........ok
All tests successful.
Files=7, Tests=133,  1 wallclock secs ( 0.24 cusr +  0.07 csys =  0.31 CPU)
Running build-dumprendertree
Compiling Java tests
make: Nothing to be done for `default'.
Running tests from /Users/eseidel/Projects/CommitQueue/LayoutTests
Testing 12455 test cases.
media/video-no-autoplay.html -> timed out
Sampling process 40023 for 10 seconds with 10 milliseconds of run time between samples
Sampling completed, processing symbols...
Sample analysis of process 40023 written to file /Users/eseidel/Library/Logs/DumpRenderTree/HangReport.txt

Exiting early after 1 failures. 9185 tests run.
368.49s total testing time

9184 test cases (99%) succeeded
1 test case (<1%) timed out
9 test cases (<1%) had stderr output
Unabled to successfully build and test
Not proceeding with work item. Sleeping until 2010-03-08 20:51:28 (5 mins).

That was the first time the commit-queue saw the failure, so the regression must have been before 20:46 PST this evening.
Comment 1 Eric Seidel (no email) 2010-03-08 23:33:13 PST
http://trac.webkit.org/changeset/55699 seems related.

I'll make the commit-bot do a full rebuild of WebKit and see if that fixes the issue.
Comment 2 Eric Seidel (no email) 2010-03-08 23:36:35 PST
The commit-queue was blocked on a red tree most of the afternoon. The last time it cycled without blocking was:
Empty queue
No work item. Sleeping until 2010-03-08 15:37:12 (5 mins).

The last commit it made (and thus ran the tests successfully) before going empty was http://trac.webkit.org/changeset/55680 which was 03/08/10 13:06:08
Comment 3 Eric Seidel (no email) 2010-03-08 23:38:39 PST
I'm technically on vacation all week, so I will not be checking the queue regularly.  However, others can view its status using http://webkit-commit-queue.appspot.com/

Right now it says "Unabled to successfully build and test" due to this test timeout.
Comment 4 Eric Seidel (no email) 2010-03-09 11:17:12 PST
This seems to have resolved itself for now.  Closing.
Comment 5 Eric Seidel (no email) 2010-03-09 11:24:53 PST
I did a clean build, and then hit a build error in XPathGrammar.h.  Looking at that file history it seemed it hadn't changed in weeks, so I did another clean build and now the commit-queue is running fine.  Not sure what caused either the consistent timeout problem or the spurious build error.
Comment 6 Eric Seidel (no email) 2010-03-10 02:03:51 PST
Now media/video-play-empty-events.html -> timed out
 is failing.  I think somehitng is wrong still.
Comment 7 Eric Seidel (no email) 2010-03-10 02:05:01 PST
I've seen the no-autoplay test fail again too. I think 55699 may be a real regression.
Comment 8 Eric Seidel (no email) 2010-03-10 10:40:19 PST
This timeout is intermittent.

The bots seem to be seieing timeouts as well, possibly related to this.

This regression (along with is currently blocking 19 patches in the commit-queue along with intermittent timeouts related to bug 35824.
Comment 9 Brady Eidson 2010-03-10 10:52:03 PST
(In reply to comment #8)
> This timeout is intermittent.
> 
> The bots seem to be seieing timeouts as well, possibly related to this.
> 
> This regression (along with is currently blocking 19 patches in the
> commit-queue along with intermittent timeouts related to bug 35824.


What "timeout" did the fix for 35824 cause, out of curiosity?

I'm aware of the test failures, that we're cleaning up on various bots by removing cookies.plists.  But this is the first I've heard of any timeouts.
Comment 10 Eric Seidel (no email) 2010-03-10 10:56:41 PST
(In reply to comment #9)
> What "timeout" did the fix for 35824 cause, out of curiosity?
> 
> I'm aware of the test failures, that we're cleaning up on various bots by
> removing cookies.plists.  But this is the first I've heard of any timeouts.

As you correctly read, this bug is not related to cookies.  This bug is about http://trac.webkit.org/changeset/55699 which seems to have caused an intermittent media/video-no-autoplay.html on the commit bot, and may be responsible for other media timeouts on the bots:
http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r55768%20(11425)/results.html
http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r55760%20(11417)/results.html
Comment 11 Darin Adler 2010-03-10 11:03:08 PST
(In reply to comment #10)
> As you correctly read, this bug is not related to cookies.  This bug is about
> http://trac.webkit.org/changeset/55699 which seems to have caused an
> intermittent media/video-no-autoplay.html on the commit bot, and may be
> responsible for other media timeouts on the bots:
> http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r55768%20(11425)/results.html
> http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r55760%20(11417)/results.html

So you've said it's my check-in causing this; that gets my attention. Does anyone have any other data? For example, is this Leopard-only? Is it bot-only or is someone seeing it on their own computer too?
Comment 12 Eric Seidel (no email) 2010-03-10 12:58:18 PST
The Leopard Commit Bot has also seen:
media/video-played-collapse.html -> timed out
since this regression.
Comment 13 Eric Seidel (no email) 2010-03-10 12:58:58 PST
Created attachment 50428 [details]
Hang report from Leopard Commit Bot (media/video-played-collapse.html)
Comment 14 Eric Seidel (no email) 2010-03-10 13:01:43 PST
I checked the logs.  The first failure the commit queue saw was of "media/video-no-autoplay" and was shortly after 2010-03-08 20:30:27
Comment 15 Eric Seidel (no email) 2010-03-10 13:05:08 PST
The bots have seen http/tests/media/video-play-stall.html timeout, but we don't have hang reports from the bots.  I don't have data on when the bots first saw http/tests/media/video-play-stall.html timeout.
Comment 16 Eric Seidel (no email) 2010-03-10 13:06:31 PST
This looks like a hang down in CG/AppKit/LayerKit.  Perhaps I just need to restart the commit bot?  I've CC'd Eric Carlson and Simon Fraser in case either of them have theories.
Comment 17 Simon Fraser (smfr) 2010-03-10 14:00:16 PST
We disable accelerated compositing on Leopard now (until some future QuickTime version revives it). I don't understand how we're getting into compositing mode.
Comment 18 Darin Adler 2010-03-10 14:54:50 PST
(In reply to comment #17)
> We disable accelerated compositing on Leopard now (until some future QuickTime
> version revives it). I don't understand how we're getting into compositing
> mode.

WebGL was turned on for the buildbots; I think Oliver did it. Since <http://trac.webkit.org/changeset/55685> and <http://trac.webkit.org/changeset/55697>, that means that accelerated compositing is on for the Leopard buildbots.
Comment 19 Simon Fraser (smfr) 2010-03-10 15:13:27 PST
I didn't know that a) cmarrin put the webgl forcing pref back in, and b) that webgl was turned on in DRT.

So maybe this is the CoreVideo crash?
Comment 20 Darin Adler 2010-03-10 15:14:14 PST
(In reply to comment #19)
> I didn't know that a) cmarrin put the webgl forcing pref back in, and b) that
> webgl was turned on in DRT.
> 
> So maybe this is the CoreVideo crash?

Seems likely.
Comment 21 Eric Seidel (no email) 2010-03-10 22:48:57 PST
I agree, this is very reminiscent of the historical CV issues (at least from what I saw of them).

I have two CV-related crash dumps since 3/8/10 on the commit bot.  I'm happy to attach both if that would be helpful.
Comment 22 Eric Seidel (no email) 2010-03-11 09:49:15 PST
This is still the primary cause of build.webkit.org redness, and I believe to be the reason why the commit queue is still stuck around 19 patches (every so often it actually succeeded in landing one when this test happens to have passed on the bots and passes twice in a row).

Is someone still investigating or should we roll 55699 out?
Comment 23 Simon Fraser (smfr) 2010-03-11 09:55:55 PST
If anything gets rolled out, it should be 55697.
Comment 24 Eric Seidel (no email) 2010-03-11 10:25:26 PST
I agree.  Lets roll out r55697 until we have a work-around for this evil CV bug.
Comment 25 Simon Fraser (smfr) 2010-03-11 10:37:02 PST
Or we just turn off webgl on the bots. I think Oliver Hunt turned it on.
Comment 26 Simon Fraser (smfr) 2010-03-11 14:33:45 PST
I think this is really just a DRT bug: bug 36033.
Comment 27 Simon Fraser (smfr) 2010-03-11 14:52:39 PST
Created attachment 50546 [details]
Patch
Comment 28 Oliver Hunt 2010-03-11 15:01:18 PST
Comment on attachment 50546 [details]
Patch

r=me
Comment 29 Eric Seidel (no email) 2010-03-15 16:07:39 PDT
Attachment 50546 [details] was posted by a committer and has review+, assigning to Simon Fraser for commit.
Comment 30 Simon Fraser (smfr) 2010-04-02 09:54:11 PDT
Landed in http://trac.webkit.org/changeset/55857