Bug 28845

Summary: REGRESSION: media/video-size-intrinsic-scale.html (and other media tests?) crashing/timing-out intermittently
Product: WebKit Reporter: Eric Seidel (no email) <eric>
Component: New BugsAssignee: Simon Fraser (smfr) <simon.fraser>
Status: RESOLVED FIXED    
Severity: Normal CC: commit-queue, eric.carlson, simon.fraser
Priority: P1 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: PC   
OS: OS X 10.5   
Bug Depends on:    
Bug Blocks: 29035, 28624, 29037, 38912    
Attachments:
Description Flags
crash report for media/view-src-add-src
none
same crash, seen with compositing/geometry/clipping-foreground on 8/28
none
Skip media/video-size-intrinsic-scale.html on leopard and re-enable media/video-source-add-src.html none

Eric Seidel (no email)
Reported 2009-08-31 03:54:58 PDT
Created attachment 38807 [details] crash report for media/view-src-add-src media/video-source-add-src.html (and other media tests?) crashing intermittently As seen in bug 28827 (by the commit queue). Attaching crash log.
Attachments
crash report for media/view-src-add-src (51.30 KB, text/plain)
2009-08-31 03:54 PDT, Eric Seidel (no email)
no flags
same crash, seen with compositing/geometry/clipping-foreground on 8/28 (47.46 KB, text/plain)
2009-08-31 03:57 PDT, Eric Seidel (no email)
no flags
Skip media/video-size-intrinsic-scale.html on leopard and re-enable media/video-source-add-src.html (4.15 KB, patch)
2009-09-16 19:38 PDT, Eric Seidel (no email)
no flags
Eric Seidel (no email)
Comment 1 2009-08-31 03:57:40 PDT
Created attachment 38808 [details] same crash, seen with compositing/geometry/clipping-foreground on 8/28 Here is another crash report (of the same crash) seen while trying to land bug 28709: compositing/geometry/clipping-foreground.html -> crashed
Eric Seidel (no email)
Comment 2 2009-08-31 04:00:12 PDT
The first report of this I see locally is from 8/28, so I assume it is a recent regression.
Eric Seidel (no email)
Comment 3 2009-09-01 03:37:14 PDT
Just had media/video-source-add-src.html timeout while trying to land bug 28808. I expect it's a similar issue.
Eric Seidel (no email)
Comment 4 2009-09-01 03:39:33 PDT
Eric Seidel (no email)
Comment 5 2009-09-01 16:16:18 PDT
Hit this again when trying to land bug 28844. :( I think we should consider disabling this test as it seems most prone to crash.
Simon Fraser (smfr)
Comment 6 2009-09-01 16:29:42 PDT
Eric is back next week.
Eric Seidel (no email)
Comment 7 2009-09-01 16:36:44 PDT
media/video-source-add-src.html -> timed out Saw the timeout again with bug 28776.
Eric Seidel (no email)
Comment 8 2009-09-01 16:39:11 PDT
This could have the same root cause as bug 28624.
Eric Seidel (no email)
Comment 9 2009-09-01 16:49:01 PDT
media/video-source-add-src.html has timed out the last 2 runs on the bots: http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r47948%20(4600)/ http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r47949%20(4601)/ Time to skip this test. :( The revision which "started" this recent bout of timeouts is unrelated: http://trac.webkit.org/changeset/47948
Eric Seidel (no email)
Comment 10 2009-09-01 16:52:44 PDT
I will attach and then commit a patch which moves media/video-source-add-src.html to media/video-source-add-src.html-disabled to make the bots green. Eric can re-enable this test when he returns next week. :(
Simon Fraser (smfr)
Comment 11 2009-09-01 16:54:12 PDT
Why not just add it to the skipped list?
Eric Seidel (no email)
Comment 12 2009-09-01 16:56:21 PDT
I could add it to all the skipped lists if you prefer. That seems like a larger change for little/no? gain. I'm comfortable with either solution.
Eric Seidel (no email)
Comment 13 2009-09-01 17:15:56 PDT
Eric Seidel (no email)
Comment 14 2009-09-01 17:16:37 PDT
Came to consensus with Simon over IRC in #webkit.
Simon Fraser (smfr)
Comment 15 2009-09-01 17:18:14 PDT
Eric Seidel (no email)
Comment 16 2009-09-02 00:47:49 PDT
I saw another example of this crash last night while trying to land bug 28225: compositing/color-matching/image-color-matching.html -> crashed So sadly skipping the test did not get rid of the crashing flakiness. It might have gotten rid of the unexplained time outs, but we'll need to dig more to solve this crasher. :(
Simon Fraser (smfr)
Comment 17 2009-09-02 09:14:17 PDT
This may be a bug in a system framework on 10.5.8. Eric, you could try running 10.6 to see if the crashes happen there.
Eric Seidel (no email)
Comment 18 2009-09-02 13:51:42 PDT
Unfortunately I don't yet have Snow Leopard and it's not officially supported for Google Macs yet (we have some outstanding radar's with Apple specific to our setup). That said, even if this is leopard specific... we may need to work around it for leopard users. I think we need to start by one of us staring at that crash trace and theorizing what could be hosed in it.
Simon Fraser (smfr)
Comment 19 2009-09-02 16:18:23 PDT
It's something in Core Animation/AppKit teardown. I'll follow up in a week and a half when I get back from vacation.
Eric Seidel (no email)
Comment 20 2009-09-03 15:57:29 PDT
Trying to land bug 28406 was yet another victim to this regression. :(
Eric Seidel (no email)
Comment 21 2009-09-04 00:42:36 PDT
Bug 28961 was yet another victim! Blocked by landing by a flakey media test crash. :(
Eric Seidel (no email)
Comment 22 2009-09-04 03:21:18 PDT
Likely related timeout while trying to land bug 28931. :(
Eric Seidel (no email)
Comment 23 2009-09-06 02:04:17 PDT
Saw the timeout: media/video-source-error.html -> timed out when landing bug 28993 as well. Maybe the timeouts are a separate bug, but I think they're probably related to this one.
Eric Seidel (no email)
Comment 24 2009-09-08 09:10:54 PDT
I went back through the crash logs from the commit-queue for last week. They were all media related. :) I filed separate bugs (bug 29035 and bug 29037) for crashes which are probably caused by this one, but have different stack traces. Hopefully between the 3 separate stack traces what is wrong will be obvious to Eric or Simon. :(
Eric Seidel (no email)
Comment 25 2009-09-09 09:55:28 PDT
video-no-autoplay just crashed on a Leopard bot. w/o a crash log it's difficult to know if it's related or not, but given how many different tests have produced these similar crash dumps, it probably is: http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r48210%20(4835)/results.html
Eric Seidel (no email)
Comment 26 2009-09-09 16:00:01 PDT
I'm unsure what to do here. It may be time for me to dive in an try and fix this myself. This bug is the largest pain point with the commit-queue right now. :( The bots just hit this again: http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r48231%20(4853)/results.html
Eric Seidel (no email)
Comment 27 2009-09-09 16:14:16 PDT
I wonder if we could catch this by running the media layout tests with NSZombies enabled.
Eric Seidel (no email)
Comment 28 2009-09-09 17:15:27 PDT
> run-webkit-tests --guard media Testing 92 test cases. media ............................ media/remove-from-document.html -> failed media/video-aspect-ratio.html -> failed media/video-layer-crash.html -> failed media/video-load-networkState.html -> failed media/video-loop.html -> failed media/video-played-ranges-1.html -> failed media/video-seek-past-end-playing.html -> failed media/video-seeking.html -> failed media/video-transformed.html -> failed media/video-zoom-controls.html -> failed media/video-zoom.html -> failed Most of them were timeouts, but a few of them were actual failures which could be memory related. (I removed some of the extra dots from the output)
Eric Seidel (no email)
Comment 29 2009-09-11 09:30:17 PDT
3 more rejections last night, and another crash on the bots. :( http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r48301%20(4915)/
Eric Seidel (no email)
Comment 30 2009-09-11 17:26:57 PDT
Another instance on the bots: http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r48323%20(4938)/results.html Four more spurious rejections today, landing bug 29210, bug 29159, bug 28831, and bug 26715. :(
Eric Seidel (no email)
Comment 31 2009-09-11 17:28:21 PDT
It might be possible to write a git bisect command which ran the media layout tests 10 times and exited 0 if they all passed. That could help us figure out which revision caused this failure.
Eric Seidel (no email)
Comment 32 2009-09-11 17:37:38 PDT
test.sh: #!/bin/sh build-webkit || exit 125 # ignore broken builds for (( i = 0 ; i <= 10; i++ )) do build-dump-render-tree || exit 125 # ignore builds with a broken DRT build run-webkit-tests media || exit 1 # Mark builds as bad if the media tests fail. done git bisect start known_bad_revision known_good_revision git bisect run ./test.sh known_bad_revision would be something from August 21st or whenever we decided this first started failing. known_good_revision would be a git commit from sometime before that. maybe august 1st?
Eric Seidel (no email)
Comment 33 2009-09-11 18:14:14 PDT
Here is a more realistic test.sh: #!/bin/sh echo "Building WebKit" WebKitTools/Scripts/build-webkit > /dev/null || exit 125 # Ignore broken builds. for (( i = 0 ; i <= 10; i++ )) do WebKitTools/Scripts/build-dumprendertree > /dev/null || exit 125 # Ignore revisions with a broken DRT build. WebKitTools/Scripts/run-webkit-tests --quiet --no-sample-on-timeout --no-launch-safari --exit-after-n-failures=1 media || exit 1 # Mark builds as bad if the media tests fail. done echo "Passed" Unfortunately that still produces false positives. :( I'm not able to get this to fail reliably enough to test yet.
Eric Seidel (no email)
Comment 34 2009-09-11 18:18:22 PDT
I've not yet seen these crashes on the Snow Leopard bot. However the Snow Leopard bot hasn't been around very long, and it's possible that bug 29216 is masking one of these failures.
Eric Seidel (no email)
Comment 35 2009-09-11 18:39:32 PDT
HUZZAH! I found the culprit! media/video-size-intrinsic-scale.html run-webkit-tests media/video-size-intrinsic-scale.html media/video-size-intrinsic-scale.html ... (pasted 100 more times or so) crashes reliably for me! Generally it crashes after about the 8th run. We may just be able to skip this test on Leopard to get rid of this bug!
Eric Seidel (no email)
Comment 36 2009-09-11 18:59:48 PDT
OK. Time for me to retire for the weekend. I'll look more at this on monday, including considering skipping media/video-size-intrinsic-scale.html on leopard until Eric/Simon have a chance to figure out what's actually wrong.
Simon Fraser (smfr)
Comment 37 2009-09-15 13:02:22 PDT
I ran media/video-source-add-src.html 1000 times with no crash. I did see one hang related to this bug when running the media tests 100 times. Would be nice to know how to make this more reproducible.
Eric Seidel (no email)
Comment 38 2009-09-15 13:14:53 PDT
(In reply to comment #37) > I ran media/video-source-add-src.html 1000 times with no crash. I did see one > hang related to this bug when running the media tests 100 times. Would be nice > to know how to make this more reproducible. The crash is repeatable for me running media/video-size-intrinsic-scale.html a few times. :) add-src is not the problem test, it just happend to be the test after media/video-size-intrinsic-scale.html.
Simon Fraser (smfr)
Comment 39 2009-09-15 14:52:57 PDT
I can run media/video-size-intrinsic-scale.html lots of times with no issues. Eric, what's your hardware?
Eric Seidel (no email)
Comment 40 2009-09-15 15:04:08 PDT
(In reply to comment #39) > I can run media/video-size-intrinsic-scale.html lots of times with no issues. > > Eric, what's your hardware? Mac Pro (Quad). I'll send you a system profiler report. > WebKitTools/Scripts/run-webkit-tests --iterations 100 --no-sample-on-timeout media/video-size-intrinsic-scale.html Testing 1 test cases 100 times. media ............ media/video-size-intrinsic-scale.html -> timed out .................... media/video-size-intrinsic-scale.html -> timed out ............................ media/video-size-intrinsic-scale.html -> timed out ............ media/video-size-intrinsic-scale.html -> timed out ........................... media/video-size-intrinsic-scale.html -> timed out . I'm not seeing it crash today, interesting.
Simon Fraser (smfr)
Comment 41 2009-09-15 15:08:37 PDT
Interesting, I'm testing on a dual CPU iMac. I'll try on my Mac Pro.
Eric Seidel (no email)
Comment 42 2009-09-15 15:51:27 PDT
Wow, interesting. I did a clean build of Debug (previously I was testing with release): run-webkit-tests --iterations 100 --no-sample-on-timeout media/video-size-intrinsic-scale.html --debug passed! I'll try a clean build of Release to check.
Eric Seidel (no email)
Comment 43 2009-09-15 17:39:14 PDT
I did a clean Release build. I'm able to get media/video-size-intrinsic-scale.html repeatably for release, but not for Debug builds. I'm able to get media/video-size-intrinsic-scale.html to crash running DRT under the debugger in XCode by passing the full path to media/video-size-intrinsic-scale.html 10 times in the arguments panel for the DumpRenderTree executable in my WebCore target.
Eric Seidel (no email)
Comment 44 2009-09-15 17:48:54 PDT
It would now be possible to write a git bisect command to test if this is a regression. That would be complicated by the fact that a bunch of flags that we would want to use in run-webkit-tests have only recently been added. Probably would be simplest to start by checking out the revision where media/video-size-intrinsic-scale.html was added and see if the test can be made to crash there.
Simon Fraser (smfr)
Comment 45 2009-09-16 12:54:48 PDT
This turns out to be an issue in CoreVideo, tracked via <rdar://problem/7228836>. I can't see any way to work around it, other than run SnowLeopard.
Eric Seidel (no email)
Comment 46 2009-09-16 13:03:36 PDT
Do we know if this is a regression in 10.5.8? Did we start tripping this with some certain tests? Can we skip those tests on Leopard? As far as I can tell this crasher is relatively new. :) I don't know if it's caused by code changes on our part, the 10.5.8 update, or by test additions, but presumably you could tell us given your radar knowledge. :)
Simon Fraser (smfr)
Comment 47 2009-09-16 13:22:29 PDT
The bug is related to the use of CVDisplayLinks, which are used by Core Animation when hardware compositing is enabled. So it's likely that this started to show up more when we turned on hardware-compositing for <video>. From what I can determine, it's a race condition that is simply a function of the frequency with which we turn over CVDisplayLinks. I tried changing the order of teardown of video elements vs. the WebHTMLView, and that did not avoid the bug.
Eric Seidel (no email)
Comment 48 2009-09-16 13:28:42 PDT
(In reply to comment #47) > The bug is related to the use of CVDisplayLinks, which are used by Core > Animation when hardware compositing is enabled. So it's likely that this > started to show up more when we turned on hardware-compositing for <video>. > From what I can determine, it's a race condition that is simply a function of > the frequency with which we turn over CVDisplayLinks. > > I tried changing the order of teardown of video elements vs. the WebHTMLView, > and that did not avoid the bug. Do you know if this bug is triggered by all video tests, or just a couple of them? If it's just a couple of them it would be easy to skip them on Leopard.
Simon Fraser (smfr)
Comment 49 2009-09-16 13:32:36 PDT
I don't see any obvious correlation between occurrences of the bug, and what any specific test is doing (other than triggering hardware). However, I guess we could try skipping media/video-size-intrinsic-scale.html on leopard.
Eric Seidel (no email)
Comment 50 2009-09-16 19:38:24 PDT
Created attachment 39676 [details] Skip media/video-size-intrinsic-scale.html on leopard and re-enable media/video-source-add-src.html
WebKit Commit Bot
Comment 51 2009-09-17 12:52:18 PDT
Comment on attachment 39676 [details] Skip media/video-size-intrinsic-scale.html on leopard and re-enable media/video-source-add-src.html Clearing flags on attachment: 39676 Committed r48485: <http://trac.webkit.org/changeset/48485>
WebKit Commit Bot
Comment 52 2009-09-17 12:52:27 PDT
All reviewed patches have been landed. Closing bug.
Eric Seidel (no email)
Comment 53 2009-09-17 13:34:51 PDT
Hopefully these crashes will all go away now. I'll close the rest of the dependent bugs if those stop too.
Eric Seidel (no email)
Comment 54 2009-09-22 09:57:49 PDT
Triple whammy on the bots just now. :( http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r48636%20(5229)/results.html Tests that timed out: compositing/color-matching/image-color-matching.html stderr media/video-source.html stderr Tests that caused the DumpRenderTree tool to crash: media/controls-right-click-on-timebar.html stderr I suspect those are all related to this root cause.
Note You need to log in before you can comment on or make changes to this bug.