Created attachment 38807 [details] crash report for media/view-src-add-src media/video-source-add-src.html (and other media tests?) crashing intermittently As seen in bug 28827 (by the commit queue). Attaching crash log.
Created attachment 38808 [details] same crash, seen with compositing/geometry/clipping-foreground on 8/28 Here is another crash report (of the same crash) seen while trying to land bug 28709: compositing/geometry/clipping-foreground.html -> crashed
The first report of this I see locally is from 8/28, so I assume it is a recent regression.
Just had media/video-source-add-src.html timeout while trying to land bug 28808. I expect it's a similar issue.
The bots just saw this timeout too! http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r47923%20(4574)/results.html
Hit this again when trying to land bug 28844. :( I think we should consider disabling this test as it seems most prone to crash.
Eric is back next week.
media/video-source-add-src.html -> timed out Saw the timeout again with bug 28776.
This could have the same root cause as bug 28624.
media/video-source-add-src.html has timed out the last 2 runs on the bots: http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r47948%20(4600)/ http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r47949%20(4601)/ Time to skip this test. :( The revision which "started" this recent bout of timeouts is unrelated: http://trac.webkit.org/changeset/47948
I will attach and then commit a patch which moves media/video-source-add-src.html to media/video-source-add-src.html-disabled to make the bots green. Eric can re-enable this test when he returns next week. :(
Why not just add it to the skipped list?
I could add it to all the skipped lists if you prefer. That seems like a larger change for little/no? gain. I'm comfortable with either solution.
Committed r47951: <http://trac.webkit.org/changeset/47951>
Came to consensus with Simon over IRC in #webkit.
<rdar://problem/7189153>
I saw another example of this crash last night while trying to land bug 28225: compositing/color-matching/image-color-matching.html -> crashed So sadly skipping the test did not get rid of the crashing flakiness. It might have gotten rid of the unexplained time outs, but we'll need to dig more to solve this crasher. :(
This may be a bug in a system framework on 10.5.8. Eric, you could try running 10.6 to see if the crashes happen there.
Unfortunately I don't yet have Snow Leopard and it's not officially supported for Google Macs yet (we have some outstanding radar's with Apple specific to our setup). That said, even if this is leopard specific... we may need to work around it for leopard users. I think we need to start by one of us staring at that crash trace and theorizing what could be hosed in it.
It's something in Core Animation/AppKit teardown. I'll follow up in a week and a half when I get back from vacation.
Trying to land bug 28406 was yet another victim to this regression. :(
Bug 28961 was yet another victim! Blocked by landing by a flakey media test crash. :(
Likely related timeout while trying to land bug 28931. :(
Saw the timeout: media/video-source-error.html -> timed out when landing bug 28993 as well. Maybe the timeouts are a separate bug, but I think they're probably related to this one.
I went back through the crash logs from the commit-queue for last week. They were all media related. :) I filed separate bugs (bug 29035 and bug 29037) for crashes which are probably caused by this one, but have different stack traces. Hopefully between the 3 separate stack traces what is wrong will be obvious to Eric or Simon. :(
video-no-autoplay just crashed on a Leopard bot. w/o a crash log it's difficult to know if it's related or not, but given how many different tests have produced these similar crash dumps, it probably is: http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r48210%20(4835)/results.html
I'm unsure what to do here. It may be time for me to dive in an try and fix this myself. This bug is the largest pain point with the commit-queue right now. :( The bots just hit this again: http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r48231%20(4853)/results.html
I wonder if we could catch this by running the media layout tests with NSZombies enabled.
> run-webkit-tests --guard media Testing 92 test cases. media ............................ media/remove-from-document.html -> failed media/video-aspect-ratio.html -> failed media/video-layer-crash.html -> failed media/video-load-networkState.html -> failed media/video-loop.html -> failed media/video-played-ranges-1.html -> failed media/video-seek-past-end-playing.html -> failed media/video-seeking.html -> failed media/video-transformed.html -> failed media/video-zoom-controls.html -> failed media/video-zoom.html -> failed Most of them were timeouts, but a few of them were actual failures which could be memory related. (I removed some of the extra dots from the output)
3 more rejections last night, and another crash on the bots. :( http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r48301%20(4915)/
Another instance on the bots: http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r48323%20(4938)/results.html Four more spurious rejections today, landing bug 29210, bug 29159, bug 28831, and bug 26715. :(
It might be possible to write a git bisect command which ran the media layout tests 10 times and exited 0 if they all passed. That could help us figure out which revision caused this failure.
test.sh: #!/bin/sh build-webkit || exit 125 # ignore broken builds for (( i = 0 ; i <= 10; i++ )) do build-dump-render-tree || exit 125 # ignore builds with a broken DRT build run-webkit-tests media || exit 1 # Mark builds as bad if the media tests fail. done git bisect start known_bad_revision known_good_revision git bisect run ./test.sh known_bad_revision would be something from August 21st or whenever we decided this first started failing. known_good_revision would be a git commit from sometime before that. maybe august 1st?
Here is a more realistic test.sh: #!/bin/sh echo "Building WebKit" WebKitTools/Scripts/build-webkit > /dev/null || exit 125 # Ignore broken builds. for (( i = 0 ; i <= 10; i++ )) do WebKitTools/Scripts/build-dumprendertree > /dev/null || exit 125 # Ignore revisions with a broken DRT build. WebKitTools/Scripts/run-webkit-tests --quiet --no-sample-on-timeout --no-launch-safari --exit-after-n-failures=1 media || exit 1 # Mark builds as bad if the media tests fail. done echo "Passed" Unfortunately that still produces false positives. :( I'm not able to get this to fail reliably enough to test yet.
I've not yet seen these crashes on the Snow Leopard bot. However the Snow Leopard bot hasn't been around very long, and it's possible that bug 29216 is masking one of these failures.
HUZZAH! I found the culprit! media/video-size-intrinsic-scale.html run-webkit-tests media/video-size-intrinsic-scale.html media/video-size-intrinsic-scale.html ... (pasted 100 more times or so) crashes reliably for me! Generally it crashes after about the 8th run. We may just be able to skip this test on Leopard to get rid of this bug!
OK. Time for me to retire for the weekend. I'll look more at this on monday, including considering skipping media/video-size-intrinsic-scale.html on leopard until Eric/Simon have a chance to figure out what's actually wrong.
I ran media/video-source-add-src.html 1000 times with no crash. I did see one hang related to this bug when running the media tests 100 times. Would be nice to know how to make this more reproducible.
(In reply to comment #37) > I ran media/video-source-add-src.html 1000 times with no crash. I did see one > hang related to this bug when running the media tests 100 times. Would be nice > to know how to make this more reproducible. The crash is repeatable for me running media/video-size-intrinsic-scale.html a few times. :) add-src is not the problem test, it just happend to be the test after media/video-size-intrinsic-scale.html.
I can run media/video-size-intrinsic-scale.html lots of times with no issues. Eric, what's your hardware?
(In reply to comment #39) > I can run media/video-size-intrinsic-scale.html lots of times with no issues. > > Eric, what's your hardware? Mac Pro (Quad). I'll send you a system profiler report. > WebKitTools/Scripts/run-webkit-tests --iterations 100 --no-sample-on-timeout media/video-size-intrinsic-scale.html Testing 1 test cases 100 times. media ............ media/video-size-intrinsic-scale.html -> timed out .................... media/video-size-intrinsic-scale.html -> timed out ............................ media/video-size-intrinsic-scale.html -> timed out ............ media/video-size-intrinsic-scale.html -> timed out ........................... media/video-size-intrinsic-scale.html -> timed out . I'm not seeing it crash today, interesting.
Interesting, I'm testing on a dual CPU iMac. I'll try on my Mac Pro.
Wow, interesting. I did a clean build of Debug (previously I was testing with release): run-webkit-tests --iterations 100 --no-sample-on-timeout media/video-size-intrinsic-scale.html --debug passed! I'll try a clean build of Release to check.
I did a clean Release build. I'm able to get media/video-size-intrinsic-scale.html repeatably for release, but not for Debug builds. I'm able to get media/video-size-intrinsic-scale.html to crash running DRT under the debugger in XCode by passing the full path to media/video-size-intrinsic-scale.html 10 times in the arguments panel for the DumpRenderTree executable in my WebCore target.
It would now be possible to write a git bisect command to test if this is a regression. That would be complicated by the fact that a bunch of flags that we would want to use in run-webkit-tests have only recently been added. Probably would be simplest to start by checking out the revision where media/video-size-intrinsic-scale.html was added and see if the test can be made to crash there.
This turns out to be an issue in CoreVideo, tracked via <rdar://problem/7228836>. I can't see any way to work around it, other than run SnowLeopard.
Do we know if this is a regression in 10.5.8? Did we start tripping this with some certain tests? Can we skip those tests on Leopard? As far as I can tell this crasher is relatively new. :) I don't know if it's caused by code changes on our part, the 10.5.8 update, or by test additions, but presumably you could tell us given your radar knowledge. :)
The bug is related to the use of CVDisplayLinks, which are used by Core Animation when hardware compositing is enabled. So it's likely that this started to show up more when we turned on hardware-compositing for <video>. From what I can determine, it's a race condition that is simply a function of the frequency with which we turn over CVDisplayLinks. I tried changing the order of teardown of video elements vs. the WebHTMLView, and that did not avoid the bug.
(In reply to comment #47) > The bug is related to the use of CVDisplayLinks, which are used by Core > Animation when hardware compositing is enabled. So it's likely that this > started to show up more when we turned on hardware-compositing for <video>. > From what I can determine, it's a race condition that is simply a function of > the frequency with which we turn over CVDisplayLinks. > > I tried changing the order of teardown of video elements vs. the WebHTMLView, > and that did not avoid the bug. Do you know if this bug is triggered by all video tests, or just a couple of them? If it's just a couple of them it would be easy to skip them on Leopard.
I don't see any obvious correlation between occurrences of the bug, and what any specific test is doing (other than triggering hardware). However, I guess we could try skipping media/video-size-intrinsic-scale.html on leopard.
Created attachment 39676 [details] Skip media/video-size-intrinsic-scale.html on leopard and re-enable media/video-source-add-src.html
Comment on attachment 39676 [details] Skip media/video-size-intrinsic-scale.html on leopard and re-enable media/video-source-add-src.html Clearing flags on attachment: 39676 Committed r48485: <http://trac.webkit.org/changeset/48485>
All reviewed patches have been landed. Closing bug.
Hopefully these crashes will all go away now. I'll close the rest of the dependent bugs if those stop too.
Triple whammy on the bots just now. :( http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r48636%20(5229)/results.html Tests that timed out: compositing/color-matching/image-color-matching.html stderr media/video-source.html stderr Tests that caused the DumpRenderTree tool to crash: media/controls-right-click-on-timebar.html stderr I suspect those are all related to this root cause.