RESOLVED FIXED Bug 38912
REGRESSION: media/video-loop.html is timing out on the commit-queue Leopard Bot
https://bugs.webkit.org/show_bug.cgi?id=38912
Summary REGRESSION: media/video-loop.html is timing out on the commit-queue Leopard Bot
Eric Seidel (no email)
Reported 2010-05-11 09:12:36 PDT
REGRESSION: media/video-loop.html is timing out on the commit-queue Leopard Bot % grep -i timed commit-queue.log media/video-loop.html -> timed out 1 test case (<1%) timed out media/video-loop.html -> timed out 1 test case (<1%) timed out media/controls-drag-timebar.html -> timed out 1 test case (<1%) timed out media/video-loop.html -> timed out 1 test case (<1%) timed out media/video-loop.html -> timed out 1 test case (<1%) timed out media/video-loop.html -> timed out 1 test case (<1%) timed out The first timeout occurred shortly after 2010-05-10 22:05. I upgraded the Qt version on the bot last night (to whatever the latest leopard software update is)., and Simon turned on accelerated compositing for leopard. One or both of those is likely related
Attachments
ridiculous video corruption of console.app seen on commit-queue bot (501.33 KB, image/tiff)
2010-05-11 11:26 PDT, Eric Seidel (no email)
no flags
system profile from the commit-queue bot (9.22 KB, text/plain)
2010-05-11 12:14 PDT, Eric Seidel (no email)
no flags
system profile from my other leopard machine seeing timeouts (10.64 KB, text/plain)
2010-05-17 13:49 PDT, Eric Seidel (no email)
no flags
Patch (1.85 KB, patch)
2010-08-25 13:32 PDT, Eric Seidel (no email)
no flags
Eric Seidel (no email)
Comment 1 2010-05-11 09:14:52 PDT
Sorry, by Qt I meant QT/QuickTime above. The version installed on the bot after last night is 7.6.6.
Eric Seidel (no email)
Comment 2 2010-05-11 09:22:54 PDT
Very reminiscent of bug 28624 and bug 28845 (which are why we ended up disabling accelerated compositing in the first place). However we haven't seen any crashes on the commit-queue since re-enabling HW compositing, so perhaps these two tests timing out are not related.
Eric Seidel (no email)
Comment 3 2010-05-11 09:33:39 PDT
I was mistaken. HW compositing should have been re-enabled on the bot as of: http://trac.webkit.org/changeset/55118 which was committed 3 months ago, no? So it's possible these hangs are related to Simons changes last night, or possible they're related to the new 7.6.6 quicktime I installed on the bot last night.
Eric Seidel (no email)
Comment 4 2010-05-11 09:35:01 PDT
Something doesn't make sense though. If the DRT hack was removed 3 moths ago, but I only finally updated the bot to a "safe" quicktime last night, shouldn't we have seen crashes for the last 3 months if HW Compositing was actually being used on Leopard?
Simon Fraser (smfr)
Comment 5 2010-05-11 10:04:06 PDT
HW compositing is conditional on theQuickTime version, so your updating QT to 7.6.6 had the effect of enabling HW compositing.
Eric Seidel (no email)
Comment 6 2010-05-11 11:16:09 PDT
Failed often enough to cause a false rejection (means it failed once, passed once, and then failed again in a row). https://bugs.webkit.org/show_bug.cgi?id=38800#c6 @simon/eric: It is possible for me to down-grade quicktime? Any theories as to what could be hanging? I don't have any samples from the hangs. Should we skip these 2 tests on Leopard?
Eric Seidel (no email)
Comment 7 2010-05-11 11:26:47 PDT
Created attachment 55727 [details] ridiculous video corruption of console.app seen on commit-queue bot I generally leave Console.app open on the bot. This mornig when I logged into it, there was some strange video corruption of the window backing store. I suspect this may be related to turning on HW compositing on the leopard bot, and thus I've attached the image to this bug.
Simon Fraser (smfr)
Comment 8 2010-05-11 11:31:59 PDT
Eric: is this bot normally headless?
Eric Seidel (no email)
Comment 9 2010-05-11 11:34:34 PDT
No, but it was restarted headless last night. The monitor happened to be off. I'm also seeing kernel errors from timeouts relating to the NVIDIA driver on the cq bot. 5/11/10 10:15:20 AM kernel NVDA(OpenGL): Channel exception! status = 0xffff info32 = 0xd = GR: SW Notify Error 5/11/10 10:15:20 AM kernel 0000000c 5/11/10 10:15:20 AM kernel 00200000 00008297 00000472 00000000 5/11/10 10:15:20 AM kernel 0000047e 00000f04 b434a7a4 00000003 5/11/10 10:15:20 AM kernel 00000000 00000000 00000002 5/11/10 10:15:33 AM kernel NVChannel(Display): Graphics channel timeout! 5/11/10 10:15:45 AM kernel NVChannel(OpenGL): Graphics channel timeout! 5/11/10 10:15:45 AM kernel NVChannel(Display): Graphics channel timeout! 5/11/10 10:15:57 AM kernel NVChannel(Display): Graphics channel timeout! 5/11/10 10:15:57 AM kernel NVDA(OpenGL): Channel exception! status = 0xffff info32 = 0xd = GR: SW Notify Error 5/11/10 10:15:57 AM kernel 0000000c 5/11/10 10:15:57 AM kernel 00200000 00008297 00000472 00000000 5/11/10 10:15:57 AM kernel 0000047e 00001b0c 1000f010 00000003 5/11/10 10:15:57 AM kernel 00000000 00000000 00000000 5/11/10 10:25:46 AM kernel NVDA(OpenGL): Channel exception! status = 0xffff info32 = 0xd = GR: SW Notify Error 5/11/10 10:25:46 AM kernel 0000000c 5/11/10 10:25:46 AM kernel 00200000 00008297 00000472 00000000 5/11/10 10:25:46 AM kernel 0000047e 00001b0c 1000f010 00000003 5/11/10 10:25:46 AM kernel 00000000 00000000 00000000 Thes logs started on the first run of run-webkit-tests after upgrading quicktime on the bot.
Eric Seidel (no email)
Comment 10 2010-05-11 11:38:30 PDT
The cq bot has a monitor connected to it. I happen to restart it via VNC last night. The monitor is currently off and I haven't gone to find the bot and turn said monitor back on yet. If you believe these failures are caused by head-less-ness, I can restart the bot with the monitor turned on (or just simply turn the monitor on... whatever would affect Mac OS X's head-less-ness oddities).
Simon Fraser (smfr)
Comment 11 2010-05-11 11:47:35 PDT
It would be great to have a bugreporter.apple.com report with your System Configuration and the log, so we can hand it off to the graphics folks.
Eric Seidel (no email)
Comment 12 2010-05-11 11:59:43 PDT
Radar 7969612 filed. Please route accordingly.
Eric Seidel (no email)
Comment 13 2010-05-11 12:10:42 PDT
I've also turned on the monitor on the commit-queue machine and restarted it. We'll see if it behaves better now.
Eric Seidel (no email)
Comment 14 2010-05-11 12:14:17 PDT
Created attachment 55731 [details] system profile from the commit-queue bot I removed non-relevant information from the report.
Eric Seidel (no email)
Comment 15 2010-05-12 01:13:20 PDT
No Restarting with the monitor turned on did not affect the behavior. Still seeing timoues: media/video-loop.html -> timed out media/video-loop.html -> timed out media/controls-drag-timebar.html -> timed out And also some failures: fast/canvas/canvas-arc-small-wide.html -> failed fast/files/file-reader.html -> failed fast/canvas/webgl/bug-32888.html -> failed fast/canvas/webgl/bug-32888.html -> faile I suspect at least the webgl failures may be related.
Eric Seidel (no email)
Comment 16 2010-05-12 01:15:10 PDT
One temporary solution would be to move the commit-queue to another server I have which has not yet updated to the latest Quicktime. Longer term we may have to re-disable HW compositing if it's not stable across leopard machines. I'm very interested in your thoughts.
Eric Seidel (no email)
Comment 17 2010-05-12 08:47:18 PDT
Eric Seidel (no email)
Comment 18 2010-05-12 09:27:14 PDT
Do we know if the Leopard Release Buildbot has AX compositing enabled? I'm not seeing tests fail there. Perhaps we need to add some sort of check for the commit-queue's flavor of broken graphics card drivers and disable HW compositing in that case? (if this is believed to be a driver bug).
Simon Fraser (smfr)
Comment 19 2010-05-12 09:33:45 PDT
The leopard bots were all updated to QuickTime 7.6.6, so should all have accel. comp. enabled.
Eric Seidel (no email)
Comment 20 2010-05-12 09:54:08 PDT
Is there a way for me to verify that AX compositing is on on the build bots? Both of my Leopard machines fail AX compositing tests. I'm attempting to determine if I have bad graphics cards in both of them (likely) or if there is some larger problem with AX compositing on leopard.
Simon Fraser (smfr)
Comment 21 2010-05-12 09:58:09 PDT
> Is there a way for me to verify that AX compositing is on on the build bots See if they run tests in LayoutTests/compositing
Eric Seidel (no email)
Comment 23 2010-05-12 10:25:07 PDT
Unfortunately I cannot switch to using my other mac leopard server due to WebGL failures: https://bugs.webkit.org/show_bug.cgi?id=37018 and one remaining media failure: https://bugs.webkit.org/show_bug.cgi?id=35271
Eric Seidel (no email)
Comment 24 2010-05-12 10:47:13 PDT
Seems the best solution would be to write a similar function to coreVideoHas7228836Fix in WebView.mm. Except one which detects if the system has 7969612 (or whatever the final radar number ends up being for this seeming NVIDIA driver issue). Any suggestions from folks how I might detect which systems would not be affected by this? (Since clearly some Leopard installs are not hitting this).
Eric Seidel (no email)
Comment 25 2010-05-12 11:42:30 PDT
I've changed: /System/Library/Frameworks/CoreVideo.framework/Versions/Current/Resources/Info.plist on the bot to have CFBundleVersion of 47.6 instead of 48.6 (so it's less than 48) so that HW compositing should auto-disable in webkit. We'll see if that works around the troubles.
Eric Seidel (no email)
Comment 26 2010-05-12 17:20:22 PDT
My hack to get WebKit to disable HW compositing worked. It checks the (hacked) CoreVideo version and turns it off correctly. However the same night that I upgraded the bot, some tests seem to have started failing with compositing off. So now that I have HW compositing back off on the commit-queue, the following tests always fail: compositing/iframes/composited-parent-iframe.html -> failed compositing/iframes/connect-compositing-iframe2.html -> failed compositing/iframes/connect-compositing-iframe3.html -> failed compositing/iframes/connect-compositing-iframe.html -> failed compositing/iframes/enter-compositing-iframe.html -> failed run-webkit-tests checks for the compiled symbol "RenderLayer" to determine if compositing is enabled: http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/webkitperl/features.pm#L71 http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/old-run-webkit-tests#L479 and thus if it should run the compositing tests: http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/old-run-webkit-tests#L483 Thoughts?
Eric Seidel (no email)
Comment 27 2010-05-12 17:24:42 PDT
I suspect that: http://trac.webkit.org/changeset/59130 is what "broke" these compositing tests to not work with compositing off. Since run-webkit-tests uses a build-time check, and WebKit itself uses a runtime check I'm not sure how these worked before. The new tests dump "GraphicsLayers" which simply don't exist when HW compositing is turned off at runtime as far as I can tell. I guess the previous compositing tests did not depend on dumping GraphicsLayers and thus passed?
Simon Fraser (smfr)
Comment 28 2010-05-12 17:29:27 PDT
We should do what we do on Windows, which is to run DRT with --tell-me-what-you-support, and use the answer to disable tests.
Eric Seidel (no email)
Comment 29 2010-05-12 18:19:06 PDT
I think --print-supported-features is a good idea. I would like a shorter term solution since that would take a while. I could consider hacking the commit-queue to simply not run the compositing tests. I could also theoretically find some other hardware to run it on. Is pre-7.6.6 Leopard a supported configuration anymore? aka do we view http://trac.webkit.org/changeset/59130 as a regression such that should consider skipping these tests on Leopard? or is this likely only to affect the commit-queue?
Eric Seidel (no email)
Comment 30 2010-05-12 18:31:04 PDT
As a temporary hack, I'm going to try passing --ignore-tests compositing/iframe to run-webkit-tests on the commit-bot.
Eric Seidel (no email)
Comment 31 2010-05-12 19:38:37 PDT
Seems cmarrin already filed a bug asking for --print-supported-features on the Mac. Bug 35897. Bug 36925 is also related.
Eric Seidel (no email)
Comment 32 2010-05-13 11:34:47 PDT
Hacks were put in place to get the commit-queue running again: http://trac.webkit.org/changeset/59364 http://trac.webkit.org/changeset/59375 I'll remove the hack once we can land an alternate solution.
Eric Seidel (no email)
Comment 33 2010-05-17 13:24:44 PDT
I'm able to reproduce these timeouts on my other leopard machine as well. :(
Eric Seidel (no email)
Comment 34 2010-05-17 13:49:38 PDT
Created attachment 56268 [details] system profile from my other leopard machine seeing timeouts
Eric Seidel (no email)
Comment 35 2010-05-17 13:51:12 PDT
I don't see any NVIDIA log messages in the kernel log on my desktop leopard machine which also saw these failures.
Eric Seidel (no email)
Comment 36 2010-05-22 10:30:58 PDT
With compositing disabled on the commit-queue to work around this bug, now compositing/tiling/huge-layer-add-remove-child.html fails. :( I'll have to make another hack to skip that too. I think I'll just skip all of "compositing" since they don't seem to work reliably across machines due to these various bugs.
Eric Seidel (no email)
Comment 37 2010-08-25 13:32:44 PDT
Simon Fraser (smfr)
Comment 38 2010-08-25 13:51:39 PDT
Comment on attachment 65468 [details] Patch > + # media tests are also broken on mac leopard due to > + # a separate CoreVideo bug which causes random crashes/hangs > + # https://bugs.webkit.org/show_bug.cgi?id=38912 > + tests_to_ignore.append("media") I don't like vague references to unidentified bugs. The primary Core Video bug was fixed in QuickTime 7.6.6, so if the machine has that version, but is still crashing, then we need a bug report.
Eric Seidel (no email)
Comment 39 2010-08-25 16:33:43 PDT
Comment on attachment 65468 [details] Patch I've provided Simon a bug report ad crash logs in bug 44643. I'm happy to make further changes to this patch, but for now re-setting r? since I believe I've answered simon's concern.
Simon Fraser (smfr)
Comment 40 2010-08-25 16:34:34 PDT
Comment on attachment 65468 [details] Patch The crashes are not the old Core Video bug, so r=me
Eric Seidel (no email)
Comment 41 2010-08-25 16:37:34 PDT
Comment on attachment 65468 [details] Patch Clearing flags on attachment: 65468 Committed r66053: <http://trac.webkit.org/changeset/66053>
Brent Fulgham
Comment 43 2014-01-09 21:02:22 PST
Code has already been landed.
Note You need to log in before you can comment on or make changes to this bug.