REGRESSION: media/video-loop.html is timing out on the commit-queue Leopard Bot % grep -i timed commit-queue.log media/video-loop.html -> timed out 1 test case (<1%) timed out media/video-loop.html -> timed out 1 test case (<1%) timed out media/controls-drag-timebar.html -> timed out 1 test case (<1%) timed out media/video-loop.html -> timed out 1 test case (<1%) timed out media/video-loop.html -> timed out 1 test case (<1%) timed out media/video-loop.html -> timed out 1 test case (<1%) timed out The first timeout occurred shortly after 2010-05-10 22:05. I upgraded the Qt version on the bot last night (to whatever the latest leopard software update is)., and Simon turned on accelerated compositing for leopard. One or both of those is likely related
Sorry, by Qt I meant QT/QuickTime above. The version installed on the bot after last night is 7.6.6.
Very reminiscent of bug 28624 and bug 28845 (which are why we ended up disabling accelerated compositing in the first place). However we haven't seen any crashes on the commit-queue since re-enabling HW compositing, so perhaps these two tests timing out are not related.
I was mistaken. HW compositing should have been re-enabled on the bot as of: http://trac.webkit.org/changeset/55118 which was committed 3 months ago, no? So it's possible these hangs are related to Simons changes last night, or possible they're related to the new 7.6.6 quicktime I installed on the bot last night.
Something doesn't make sense though. If the DRT hack was removed 3 moths ago, but I only finally updated the bot to a "safe" quicktime last night, shouldn't we have seen crashes for the last 3 months if HW Compositing was actually being used on Leopard?
HW compositing is conditional on theQuickTime version, so your updating QT to 7.6.6 had the effect of enabling HW compositing.
Failed often enough to cause a false rejection (means it failed once, passed once, and then failed again in a row). https://bugs.webkit.org/show_bug.cgi?id=38800#c6 @simon/eric: It is possible for me to down-grade quicktime? Any theories as to what could be hanging? I don't have any samples from the hangs. Should we skip these 2 tests on Leopard?
Created attachment 55727 [details] ridiculous video corruption of console.app seen on commit-queue bot I generally leave Console.app open on the bot. This mornig when I logged into it, there was some strange video corruption of the window backing store. I suspect this may be related to turning on HW compositing on the leopard bot, and thus I've attached the image to this bug.
Eric: is this bot normally headless?
No, but it was restarted headless last night. The monitor happened to be off. I'm also seeing kernel errors from timeouts relating to the NVIDIA driver on the cq bot. 5/11/10 10:15:20 AM kernel NVDA(OpenGL): Channel exception! status = 0xffff info32 = 0xd = GR: SW Notify Error 5/11/10 10:15:20 AM kernel 0000000c 5/11/10 10:15:20 AM kernel 00200000 00008297 00000472 00000000 5/11/10 10:15:20 AM kernel 0000047e 00000f04 b434a7a4 00000003 5/11/10 10:15:20 AM kernel 00000000 00000000 00000002 5/11/10 10:15:33 AM kernel NVChannel(Display): Graphics channel timeout! 5/11/10 10:15:45 AM kernel NVChannel(OpenGL): Graphics channel timeout! 5/11/10 10:15:45 AM kernel NVChannel(Display): Graphics channel timeout! 5/11/10 10:15:57 AM kernel NVChannel(Display): Graphics channel timeout! 5/11/10 10:15:57 AM kernel NVDA(OpenGL): Channel exception! status = 0xffff info32 = 0xd = GR: SW Notify Error 5/11/10 10:15:57 AM kernel 0000000c 5/11/10 10:15:57 AM kernel 00200000 00008297 00000472 00000000 5/11/10 10:15:57 AM kernel 0000047e 00001b0c 1000f010 00000003 5/11/10 10:15:57 AM kernel 00000000 00000000 00000000 5/11/10 10:25:46 AM kernel NVDA(OpenGL): Channel exception! status = 0xffff info32 = 0xd = GR: SW Notify Error 5/11/10 10:25:46 AM kernel 0000000c 5/11/10 10:25:46 AM kernel 00200000 00008297 00000472 00000000 5/11/10 10:25:46 AM kernel 0000047e 00001b0c 1000f010 00000003 5/11/10 10:25:46 AM kernel 00000000 00000000 00000000 Thes logs started on the first run of run-webkit-tests after upgrading quicktime on the bot.
The cq bot has a monitor connected to it. I happen to restart it via VNC last night. The monitor is currently off and I haven't gone to find the bot and turn said monitor back on yet. If you believe these failures are caused by head-less-ness, I can restart the bot with the monitor turned on (or just simply turn the monitor on... whatever would affect Mac OS X's head-less-ness oddities).
It would be great to have a bugreporter.apple.com report with your System Configuration and the log, so we can hand it off to the graphics folks.
Radar 7969612 filed. Please route accordingly.
I've also turned on the monitor on the commit-queue machine and restarted it. We'll see if it behaves better now.
Created attachment 55731 [details] system profile from the commit-queue bot I removed non-relevant information from the report.
No Restarting with the monitor turned on did not affect the behavior. Still seeing timoues: media/video-loop.html -> timed out media/video-loop.html -> timed out media/controls-drag-timebar.html -> timed out And also some failures: fast/canvas/canvas-arc-small-wide.html -> failed fast/files/file-reader.html -> failed fast/canvas/webgl/bug-32888.html -> failed fast/canvas/webgl/bug-32888.html -> faile I suspect at least the webgl failures may be related.
One temporary solution would be to move the commit-queue to another server I have which has not yet updated to the latest Quicktime. Longer term we may have to re-disable HW compositing if it's not stable across leopard machines. I'm very interested in your thoughts.
Claimed another victim: https://bugs.webkit.org/show_bug.cgi?id=38905#c5
Do we know if the Leopard Release Buildbot has AX compositing enabled? I'm not seeing tests fail there. Perhaps we need to add some sort of check for the commit-queue's flavor of broken graphics card drivers and disable HW compositing in that case? (if this is believed to be a driver bug).
The leopard bots were all updated to QuickTime 7.6.6, so should all have accel. comp. enabled.
Is there a way for me to verify that AX compositing is on on the build bots? Both of my Leopard machines fail AX compositing tests. I'm attempting to determine if I have bad graphics cards in both of them (likely) or if there is some larger problem with AX compositing on leopard.
> Is there a way for me to verify that AX compositing is on on the build bots See if they run tests in LayoutTests/compositing
They both run compositing/ yes: http://build.webkit.org/builders/Leopard%20Intel%20Release%20(Tests)/builds/14494/steps/layout-test/logs/stdio http://build.webkit.org/builders/Leopard%20Intel%20Debug%20(Tests)/builds/14279/steps/layout-test/logs/stdio Thanks.
Unfortunately I cannot switch to using my other mac leopard server due to WebGL failures: https://bugs.webkit.org/show_bug.cgi?id=37018 and one remaining media failure: https://bugs.webkit.org/show_bug.cgi?id=35271
Seems the best solution would be to write a similar function to coreVideoHas7228836Fix in WebView.mm. Except one which detects if the system has 7969612 (or whatever the final radar number ends up being for this seeming NVIDIA driver issue). Any suggestions from folks how I might detect which systems would not be affected by this? (Since clearly some Leopard installs are not hitting this).
I've changed: /System/Library/Frameworks/CoreVideo.framework/Versions/Current/Resources/Info.plist on the bot to have CFBundleVersion of 47.6 instead of 48.6 (so it's less than 48) so that HW compositing should auto-disable in webkit. We'll see if that works around the troubles.
My hack to get WebKit to disable HW compositing worked. It checks the (hacked) CoreVideo version and turns it off correctly. However the same night that I upgraded the bot, some tests seem to have started failing with compositing off. So now that I have HW compositing back off on the commit-queue, the following tests always fail: compositing/iframes/composited-parent-iframe.html -> failed compositing/iframes/connect-compositing-iframe2.html -> failed compositing/iframes/connect-compositing-iframe3.html -> failed compositing/iframes/connect-compositing-iframe.html -> failed compositing/iframes/enter-compositing-iframe.html -> failed run-webkit-tests checks for the compiled symbol "RenderLayer" to determine if compositing is enabled: http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/webkitperl/features.pm#L71 http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/old-run-webkit-tests#L479 and thus if it should run the compositing tests: http://trac.webkit.org/browser/trunk/WebKitTools/Scripts/old-run-webkit-tests#L483 Thoughts?
I suspect that: http://trac.webkit.org/changeset/59130 is what "broke" these compositing tests to not work with compositing off. Since run-webkit-tests uses a build-time check, and WebKit itself uses a runtime check I'm not sure how these worked before. The new tests dump "GraphicsLayers" which simply don't exist when HW compositing is turned off at runtime as far as I can tell. I guess the previous compositing tests did not depend on dumping GraphicsLayers and thus passed?
We should do what we do on Windows, which is to run DRT with --tell-me-what-you-support, and use the answer to disable tests.
I think --print-supported-features is a good idea. I would like a shorter term solution since that would take a while. I could consider hacking the commit-queue to simply not run the compositing tests. I could also theoretically find some other hardware to run it on. Is pre-7.6.6 Leopard a supported configuration anymore? aka do we view http://trac.webkit.org/changeset/59130 as a regression such that should consider skipping these tests on Leopard? or is this likely only to affect the commit-queue?
As a temporary hack, I'm going to try passing --ignore-tests compositing/iframe to run-webkit-tests on the commit-bot.
Seems cmarrin already filed a bug asking for --print-supported-features on the Mac. Bug 35897. Bug 36925 is also related.
Hacks were put in place to get the commit-queue running again: http://trac.webkit.org/changeset/59364 http://trac.webkit.org/changeset/59375 I'll remove the hack once we can land an alternate solution.
I'm able to reproduce these timeouts on my other leopard machine as well. :(
Created attachment 56268 [details] system profile from my other leopard machine seeing timeouts
I don't see any NVIDIA log messages in the kernel log on my desktop leopard machine which also saw these failures.
With compositing disabled on the commit-queue to work around this bug, now compositing/tiling/huge-layer-add-remove-child.html fails. :( I'll have to make another hack to skip that too. I think I'll just skip all of "compositing" since they don't seem to work reliably across machines due to these various bugs.
Created attachment 65468 [details] Patch
Comment on attachment 65468 [details] Patch > + # media tests are also broken on mac leopard due to > + # a separate CoreVideo bug which causes random crashes/hangs > + # https://bugs.webkit.org/show_bug.cgi?id=38912 > + tests_to_ignore.append("media") I don't like vague references to unidentified bugs. The primary Core Video bug was fixed in QuickTime 7.6.6, so if the machine has that version, but is still crashing, then we need a bug report.
Comment on attachment 65468 [details] Patch I've provided Simon a bug report ad crash logs in bug 44643. I'm happy to make further changes to this patch, but for now re-setting r? since I believe I've answered simon's concern.
Comment on attachment 65468 [details] Patch The crashes are not the old Core Video bug, so r=me
Comment on attachment 65468 [details] Patch Clearing flags on attachment: 65468 Committed r66053: <http://trac.webkit.org/changeset/66053>
http://trac.webkit.org/changeset/66053 might have broken Leopard Intel Release (Tests) The following changes are on the blame list: http://trac.webkit.org/changeset/66056 http://trac.webkit.org/changeset/66057 http://trac.webkit.org/changeset/66058 http://trac.webkit.org/changeset/66060 http://trac.webkit.org/changeset/66053
Code has already been landed.