Summary: | All Mac bots occasionally fail tons of media/ sputnik/ and svg/ tests | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Beth Dakin <bdakin> | ||||||
Component: | Tools / Tests | Assignee: | Nobody <webkit-unassigned> | ||||||
Status: | REOPENED --- | ||||||||
Severity: | Normal | CC: | bdakin, dpranke, eric.carlson, feature-media-reviews, rniwa, simon.fraser, thorton, tonikitoo, webkit-bug-importer, webkit.review.bot | ||||||
Priority: | P2 | Keywords: | InRadar, LayoutTestFailure, MakingBotsRed, Regression | ||||||
Version: | 528+ (Nightly build) | ||||||||
Hardware: | Mac | ||||||||
OS: | Unspecified | ||||||||
Attachments: |
|
Description
Beth Dakin
2012-04-06 16:29:38 PDT
It seems like generated results are being compared to the wrong expected results. Somehow, we're getting out of kilter. Are there example logs for this I can look at? ... Chromium DRT will actually echo the URL for the test back into the output, so there's code in new-run-webkit-tests that actually asserts if the output is for the wrong test. It doesn't look like there's anything equivalent for the non-Chromium ports, so debugging this might be harder than I would like :(. Has anything changed in the test harness in the last week that could lead to this problem? This is a really terrible issue, and I agree it's hard to debug, so maybe we can just figure out what caused it to start happening. I don't know how long these links will be good for, but here's an example: This test is failing: http://trac.webkit.org/export/113594/trunk/LayoutTests/media/unsupported-rtsp.html Here's the pretty diff: http://build.webkit.org/results/Lion%20Release%20(Tests)/r113594%20(7423)/media/unsupported-rtsp-pretty-diff.html The pretty diff is comparing the results of this test: http://trac.webkit.org/export/113594/trunk/LayoutTests/media/svg-as-image-with-media-blocked.html with the expected results of media/unsupported-rtsp.html, and it only thinks that media/unsupported-rtsp.html is failing because the generated results are from media/svg-as-image-with-media-blocked.html On the Lion Release Tests bot, this started between 113371 and 113376. <http://build.webkit.org/builders/Lion%20Release%20%28Tests%29?numbuilds=200> On Snow Leopard Release Tests, this started between 113363 and 113371. <http://build.webkit.org/builders/SnowLeopard%20Intel%20Release%20%28Tests%29?numbuilds=200> The only thing that's changed in the harness in that timeframe that looks even remotely related is the change to run *fewer* workers on the bots: https://bugs.webkit.org/show_bug.cgi?id=83076 It's hard to imagine that that would be a problem, though. I haven't gotten to take much of a look at this yet, will keep investigating. My guess is that there's one test that behaves badly (a media test?), and causes the off-by-one for all the other tests run in the same shard. (In reply to comment #7) > My guess is that there's one test that behaves badly (a media test?), and causes the off-by-one for all the other tests run in the same shard. I think Simon's hypothesis is a good one. I spot-checked a few logs (from builds 7423, 7428, and 7437 on the Lion Release bot) and they all appear to have one worker that starts failing every test starting with media/nodesFromRect-shadowContent.html, so it might be that test. If you then look at the results for 7437, http://build.webkit.org/results/Lion%20Release%20(Tests)/r113622%20(7437)/results.html , you can see that the -actual output for nodesFromRect-shadowContent is repeated in the next failing test, media/remove-from-document-before-load.html . And, interestingly, the -expected for that test says "PASS: A crash did not occur when the media element was removed before loading.". So, I'm guessing that that test is actually crashing, and we're not noticing it, and somehow the internal output buffers in NRWT aren't getting emptied, so we end up reusing the same output twice and everything goes to hell after that. Note that that is the test that is being moved in r113371, so I think we have a pretty good suspect :). Maybe try skipping that test and see if the problem goes away? There's almost certainly also a bug in NRWT that maybe I introduced a few weeks ago when I was reworking the crash log code, so I can start looking for that now that I have a theory, as well. Thanks for the analysis, Dirk! I will try skipping that test, and we'll see how it goes. Skipped with http://trac.webkit.org/changeset/113661 I would live to see a backtrace of this crash, if it is confirmed. I wonder if the above patch fixes things. We would have to unskip the test again and wait for a few bot cycles. Anyways, the changes below work fine on Gtk and Qt builds, and should go it regardless I would say. diff --git a/LayoutTests/media/nodesFromRect-shadowContent.html b/LayoutTests/media/nodesFromRect-shadowContent.html index d55610a..9fdaf7f 100644 --- a/LayoutTests/media/nodesFromRect-shadowContent.html +++ b/LayoutTests/media/nodesFromRect-shadowContent.html @@ -13,11 +13,6 @@ <script type="text/javascript" charset="utf-8"> function runTest() { - if (window.layoutTestController) { - layoutTestController.dumpAsText(); - layoutTestController.waitUntilDone(); - } - var e = {}; // Set up shortcut access to elements @@ -42,12 +37,7 @@ var shadow =['-webkit-media-controls-timeline-container', '-webkit-media-controls-play-button', '-webkit-media-controls-panel', '-webkit-media-controls']; checkShadowContent(clickX, clickY, 10, 10, 20, 20, shadow); - - if (window.layoutTestController) - layoutTestController.notifyDone(); } - - window.onload = runTest; </script> </head> <body id="body"> @@ -58,6 +48,7 @@ testExpected("video.controls", null, '!='); waitForEvent('canplaythrough', function () { runTest(); + endTest(); } ); video.src = findMediaFile("video", "content/test"); </script> diff --git a/LayoutTests/platform/gtk/media/nodesFromRect-shadowContent-expected.txt b/LayoutTests/platform/gtk/media/nodesFromRect-shadowContent-expected.txt index 390e73e..371f393 100644 --- a/LayoutTests/platform/gtk/media/nodesFromRect-shadowContent-expected.txt +++ b/LayoutTests/platform/gtk/media/nodesFromRect-shadowContent-expected.txt @@ -4,4 +4,5 @@ This test only runs in DRT! EXPECTED (video.controls != 'null') OK EVENT(canplaythrough) +END OF TEST Created attachment 136726 [details] (committed r113910, r=eric_carlson) speculative patch After moving the test to this new location, I accidentally left behind an unneeded line: "window.onload = runTest;". It is unneeded since now the test gets triggered by "waitForEvent('canplaythrough'" Also, remove unneeded LTC calls, since they are called by the helper scripts imported (e.g. video-test.js). Lets see if the Mac bots like this more... As well as fixing the test, we should fix the harness to ensure that bad tests can't trip it up. Please file another bug about the problem with the test harness if one doesn't already exist. This was that bug :( (In reply to comment #16) > This was that bug :( Right. Want me to hold on committing the test fix? (In reply to comment #14) > As well as fixing the test, we should fix the harness to ensure that bad tests can't trip it up. I will note that fixing things completely probably includes something like the #URL comment that chromium does ... also, we don't actually know for sure that there's a bug in NRWT at the moment; it could be a bug in DRT. But that's quibbling ... Ok, will push and leave the bug opened. Possible useful chat: <eric_carlson> tonikitoo: the changes look OK, but do you know what was causing so many test failures <eric_carlson> tonikitoo: calling notifyDone() twice? <tonikitoo> eric_carlson, maybe that, and also calling runTests twice? <tonikitoo> from window.onload= runtest and waitTillRead(runTest, ...) <eric_carlson> tonikitoo: oh, I maybe having a test continue to generate output after it claimed to have finished... <tonikitoo> that is my guess (In reply to comment #19) > Ok, will push and leave the bug opened. > > Possible useful chat: > > <eric_carlson> tonikitoo: the changes look OK, but do you know what was causing so many test failures > <eric_carlson> tonikitoo: calling notifyDone() twice? > <tonikitoo> eric_carlson, maybe that, and also calling runTests twice? > <tonikitoo> from window.onload= runtest and waitTillRead(runTest, ...) > <eric_carlson> tonikitoo: oh, I maybe having a test continue to generate output after it claimed to have finished... > <tonikitoo> that is my guess Wait, does this mean that calling runTest() from inside a test can make DRT generate two copies of the output? That seems bad. seems like the patch is still failing on Mac Lion. Investigating... (In reply to comment #21) > seems like the patch is still failing on Mac Lion. Investigating... It seems to need rebaseline: --- /Volumes/Data/slave/lion-intel-release-tests/build/layout-test-results/media/nodesFromRect-shadowContent-expected.txt +++ /Volumes/Data/slave/lion-intel-release-tests/build/layout-test-results/media/nodesFromRect-shadowContent-actual.txt @@ -3,4 +3,6 @@ This test only runs in DRT! EXPECTED (video.controls != 'null') OK +EVENT(canplaythrough) +END OF TEST http://build.webkit.org/results/Lion%20Release%20%28Tests%29/r113910%20%287504%29/media/nodesFromRect-shadowContent-diff.txt Created attachment 136826 [details]
rebaseline
Comment on attachment 136826 [details] rebaseline Clearing flags on attachment: 136826 Committed r113944: <http://trac.webkit.org/changeset/113944> All reviewed patches have been landed. Closing bug. . Has anyone seen this recently? Should this bug be closed? I think we're keeping this open to fix: "As well as fixing the test, we should fix the harness to ensure that bad tests can't trip it up." |