REOPENED 83411
All Mac bots occasionally fail tons of media/ sputnik/ and svg/ tests
https://bugs.webkit.org/show_bug.cgi?id=83411
Summary All Mac bots occasionally fail tons of media/ sputnik/ and svg/ tests
Beth Dakin
Reported 2012-04-06 16:29:38 PDT
Starting around the middle of the day on April 5th, the Mac bots started occasionally failing tons of tests, primarily in media/ sputnik/ and svg/. There will be a number of good runs with the expected number of failures right now (0-10 on Lion bots), and then one or two runs with 500+ failures.
Attachments
(committed r113910, r=eric_carlson) speculative patch (3.78 KB, patch)
2012-04-11 12:43 PDT, Antonio Gomes
no flags
rebaseline (1.31 KB, patch)
2012-04-11 22:29 PDT, Antonio Gomes
no flags
Radar WebKit Bug Importer
Comment 1 2012-04-06 16:30:14 PDT
Beth Dakin
Comment 2 2012-04-06 16:51:25 PDT
It seems like generated results are being compared to the wrong expected results. Somehow, we're getting out of kilter.
Dirk Pranke
Comment 3 2012-04-06 18:10:36 PDT
Are there example logs for this I can look at? ... Chromium DRT will actually echo the URL for the test back into the output, so there's code in new-run-webkit-tests that actually asserts if the output is for the wrong test. It doesn't look like there's anything equivalent for the non-Chromium ports, so debugging this might be harder than I would like :(.
Beth Dakin
Comment 4 2012-04-09 11:49:54 PDT
Has anything changed in the test harness in the last week that could lead to this problem? This is a really terrible issue, and I agree it's hard to debug, so maybe we can just figure out what caused it to start happening. I don't know how long these links will be good for, but here's an example: This test is failing: http://trac.webkit.org/export/113594/trunk/LayoutTests/media/unsupported-rtsp.html Here's the pretty diff: http://build.webkit.org/results/Lion%20Release%20(Tests)/r113594%20(7423)/media/unsupported-rtsp-pretty-diff.html The pretty diff is comparing the results of this test: http://trac.webkit.org/export/113594/trunk/LayoutTests/media/svg-as-image-with-media-blocked.html with the expected results of media/unsupported-rtsp.html, and it only thinks that media/unsupported-rtsp.html is failing because the generated results are from media/svg-as-image-with-media-blocked.html
Simon Fraser (smfr)
Comment 5 2012-04-09 11:53:04 PDT
On the Lion Release Tests bot, this started between 113371 and 113376. <http://build.webkit.org/builders/Lion%20Release%20%28Tests%29?numbuilds=200> On Snow Leopard Release Tests, this started between 113363 and 113371. <http://build.webkit.org/builders/SnowLeopard%20Intel%20Release%20%28Tests%29?numbuilds=200>
Dirk Pranke
Comment 6 2012-04-09 17:31:12 PDT
The only thing that's changed in the harness in that timeframe that looks even remotely related is the change to run *fewer* workers on the bots: https://bugs.webkit.org/show_bug.cgi?id=83076 It's hard to imagine that that would be a problem, though. I haven't gotten to take much of a look at this yet, will keep investigating.
Simon Fraser (smfr)
Comment 7 2012-04-09 17:41:28 PDT
My guess is that there's one test that behaves badly (a media test?), and causes the off-by-one for all the other tests run in the same shard.
Dirk Pranke
Comment 8 2012-04-09 18:41:41 PDT
(In reply to comment #7) > My guess is that there's one test that behaves badly (a media test?), and causes the off-by-one for all the other tests run in the same shard. I think Simon's hypothesis is a good one. I spot-checked a few logs (from builds 7423, 7428, and 7437 on the Lion Release bot) and they all appear to have one worker that starts failing every test starting with media/nodesFromRect-shadowContent.html, so it might be that test. If you then look at the results for 7437, http://build.webkit.org/results/Lion%20Release%20(Tests)/r113622%20(7437)/results.html , you can see that the -actual output for nodesFromRect-shadowContent is repeated in the next failing test, media/remove-from-document-before-load.html . And, interestingly, the -expected for that test says "PASS: A crash did not occur when the media element was removed before loading.". So, I'm guessing that that test is actually crashing, and we're not noticing it, and somehow the internal output buffers in NRWT aren't getting emptied, so we end up reusing the same output twice and everything goes to hell after that. Note that that is the test that is being moved in r113371, so I think we have a pretty good suspect :). Maybe try skipping that test and see if the problem goes away? There's almost certainly also a bug in NRWT that maybe I introduced a few weeks ago when I was reworking the crash log code, so I can start looking for that now that I have a theory, as well.
Beth Dakin
Comment 9 2012-04-09 19:09:00 PDT
Thanks for the analysis, Dirk! I will try skipping that test, and we'll see how it goes.
Beth Dakin
Comment 10 2012-04-09 19:14:11 PDT
Antonio Gomes
Comment 11 2012-04-09 20:17:59 PDT
I would live to see a backtrace of this crash, if it is confirmed.
Antonio Gomes
Comment 12 2012-04-10 13:03:30 PDT
I wonder if the above patch fixes things. We would have to unskip the test again and wait for a few bot cycles. Anyways, the changes below work fine on Gtk and Qt builds, and should go it regardless I would say. diff --git a/LayoutTests/media/nodesFromRect-shadowContent.html b/LayoutTests/media/nodesFromRect-shadowContent.html index d55610a..9fdaf7f 100644 --- a/LayoutTests/media/nodesFromRect-shadowContent.html +++ b/LayoutTests/media/nodesFromRect-shadowContent.html @@ -13,11 +13,6 @@ <script type="text/javascript" charset="utf-8"> function runTest() { - if (window.layoutTestController) { - layoutTestController.dumpAsText(); - layoutTestController.waitUntilDone(); - } - var e = {}; // Set up shortcut access to elements @@ -42,12 +37,7 @@ var shadow =['-webkit-media-controls-timeline-container', '-webkit-media-controls-play-button', '-webkit-media-controls-panel', '-webkit-media-controls']; checkShadowContent(clickX, clickY, 10, 10, 20, 20, shadow); - - if (window.layoutTestController) - layoutTestController.notifyDone(); } - - window.onload = runTest; </script> </head> <body id="body"> @@ -58,6 +48,7 @@ testExpected("video.controls", null, '!='); waitForEvent('canplaythrough', function () { runTest(); + endTest(); } ); video.src = findMediaFile("video", "content/test"); </script> diff --git a/LayoutTests/platform/gtk/media/nodesFromRect-shadowContent-expected.txt b/LayoutTests/platform/gtk/media/nodesFromRect-shadowContent-expected.txt index 390e73e..371f393 100644 --- a/LayoutTests/platform/gtk/media/nodesFromRect-shadowContent-expected.txt +++ b/LayoutTests/platform/gtk/media/nodesFromRect-shadowContent-expected.txt @@ -4,4 +4,5 @@ This test only runs in DRT! EXPECTED (video.controls != 'null') OK EVENT(canplaythrough) +END OF TEST
Antonio Gomes
Comment 13 2012-04-11 12:43:44 PDT
Created attachment 136726 [details] (committed r113910, r=eric_carlson) speculative patch After moving the test to this new location, I accidentally left behind an unneeded line: "window.onload = runTest;". It is unneeded since now the test gets triggered by "waitForEvent('canplaythrough'" Also, remove unneeded LTC calls, since they are called by the helper scripts imported (e.g. video-test.js). Lets see if the Mac bots like this more...
Simon Fraser (smfr)
Comment 14 2012-04-11 13:01:09 PDT
As well as fixing the test, we should fix the harness to ensure that bad tests can't trip it up.
Eric Carlson
Comment 15 2012-04-11 13:47:46 PDT
Please file another bug about the problem with the test harness if one doesn't already exist.
Simon Fraser (smfr)
Comment 16 2012-04-11 13:49:30 PDT
This was that bug :(
Antonio Gomes
Comment 17 2012-04-11 13:51:21 PDT
(In reply to comment #16) > This was that bug :( Right. Want me to hold on committing the test fix?
Dirk Pranke
Comment 18 2012-04-11 14:11:46 PDT
(In reply to comment #14) > As well as fixing the test, we should fix the harness to ensure that bad tests can't trip it up. I will note that fixing things completely probably includes something like the #URL comment that chromium does ... also, we don't actually know for sure that there's a bug in NRWT at the moment; it could be a bug in DRT. But that's quibbling ...
Antonio Gomes
Comment 19 2012-04-11 14:20:35 PDT
Ok, will push and leave the bug opened. Possible useful chat: <eric_carlson> tonikitoo: the changes look OK, but do you know what was causing so many test failures <eric_carlson> tonikitoo: calling notifyDone() twice? <tonikitoo> eric_carlson, maybe that, and also calling runTests twice? <tonikitoo> from window.onload= runtest and waitTillRead(runTest, ...) <eric_carlson> tonikitoo: oh, I maybe having a test continue to generate output after it claimed to have finished... <tonikitoo> that is my guess
Dirk Pranke
Comment 20 2012-04-11 16:09:10 PDT
(In reply to comment #19) > Ok, will push and leave the bug opened. > > Possible useful chat: > > <eric_carlson> tonikitoo: the changes look OK, but do you know what was causing so many test failures > <eric_carlson> tonikitoo: calling notifyDone() twice? > <tonikitoo> eric_carlson, maybe that, and also calling runTests twice? > <tonikitoo> from window.onload= runtest and waitTillRead(runTest, ...) > <eric_carlson> tonikitoo: oh, I maybe having a test continue to generate output after it claimed to have finished... > <tonikitoo> that is my guess Wait, does this mean that calling runTest() from inside a test can make DRT generate two copies of the output? That seems bad.
Antonio Gomes
Comment 21 2012-04-11 21:28:33 PDT
seems like the patch is still failing on Mac Lion. Investigating...
Antonio Gomes
Comment 22 2012-04-11 21:48:39 PDT
(In reply to comment #21) > seems like the patch is still failing on Mac Lion. Investigating... It seems to need rebaseline: --- /Volumes/Data/slave/lion-intel-release-tests/build/layout-test-results/media/nodesFromRect-shadowContent-expected.txt +++ /Volumes/Data/slave/lion-intel-release-tests/build/layout-test-results/media/nodesFromRect-shadowContent-actual.txt @@ -3,4 +3,6 @@ This test only runs in DRT! EXPECTED (video.controls != 'null') OK +EVENT(canplaythrough) +END OF TEST http://build.webkit.org/results/Lion%20Release%20%28Tests%29/r113910%20%287504%29/media/nodesFromRect-shadowContent-diff.txt
Antonio Gomes
Comment 23 2012-04-11 22:29:26 PDT
Created attachment 136826 [details] rebaseline
WebKit Review Bot
Comment 24 2012-04-11 23:05:45 PDT
Comment on attachment 136826 [details] rebaseline Clearing flags on attachment: 136826 Committed r113944: <http://trac.webkit.org/changeset/113944>
WebKit Review Bot
Comment 25 2012-04-11 23:05:52 PDT
All reviewed patches have been landed. Closing bug.
Antonio Gomes
Comment 26 2012-04-12 05:39:02 PDT
.
Tim Horton
Comment 27 2012-09-06 18:32:14 PDT
Has anyone seen this recently? Should this bug be closed?
Simon Fraser (smfr)
Comment 28 2012-09-07 11:02:16 PDT
I think we're keeping this open to fix: "As well as fixing the test, we should fix the harness to ensure that bad tests can't trip it up."
Note You need to log in before you can comment on or make changes to this bug.