RESOLVED FIXED 36908
Several tests in fast/canvas/webgl/ failed randomly on Leopard Commit Bot
https://bugs.webkit.org/show_bug.cgi?id=36908
Summary Several tests in fast/canvas/webgl/ failed randomly on Leopard Commit Bot
Eric Seidel (no email)
Reported 2010-03-31 16:27:19 PDT
fast/canvas/webgl/index-validation.html failed on Leopard Commit Bot Twice actually. Somehow lightning struck twice. https://bugs.webkit.org/show_bug.cgi?id=36876#c3 /tmp/layout-test-results/fast/canvas/webgl/index-validation-actual.txt 44 55 PASS gl.getError() is 0 66 PASS gl.drawElements(gl.TRIANGLES, 3, gl.UNSIGNED_SHORT, 0) is undefined. 7 PASS gl.getError() is 0 7 FAIL gl.getError() should be 0. Was 1286. 88 PASS successfullyParsed is true 99 1010 TEST COMPLETE I suspect it may be related to the commit-bot's hardware?
Attachments
patch (1.15 KB, patch)
2010-04-01 10:30 PDT, Zhenyao Mo
no flags
patch: gather extra information about failure (2.18 KB, patch)
2010-04-02 10:20 PDT, Zhenyao Mo
no flags
patch (1.75 KB, patch)
2010-04-15 10:04 PDT, Zhenyao Mo
no flags
revised patch : responding to Ken Russell's review (1.83 KB, patch)
2010-04-15 10:11 PDT, Zhenyao Mo
abarth: review+
commit-queue: commit-queue-
revised patch: initialize members in the same order as they are declared (1.82 KB, patch)
2010-04-15 11:53 PDT, Zhenyao Mo
no flags
Eric Seidel (no email)
Comment 2 2010-03-31 16:29:01 PDT
I'm happy to send a full hardware report to anyone who might need one.
Eric Seidel (no email)
Comment 3 2010-03-31 16:31:14 PDT
Grepping the commit-queue logs, I see the commit-queue having failed this test 9 times. We just happen to hit two failures in a row while trying to land Adam's patch https://bugs.webkit.org/show_bug.cgi?id=36876#c3
Zhenyao Mo
Comment 4 2010-03-31 20:32:34 PDT
This happened once on my machine before, but not recently. I would suggest to use "antialias:false" in this test to resolve this random failure. However, we need to dig into the reason why this happens and find a fix for it.
Eric Seidel (no email)
Comment 5 2010-04-01 00:21:16 PDT
Eric Seidel (no email)
Comment 6 2010-04-01 00:50:05 PDT
Something happened today to make this failure much more common than previously.
Eric Seidel (no email)
Comment 7 2010-04-01 01:09:44 PDT
Looks like the first time the Leopard Commit Bot ever saw this test fail was around 3:30PM this afternoon.
Eric Seidel (no email)
Comment 8 2010-04-01 01:10:23 PDT
It was shortly after http://trac.webkit.org/changeset/56872, which is probably related. :)
Eric Seidel (no email)
Comment 9 2010-04-01 01:11:36 PDT
This test has failed 39 times on the Leopard Commit Bot since it first failed this afternoon!
Adam Barth
Comment 10 2010-04-01 09:14:18 PDT
This failure rate is unacceptable. Can we rollback that rev and see if it fixes the problem?
Kenneth Russell
Comment 11 2010-04-01 09:35:42 PDT
*Please* don't roll back that revision. It is a large one that we've been trying to land for weeks, and the failure is ultimately caused by driver bugs and not the code. Can we temporarily disable this test on this bot? We may have another workaround we can land in a couple of hours.
Adam Barth
Comment 12 2010-04-01 09:40:03 PDT
> Can we temporarily disable this test on this bot? We don't have a way to disable tests on the commit-queue specifically. We can disable it more generally however. > We may have another workaround we can land in a couple of hours. Ok. By way of context, this test is flaky enough that it often fails twice in a row. We have something like 13k tests. If an appreciable fraction of them failed this often, the entire test suite would be useless.
Kenneth Russell
Comment 13 2010-04-01 09:43:07 PDT
(In reply to comment #12) > > Can we temporarily disable this test on this bot? > > We don't have a way to disable tests on the commit-queue specifically. We can > disable it more generally however. Can we add it to the Skipped list for mac-leopard for the moment? > > We may have another workaround we can land in a couple of hours. > > Ok. > > By way of context, this test is flaky enough that it often fails twice in a > row. We have something like 13k tests. If an appreciable fraction of them > failed this often, the entire test suite would be useless. Understood. I apologize for the flakiness introduced by this checkin.
Zhenyao Mo
Comment 14 2010-04-01 10:30:47 PDT
Created attachment 52310 [details] patch This patch should stabilize the test on the bot. However, we need to dig deeper into the cause of this flaky behavior - very likely hardware/driver bugs - and find a better way to solve it. Will create another bug to track this down.
Zhenyao Mo
Comment 15 2010-04-01 10:43:46 PDT
Just created a new bug to track this issue, but the patch should solve it for now. https://bugs.webkit.org/show_bug.cgi?id=36972
Kenneth Russell
Comment 16 2010-04-01 10:44:15 PDT
Comment on attachment 52310 [details] patch Looks good to me.
Eric Seidel (no email)
Comment 17 2010-04-01 11:12:24 PDT
Comment on attachment 52310 [details] patch So is this work-around we intend to roll out eventually? Or is this a permanent change to this test? The ChangeLog doesn't really say. I'm fine with either of course.
Eric Seidel (no email)
Comment 18 2010-04-01 11:13:58 PDT
(In reply to comment #12) > By way of context, this test is flaky enough that it often fails twice in a > row. We have something like 13k tests. If an appreciable fraction of them > failed this often, the entire test suite would be useless. Actually, the statement is stronger. After Adam's changes to speed up the commit-queue yesterday (which happened right after this regression landed) it now requires a flaky test to fail *three* times in a row for it to cause the commit queue to wrongly reject a patch. :)
Kenneth Russell
Comment 19 2010-04-01 11:15:42 PDT
(In reply to comment #17) > (From update of attachment 52310 [details]) > So is this work-around we intend to roll out eventually? Or is this a > permanent change to this test? The ChangeLog doesn't really say. I'm fine with > either of course. Right now we don't know why this OpenGL error is reported on this test, so the change is basically permanent.
Eric Seidel (no email)
Comment 20 2010-04-01 11:25:14 PDT
Eric Seidel (no email)
Comment 21 2010-04-01 11:29:17 PDT
WebKit Commit Bot
Comment 22 2010-04-01 11:37:24 PDT
Comment on attachment 52310 [details] patch Clearing flags on attachment: 52310 Committed r56925: <http://trac.webkit.org/changeset/56925>
WebKit Commit Bot
Comment 23 2010-04-01 11:37:36 PDT
All reviewed patches have been landed. Closing bug.
Eric Seidel (no email)
Comment 24 2010-04-01 21:23:28 PDT
This test has still failed at least twice on the commit-queue since this fix went in. I can get you the failure diff.
Eric Seidel (no email)
Comment 25 2010-04-01 21:25:13 PDT
The failure: /tmp/layout-test-results/fast/canvas/webgl/index-validation-actual.txt 44 55 PASS gl.getError() is 0 66 PASS gl.drawElements(gl.TRIANGLES, 3, gl.UNSIGNED_SHORT, 0) is undefined. 7 PASS gl.getError() is 0 7 FAIL gl.getError() should be 0. Was 1286. 88 PASS successfullyParsed is true 99 1010 TEST COMPLETE
Zhenyao Mo
Comment 26 2010-04-02 07:56:34 PDT
After the previous patch, the only difference in this test before and after the multisample patch is the stencil buffer is on by default (before by default we only have color and depth buffer). I'll get another patch ready to turn off the stencil buffer for this test shortly.
Zhenyao Mo
Comment 27 2010-04-02 08:02:32 PDT
Also, I am curious, after the previous patch in which I turned off antialias for this test, does the frequency of this random failure reduced on the bot or still the same?
Eric Seidel (no email)
Comment 28 2010-04-02 10:02:45 PDT
It appears to be greatly reduced. There have been 9 failure on the bot since last night. Yesterday afternoon we had 66. The bot may simply be doing less now... but I think the failure incidence was reduced, but not eliminated by your previous change.
Zhenyao Mo
Comment 29 2010-04-02 10:20:51 PDT
Created attachment 52421 [details] patch: gather extra information about failure This patch is an attempt to gather further information about the cause of this failure (sorry, but seems like the bot is the only machine that we get this failure at the moment). With this patch, the failure behavior won't be better or worse.
Zhenyao Mo
Comment 30 2010-04-02 10:22:42 PDT
Eric, when this patch lands, and the bot failed this test again, could you send the output to me? Then I will upload another patch to fix the failure problem (hopefully).
Kenneth Russell
Comment 31 2010-04-02 10:24:11 PDT
Comment on attachment 52421 [details] patch: gather extra information about failure Looks good to me.
Eric Seidel (no email)
Comment 32 2010-04-02 10:27:37 PDT
Comment on attachment 52421 [details] patch: gather extra information about failure OK.
WebKit Commit Bot
Comment 33 2010-04-02 12:22:00 PDT
Comment on attachment 52421 [details] patch: gather extra information about failure Clearing flags on attachment: 52421 Committed r57015: <http://trac.webkit.org/changeset/57015>
WebKit Commit Bot
Comment 34 2010-04-02 12:22:07 PDT
All reviewed patches have been landed. Closing bug.
Zhenyao Mo
Comment 35 2010-04-15 09:59:03 PDT
These tests are still failing on the commit-bot.
Zhenyao Mo
Comment 36 2010-04-15 10:04:07 PDT
Kenneth Russell
Comment 37 2010-04-15 10:07:29 PDT
The patch still has an OOPS about no new tests. Also, I think it's worth separately mentioning in the ChangeLog that this fixes an uninitialized variable bug and restores a glFinish call that used to be present.
Zhenyao Mo
Comment 38 2010-04-15 10:11:23 PDT
Created attachment 53445 [details] revised patch : responding to Ken Russell's review
Kenneth Russell
Comment 39 2010-04-15 10:13:02 PDT
Looks good to me.
Adam Barth
Comment 40 2010-04-15 11:06:15 PDT
Comment on attachment 53445 [details] revised patch : responding to Ken Russell's review Me too.
WebKit Commit Bot
Comment 41 2010-04-15 11:37:14 PDT
Comment on attachment 53445 [details] revised patch : responding to Ken Russell's review Rejecting patch 53445 from commit-queue. Failed to run "['WebKitTools/Scripts/build-webkit', '--debug']" exit_code: 1 Last 500 characters of output: n/yacc /bin/sh -c /Users/eseidel/Projects/CommitQueue/WebKitBuild/WebCore.build/Debug/WebCore.build/Script-5DF50887116F3077005202AB.sh ** BUILD FAILED ** The following build commands failed: WebCore: Distributed-CompileC /Users/eseidel/Projects/CommitQueue/WebKitBuild/WebCore.build/Debug/WebCore.build/Objects-normal/i386/GraphicsContext3DMac.o /Users/eseidel/Projects/CommitQueue/WebCore/platform/graphics/mac/GraphicsContext3DMac.cpp normal i386 c++ com.apple.compilers.gcc.4_2 (1 failure) Full output: http://webkit-commit-queue.appspot.com/results/1711103
Zhenyao Mo
Comment 42 2010-04-15 11:53:23 PDT
Created attachment 53458 [details] revised patch: initialize members in the same order as they are declared
Kenneth Russell
Comment 43 2010-04-15 11:56:16 PDT
LGTM
Adam Barth
Comment 44 2010-04-15 12:01:19 PDT
Comment on attachment 53458 [details] revised patch: initialize members in the same order as they are declared Ah, the old "order of initialization" check. :)
WebKit Commit Bot
Comment 45 2010-04-15 13:08:00 PDT
Comment on attachment 53458 [details] revised patch: initialize members in the same order as they are declared Clearing flags on attachment: 53458 Committed r57664: <http://trac.webkit.org/changeset/57664>
WebKit Commit Bot
Comment 46 2010-04-15 13:08:08 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.